Issues with stmincuts in iGraph for R - r

I am trying to have R calculate the minimum cuts of a network. I am having trouble using iGraph's stmicuts command. It produces cuts for certain nodes and the ones it does produce cuts for do not make much sense. For example:
st_min_cuts(net_full, '2', '417', capacity = NULL)
Produces
st_min_cuts(net_full, '2', '346', capacity = NULL)
$value
[1] 2
$cuts
$cuts[[1]]
+ 0/960 edges from 19b17c7 (vertex names):
$partition1s
$partition1s[[1]]
+ 512/525 vertices, named, from 19b17c7:
[1] 346 515 345 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124
[26] 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99
[51] 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74
[76] 73 72 32 31 28 15 8 4 2 524 520 518 517 516 519 513 512 511 510 509 506 504 503 499 497
[101] 491 487 485 480 478 476 474 469 382 380 379 377 370 368 367 366 364 362 361 358 356 354 349 347 338
[126] 336 332 329 319 318 317 316 315 314 313 312 311 310 309 308 307 306 305 304 303 302 301 300 299 298
[151] 297 296 295 30 27 294 293 292 291 290 289 288 287 286 285 284 283 282 281 280 279 278 277 276 275
[176] 274 273 272 24 23 265 264 263 262 261 260 259 258 257 256 255 254 253 252 251 250 249 248 247 246
[201] 245 244 243 242 241 240 239 238 237 21 20 19 18 269 268 267 266 26 17 16 236 235 234 233 232
[226] 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207
+ ... omitted several vertices
I am trying to determine the minimum number of nodes that need to be cut for the source and sink to be disconnected. Any recommendations about how I can fix this issue or find the minimum cuts another way?

Related

nls-Fitting data with Weibull density functions

I am trying to fit my data with a Weibull density functions.
eventually, I want to smooth my observations for the entire year so that I can create a smooth GPP (my observation)- DOY (day of the year) curve.
The data is detached at the end of my question.And here's the point plot for my data
Point plot
The formula is quite complex here's the formula, P(t) stands for my observations
somehow I managed to build a nonlinear model for my data using code below,
library(nls2)
library(dplyr)
require(minpack.lm)
#I store my data in data.frame d
#define weibull function
weibull_function<-function(a,b,k,x0,y0,t){
y=
ifelse(t>(x0-b*(k-1)/k),
y0+a*((k-1)/k)^((1-k)/k)*abs((t-x0)/b+((k-1)/k)^(1/k))^(k-1)*exp(1)^(-abs((t-x0)/b+((k-1)/k)^(1/k))^k+(k-1)/k),
y0
)
return(y)
}
#data fitting
lm1<-nlsLM(y~weibull_function(a,b,k,x0,y0,t),data=d,start=list(a=0,b=10,k=2,x0=1,y0=0)
#plot predict values
plot(d$x,predict(lm1,d))
But the predicted values can not actually fit my data, as u see in the plot fitted data
I had go through quite a lot of answers on StackOverflow,
and aware that the bias may relate to the start values I use.
So I changed some of the values for the start value, and here's what surprised me.
As I go through different combinations of start values for my a,b,k,x0 and y0, the nls function generated quite an amount of different models, which use different values,
however, none of them seems to really fit my data.
Now I am quite confused about which strat values I should use and how can I make sure that the model (suppose I eventually find ONE fits my data) is better than any other nls Weibull models (since it is impossible to go through all combinations of start values?
Thank u
t y
1 1 0.0000000
2 2 0.0000000
3 3 0.0000000
4 4 0.0000000
5 5 0.0707867
6 6 0.1712200
7 7 0.4918100
8 8 0.7889240
9 9 0.5143970
10 10 0.7365840
11 11 0.8226880
12 12 0.8913360
13 13 1.9113300
14 14 1.9021600
15 15 2.5347900
16 16 2.9011300
17 17 2.4049000
18 18 0.7344520
19 19 0.1427200
20 20 0.0541768
21 21 0.0000000
22 22 0.0000000
23 23 0.1926340
24 24 0.5145610
25 25 0.8064800
26 26 0.8090040
27 27 2.1381500
28 28 1.8712600
29 29 0.9658490
30 30 0.2964860
31 31 1.2073700
32 32 2.5077900
33 33 3.4101900
34 34 2.8787600
35 35 3.6792400
36 36 2.9349200
37 37 2.6029300
38 38 1.9863700
39 39 1.2938900
40 40 0.4992630
41 41 0.6379650
42 42 0.4024000
43 43 0.1084260
44 44 0.1374730
45 45 0.2230510
46 46 0.1501440
47 47 0.4220550
48 48 0.7916190
49 49 0.6582870
50 50 1.2428100
51 51 1.0643000
52 52 0.4634650
53 53 0.4777060
54 54 0.2625760
55 55 0.3416690
56 56 2.0303200
57 57 1.1497000
58 58 1.4016800
59 59 0.7974760
60 60 1.6967400
61 61 1.5555500
62 62 1.3034300
63 63 2.9090000
64 64 2.0858800
65 65 0.8658620
66 66 3.3597300
67 67 1.0571400
68 68 4.4057700
69 69 3.0252900
70 70 1.2971200
71 71 3.9716500
72 72 3.1547100
73 73 1.6375300
74 74 3.0920600
75 75 4.3314800
76 76 3.6577800
77 77 3.0225800
78 78 3.4114200
79 79 4.1715900
80 80 3.5697300
81 81 3.8911100
82 82 4.4364500
83 83 4.9133700
84 84 5.2404200
85 85 5.7771400
86 86 6.7429000
87 87 6.9022200
88 88 7.4436900
89 89 4.3942800
90 90 0.8826800
91 91 1.4101000
92 92 2.2473800
93 93 2.9795900
94 94 3.9610900
95 95 2.8689700
96 96 2.3157700
97 97 4.2013700
98 98 2.4536200
99 99 2.3285200
100 100 1.6641800
101 101 1.8391400
102 102 3.7247200
103 103 4.4881200
104 104 5.4677000
105 105 7.1896600
106 106 4.5204400
107 107 5.8330400
108 108 3.3793700
109 109 3.8234600
110 110 3.9182200
111 111 3.1710000
112 112 2.9232900
113 113 4.2434700
114 114 4.7464600
115 115 4.6802300
116 116 5.1251200
117 117 6.4484500
118 118 5.6865200
119 119 4.1672000
120 120 4.9955900
121 121 6.9491800
122 122 5.7618500
123 123 2.4349800
124 124 3.7315500
125 125 8.3070800
126 126 4.3468400
127 127 8.4310100
128 128 9.7953500
129 129 5.1387300
130 130 5.6159800
131 131 4.9249800
132 132 5.2035200
133 133 7.3140900
134 134 8.5128400
135 135 8.8445500
136 136 6.4021100
137 137 8.5730400
138 138 9.0752800
139 139 6.9884600
140 140 10.0649000
141 141 10.9208000
142 142 10.4544000
143 143 14.0787000
144 144 12.6344000
145 145 11.9214000
146 146 15.1133000
147 147 15.3369000
148 148 15.4777000
149 149 16.0808000
150 150 15.8116000
151 151 15.3791000
152 152 10.9130000
153 153 11.8881000
154 154 12.5383000
155 155 2.9121600
156 156 4.8731600
157 157 11.6981000
158 158 6.8281600
159 159 8.1552300
160 160 11.3900000
161 161 10.4996000
162 162 9.9490400
163 163 7.3252500
164 164 11.6759000
165 165 10.3756000
166 166 17.2289000
167 167 6.7320000
168 168 13.6835000
169 169 15.4414000
170 170 12.7428000
171 171 13.5159000
172 172 13.8205000
173 173 9.9679200
174 174 11.4347000
175 175 11.8706000
176 176 6.5545700
177 177 13.6308000
178 178 15.3185000
179 179 9.1710900
180 180 13.5977000
181 181 11.2282000
182 182 11.7510000
183 183 11.4871000
184 184 10.4018000
185 185 10.8641000
186 186 9.2063100
187 187 11.3159000
188 188 10.6050000
189 189 12.6539000
190 190 9.2266000
191 191 8.5330400
192 192 9.2949000
193 193 8.2153200
194 194 10.7958000
195 195 7.4245200
196 196 7.2358800
197 197 9.3145700
198 198 8.3644700
199 199 8.4106900
200 200 13.7398000
201 201 12.8421000
202 202 9.3427900
203 203 11.5155000
204 204 12.1537000
205 205 11.3195000
206 206 10.8288000
207 207 11.1031000
208 208 12.6185000
209 209 10.4288000
210 210 8.7446600
211 211 13.1651000
212 212 12.4868000
213 213 7.0671500
214 214 10.6482000
215 215 10.5971000
216 216 11.2978000
217 217 12.0698000
218 218 11.9749000
219 219 11.3467000
220 220 12.7263000
221 221 8.9283400
222 222 9.7184300
223 223 10.2274000
224 224 11.9933000
225 225 12.6712000
226 226 11.4917000
227 227 11.5164000
228 228 11.1688000
229 229 12.1940000
230 230 12.2719000
231 231 12.6843000
232 232 12.0033000
233 233 10.4394000
234 234 10.0225000
235 235 9.3543900
236 236 9.5651400
237 237 8.0770500
238 238 8.2516400
239 239 6.7008700
240 240 10.2780000
241 241 8.4796000
242 242 9.8009400
243 243 8.6459500
244 244 7.7860100
245 245 9.7695600
246 246 8.4967000
247 247 8.2067600
248 248 8.2361900
249 249 7.3512700
250 250 6.2018700
251 251 7.1628900
252 252 7.0082400
253 253 6.9478600
254 254 6.8310100
255 255 4.1930200
256 256 7.1842600
257 257 7.2565500
258 258 3.7791600
259 259 6.7925900
260 260 10.1900000
261 261 7.4041900
262 262 8.6597800
263 263 9.5826000
264 264 8.3029000
265 265 7.2548300
266 266 8.7421600
267 267 4.3173600
268 268 5.5106100
269 269 6.4128400
270 270 5.4460700
271 271 5.8495000
272 272 6.1458700
273 273 6.7045200
274 274 7.3160100
275 275 6.4701900
276 276 4.5038000
277 277 2.7967300
278 278 4.6101100
279 279 3.1605100
280 280 3.4307200
281 281 5.7120700
282 282 4.8887400
283 283 5.2968700
284 284 5.8722500
285 285 6.0290200
286 286 3.8281000
287 287 1.4922500
288 288 4.3007900
289 289 4.7463100
290 290 3.6876100
291 291 3.1633900
292 292 2.5615100
293 293 4.0825100
294 294 2.8859400
295 295 3.1885900
296 296 5.4614400
297 297 4.9645100
298 298 4.4726700
299 299 1.3583300
300 300 1.6828900
301 301 3.0714600
302 302 3.4279900
303 303 1.2706300
304 304 2.2885800
305 305 4.0884900
306 306 1.4124700
307 307 3.6298100
308 308 2.7364700
309 309 2.8791000
310 310 2.6254400
311 311 3.5437700
312 312 1.8247300
313 313 1.6026100
314 314 2.0445300
315 315 1.2098200
316 316 2.9734400
317 317 1.7955200
318 318 1.6497700
319 319 3.7585900
320 320 2.1699300
321 321 1.9716500
322 322 1.0365200
323 323 1.0400600
324 324 1.2130500
325 325 2.7250800
326 326 1.6329600
327 327 3.0840200
328 328 0.7717740
329 329 0.8716610
330 330 1.6803600
331 331 1.3165100
332 332 0.8895280
333 333 1.1678900
334 334 1.3315100
335 335 1.3054600
336 336 0.8515050
337 337 0.4578000
338 338 0.0516099
339 339 0.1484510
340 340 0.2275460
341 341 0.8208840
342 342 0.7448860
343 343 2.3841900
344 344 0.2445460
345 345 0.7701040
346 346 1.9149200
347 347 1.4889100
348 348 0.8986610
349 349 0.3705810
350 350 0.4623590
351 351 0.2586430
352 352 0.1939820
353 353 0.1817090
354 354 0.1586170
355 355 0.0517517
356 356 0.0291422
357 357 0.0269378
358 358 0.0960937
359 359 0.4633600
360 360 0.5766720
361 361 0.8399390
362 362 0.6647790
363 363 0.7475380
364 364 1.6569600
365 365 1.8504600
366 366 1.3835600

Using R to create a 'change in x' column for multiple x columns

Im currently trying to calculate change in amount of SWE for multiple sites (50) so as to attain snow melt rates (change per day). I think I have the right formula but trying to loop it doesn't seem to be working, my code and error below.
site <- colnames(winter_pr_swe[,55:104])
for (i in site){
sitex <- paste0(i)
ratex <- paste0('rate',i)
winter_pr_swe1 <- winter_pr_swe %>% mutate(ratex = (sitex - lag(sitex))
}
Error: unexpected '}' in: "winter_pr_swe1 <- winter_pr_swe %>% %>% mutate(ratex = (sitex - lag(sitex))
}"
Here is the head of the data set for reference, each column being the SWE levels for a single site, each row being a days measurement:
head(winter_pr_swe[,55:104])
X303 X308 X310 X316 X327 X337 X340 X378 X386 X409 X416 X426
1 157 71 234 287 361 163 86 292 241 348 109 183
2 157 150 272 333 384 165 102 300 302 409 130 185
3 157 150 274 340 396 168 107 310 312 419 130 190
4 157 150 274 343 406 168 157 315 312 419 122 190
5 157 147 274 351 409 168 193 318 318 434 107 190
6 165 140 274 351 427 168 193 335 318 434 107 234
X431 X465 X486 X488 X491 X511 X519 X527 X532 X538 X580 X586
1 592 178 69 53 292 427 64 137 450 175 312 246
2 658 208 97 122 340 490 155 137 508 190 340 282
3 668 218 97 124 351 490 157 137 513 203 343 300
4 676 218 94 124 351 490 157 152 521 203 343 302
5 696 224 86 119 351 490 152 160 531 213 353 323
6 726 226 79 109 351 485 145 168 549 218 368 333
X589 X595 X617 X622 X624 X629 X632 X640 X665 X705 X715 X737
1 310 41 351 300 315 206 358 66 46 358 157 478
2 338 43 406 320 330 231 391 155 48 483 170 511
3 353 33 406 328 338 241 396 155 48 485 180 521
4 356 30 406 330 338 241 399 155 48 485 183 523
5 363 25 401 333 353 267 427 155 56 485 183 556
6 368 15 401 345 376 267 460 145 56 485 198 599
X739 X755 X757 X762 X780 X827 X839 X840 X843 X857 X861 X866
1 168 201 274 254 462 356 74 526 183 180 58 112
2 196 244 310 257 533 363 74 584 224 196 145 160
3 201 249 315 267 549 376 74 589 226 196 145 175
4 206 251 318 267 551 376 74 592 229 196 145 160
5 226 249 320 282 587 376 91 620 239 201 142 155
6 226 249 320 282 599 417 91 648 241 201 119 145
X874 X877
1 615 23
2 671 152
3 678 155
4 681 155
5 709 140
6 752 140

Length of list object returns 1 in R when not 1

I have a list x
X1
1 0.8
2 1.0
3 661.7
4 661.8
5 661.9
6 662.3
7 662.6
8 662.7
9 663.3
10 663.6
11 663.7
12 663.9
13 664.0
14 664.1
15 664.3
16 664.4
17 664.5
18 664.7
19 665.1
20 666.9
21 667.5
22 668.2
23 668.3
24 669.7
25 670.3
26 670.8
27 671.1
28 672.0
29 672.1
30 674.8
31 675.3
32 677.5
33 677.9
34 678.5
35 678.9
36 679.0
37 686.6
38 687.6
39 714.1
40 899.1
41 900.4
42 901.1
43 901.3
44 902.7
45 908.3
46 908.7
47 908.9
48 909.0
49 909.2
50 910.0
51 910.1
52 910.3
53 910.6
54 910.7
55 911.3
56 911.4
57 911.6
58 911.8
59 912.6
60 912.7
61 912.8
62 913.0
63 913.1
64 913.2
65 913.3
66 913.7
67 913.9
68 914.0
69 914.2
70 914.3
71 914.4
72 914.6
73 915.2
74 915.3
75 915.5
76 915.6
77 915.7
78 915.9
79 916.0
80 916.1
81 916.3
82 916.5
83 916.6
84 916.7
85 916.9
86 917.3
87 917.5
88 917.6
89 917.8
90 917.9
91 918.0
92 918.2
93 918.3
94 918.5
95 918.6
96 918.8
97 918.9
98 919.0
99 919.2
100 919.3
101 919.5
102 919.6
103 919.7
104 919.9
105 920.0
106 920.2
107 920.3
108 920.5
109 920.6
110 920.8
111 920.9
112 921.0
113 921.1
114 921.2
115 921.3
116 921.3
117 921.5
118 921.6
119 921.7
120 921.8
121 922.0
122 922.1
123 922.4
124 922.5
125 922.6
126 922.7
127 922.9
128 923.0
129 923.2
130 923.3
131 923.5
132 923.6
133 923.8
134 923.9
135 927.2
136 927.3
137 927.4
138 927.6
139 927.7
140 927.8
141 928.0
142 928.1
143 928.3
144 928.4
145 928.5
146 928.7
147 928.8
148 928.9
149 929.1
150 929.2
151 929.3
152 929.5
153 929.6
154 929.8
155 929.9
156 930.1
157 930.2
158 930.3
159 930.3
160 930.5
161 930.6
162 930.7
163 930.9
164 931.0
165 931.1
166 931.2
167 931.3
168 931.4
169 931.5
170 931.7
171 931.8
172 932.0
173 932.0
174 932.1
175 932.2
176 932.4
177 932.5
178 932.6
179 932.7
180 933.3
181 933.4
182 933.6
183 933.7
184 933.8
185 934.5
186 934.7
187 934.8
188 934.9
189 935.0
190 935.2
191 935.3
192 935.3
193 935.5
194 935.6
195 935.7
196 935.8
197 936.0
198 936.1
199 936.3
200 936.4
201 936.5
202 936.7
203 936.8
204 936.9
205 937.1
206 937.2
207 937.4
208 937.5
209 937.7
210 937.8
211 937.9
212 938.1
213 938.2
214 938.4
215 938.5
216 938.7
217 938.9
218 939.0
219 939.2
220 939.4
221 939.7
222 939.9
223 940.3
224 940.7
225 940.9
226 941.4
227 941.7
228 942.1
229 942.6
230 942.7
231 943.3
232 943.5
233 943.9
234 944.9
235 945.0
236 945.1
237 945.4
238 945.6
239 945.8
240 945.9
241 946.2
242 947.6
243 947.9
244 948.2
245 948.3
246 948.5
247 948.6
248 948.8
249 948.9
250 949.5
251 949.6
252 951.8
253 951.9
254 952.0
255 952.1
256 952.5
257 952.6
258 953.0
259 953.3
260 953.4
261 953.5
262 953.7
263 953.8
264 953.9
265 954.1
266 954.2
267 954.4
268 954.5
269 954.7
270 954.8
271 955.0
272 955.1
273 955.2
274 955.4
275 955.5
276 955.6
277 955.7
278 955.9
279 956.0
280 956.1
281 956.3
282 956.4
283 956.5
284 956.6
285 956.9
286 957.2
287 957.3
288 957.4
289 957.5
290 957.9
291 958.9
292 959.0
293 959.3
294 959.5
295 959.9
296 960.0
297 960.2
298 960.5
299 960.6
300 960.8
301 960.8
302 961.4
303 961.5
304 961.6
305 961.7
306 961.8
307 961.9
308 968.8
309 969.1
310 970.0
311 970.5
312 970.7
313 974.2
314 998.7
315 998.8
316 998.9
317 999.1
318 999.2
319 1000.3
320 1001.2
321 1001.4
322 1001.5
323 1001.6
324 1001.7
325 1003.2
326 1003.4
327 1003.6
328 1004.2
329 1004.3
330 1004.4
331 1004.5
332 1004.6
333 1005.3
334 1005.4
335 1005.5
336 1005.6
337 1005.7
338 1005.9
339 1006.0
340 1006.1
341 1006.8
342 1006.9
343 1007.1
344 1007.2
345 1007.3
346 1007.4
347 1007.6
348 1007.7
349 1007.8
350 1008.0
351 1008.1
352 1008.7
353 1008.8
354 1008.9
355 1009.0
356 1009.2
357 1009.3
358 1009.3
359 1009.5
360 1009.6
361 1009.7
362 1009.8
363 1010.0
364 1010.2
365 1010.4
366 1010.5
367 1010.6
368 1010.7
369 1010.9
370 1011.0
371 1011.1
372 1011.2
373 1011.4
374 1011.5
375 1011.6
376 1011.7
377 1011.9
378 1012.0
379 1012.1
380 1012.2
381 1012.3
382 1012.4
383 1012.6
384 1012.7
385 1012.8
386 1013.0
387 1013.2
388 1013.4
389 1013.5
390 1013.6
391 1013.6
392 1013.8
393 1013.9
394 1014.0
395 1014.0
396 1014.3
397 1014.7
398 1014.8
399 1014.9
400 1015.7
401 1015.8
402 1016.0
403 1016.1
404 1016.2
405 1016.5
406 1016.6
407 1016.9
408 1017.0
409 1017.1
410 1017.3
411 1017.4
412 1017.5
413 1017.7
414 1017.8
415 1017.8
416 1018.3
417 1018.5
418 1026.6
419 1027.0
420 1027.3
421 1027.4
422 1027.7
423 1028.6
424 1029.1
425 1029.9
426 1030.0
427 1030.2
428 1270.0
429 1270.1
430 1270.2
431 1270.3
432 1270.4
433 1270.5
434 1270.7
435 1270.7
436 1270.9
437 1271.0
438 1271.3
439 1271.4
440 1272.3
441 1272.5
442 1273.1
443 1273.2
444 1273.3
445 1273.4
446 1273.5
447 1273.8
448 1274.0
449 1274.1
450 1274.3
451 1274.4
452 1274.6
453 1274.7
454 1274.8
455 1274.9
456 1275.1
457 1275.3
458 1275.5
459 1275.6
460 1275.8
461 1275.9
462 1276.1
463 1276.2
464 1276.3
465 1276.4
466 1276.6
467 1276.7
468 1276.8
469 1277.2
470 1277.3
471 1277.5
472 1277.6
473 1277.7
474 1277.9
475 1278.0
476 1278.1
477 1278.2
478 1278.3
479 1278.4
480 1278.5
481 1278.7
482 1279.0
483 1279.0
484 1279.1
485 1279.3
486 1279.3
487 1279.5
488 1279.6
489 1279.7
490 1279.8
491 1280.3
492 1280.4
493 1280.7
494 1280.8
495 1280.9
496 1281.1
497 1281.3
498 1281.4
499 1281.5
500 1282.3
501 1283.0
502 1283.1
503 1284.0
504 1284.8
505 1284.9
506 1285.0
507 1285.1
508 1285.4
which has a length of 508, although when I use length(x) it returns 1. I have tried to the function
length(as.vector(x))
although this also does not work and returns 1. Is there another form that I should convert this list to so that I can accurately find the length? For reference, I am using the length to duplicate other elements using the rep_len() function.
as.vector on a data.frame returns the data.frame itself as there is no method for as.vector with data.frame
methods('as.vector')
#[1] as.vector,abIndex-method as.vector,ANY-method as.vector,dgCMatrix-method as.vector,dgeMatrix-method
#[5] as.vector,diagonalMatrix-method as.vector,dsCMatrix-method as.vector,ldenseMatrix-method as.vector,Matrix-method
#[9] as.vector,ndenseMatrix-method as.vector,sparseVector-method as.vector.factor as.vector.Matrix*
#[13] as.vector.sparseVector*
We can also check the reverse i.e. on data.frame
grep('^as\\.', methods(class = 'data.frame'), value = TRUE)
#[1] "as.data.frame.data.frame" "as.data.table.data.frame"
#[3] "as.list.data.frame" "as.matrix.data.frame" "as.tbl.data.frame"
and the length is the same as the number of columns of data.frame i.e. here it is 1. Instead, we need nrow(x)
as.vector(mtcars) # nothing changed
length(as.vector(mtcars))
#[1] 11
But, suppose, if we do
nrow(mtcars)
#[1] 32
length can also be applied on the vector by extracting the column with $ or [[
length(mtcars[[1]])

How to partition or subset a nested dataframe with a list of index in R?

I am trying to creating a training dataframe for fitting my model. The dataframe I am working with is a nested dataframe. Using createDataPartition , I have created a list of indexes. But I am having trouble subsetting the dataframe with said list.
Here is what the object partitionindex created by caret::createDataParition looks like:
partitionindex
[[1]]
[[1]]$Resample1
[1] 4 5 6 8 9 10 11 12 14 15 17 18 20 21 23 28 30 32 34 38 39 41 42 46
[25] 47 48 50 52 53 56 57 58 59 60 64 66 67 70 73 75 76 77 78 82 85 87 90 95
[49] 97 99 105 106 110 113 114 116 117 118 119 120 123 124 126 128 129 130 132 134 135 137 139 141
[73] 142 143 144 145 146 148 149 151 153 154 155 157 158 164 165 167 170 174 176 178 182 183 184 186
[97] 189 190 191 193 194 197 198 200 201 202 203 206 210 211 212 213 214 216 219 221 222 223 226 232
[121] 236 237 241 243 247 248 251 254 255 256 258 262 263 264 269 270 271 274 276 277 280 281 284 291
[145] 292 293 295 296 297 299 300 301 302 303 304 309 314 317 318 319 320 323 324 327 328 329 339 341
[169] 342 343 344 345 349 350 351 353 354 355 356 360 361 363 364 365 367 370 371 375 379 380
[[2]]
[[2]]$Resample1
[1] 1 2 4 5 7 8 9 10 14 17 19 22 24 26 28 29 31 32 34 36 37 42 44 45
[25] 47 48 49 51 52 53 56 58 65 66 67 68 72 74 75 77 78 81 83 86 95 96 98 100
[49] 102 104 105 106 110 113 114 115 118 119 122 123 124 125 128 129 130 132 135 137 142 144 145 147
[73] 149 150 151 152 158 160 161 163 165 168 169 170 171 175 176 180 183 186 187 188 191 194 196 199
[97] 203 205 206 207 208 209 210 211 213 215 218 220 221 222 224 225 227 228 231 233 240 241 242 243
[121] 247 248 250 251 254 255 256 257 258 262 263 264 267 268 269 270 272 273 277 278 282 285 286 288
[145] 289 290 292 293 294 295 296 300 301 302 304 305 307 308 312 314 315 316 317 321 323 328 329 332
[169] 333 335 336 339 341 343 344 345 347 348 349 354 355 359 360 362 363 366 369 374 375 376 377
[[3]]
[[3]]$Resample1
[1] 5 8 10 12 17 22 25 26 27 30 32 33 34 36 38 39 42 44 45 46 47 51 52 57
[25] 58 59 62 64 66 70 71 73 75 78 81 82 83 84 86 89 90 95 96 97 98 100 103 104
[49] 105 108 109 111 112 113 114 117 119 120 121 123 124 127 130 131 132 133 137 139 140 141 144 148
[73] 149 150 151 153 154 155 156 157 159 160 163 164 167 168 170 172 173 176 178 179 181 182 184 186
[97] 187 188 189 190 191 207 208 212 214 215 219 220 222 223 227 230 233 234 238 248 250 251 252 253
[121] 256 258 260 261 262 264 265 266 267 270 271 272 275 278 281 285 288 289 291 293 295 297 298 302
[145] 303 305 306 308 312 314 315 318 319 320 321 323 325 326 329 332 333 334 335 336 338 342 343 345
[169] 347 348 349 350 351 352 360 361 363 364 365 366 368 369 370 371 372 374 375 376 377 378
[[4]]
[[4]]$Resample1
[1] 1 2 3 4 5 6 7 8 10 12 14 15 18 19 20 22 23 25 26 27 28 30 31 34
[25] 37 38 40 44 45 46 47 49 50 51 52 59 62 64 66 68 70 71 72 73 75 76 79 80
[49] 81 83 84 86 88 89 91 92 94 95 96 97 99 100 102 105 108 109 112 119 125 126 129 130
[73] 132 134 137 139 140 141 145 150 153 155 156 158 159 162 163 170 178 179 181 182 184 185 187 188
[97] 190 191 192 194 196 197 199 201 205 206 207 218 219 220 223 229 230 231 232 237 238 240 241 242
[121] 244 245 247 248 249 251 252 253 257 258 260 261 263 264 265 266 270 271 273 275 276 283 285 289
[145] 290 291 294 298 299 300 302 303 304 306 307
And the nested dataframe:
> nested_df
# A tibble: 4 x 2
# Groups: League [4]
League data
<chr> <list<df[,133]>>
1 F1 [380 x 133]
2 E0 [380 x 133]
3 SP1 [380 x 133]
4 D1 [308 x 133]
I tried something like this but to no avail:
nested_df%>%
mutate(train = data[map(data,~.x[partitionindex,])])
Error in x[i] : invalid subscript type 'list'
Is there a solution involving purrr::map or lappy?
I think this could work, with purrr::pmap
nested_df %>%
ungroup() %>% # make sure the table is not grouped
mutate(i = row_number()) %>%
mutate(train = pmap(
.,
function(data, i, ...) {
data[partitionindex[[i]]$Resample1,]
}
)) %>%
select(-i)

Waves argument in geeglm of geepack in R causes failure

I am trying to calculate a GEE-model in the R package "geepack". The response variable is proportional, coded as (Successes, Failures). The explanatory variables are Weight(cont), Rank(cont), ColonySize(cont) and Sex(factor). The data set contains temporal non-independence of observations because over a study period of 413 days repeated behavioral measurements of the same individuals where taken. This non-independence is reflected in a column specifying the AnimalID and the day of observation (Ndate). The data set is not very large and contains 1062 observations on 165 different individuals. The complete study period is 413 days (i.e. Ndate range:1-413).
gee1<-geeglm(wl~WeightScaled+Rank+ColonySize+Sex,
data=allsub, family=binomial, id=AnimalID,
corstr="ar1")
The above model is calculated without difficulties and without noticeable delay. However, the observations are not regularly distributed over the study period (see the complete vector for Ndate below) which means the model output is not meaningful. When I include the waves argument in the model to correctly account for temporal auto-correlation R seems to get stuck or takes very long to calculate this model which should really not take so much time. What happens is that R-Gui displays "(not responding)" for more than 1 hour and the small circle (Win7) indicates that R is busy. The CPU-usage according to the task manager is mostly between 25-30%, sometimes up to 50%. So my question is: Did I make a mistake when specifying the "waves" function which cause R to hang itself or is it normal for this process to be computational very intense? (see an extract of the variable Ndate below)
Model including the waves argument:
gee1<-geeglm(wl~Weight+Rank+ColonySize+Sex,
data=allsub, family=binomial, id=AnimalID,
corstr="ar1", waves=Ndate)
The second question is more fundamental with regards to this GEE and its autocorrelation structure: Is the model able to deal with this kind of temporal autocorrelation where repeated observations of one individual are typically 5-15 but time in between varies largely (sometimes only a few days, but sometimes up to 100 days or more). Textbook examples all look very different but as I see it the principle should be the same.
Thanks very much.
> allsub$Ndate
[1] 169 169 169 43 43 5 5 5 267 267 267 267 162 162 162 162 162 256
[19] 256 256 256 256 256 263 263 263 263 263 263 176 176 176 176 176 176 183
[37] 183 183 183 183 183 190 190 190 190 190 190 190 196 196 196 196 196 196
[55] 196 284 284 284 284 291 291 291 291 175 175 175 175 175 175 175 175 199
[73] 199 199 199 199 199 199 186 186 186 186 186 186 189 189 189 189 189 189
[91] 266 266 266 266 266 266 196 196 196 196 196 196 196 242 242 242 242 242
[109] 242 207 207 207 207 207 210 210 210 210 210 245 245 245 245 245 245 302
[127] 302 302 302 302 302 302 302 217 217 217 217 217 217 217 270 270 270 272
[145] 272 272 291 291 291 220 220 220 220 220 220 220 238 238 238 238 238 238
[757] 291 291 291 291 291 291 220 220 238 238 241 241 294 294 294 294 294 294
[775] 303 303 303 263 263 263 263 263 263 263 263 263 263 316 316 309 304 304
[793] 304 323 323 19 50 99 67 67 67 22 22 22 43 60 110 178 178 178
[811] 33 115 115 115 115 96 116 116 116 116 116 116 116 116 116 116 116 26
[829] 26 122 122 122 122 122 122 122 122 122 64 40 40 40 40 40 40 40
[847] 40 40 58 58 58 58 58 58 58 58 58 58 71 71 75 85 127 78
[865] 78 12 12 12 12 12 12 12 12 12 12 15 152 152 152 152 337 337
[883] 337 337 337 337 344 344 344 344 344 344 344 82 82 82 82 82 82 82
[901] 82 82 348 348 348 348 348 348 348 348 348 351 351 351 359 359 355 355
[919] 355 354 354 345 345 345 358 358 358 358 362 362 362 331 331 349 349 361
[937] 361 378 364 364 364 369 369 369 375 375 375 373 373 373 373 342 365 365
[955] 365 365 365 365 365 365 379 379 379 379 379 379 379 379 379 379 379 379
[973] 379 379 352 352 341 382 382 382 385 373 373 373 373 373 373 368 368 368
[991] 389 389 389 389 285 285 285 308 308 309 309 321 322 326 329 329 329 329
[1009] 330 330 330 330 385 385 385 385 385 385 385 380 380 380 380 380 380 380
[1027] 386 386 386 386 390 390 390 390 365 365 393 393 393 393 393 393 393 393
[1045] 393 393 393 393 393 393 399 397 397 397 392 392 392 392 407 407 400 400
[1063] 413 413
I founds out why R crashes when including the waves argument. GEEglm does not accept two observations on the same individual conducted on the same day. This makes sense when thinking through what the model does. Hope this may help someone else.

Resources