when you need a Kinhom rather than a Kest? - r

envelope of the K funcition (and its derivative such as L) is very useful for validating a fitted spatial points process model. for instance, I fit a poisson model for a data J1a2, which is as following:
J1a2.points:
# X.1 X Y
1 1 118.544 1638.445
2 2 325.995 1761.223
3 3 681.625 1553.771
4 4 677.392 1816.261
5 5 986.451 1685.016
6 6 1469.093 1354.787
7 7 1608.805 1625.744
8 8 1994.071 1782.391
9 9 1968.669 1375.955
10 10 2362.403 1337.852
11 11 2701.099 1773.924
12 12 2900.083 1820.495
13 13 2963.588 1668.081
14 14 3412.360 1676.549
15 15 3378.490 1456.396
16 16 3721.420 1464.863
17 17 3823.028 1701.951
18 18 4072.817 1790.859
19 19 4089.751 1388.656
20 20 97.375 715.497
21 21 376.799 1033.025
22 22 563.082 1126.166
23 23 935.647 1206.607
24 24 512.277 486.876
25 25 935.647 757.834
26 26 1409.821 410.670
27 27 1435.223 639.290
28 28 1706.180 1045.726
29 29 1968.669 876.378
30 30 2307.365 711.263
31 31 2624.892 897.546
32 32 2654.528 1236.243
33 33 2857.746 423.371
34 34 3039.795 639.290
35 35 3298.050 707.029
36 36 3111.767 1011.856
37 37 3361.555 1227.775
38 38 4047.414 1185.438
39 39 3569.007 508.045
40 40 4250.632 469.942
41 41 4386.110 872.144
42 42 93.141 237.088
43 43 554.614 186.283
44 44 757.832 148.180
45 45 965.283 220.153
46 46 1723.115 296.360
47 47 1744.283 423.371
48 48 1913.631 203.218
49 49 2167.653 292.126
50 50 2629.126 211.685
51 51 3217.610 283.658
52 52 3827.262 325.996
and:
J1a2.Win<-owin(c(0, 4500.42),c(0, 1917.87))
if you draw evelope for the data with Lest:
library(spatstat)
env.data<-envelope(J1a2, Lest,correction="border",
nsim=19, global=TRUE)
plot(env.data,.-r~r, shade=NULL, legend=FALSE,
xlab=expression(paste("r(",mu,"m)")),ylab="L(r)-r", main = "")
the Lest() curve goes out of the envelope. however, if you use Linhom instead of Lest, you will find the Linhom() are all inside of the envelope.
it seems that this suggest a inhomogenous density kernel of the data. so I use y as covariate in fitting:
poisson.J1a2<-ppm(J1a2~1,Poisson(),correction="border")
y1.J1a2<-ppm(J1a2~y,correction="border")
anova(poisson.J1a2,y.J1a2,test="LR") #p=0.6484
I don't find any evidence of a spatial trend of density along y, or x, or their combinations.
then why the Linhom() outperform the Lest() in this case?
furthermore, when should one decide to use Linhom() instead of Lest?

You should first decide whether or not the intensity can be assumed to be constant. To help you with this you can look at kernel density estimates or do formal tests such as a quadrat test etc. If you decide that the intensity can be assumed to be constant you use Lest() if this is not the case you use Linhom().

Related

evaluating neural network performance

I trained my neural network with a sigmoid activation function so that the predicted values lie in the range [0,1). However, the range of real data in which the z-score transformation has been performed goes beyond [0,1). In this case what would be the appropriate way to evaluate my model. Should I rescale as well the original test data to the same range and then evaluate with criteria like mean square forecast error?
> real_predicted_neural
predicted real
1 1.909219e-07 -3.57877473
2 4.161819e-08 -2.28704595
3 1.754706e-11 -1.08509429
4 1.149891e-13 -0.46573114
5 7.777560e-02 0.42381300
6 4.173448e-07 -0.44060297
7 1.119703e-01 0.21075550
8 8.682557e-01 -0.01292402
9 4.736056e-08 -0.29830701
10 7.506821e-08 -1.20302227
11 7.341235e-01 -0.03986571
12 7.501776e-05 -0.94315815
13 1.145697e-04 0.49730175
14 2.214929e-13 0.04252241
15 4.597199e-01 -0.38539901
16 2.324931e-03 -0.74468628
17 4.366025e-06 -0.77037244
18 1.394450e-06 0.16679048
19 5.869884e-11 -0.75876486
20 1.817941e-04 0.04303387
21 7.060773e-04 0.06099372
22 8.267170e-06 -1.21687318
23 9.388680e-02 0.61135319
24 1.099290e-01 0.55715201
25 9.757236e-01 -0.33480226
26 9.544055e-01 0.09061006
27 7.322074e-07 0.09290822
28 1.014327e-06 -0.61658893
29 7.848382e-08 -0.78739456
30 1.791908e-04 -0.44073540
31 1.357918e-03 -0.22099008
32 5.192233e-06 -0.32744703
33 2.624779e-06 -0.37644068
34 6.414216e-02 -0.36947939
35 1.388143e-06 -0.00994845
36 3.010872e-05 -0.05984833
37 9.873201e-03 -0.21815268
38 3.896163e-04 -0.24009094
39 2.718760e-02 0.33383333
40 1.025650e-02 0.09779867

GAM Predictions in R have same curve shape

I have a dataframe:
Albedo Year_Since_Burn Summer_SRAD Winter_SRAD
1 397.00 1 17801.70 6589.56
2 289.60 2 18027.20 6633.96
3 615.29 3 17397.10 6952.69
4 258.12 4 17793.63 6627.62
5 139.32 5 17853.00 6675.00
6 463.81 6 17853.00 6675.00
7 532.47 7 17853.00 6675.00
8 300.09 8 17648.00 6890.00
9 118.00 9 17786.13 6724.67
10 238.18 10 18050.13 6916.46
11 439.11 11 18057.20 6893.08
12 366.00 12 17823.00 6618.12
13 441.25 13 17809.50 6673.79
14 450.31 14 17654.40 6849.19
15 275.43 15 17592.80 7202.88
16 147.11 16 17830.20 6672.88
17 285.68 17 18065.13 6897.58
18 309.61 18 17665.80 7036.62
19 264.95 19 18053.47 6867.17
20 125.18 20 17834.40 6661.19
21 289.50 21 17824.00 6684.50
22 293.61 22 17826.90 6681.83
23 368.95 23 17634.55 6914.06
24 563.11 24 17434.23 7043.04
25 434.41 25 17527.60 7070.38
26 199.78 26 17955.40 6704.00
27 153.37 27 17872.70 6637.00
28 287.29 28 17843.20 6659.67
29 173.52 29 17822.93 6616.75
30 239.28 30 17884.00 6580.56
31 292.91 31 17884.00 6580.56
32 323.00 32 18078.70 6758.50
33 282.00 33 18078.70 6758.50
34 237.50 34 17779.10 7303.38
35 225.00 35 17822.80 6617.42
36 237.55 36 17822.80 6617.42
37 247.11 37 17918.50 6695.71
38 336.48 38 17918.50 6695.71
39 290.00 39 17918.50 6695.71
40 248.42 40 17822.80 6617.42
41 304.74 41 17918.50 6695.71
42 311.52 42 17918.50 6695.71
43 281.39 43 17918.50 6695.71
44 234.68 44 17918.50 6695.71
45 297.58 45 17918.50 6695.71
46 265.52 46 17918.50 6695.71
47 186.29 47 17918.50 6695.71
48 291.16 48 17918.50 6695.71
49 185.17 49 17918.50 6695.71
50 288.94 50 17918.50 6695.71
51 269.64 51 17918.50 6695.71
52 255.00 52 17918.50 6695.71
I am fitting a GAM model in R like so:
gam.m1 <- gam(Albedo ~ s(Year_Since_Burn) + s(Summer_SRAD) + s(Winter_SRAD), data=df)
which seems to work fine, and returns result as I would expect.
I have now created some data to predict on. Essentially I randomly selected 2 rows in the original df, duplicated them 52 times each, and then removed the Year_Since_Burn and Albedo columns, and created some new Year_Since_Burn data. So I only manipulated one independent variable. I did this like so:
df <- df[sample(nrow(df), 2),]
df <- df %>% select (-c(Albedo, Year_Since_Burn))
#add an id columns
df$ID <- seq.int(nrow(df))
#loop through each row
for (i in 1:nrow(df)) {
#select each row
grp <- (df[i, ])
#repeat each row 52 times
grp <- grp[rep(seq_len(nrow(grp)), each=52),]
#add a column for year since burn
grp$Year_Since_Burn <- seq.int(nrow(grp))
#select rows to keep
grp <- grp %>% select (c(ID, Year_Since_Burn,Summer_SRAD, Winter_SRAD, Winter_Tavg, Summer_Tavg, PFI, Bulk_Density,
SOC_Content, SOC_Stock, L3_Ecoregion))
#append
combined[[i]] <- grp
}
#concat
final = do.call(rbind, combined)
Now for each unique ID in final I predicted the dependent variable like so:
y_hat <- predict(gam.m1, final)
and then I plotted to look at how the predictions varied with Year_Since_Burn:
final2 <- data.frame(as.array(final$ID), as.array(final$Year_Since_Burn), y_hat2)
names(final2) <- c("ID", 'Year_Since_Burn', "Predicted")
#plot
plots1 <- lapply(split(final2, final2$ID),
function(x)
ggplot(x, aes(x=Year_Since_Burn, y=Predicted)) +
geom_line())
For the output graphs the curves are identical in shape for every prediction, it is just the magnitudes that are shifting. I am not sure if this is what GAMs is supposed to do or if this is an error on my part. This is what the predictions for two different Id's look like:

Morans correlogram with only one point. What is wrong?

Im trying Moran's I and respective plot in r. But the plot has only one point. I have no idea of what is going wrong. The code is based on<
http://rstudio-pubs-static.s3.amazonaws.com/9688_a49c681fab974bbca889e3eae9fbb837.html>
my data called "coordenata"
resid x y
1 0.07785411 -53.20342 -22.66700
2 -0.28358702 -53.20389 -22.66864
3 -0.64011338 -53.21392 -22.68122
4 1.22071249 -53.21311 -22.72369
5 0.95734778 -53.28469 -22.75289
6 0.35345302 -53.25822 -22.74850
7 -0.68357738 -53.28344 -22.70694
8 -1.24596010 -53.32950 -22.72872
9 -0.19944162 -53.33669 -22.73561
10 0.67544909 -53.36756 -22.80767
11 0.64002961 -53.35947 -22.79958
12 0.04564233 -53.21889 -22.67419
13 0.01618436 -53.24522 -22.70144
14 -2.65436794 -53.23017 -22.69292
15 0.72096256 -53.25539 -22.69978
16 0.89656515 -53.28489 -22.72222
17 1.85358579 -53.33069 -22.79161
18 -0.03590077 -53.33200 -22.78336
19 0.32348975 -53.33494 -22.78586
20 2.06771402 -53.37781 -22.77869
21 -1.02190709 -53.30492 -22.77244
22 -2.02813250 -53.53917 -22.79856
23 -1.20702445 -53.53858 -22.79406
24 -1.24091732 -53.55272 -22.80536
25 -1.13491596 -53.56181 -22.82914
26 -0.82934613 -53.56422 -22.83417
27 1.23418758 -53.60017 -22.85531
28 -1.72808514 -53.65900 -22.97828
29 -0.02144049 -53.65908 -22.97497
30 0.49174568 -53.64597 -22.95439
31 -0.54408149 -53.64217 -22.91033
32 -0.37111342 -53.61447 -22.86269
33 -0.31121931 -53.27153 -22.70036
34 0.32419211 -53.30308 -22.72183
35 1.57980287 -53.33053 -22.72947
36 -1.91156060 -53.34633 -22.74722
37 -0.79036645 -53.23667 -22.68925
the code
coordinates(coordenata)<-c("x","y")
fit2<-correlog(coordenata$x,coordenata$y,coordenata$resid,increment=5,resamp=100,quiet=T)
plot(fit2)
Thanks in advance for any help!

Run model on each column and save each prediction output

I am trying to compile a code that runs a model on each column and saves each prediction into a data.frame.
I am probably missing a basic step because the result only saves data for the last prediction. It would be great if someone could give me a tip.
Here is the code:
for(i in 1:ncol(temp[,1:101])){
pred=(exp(predict(gam(temp[,i]~s(w),gamma=1.4,data=temp))))
prediction[i]<- as.data.frame(cbind(pred=pred))
}
Here is a short look of the data and result:
temp2
tdat.V1 tdat.V2 tdat.V3 tdat.V4 tdat.V5 w
9 0.2468596 0.2468596 -0.47226384 -0.47226384 -0.69767176 9
10 -0.3298719 -0.3298719 -0.61766160 -0.61766160 -1.05190065 10
11 0.2636122 0.2636122 -0.16966523 -0.16966523 -0.98531224 11
12 1.1036205 1.1036205 0.46601526 0.46601526 -0.35346974 12
13 1.1337664 1.1337664 -0.31946816 -0.31946816 -0.78722896 13
14 1.0441290 1.0441290 -0.19397040 -0.19397040 -0.99997758 14
15 0.5904416 0.5904416 -0.49903362 -0.49903362 -1.29327665 15
16 0.2704478 0.2704478 -0.33188601 -0.33188601 -0.89020267 16
17 0.4905354 0.4905354 0.26849660 0.26849660 -0.22608949 17
18 1.4072215 1.4072215 1.43372101 1.43372101 0.74552152 18
19 -0.3510362 -0.3510362 -0.65455175 -0.65455175 -0.67979925 19
20 -0.9471780 -0.9471780 -0.99449245 -0.99449245 -0.94800264 20
21 0.5601007 0.5601007 -0.41078889 -0.41078889 -0.70911666 21
22 0.6337811 0.6337811 0.11769665 0.11769665 -0.37718872 22
23 1.1154420 1.1154420 0.52692499 0.52692499 0.26777430 23
24 0.1314404 0.1314404 0.02146546 0.02146546 -0.03748099 24
25 0.2262661 0.2262661 0.14216196 0.14216196 -0.19273456 25
26 1.7767008 1.7767008 1.19683315 1.19683315 0.55529405 26
27 2.0070761 2.0070761 1.70737151 1.70737151 0.90322033 27
28 2.2252446 2.2252446 1.35160191 1.35160191 0.98155994 28
29 1.7452878 1.7452878 0.86052298 0.86052298 1.27898872 29
30 0.2071554 0.2071554 0.55612163 0.55612163 0.64726184 30
31 1.7144228 1.7144228 0.74949354 0.74949354 0.42433658 31
32 0.2533343 0.2533343 -0.11861726 -0.11861726 -0.63511376 32
33 0.6176735 0.6176735 0.29274750 0.29274750 -0.20402280 33
34 1.0868382 1.0868382 1.19325652 1.19325652 1.57309478 34
35 1.7051584 1.7051584 0.00151082 0.00151082 -0.95416617 35
> for(i in 1:ncol(temp2[,1:4])){
+
+ pred=(exp(predict(gam(temp2[,i]~s(w),gamma=1.4,data=temp2))))
+ prediction <- as.data.frame(cbind(pred=pred))
+ }
> print(prediction)
pred
9 0.6969407
10 0.7291379
11 0.7628225
12 0.7980632
13 0.8349320
14 0.8735040
15 0.9138580
16 0.9560762
17 1.0002448
18 1.0464540
19 1.0947979
20 1.1453751
21 1.1982889
22 1.2536473
23 1.3115630
24 1.3721544
25 1.4355449
26 1.5018639
27 1.5712468
28 1.6438349
29 1.7197765
30 1.7992264
31 1.8823468
32 1.9693071
33 2.0602847
34 2.1554654
35 2.2550432

how do I select points in a dataset above x% contour of a density map?

I have a matrix of data (see below) and I am trying to turn it into a density contour map (Can1 and Can2 variables), maybe with ks or sm packages.
My question is how do I select those points in the dataset which lie above (say) 80% contour of the density map?
Thanks
ID Can1 Can2
4 -12.3235137 -1.0788867664
1 -12.2949912 -0.9321009837
5 -12.2835123 -1.0164225574
2 -12.2571822 -0.7094457036
3 -12.2713779 -0.9908419863
10 -12.9870438 -1.0936405526
6 -12.7167605 -1.4620772026
7 -12.8193776 -1.0911349785
8 -12.9781963 -1.1762698594
9 -12.7983478 -1.3453369581
13 -14.0389948 0.2855210115
11 -14.0015922 0.1467552738
15 -14.0723604 0.0244576488
14 -14.0743560 0.1417245145
12 -13.9898266 0.0005437008
20 -6.5881994 0.5124980991
17 -6.1812321 0.6789584579
16 -6.4704200 0.5942317307
18 -6.6960456 0.5720874622
19 -6.1159788 0.5960966790
22 -2.4794887 2.5493267897
24 -2.4918040 2.7823374576
21 -2.5145044 2.5877290160
23 -2.5048371 2.4916280770
25 -2.5018765 2.8536302559
29 -0.1781852 2.0805229401
26 -0.1581308 2.0151355747
28 -0.2118605 1.9658284615
27 -0.4184119 2.0540218901
30 -0.2994573 2.0205573385
35 2.6254869 1.3858705991
31 2.3146430 1.3510499304
33 2.5346138 1.2524229847
34 2.3741699 1.3842499455
32 2.6008389 1.3446707509
37 3.0920503 1.5807032840
38 3.1559727 1.4924092104
36 3.1593556 1.5803284343
39 3.0801444 1.6031732981
40 3.2562384 1.5810975265
43 4.8414364 2.1539254215
41 4.7938193 2.1613978258
44 4.7919209 2.2151527426
42 4.9830802 2.2374622446
45 4.7629268 2.4217335005
46 5.5631728 0.9986762598
50 5.5250403 1.0549399894
48 5.5833619 1.1368625963
47 5.5660312 1.1881215490
49 5.6224256 1.1634998303
53 5.5536366 0.2513665533
54 5.5276808 0.2685455911
51 5.7103045 0.2193839293
52 5.6014729 0.2353172964
55 5.5959034 0.2447836618
56 5.1542133 0.6070006863
59 5.0043394 0.4518710615
58 5.2314146 0.5656457888
60 5.1318728 0.4771275341
57 5.3599822 0.4918185651
61 7.0235173 -0.2669136870
63 7.0216315 -0.0097862523
64 7.0521253 -0.2457722410
62 7.0150637 -0.1456269078
65 7.0729018 -0.3573952321
69 5.8115406 -1.4652084167
67 5.7624475 -1.4147564126
68 5.8692888 -1.4695783153
70 5.9088094 -1.4927034632
66 5.8400205 -1.4817447808
71 4.8586107 -1.3111515744
73 4.7198564 -1.2891991780
72 4.9153659 -1.4499710448
74 4.7653488 -1.2839433419
75 4.7754971 -1.4655359108
77 3.8955675 -7.0922887151
78 3.8338151 -7.1595858283
80 3.7255063 -7.2147373050
79 3.7367055 -7.3468877516
76 4.0166957 -7.1952570639
Calculate the 80% point. One way: y<- x[x > 0.8 * max(x)] (I'm assuming you wanted 80% of the max level, not the 80th percentile) .
Then plot y .
After a bit of searching I think it can be achieved using the kde2d function from the MASS package.

Resources