Correlation between variables and components using pls package - r

Using the data.frame below (Source: http://eric.univ-lyon2.fr/~ricco/tanagra/fichiers/en_Tanagra_PLSR_Software_Comparison.pdf)
Data
df <- read.table(text = c("
diesel twodoors sportsstyle wheelbase length width height curbweight enginesize horsepower horse_per_weight conscity price symboling
0 1 0 97 172 66 56 2209 109 85 0.0385 8.7 7975 2
0 0 0 100 177 66 54 2337 109 102 0.0436 9.8 13950 2
0 0 0 116 203 72 57 3740 234 155 0.0414 14.7 34184 -1
0 1 1 103 184 68 52 3016 171 161 0.0534 12.4 15998 3
0 0 0 101 177 65 54 2765 164 121 0.0438 11.2 21105 0
0 1 0 90 169 65 52 2756 194 207 0.0751 13.8 34028 3
1 0 0 105 175 66 54 2700 134 72 0.0267 7.6 18344 0
0 0 0 108 187 68 57 3020 120 97 0.0321 12.4 11900 0
0 0 1 94 157 64 51 1967 90 68 0.0346 7.6 6229 1
0 1 0 95 169 64 53 2265 98 112 0.0494 9.0 9298 1
1 0 0 96 166 64 53 2275 110 56 0.0246 6.9 7898 0
0 1 0 100 177 66 53 2507 136 110 0.0439 12.4 15250 2
0 1 1 94 157 64 51 1876 90 68 0.0362 6.4 5572 1
0 0 0 95 170 64 54 2024 97 69 0.0341 7.6 7349 1
0 1 1 95 171 66 52 2823 152 154 0.0546 12.4 16500 1
0 0 0 103 175 65 60 2535 122 88 0.0347 9.8 8921 -1
0 0 0 113 200 70 53 4066 258 176 0.0433 15.7 32250 0
0 0 0 95 165 64 55 1938 97 69 0.0356 7.6 6849 1
1 0 0 97 172 66 56 2319 97 68 0.0293 6.4 9495 2
0 0 0 97 172 66 56 2275 109 85 0.0374 8.7 8495 2"), header = T)
and this
Code
library(plsdepot)
df.plsdepot = plsreg1(df[, 1:11], df[, 14, drop = FALSE], comps = 3)
data<-df.plsdepot$cor.xyt
data<-as.data.frame(data)
I got this data.frame of the correlation between variables and components
data
# t1 t2 t3
#diesel -0.23513860 -0.38154681 0.439221649
#twodoors 0.71849247 0.45622386 0.055982798
#sportsstyle 0.51909329 -0.02381952 -0.672617464
#wheelbase -0.86843937 0.34114664 -0.254589548
#length -0.75311884 0.62404991 -0.085596033
#width -0.67444970 0.62282146 -0.158675019
#height -0.67228557 -0.14675385 0.317166599
#curbweight -0.59305898 0.73532560 -0.241983833
#enginesize -0.39475651 0.82353941 -0.252270394
#horsepower 0.04843256 0.96637015 -0.148407288
#horse_per_weight 0.50515322 0.81502376 -0.006045151
#symboling 0.64900253 0.23673633 0.346902434
and I managed to plot them as below
library(plsdepot)
df.plsdepot = plsreg1(df[, 1:11], df[, 14, drop = FALSE], comps = 3)
plot(df.plsdepot, comps = c(1, 2))
I had to use pls package instead of plsdepot. I need to get the correlations between variables and components and plot them
Using pls, I managed to plot the correlation between variables and components as below
library(pls)
Y <- as.matrix(df[,14])
X <- as.matrix(df[,1:11])
df.pls <- mvr(Y ~ X, ncomp = 3, method = "oscorespls", scale = T)
plot(df.pls, "correlation")
However, I couldn't find a way to get these values (correlation between variables and components) and convert them to data.frame using pls package.
Any help how can I get these correlation values using pls package will be highly appreciated?

Thanks to Bjørn-Helge Mevik (the maintainer of pls package), for his answer below
==========================================================================
If you look at the corrplot code:
> corrplot
function (object, comps = 1:2, labels, radii = c(sqrt(1/2), 1),
identify = FALSE, type = "p", xlab, ylab, ...) {
nComps <- length(comps)
if (nComps < 2)
stop("At least two components must be selected.")
if (is.matrix(object)) {
cl <- object[, comps, drop = FALSE]
varlab <- colnames(cl)
}
else {
S <- scores(object)[, comps, drop = FALSE]
if (is.null(S))
stop("`", deparse(substitute(object)), "' has no scores.")
cl <- cor(model.matrix(object), S)
varlab <- compnames(object, comps, explvar = TRUE)
}
you will see that it basically does
S <- scores(object)[, comps, drop = FALSE]
cl <- cor(model.matrix(object), S)
to calculate the correlation loadings. Using df.pls in place of object should give you a matrix of correlation loadings.
S <- scores(df.pls)[, comps= 1:2, drop = FALSE]
cl <- cor(model.matrix(df.pls), S)
df.cor <- as.data.frame(cl)
df.cor
# Comp 1 Comp 2
#diesel -0.23513860 -0.38154681
#twodoors 0.71849247 0.45622386
#sportsstyle 0.51909329 -0.02381952
#wheelbase -0.86843937 0.34114664
#length -0.75311884 0.62404991
#width -0.67444970 0.62282146
#height -0.67228557 -0.14675385
#curbweight -0.59305898 0.73532560
#enginesize -0.39475651 0.82353941
#horsepower 0.04843256 0.96637015
#horse_per_weight 0.50515322 0.81502376

Related

Reformat output of R table() command for plotting

I wish to plot some count data (likely as a bubble plot). I've some different experiments and for each experiment, I've three replicates. The output from the table() command is given below.
> with(myData.df, table(ChargeGroup,Expt,Repx))
, , Repx = 1
Expt
ChargeGroup Ctrl CV2 Gas n15 n30 n45 n60 p15 p30 v0
<+10 540 512 567 204 642 648 71 2 2 6
+10:+15 219 258 262 156 283 16 0 1 0 7
+15:+20 119 118 14 200 14 0 0 7 0 51
+20:+25 57 38 0 84 1 0 0 31 7 87
+25: 30 16 0 17 0 0 0 24 19 18
, , Repx = 2
Expt
ChargeGroup Ctrl CV2 Gas n15 n30 n45 n60 p15 p30 v0
<+10 529 522 582 201 642 626 77 1 2 5
+10:+15 232 249 264 150 273 14 0 1 0 5
+15:+20 116 113 18 204 13 0 0 12 0 41
+20:+25 53 46 0 82 0 0 0 36 6 94
+25: 28 12 0 26 0 0 0 33 21 28
, , Repx = 3
Expt
ChargeGroup Ctrl CV2 Gas n15 n30 n45 n60 p15 p30 v0
<+10 536 525 591 224 671 641 63 1 2 6
+10:+15 236 238 257 170 276 16 0 2 1 10
+15:+20 113 108 15 212 12 0 0 10 0 47
+20:+25 57 40 0 77 0 0 0 34 3 107
+25: 32 11 0 25 0 0 0 26 15 26
Can anyone help in to further process the output so that I can go directly for plotting in either base graphics or ggplot?
Thanks
There are couple of methods - with base R, by looping over the third dmension and plotting with barplot
par(mfrow = c(3, 1))
apply(with(myData.df, table(ChargeGroup,Expt,Repx)), 3, barplot)
-testing
par(mfrow = c(3, 1))
apply(with(mtcars, table(cyl, vs, gear)), 3, barplot)
Or convert to a single data.frame with as.data.frame and using ggplot or directly get the data.frame/tibble output with count
library(dplyr)
library(ggplot2)
myData.df %>%
count(ChargeGroup,Expt,Repx) %>%
ggplot(aes(x=ChargeGroup, y = n, fill = Expt)) +
geom_col() +
facet_wrap(~ Repx)
-testing
mtcars %>%
count(cyl = factor(cyl), vs = factor(vs), gear = factor(gear)) %>%
ggplot(aes(x = cyl, y = n, fill = vs)) +
geom_col() +
facet_wrap(~ gear)

R - ddplyr not working as expected - shapiro.test on long-form data by categories

I have a dataset in long form, values in "values" column and categories in "ind". The data looks like this:
values ind
1 42.58666667 le_mean
2 52.35666667 le_mean
64 78.7 le_mean
65 95.49666667 le_mean
66 88.91 le_mean
67 1.295234856 le_sd
68 4.294139417 le_sd
69 0 le_sd
70 7.327416552 le_sd
71 4.007322464 le_sd
72 0 le_sd
73 0 le_sd
74 0 le_sd
75 0 le_sd
76 0.704367328 le_sd
77 1.11 le_sd
78 6.870315374 le_sd
79 10.36559855 le_sd
80 7.589591557 le_sd
86 1.223165293 le_sd
87 7.600019737 le_sd
88 3.655995077 le_sd
89 5.148595278 le_sd
90 0 le_sd
229 2.385211381 re_sd
230 4.465672775 re_sd
231 3.129765699 re_sd
232 3.55056803 re_sd
233 0 re_sd
234 0 re_sd
276 29.34 lf_mean
277 41.66333333 lf_mean
278 39.84666667 lf_mean
279 35.33666667 lf_mean
280 61.68 lf_mean
281 73.22333333 lf_mean
282 75.51666667 lf_mean
283 31.74666667 lf_mean
284 28.37666667 lf_mean
285 40.03333333 lf_mean
286 21.31333333 lf_mean
287 18.90666667 lf_mean
288 0 lf_mean
I am trying to get the p-values for a shapiro.test out in a data frame by category, but I am getting the same p-values, which is incorrect. I have tried:
ddply(bpdata_long, .(ind),
function(x) shapiro.test(bpdata_long$values)$p.value)
and I have also tried:
shapfunc <- function(x){
return(data.frame(pvalues=shapiro.test(bpdata_long$values)$p.value))
}
ddply(bpdata_long, .(ind), shapfunc)
but with both all I'm getting back is:
ddply(bpdata_long, .(ind),
+ function(x) shapiro.test(bpdata_long$values)$p.value)
ind V1
1 le_mean 0.0000000000000000000000000000008028749
2 le_sd 0.0000000000000000000000000000008028749
3 re_mean 0.0000000000000000000000000000008028749
4 re_sd 0.0000000000000000000000000000008028749
5 lf_mean 0.0000000000000000000000000000008028749
6 lf_sd 0.0000000000000000000000000000008028749
Could someone help with this, please? Where does my code go wrong?
The issue is that by using shapiro.test(bpdata_long$values) you apply the Shapiro test on the ungrouped values column. That's why you get the same value for each group. Additionally, instead of using the retired plyr package I would suggest to switch to dplyr:
library(dplyr)
bpdata_long %>%
group_by(ind) %>%
summarise(p.value = shapiro.test(values)$p.value)
#> # A tibble: 4 × 2
#> ind p.value
#> <chr> <dbl>
#> 1 le_mean 0.450
#> 2 le_sd 0.00774
#> 3 lf_mean 0.471
#> 4 re_sd 0.285

counting the number of times a value appears in a column in relation to other columns in r

I am new to r and I have a dataframe very close to the one below and I would love to find a general way that tells me how many times plus 1, the number "0" appears for each country (intro4) and id.
Intro4 number id
221 TAN 0 19
222 TAN 0 73
223 TAN 0 73
224 TOG 0 37
225 TOG 0 58
226 UGA 0 96
227 UGA 0 112
228 UGA 0 96
229 ZAM 0 40
230 ZAM 0 99
231 ZAM 0 139
I can do it by hand by it is a big data frame and would take forever, count () gives me the frequency but doesn't divide it between different countries. I have found a way to do it but I will have to select and filter for each individual county (intro4) and add 1 to the result. I was wondering if there was any quicker way to fo it. The code I have tried was this one:
projects <- finalr %>% select (Intro4,number,id)
projects1<-projects %>% filter (str_detect (number, "0"))
projects2<-projects1 %>%arrange (Intro4)
projects3<-sum(projects2$Intro4 == "TAN", na.rm = TRUE)
projects4<-sum(projects2$Intro4=="UGA",na.rm=TRUE)
I would be extremely grateful for any help, thank you :)
You can also do it as followed:
library(dplyr)
dat <- read.table(header = T, text =
"Intro4 number id
TAN 0 19
TAN 0 73
TAN 0 73
TOG 0 37
TOG 0 58
UGA 0 96
UGA 0 112
UGA 0 96
ZAM 0 40
ZAM 0 99
ZAM 0 139", stringsAsFactors = F)
dat %>% group_by(Intro4, id, number) %>% tally()
Which produces:
Intro4 id number n
<chr> <int> <int> <int>
1 TAN 19 0 1
2 TAN 73 0 2
3 TOG 37 0 1
4 TOG 58 0 1
5 UGA 96 0 2
6 UGA 112 0 1
7 ZAM 40 0 1
8 ZAM 99 0 1
9 ZAM 139 0 1
Assuming number can be anything like 0, 1, 2 etc. one can count occurrence of 0 by sum(number==0). A solution using dplyr can be as:
library(dplyr)
df %>% group_by(Intro4, id) %>%
summarise(count = sum(number==0))
# # A tibble: 9 x 3
# # Groups: Intro4 [?]
# Intro4 id count
# <chr> <int> <int>
# 1 TAN 19 1
# 2 TAN 73 2
# 3 TOG 37 1
# 4 TOG 58 1
# 5 UGA 96 2
# 6 UGA 112 1
# 7 ZAM 40 1
# 8 ZAM 99 1
# 9 ZAM 139 1
Data:
df <- read.table(text="
Intro4 number id
221 TAN 0 19
222 TAN 0 73
223 TAN 0 73
224 TOG 0 37
225 TOG 0 58
226 UGA 0 96
227 UGA 0 112
228 UGA 0 96
229 ZAM 0 40
230 ZAM 0 99
231 ZAM 0 139",
header = TRUE, stringsAsFactors = FALSE)

GLM returning negative values for suitability in species distribution modeling

I have started to work with species distribution modeling with GLM. Using BIOCLIM environmental data (for example: Bio10, Bio15, Bio16, Bio17 as predictors), the following data (stored in an object presausTrain):
ID bioclim_10 bioclim_11 bioclim_15 bioclim_16 pres longitude latitude
2 2 225.00000 105.00000000 22.206624 299.18014 1 -58.8786 -34.2269
3 3 228.97882 112.97809077 27.000000 319.94470 1 -59.5144 -33.7806
4 4 219.00000 104.57779206 16.000000 265.57779 1 -57.2555 -35.2549
6 6 188.00000 83.00000000 18.000000 260.42379 1 -57.5419 -38.0551
9 9 224.58419 104.73418836 23.000000 320.08305 1 -58.9186 -34.4132
10 10 243.60300 94.16917531 64.561824 85.17573 1 -68.6146 -32.8886
11 11 224.58433 104.73658836 23.000000 320.09025 1 -58.9187 -34.4133
12 12 253.00000 97.00000000 68.608231 71.99121 1 -68.5041 -32.3345
13 13 224.60863 104.75578836 23.000000 320.02305 1 -58.9195 -34.4128
15 15 245.44112 94.58706179 64.849824 84.25853 1 -68.6026 -32.8416
16 16 264.02281 151.00000000 54.022813 393.34787 1 -60.7727 -28.6506
17 17 244.67617 128.19141384 48.323829 366.28249 1 -60.6717 -31.6380
18 18 263.00000 149.49003689 53.490037 391.42668 1 -60.7500 -28.7500
19 19 272.04463 181.06767992 43.272909 412.80043 1 -58.1522 -25.1102
20 20 250.00000 132.00000000 49.877386 358.92412 1 -60.8829 -31.2539
21 21 268.54597 165.00000000 32.000000 418.09660 1 -58.0293 -28.0340
26 26 263.03251 149.36775948 53.286986 392.57182 1 -60.7333 -28.7333
27 27 262.00000 149.00000000 52.954712 394.07047 1 -60.6666 -28.7857
28 28 194.26954 91.54652958 113.000000 221.44775 1 -70.8308 -33.2159
29 29 195.00139 91.98381950 113.000000 219.30565 1 -70.8255 -33.2179
30 30 194.71515 92.34394042 113.000000 219.32903 1 -70.8312 -33.1968
31 31 194.87274 92.25693323 113.000000 218.64974 1 -70.8271 -33.2033
32 32 262.51488 149.00000000 53.000000 391.44238 1 -60.7334 -28.7999
33 33 236.09116 148.19977261 21.050265 543.87328 1 -53.9738 -25.8564
34 34 244.17649 128.15908399 47.077874 363.03794 1 -60.6339 -31.6890
36 36 249.80369 132.80368760 47.196312 364.22593 1 -60.2472 -31.2462
37 37 268.00000 164.88563766 32.000000 414.86622 1 -58.0654 -28.0482
38 38 268.00000 164.86220565 32.000000 414.68268 1 -58.0699 -28.0454
39 39 256.00000 142.51301366 48.000000 358.57247 1 -60.5333 -29.7500
40 40 255.02037 143.12581264 46.732643 438.70468 1 -59.7161 -29.3281
41 41 264.00000 151.00000000 54.000000 394.65955 1 -60.7500 -28.6500
42 42 254.54615 164.95675375 19.200389 502.30639 1 -54.4563 -25.6887
43 43 272.00000 173.71328176 36.025171 467.51253 1 -58.1000 -26.5833
44 44 286.97773 208.08168096 56.000000 292.08590 1 -59.5522 -21.2787
45 45 224.22325 78.22324976 38.606279 185.39521 1 -63.5337 -37.7471
46 46 248.74987 159.74987480 27.453648 559.43635 1 -54.2713 -25.6734
47 47 209.41746 124.45790111 107.988728 331.33831 1 -71.6073 -33.5050
48 48 244.38027 128.36415875 49.000000 369.61503 1 -60.6817 -31.5992
49 49 162.85989 96.36235347 118.491117 443.99917 1 -71.5244 -33.1645
50 50 130.32560 17.41336935 68.079547 360.58826 1 -71.1000 -40.9500
51 51 139.05510 25.70054673 69.765255 389.11327 1 -71.0837 -40.9810
52 52 209.13482 124.35046642 107.868234 332.58278 1 -71.6089 -33.5031
53 53 256.00458 165.33361100 21.301138 511.40500 1 -54.4162 -25.6967
54 54 271.00000 170.00000000 60.000000 362.54198 1 -60.4542 -25.9167
56 56 229.00000 112.35964626 25.000000 301.35039 1 -59.0210 -33.6877
57 57 119.99753 15.10747321 54.000000 471.71589 1 -71.7248 -42.7099
58 58 135.70071 20.70070732 72.827280 349.44457 1 -71.0065 -41.0595
59 59 264.00000 174.43120494 23.910081 420.64503 1 -57.0766 -26.0751
60 60 262.52382 173.72329246 25.077236 432.73019 1 -57.0500 -26.0167
62 62 179.34210 80.87832470 86.102021 594.32138 1 -72.6524 -37.8537
63 63 154.27204 63.26968212 83.647330 756.03579 1 -72.7667 -37.6333
64 64 170.36894 82.95671452 76.716261 582.33120 1 -72.9125 -38.0167
65 65 255.29339 141.05130937 44.000000 362.34977 1 -59.6919 -30.0224
68 68 244.00000 126.00000000 47.000000 373.97578 1 -60.7068 -31.8564
70 70 169.65447 81.75782454 60.138823 575.48334 1 -72.6000 -38.7333
71 71 280.00000 209.22244349 60.000000 311.98601 1 -60.0000 -20.0000
74 74 173.06376 91.94939494 86.649798 808.16328 1 -72.9333 -37.1667
75 75 93.88276 -3.88122756 123.993938 122.31361 1 -65.7049 -23.1626
77 77 244.73709 128.25037481 48.750699 368.01469 1 -60.7000 -31.6333
78 78 238.25716 118.42208981 26.120460 317.28934 1 -58.5249 -33.0121
79 79 264.68778 215.00000000 54.000000 469.93021 1 -63.0000 -17.0000
81 81 132.00000 77.00000000 37.000000 770.18289 1 -74.1167 -43.3500
82 82 204.24999 73.75029357 31.762202 275.78719 1 -60.2000 -37.3000
84 84 230.00000 113.03251305 23.367766 283.85559 1 -58.7333 -33.4833
85 85 239.68529 122.46175316 12.326192 327.46175 1 -55.7766 -32.5428
86 86 192.89750 78.09241708 19.000000 252.85173 1 -58.0658 -37.8406
87 87 127.72334 35.72334013 73.511696 1099.73574 1 -71.8167 -38.2167
90 90 225.43089 107.43089205 22.000000 304.29268 1 -58.3902 -34.8034
91 91 134.53429 72.02286008 40.000000 865.02286 1 -73.6167 -43.1167
92 92 225.07390 102.00000000 39.238986 337.55187 1 -60.7313 -34.2004
93 93 255.09615 141.09614673 15.688826 373.75971 1 -56.4500 -30.4300
95 95 168.91143 99.08857071 84.088571 593.54574 1 -73.0167 -36.7333
96 96 241.33689 219.33688825 75.000000 952.96431 1 -59.0000 -13.0000
97 97 267.51799 180.35046950 87.000000 353.46857 1 -63.0700 -20.8700
98 98 262.97274 210.03301311 61.289635 734.77698 1 -63.6667 -17.4500
99 99 217.00000 98.96529301 18.652335 283.29342 1 -57.9995 -35.5728
102 102 229.00000 107.00000000 28.000000 311.07590 1 -59.8228 -34.3834
104 104 225.00000 104.96487882 22.610418 318.55003 1 -59.0000 -34.4000
105 105 259.00000 147.04660936 24.512470 410.87977 1 -57.0944 -29.7149
107 107 244.31221 120.02550008 33.687788 366.17838 1 -59.0000 -31.8333
108 108 208.64289 87.07940941 14.000000 206.82547 1 -57.1347 -37.0029
109 109 248.30467 157.12855496 18.887519 493.23855 1 -54.3167 -25.9000
112 112 259.00000 151.00000000 22.000000 434.28496 1 -56.6444 -29.1753
113 113 227.87889 110.74950291 22.000000 310.87978 1 -58.3934 -34.7014
114 114 188.31179 83.92218970 17.311789 259.70139 1 -57.8431 -38.2656
116 116 224.69761 106.55401173 23.302389 327.10194 1 -58.5911 -34.4191
117 117 222.10785 105.00227249 25.889880 343.91302 1 -58.7179 -34.5725
118 118 200.28610 81.28609587 20.000000 248.04349 1 -58.2529 -37.8511
119 119 254.42257 162.36128278 19.840207 503.72360 1 -55.6030 -27.4438
120 120 249.80225 132.00000000 50.000000 358.77755 1 -61.0000 -31.0000
124 124 231.00000 142.00000000 15.000000 389.98814 1 -51.0936 -31.2872
126 126 235.43596 148.97624749 10.953114 404.58779 1 -50.9919 -29.9444
127 127 234.99430 153.85461324 11.073286 388.21653 1 -51.7181 -29.9433
128 128 233.60352 152.60352054 8.912486 383.80058 1 -51.3247 -29.7000
129 129 244.81880 184.81879611 42.716919 1025.62246 1 -46.4197 -23.8900
131 131 213.06989 159.02007109 60.544541 617.99241 1 -46.6339 -23.5503
133 133 212.30438 154.80780343 67.000000 636.61384 1 -46.8800 -23.1803
137 137 223.21176 165.93543578 70.980998 654.28145 1 -46.9797 -22.7003
22 2 194.00099 73.00099051 26.001083 276.00325 0 -59.1797 -37.3630
310 3 205.99766 62.99766278 20.000000 65.00000 0 -66.6797 -40.7797
410 4 267.99982 119.00012978 90.000314 163.00107 0 -67.0130 -30.1547
66 6 218.00083 127.00051598 16.000000 423.99824 0 -52.6380 -31.4047
8 8 272.99900 256.99786347 80.002135 769.99245 0 -48.2213 -10.5714
910 9 258.99943 245.00000000 20.999083 908.00078 0 -75.9297 -1.0297
1010 10 280.00232 267.00116165 86.000000 651.00936 0 -65.6380 8.3036
1110 11 279.00000 174.00000000 87.000000 465.99622 0 -63.2213 -23.1130
121 12 249.00582 217.00581833 70.999999 704.98944 0 -55.4297 -15.3214
14 14 273.00147 251.00146645 83.000000 809.99861 0 -51.3463 -12.9880
151 15 246.00666 221.00665863 85.001131 906.97968 0 -49.9713 -15.6964
161 16 263.00137 249.00250902 50.000000 835.00547 0 -71.0130 -8.1964
171 17 224.99969 99.99969124 43.000000 335.99883 0 -61.1797 -34.1547
181 18 228.99874 203.99940734 80.999335 669.99256 0 -47.6380 -15.4880
191 19 268.99981 254.99946750 98.000347 827.99056 0 -38.3880 -3.9047
201 20 76.98821 -0.01070989 17.000000 132.00969 0 -67.3463 -54.6964
25 25 229.00100 147.99999952 18.000000 521.00148 0 -53.5963 -25.3630
261 26 264.00251 247.00271798 54.000207 966.99373 0 -55.5130 0.1786
271 27 187.01335 46.01069668 35.998369 53.99662 0 -69.4713 -38.4047
281 28 228.00046 213.99999953 73.000000 815.00304 0 -60.7213 -12.6130
291 29 268.00058 262.00000000 45.000000 940.99818 0 -63.5547 -5.3630
301 30 228.01359 218.01360884 54.000000 1191.84478 0 -73.0547 9.4286
311 31 267.99977 257.99977009 12.001378 956.99770 0 -73.5130 -2.5714
321 32 253.00035 243.00034548 70.000000 897.99648 0 -51.8047 -4.6547
331 33 259.00000 242.99977322 58.000000 746.00023 0 -71.0130 -10.3214
35 35 266.00115 234.00023820 129.998376 343.98683 0 -80.1380 -2.8214
361 36 239.00091 158.00203076 9.999796 490.99858 0 -53.3880 -27.0714
371 37 256.00107 223.00214169 82.000471 710.99743 0 -49.6797 -18.4880
381 38 264.99942 250.99884783 29.998848 1096.99310 0 -70.4297 1.0536
391 39 15.01118 -12.98915222 81.002712 506.99793 0 -75.9713 -12.2797
401 40 259.00048 148.99941053 29.000000 407.99775 0 -57.8047 -29.1130
411 41 271.00000 261.99999933 39.998855 874.99310 0 -62.3047 -4.3214
421 42 245.99772 232.99637349 74.000000 1106.00622 0 -54.6380 -9.0714
431 43 270.00000 254.99884138 37.001159 1035.03518 0 -58.9713 5.6786
441 44 210.99715 171.99887466 77.000001 800.00466 0 -47.0547 -19.8214
451 45 258.99980 247.00000000 67.000000 783.00282 0 -53.5130 -4.5714
461 46 290.00000 265.00000000 85.999424 911.00865 0 -68.5547 7.2203
471 47 278.00101 181.99999886 68.998878 276.00123 0 -61.2213 -23.4464
481 48 262.99508 220.99508159 72.002646 313.03257 0 -36.8463 -9.9047
491 49 268.00000 247.99999990 59.000000 800.00140 0 -67.8880 -10.6130
501 50 248.99900 185.99899706 47.000000 460.00236 0 -52.2213 -22.2797
511 51 263.00095 228.00095374 92.001329 343.99639 0 -39.7630 -8.0714
521 52 266.00199 258.00132718 41.000444 959.00026 0 -65.0130 3.1370
531 53 251.00102 214.00036761 23.000651 262.00170 0 -39.2630 -12.6130
55 55 258.00020 248.00020222 74.999546 910.00354 0 -51.3880 -4.7380
561 56 223.00509 177.00376930 87.000000 680.00092 0 -42.7630 -18.0714
571 57 232.00792 162.00508410 91.000000 439.99713 0 -64.1797 -20.3214
581 58 122.97838 92.97610938 109.995439 423.17753 0 -73.3047 -15.0714
591 59 219.00298 160.00429666 73.998683 707.97064 0 -45.2213 -22.9047
61 61 278.00059 265.99955351 73.000000 994.99690 0 -60.5547 1.3870
621 62 264.99965 251.99965249 68.999652 931.99988 0 -54.5547 -3.0297
631 63 262.00057 252.99885628 31.000000 784.00057 0 -71.3047 -5.6964
641 64 276.00000 188.99909010 43.999678 324.00000 0 -58.9297 -23.7797
67 67 140.99782 16.99759906 42.000450 52.99995 0 -69.7630 -44.2380
69 69 91.02975 5.03139201 43.000502 727.18200 0 -72.5547 -43.3630
701 70 238.99996 150.99995612 22.999681 546.00091 0 -53.9297 -25.7797
73 73 156.99829 31.99828703 24.000743 55.00000 0 -70.3880 -48.0297
741 74 231.00043 215.00021260 75.000213 952.00132 0 -59.7630 -12.7380
76 76 187.00000 58.99890366 25.001097 58.00110 0 -65.6797 -43.5714
771 77 267.99832 252.99831923 105.998772 684.01876 0 -38.5547 -4.1130
781 78 269.99889 131.00022074 82.999779 177.00133 0 -66.3880 -30.1964
80 80 262.00102 254.00102180 41.000000 892.99807 0 -67.9713 -5.3214
811 81 255.00000 142.00000000 36.000000 376.99893 0 -58.1797 -29.7380
83 83 193.00076 91.00030596 94.000000 539.00085 0 -72.3880 -36.2797
841 84 278.99976 263.99975911 50.000000 1121.00142 0 -56.5547 -1.0297
851 85 250.00067 227.00067291 82.000000 917.00517 0 -55.3880 -13.7380
861 86 260.00000 250.99977489 23.000000 824.99751 0 -69.7630 -3.6130
89 89 282.00038 256.00019033 99.000532 413.00270 0 -41.8880 -7.5297
901 90 270.00000 225.99941948 57.998288 982.01583 0 -65.1797 -15.3214
911 91 267.99899 253.99898537 46.000000 491.01211 0 -61.5963 7.4703
921 92 219.00059 87.99889395 39.000000 319.99779 0 -61.3047 -36.3214
94 94 225.99978 86.99977939 72.999470 271.99907 0 -66.1380 -33.5297
951 95 116.00396 4.00263725 30.998909 74.99845 0 -71.5130 -48.0714
961 96 208.00089 134.00089034 13.000000 504.99850 0 -52.0130 -28.4880
971 97 259.99964 235.00031380 74.000666 869.99228 0 -57.2213 -14.6547
981 98 231.00000 79.00000000 49.999780 166.99961 0 -66.0130 -37.1547
101 101 245.99606 236.99707761 29.998848 1009.00190 0 -65.4713 1.5536
103 103 259.99783 246.99885797 54.001024 1039.00717 0 -68.7213 -7.5714
1041 104 244.00153 195.00152628 76.999074 658.99242 0 -49.1797 -21.0297
106 106 263.99796 253.99796462 71.001017 1446.01159 0 -64.2213 6.1786
1071 107 173.99788 103.99765761 173.996402 14.00164 0 -69.5130 -19.6130
1081 108 240.00781 232.00780741 34.000000 791.98642 0 -63.4713 3.1786
111 111 216.99799 105.99696451 108.002055 399.00018 0 -71.5963 -34.3630
1121 112 264.00119 249.00118836 57.000594 916.99807 0 -54.0547 1.1370
1131 113 265.99668 250.99667928 40.000000 687.99853 0 -74.0130 -8.0297
115 115 83.00514 44.00536119 101.999774 432.98950 0 -70.2630 -15.8630
1161 116 238.00000 118.00000000 17.000590 353.99879 0 -57.5547 -33.2797
1171 117 251.00441 219.00441215 91.999012 314.99232 0 -40.7213 -8.3214
1181 118 271.00022 262.00021686 33.000000 883.99764 0 -62.8880 -3.1964
1191 119 262.00000 252.99942668 25.000000 766.99989 0 -70.9297 -5.0297
123 123 276.00021 263.99964610 71.000215 887.00057 0 -61.5130 1.9703
125 125 247.99556 205.99494311 57.999540 274.00241 0 -36.8047 -9.1130
1261 126 249.99882 231.99882490 71.999271 452.98961 0 -67.4297 10.0120
1271 127 278.00000 236.00000000 69.000000 578.00059 0 -56.9297 -16.5714
1281 128 288.00021 198.99989033 65.999786 306.99978 0 -60.7630 -22.2797
130 130 283.00046 259.00091135 91.999846 722.00153 0 -42.1797 -5.7380
132 132 191.01856 177.01912717 68.003215 1108.93325 0 -73.5547 11.0953
136 136 261.00000 246.00102657 59.000000 726.00457 0 -71.2213 -9.6130
the expression for the model structure:
model <- pres ~ bioclim_10 + I(bioclim_10^2) + bioclim_11 + I(bioclim_11^2) + bioclim_15 + I(bioclim_15^2) + bioclim_16 + I(bioclim_16^2)
the following expression for GLM:
GLM <- glm(model, family=binomial(link=logit), data=presausTrain)
the results will contain a lot of negative values for the projected suitability for the species:
projecaoSuitability <- predict(predictors, GLM)
plot(projecaoSuitability, main='Myocastor coypus')
[
For example, at the coorddinates point -41.55306 -12.39342, the model predicts:
pointXY = data.frame(-41.55306, -12.39342)
suitabAtPoint = extract(predictors,pointXY)
predictedSuitabilityAtPoint = predict(GLM, as.data.frame(suitabAtPoint))
-2.167515
I think that negative values is occouring because some mistake of mine, as it make non sense and my Random Forest, Maxent and Bioclim returns values ranging from 0 to 1.
Someone can help me, please?
You need to use the type = 'response' argument in your call of predict (the default is link). This will give you fitted probabilities rather than their natural logs.

mlogit duplicate 'row.names' are not allowed

New to R and want to use mlogit function.
However after putting my data into a data frame and run
x <- mlogit.data(mlogit, choice="PlacedN", shape="long", alt.var="RaceID")
I get duplicate 'row.names' are not allowed
I can upload my file if needed I've spent days trying to get this to work, so any help will be appreciated
You may want to put "RaceID" into the alt.levels argument instead of alt.var. From the mlogit.data help file:
alt.levels
the name of the alternatives: if null, for a wide data.frame, they are guessed from the variable names and the choice variable (both should be the same), for a long data.frame, they are guessed from the alt.var argument.
Give this a try.
library(mlogit)
m <- read.csv("mlogit.csv")
mlogd <- mlogit.data(m, choice="PlacedN", shape="long", alt.levels="RaceID")
head(mlogd)
# RaceID PlacedN RSP TrA JoA aDS bDS mDS aDH bDH mDH LDH MR eMR
# 1.RaceID 20119552 TRUE 3.00 13 12 0 0 0 0 0 0 0 0 131
# 2.RaceID 20119552 FALSE 4.00 23 26 91 94 94 139 153 145 153 150 150
# 3.RaceID 20119552 FALSE 0.83 15 15 99 127 99 150 153 150 153 159 159
# 4.RaceID 20119552 FALSE 18.00 21 15 0 0 0 0 0 0 0 0 131
# 5.RaceID 20119552 FALSE 16.00 16 12 92 127 92 134 135 134 135 136 136
# 6.RaceID 20119617 TRUE 2.50 12 10 0 0 0 0 0 0 0 0 152

Resources