how to use capscale {vegan} r - r

I have used
?capscale
and scoured the internet for answers, but am still not understanding what should be the dependent variable for my dataset if I want to use
capscale()
to analyze my data.
I have NDVI data with some continuous and some categorical variables:
ndvi=structure(list(siteOID = 25840:25939, Elevation = c(1871.92,
1875.38, 1878.28, 1878.54, 1878.33, 1879.2, 1880.51, 1883.78,
1884.6, 1884.85, 1885.46, 1888.72, 1890.94, 1897.19, 1901.95,
1902.47, 1902.81, 1903.49, 1906.62, 1908.73, 1909.4, 1910.65,
1913.44, 1915, 1915.81, 1918.06, 1920.01, 1921.53, 1925.48, 1926.66,
1927.64, 1931.02, 1932.8, 1935.27, 1938.33, 1941.19, 1945.71,
1948.68, 1951.52, 1951.83, 1955.76, 1961.02, 1963.92, 1963.25,
1969.53, 1972.56, 1977.92, 1978.93, 1981.54, 1985.64, 1987.6,
1987.62, 1988.78, 1991.92, 1997.03, 1998.06, 1998.98, 2001.26,
2006.97, 2009.56, 2009.81, 2011.55, 2017.92, 2021.75, 2023.42,
2024.91, 2028.15, 2032.83, 2032.83, 2033.5, 2035.75, 2037.44,
2045.51, 2047.38, 2049.85, 2052.33, 2059.36, 2069.27, 2071.41,
2071.83, 2074.15, 2081.55, 2083.52, 2086.3, 2090.5, 2095.57,
2096.69, 2100.65, 2108.06, 2110.48, 2113.45, 2121.78, 2124.82,
2133.54, 2137.54, 2146.43, 2150.53, 2156.63, 2160.05, 2168.57
), Shape_Area = c(2940.395887, 5105.447128, 3763.362181, 2801.775054,
3854.690283, 4627.01632, 6863.6264, 5452.724569, 3504.284818,
3967.710707, 7004.963815, 3926.00215, 7645.532158, 6306.085153,
3451.101972, 4699.688114, 3880.378241, 4792.898829, 5542.142348,
3674.957345, 3562.897792, 3219.790167, 5369.915585, 3854.684578,
3737.522732, 5190.103216, 5137.457907, 4753.975071, 3605.727759,
4682.430962, 3412.007599, 4955.96479, 0, 5106.057222, 3026.454348,
6814.973732, 5422.439336, 4523.077568, 3092.711952, 2667.1801,
2318.487235, 1623.008863, 2672.648264, 2524.245809, 2164.660806,
3153.921959, 3170.875701, 3755.980623, 4505.277, 3954.724973,
3592.717424, 2877.927426, 3465.37684, 2317.185185, 3249.657309,
2710.26402, 3421.803771, 2556.020604, 3849.407062, 3782.797907,
1950.365079, 3522.797668, 2340.599897, 2451.029503, 3034.109721,
2873.167998, 2278.337947, 2546.02206, 3545.854694, 3514.69201,
2731.819076, 2537.618027, 2116.84627, 2213.553587, 4430.489625,
2648.387315, 4408.844477, 3453.225099, 2457.844425, 3597.718985,
3933.191433, 3280.424579, 2309.053402, 4062.750209, 2755.087578,
3785.974581, 3485.221528, 4698.642524, 3647.400111, 4512.594002,
4509.418612, 3908.621289, 5856.573472, 4084.254238, 4772.464487,
4587.251362, 3275.527576, 3236.108516, 4771.636048, 5241.064376
),slopemean = c(7.012740221, 6.374673005, 6.713881453, 6.219425964,
5.393005565, 5.567550724, 4.557037692, 5.122994391, 5.608577084,
5.054081163, 3.020378928, 3.535192937, 2.910682318, 2.262314184,
1.872473637, 2.04489899, 1.358906129, 1.738190173, 2.190473907,
2.285263883, 1.932403531, 1.318049102, 2.323188104, 2.838744229,
2.5508166, 3.662199524, 2.645026659, 2.691092801, 2.209619006,
2.360828268, 2.83633309, 2.917255029, 3.814524024, 3.594417877,
2.537033654, 2.758014447, 6.487904879, 6.546860137, 6.611400228,
6.548973659, 7.320545057, 7.167488849, 7.486095047, 6.736548642,
6.978404939, 6.209158245, 5.780635711, 5.952286865, 6.21757545,
6.026404989, 8.286706911, 5.013909823, 4.302618208, 5.958519395,
4.735497169, 6.86024694, 5.923437148, 4.814125561, 6.278868822,
6.369820399, 4.211901608, 5.067338774, 7.276210246, 9.342363631,
7.382804547, 7.026542905, 7.386944243, 6.993269548, 4.999933584,
5.386859906, 5.74222567, 6.407413812, 6.220262604, 6.361011563,
7.89187751, 7.504486516, 8.071826326, 7.282463079, 5.730589071,
6.75588336, 5.865557512, 5.567460529, 5.743696501, 6.234486916,
6.672290961, 4.424730467, 3.993647329, 5.934593258, 7.937450668,
8.264807165, 7.39251924, 7.862093222, 6.829388913, 7.447980573,
6.477102849, 6.185640762, 7.760704698, 8.44009344, 8.557933442,
7.60872553), avwidth = c(41.38533, 44.11806, 43.54585, 38.07962,
40.80878, 49.52246, 49.97194, 51.36124, 50.45419, 51.12577, 52.46919,
49.68379, 43.48322, 51.95128, 46.91944, 58.70265, 55.41018, 50.92463,
55.55058, 45.73485, 50.29035, 49.08618, 52.57013, 51.48199, 52.90921,
44.27491, 55.71036, 50.08104, 47.3439, 49.8397, 51.81409, 50.43767,
60.95491, 38.50229, 47.8118, 52.66532, 44.10194, 46.67934, 46.46481,
37.64217, 21.84973, 25.04575, 33.79403, 29.61029, 29.71018, 21.3549,
28.02716, 38.25882, 45.25996, 40.10562, 46.15768, 40.82856, 42.1975,
31.75748, 32.83316, 34.33412, 33.54285, 39.29999, 33.25312, 33.65804,
30.00087, 32.63515, 31.11767, 31.14068, 27.83876, 30.20586, 34.80735,
32.65111, 38.31069, 43.65983, 35.21719, 32.87317, 28.83573, 33.8517,
29.72621, 32.61762, 31.11199, 23.89315, 31.26606, 33.78306, 34.89358,
38.64512, 34.68206, 34.2003, 44.12035, 35.59922, 48.34063, 47.52268,
47.02729, 51.07513, 51.5254, 43.25953, 47.01821, 38.28714, 35.90366,
40.30569, 48.04857, 54.46596, 49.70541, 49.18992), watershed = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("marys", "reese", "wwalker"), class = "factor")), row.names = c(NA,
100L), class = "data.frame")
So, siteOID is each individual site and the categorical variable is watershed.
Now, I am trying to evaluate potential influences of NDVI in multivariate space, by conducting a Canonical Analysis of Principal Coordinates on a Gower similarity matrix of scaled habitat variables using the “capscale” function from the vegan package in R (Oksanen et al. 2013).
ndvi.cap <- capscale(siteOID~ Elevation + Shape_Area + avwidth + slopemean + watershed, ndvi,
dist="bray")
I don't understand how capscale formula is meant to be set up due to the lack of examples with actual explanations of what should be the dependent variable. The example in
?capscale
## Basic Analysis
vare.cap <- capscale(varespec ~ N + P + K + Condition(Al), varechem,
dist="bray")
vare.cap
plot(vare.cap)
anova(vare.cap)
uses two (!) different datasets, which does not make sense to me. Should I be putting the actual NDVI values as the dependent variable and not the site? I do have many NDVI-related variables associated with each site (actual NDVI value, differences between seasons, sens slope for trend), but I am not sure if a continuous variable should be listed as the dependent variable or not.
My question is similar to : Alternative example for capscale function in vegan package
but the answer given there did not help me.

Related

K-NN in R : Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, : ‘sum’ not meaningful for factors

I am running K-NN in R and keep getting the following error while I try to run the For loop for different 'k' values:
Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, : ‘sum’ not meaningful for factors.
My variable 'Class' in this code (class_pred_valid$Class) is a factor with two levels 'used' and 'not used'. Is this could be the reason it is throwing an error? If yes, how do I overcome this issue. I have spent two days trying to solve it but couldn't. Thank you for providing any help.
test.pred<-knn(class_pred_train[,-1], class_pred_test[,-1], class_pred_train[,1], k=25)
table(class_pred_test[,1],test.pred)
CrossTable(x=class_pred_test[,1] ,y=test.pred,prop.chisq = FALSE) # create a X-table
corr.class.rate<-numeric(25)
for(k in 1:25)
{
pred.class<-knn(class_pred_train[,-1], class_pred_valid[,-1], class_pred_train[,1], k=k)
corr.class.rate[k]<-sum((pred.class=class_pred_valid$Class))/length(pred.class)
}
corr.class.rate
plot(c(1:25),corr.class.rate,type="l",
main="Correct Classification Rates for the Test Data for a range of k",
xlab="k",ylab="Correct Classification Rate",cex.main=0.7)
which.max(corr.class.rate)
pred<-knn(class_pred_train[,-1], class_pred_test[,-1], class_pred_train[,1], k=1)
sum((pred==class_pred_train$Class))/length(pred)

How to annotate geom_segment arrows in ggplot

I have a dataframe:
df_sites <- structure(list(x = c(1.04092250164696, -0.383065216420003, 0.396244810279309,
0.970078841220606, 1.70624019153651, 3.16514402752826, 0.683787687531189,
0.00206174639359557, 0.885459199930364, 0.990634067372794, 0.228548628266029,
5.12827669944002, 0.0950586619539368, -0.275846514997531, 1.5525132408558,
-1.29950430377717, -0.990922674400145, 0.185830660119637, 0.00602127943634668,
-1.02247155743703, -0.251974618425098, 1.87788540164332, 1.28325669941297,
1.02150538568984, -0.865622294371786, -1.96452990510675, -0.524866180755096,
2.17941326700128, -1.34324588367972, -1.81439562296687, -1.13470999575871,
-0.493658775981049, -0.296149601541577, 0.447503914837335, -0.269452469430389,
0.0127337699647291, -1.04287439571777, -0.613105026144241, -1.3890917214799,
-1.90860630718699, -1.16104734632228, -0.584089855574213, -1.2278237710839,
-0.937664406699838, 1.09181991754655, -0.565406792755387, -0.58204838078486,
0.842304932110318), y = c(-3.45147995394394, 2.29349807839102,
0.174644402446899, 3.8468101986443, 2.6412842200453, -0.0665028396276639,
2.05491741522117, 0.165875878990559, -0.25539122973085, 1.74130285620058,
0.396659954165391, -1.65827015730937, 1.17736501075071, -3.72087159136532,
1.89896109873428, 1.68766224921712, -2.92368548480463, -2.42481488216442,
2.20648524060166, -0.486513106980203, 2.05729614246768, 2.51807338395106,
1.9974880289267, -2.67208900165781, -0.749156762561599, 1.93100782500476,
-4.15965374769117, 3.64156647300722, -2.7010471123406, 0.198076035987165,
1.62736086278764, -1.03740092888219, -3.89989372202828, -0.213429351502094,
-0.408170753360095, -1.61011027424538, -0.213306102694109, -0.154504840231308,
0.118730504697768, 1.91054431185776, 0.255125262080179, 0.612701198243207,
-1.21511378377373, 3.29282161162431, 2.50675599190964, -3.80136136529774,
-1.28545510252701, 3.02158440057367), Sites = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "M", class = "factor")), row.names = c("M.T1.R1.S1.16S.S50",
"M.T1.R1.S2.16S.S62", "M.T1.R1.S3.16S.S74", "M.T1.R2.S1.16S.S86",
"M.T1.R2.S2.16S.S3", "M.T1.R2.S3.16S.S15", "M.T1.R3.S1.16S.S27",
"M.T1.R3.S2.16S.S39", "M.T1.R3.S3.16S.S51", "M.T1.R4.S1.16S.S63",
"M.T1.R4.S2.16S.S75", "M.T1.R4.S3.16S.S87", "M.T2.R1.S1.16S.S53",
"M.T2.R1.S2.16S.S65", "M.T2.R1.S3.16S.S77", "M.T2.R2.S1.16S.S89",
"M.T2.R2.S2.16S.S6", "M.T2.R2.S3.16S.S18", "M.T2.R3.S1.16S.S30",
"M.T2.R3.S2.16S.S42", "M.T2.R3.S3.16S.S54", "M.T2.R4.S1.16S.S66",
"M.T2.R4.S2.16S.S78", "M.T2.R4.S3.16S.S90", "M.T3.R1.S1.16S.S56",
"M.T3.R1.S2.16S.S68", "M.T3.R1.S3.16S.S80", "M.T3.R2.S1.16S.S92",
"M.T3.R2.S2.16S.S9", "M.T3.R2.S3.16S.S21", "M.T3.R3.S1.16S.S33",
"M.T3.R3.S2.16S.S45", "M.T3.R3.S3.16S.S57", "M.T3.R4.S1.16S.S69",
"M.T3.R4.S2.16S.S81", "M.T3.R4.S3.16S.S93", "M.T4.R1.S1.16S.S59",
"M.T4.R1.S2.16S.S71", "M.T4.R1.S3.16S.S83", "M.T4.R2.S1.16S.S95",
"M.T4.R2.S2.16S.S12", "M.T4.R2.S3.16S.S24", "M.T4.R3.S1.16S.S36",
"M.T4.R3.S2.16S.S48", "M.T4.R3.S3.16S.S60", "M.T4.R4.S1.16S.S72",
"M.T4.R4.S2.16S.S193", "M.T4.R4.S3.16S.S203"), class = "data.frame")
which I plot as
p<-ggplot()
p<-p+geom_point(data=df_sites,aes(x,y,colour=Sites), shape = "diamond", size = 5)
df_arrows <- structure(list(x = c(-0.0506556191949347, -0.248732307259684,
0.75), y = c(-0.669658874134264, -0.45802558549515, -0.110871926510315
)), class = "data.frame", row.names = c("`POX-C`", "Protein",
"yield"))
p+geom_segment(data=df_arrows, aes(x = 0, y = 0, xend = x, yend = y),
arrow = arrow(length = unit(0.2, "cm")))
I would like to add annotation to these arrows. How do I do it?
We can use geom_text and the data contained in df_arrows:
library(dplyr) # get %>% and mutate
p <- p+geom_segment(data=df_arrows, aes(x = 0, y = 0, xend = x, yend = y),
arrow = arrow(length = unit(0.2, "cm")))
p + geom_text(data = df_arrows %>% mutate(labs = row.names(.)),
aes(x = x, y = y, label = labs))
If you want the plot to be a little easier on the eyes and avoid plotting over things, you can try the geom_text_repel function from the ggrepel package.

ggplot2 and CSV "inventing" data that isn't in my input

I'm attempting to produce an attractive graph of bandwidth data across a number of machines and tests. My attempts seem to work for small manually entered amounts of data, but when I feed the "full" 1773 entries, I get results in my graph that don't seem to exist in the input data.
I believe this is likely because the different tests are each of different duration, but I can't seem to prove this. If I use the following input data as csv (sorry, off-site because of size) I end up with a strange upwards-curve on my geom_smooth line, and additional data points that I can't actually see in my .csv input data. (I have much more data in real life, this is a subset that produces the strange behaviour)
I would expect the first four tries (try01-try04) to flat-line at zero, and try05 to carry on at around 1GBit/sec. Here's my code
library("ggplot2")
library("RColorBrewer")
speed = read.csv(file="data.csv")
svg("all_results.svg",width=24)
ggplot(speed,
aes(x = Second, y = Bandwidth, group=Test, colour=Test)) +
scale_fill_brewer(palette="Paired") +
geom_point() +
geom_smooth()
dev.off()
Here's the image produced
#Gregor seems to be exactly right in that the seconds are interpreted as text, when they should represent the number of the seconds since the start of that test.
Here's some example input data - please note the times are not always on a .00 second boundary due to the output of iperf.
structure(list(Machine = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "valhalla", class = "factor"),
User = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "alice", class = "factor"),
Test = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "try01", class = "factor"),
Second = structure(c(1L, 2L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 20L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), .Label = c("0.00-1.00",
"1.00-2.00", "10.00-11.00", "11.00-12.00", "12.00-13.00",
"13.00-14.00", "14.00-15.00", "15.00-16.00", "16.00-17.00",
"17.00-18.00", "18.00-19.00", "19.00-20.00", "2.00-3.00",
"3.00-4.00", "4.00-5.00", "5.00-6.00", "6.00-7.00", "7.00-8.00",
"8.00-9.00", "9.00-10.00"), class = "factor"), Bandwidth = c(937,
943, 944, 943, 943, 943, 943, 944, 658, 943, 944, 943, 944,
644, 943, 943, 943, 944, 943, 943)), row.names = c(NA, 20L
), class = "data.frame")
I'll try casting (or whatever R calls it) those to a float now.
Points have a single x value, not a range of x-values, so we'll separate your Second column into beginning and end of the interval and plot the points at the beginning. Calling your data dd"
library(tidyr)
library(dplyr)
dd = dd %>%
separate(Second, into = c("sec_start", "sec_end"), sep = "-", remove = FALSE) %>%
mutate(sec_start = as.numeric(sec_start),
sec_end = as.numeric(sec_end))
After that the plotting should go just fine if you put sec_start or sec_end on the x-axis. (Or calculate the middle, whatever you want...)
If you want to visualize the durations, you could use geom_segment and aes(x = sec_start, xend = sec_end, y = Bandwidth, yend = Bandwidth), but since everything is just about the same duration, it doesn't seem like this would add much value.

How to add gradient color to a surface3d in R when Z axis is between 0 and 1

I've seen many posts (here, here, here, and here) on how to add color gradient on the Z axis (but none on "z" values that range from 0 to 1). The only thing is that when I do this, I end up with only two colors if my data on the Z axis is between 0 and 1.
Here is an example:
I would like to have a figure where the Z axis is show a red color when it's near 0 and yellow when it's near 1.
The other problem is that I have a bunch of NA's in the Z axis because I'm defining the surface for only the x and y values that correspond to the points. Usually, people use "outer(x,y,f)" to compute the surface. I don't have an equation where I can just plug the numbers.
Is there a way that I can do this?
df3d = structure(list(phi = c(0.714779631270897, 0.687691682891498,
0.596648688803568, 0.573930669753368, 0.742367142156744, 0.647098819439728,
0.695488766544905, 0.728284245613654, 0.688278993976676, 0.692076206940355,
0.721356887106184, 0.551532807978921, 0.54294513452377, 0.529948458419129,
0.583705941140962, 0.556086109758564, 0.721770088612814, 0.711284095827769,
0.573741332655988, 0.527342613188125, 0.762709309318822, 0.740228675759072,
0.539713252759555, 0.696487636519962, 0.709494568163841, 0.537216639879562,
0.551801008711386, 0.545341937291782, 0.584139265723182, 0.64967079561165,
0.562544215947123, 0.716870075612315, 0.523337825235807, 0.588702763971338,
0.744644767844755, 0.551489639273234, 0.617165392352849, 0.556723007149084,
0.66554863194508, 0.570156474465965, 0.59324644850682, 0.552326531317577,
0.607405070778153, 0.765602115588822, 0.532910404322836, 0.749202895901834,
0.638084894011913, 0.594508381800896, 0.745877525852658, 0.742265176757939,
0.55200104972317, 0.598724220429779, 0.704160605412078, 0.709273655686999,
0.57882815350951, 0.80558646355475, 0.739236441867173, 0.556469513099474,
0.560730917777703, 0.715514054617767, 0.562095774851614, 0.540152840905987,
0.561824376055385, 0.595049050758879, 0.544700858333275, 0.54379044778355,
0.735023707587803, 0.75761987117526, 0.529370104304623, 0.756142990929929,
0.580486562475464, 0.555099817471069, 0.537232767721754, 0.68405457472067,
0.572070245916932, 0.73826438688156, 0.776877621879421, 0.5417182204358,
0.757617713719944, 0.536922997394714, 0.695880672257972, 0.570816629701256,
0.551885077056955, 0.697426644089613, 0.700677930911186, 0.722074526398648,
0.547841598427244, 0.744115961419341, 0.568163711481982, 0.631039420851915,
0.52569185852275, 0.655488455712025, 0.715875702650255, 0.562828009151803,
0.565017441865273, 0.554557230119741, 0.641911755728664, 0.549787832704858,
0.551682550480448, 0.522229525069209), sp = structure(c(4L, 4L,
1L, 1L, 2L, 2L, 2L, 2L, 4L, 3L, 2L, 1L, 1L, 1L, 1L, 1L, 4L, 4L,
1L, 1L, 2L, 2L, 1L, 3L, 2L, 1L, 1L, 1L, 1L, 4L, 1L, 4L, 1L, 1L,
2L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L,
1L, 1L, 4L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L,
1L, 4L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 4L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = c("fortis", "fuliginosa", "magnirostris", "scandens"
), class = "factor"), pc1 = c(0.175880701440334, 0.00718708371839084,
0.141108047117647, -0.0241407292755287, -0.362347619490667, -0.278187055817663,
-0.322472422874688, -0.342113759548994, -0.0480003258625404,
0.471768235224601, -0.324560745197095, 0.0893840127998557, 0.392067958177292,
0.333197422567793, 0.143274241985899, 0.39728316736576, 0.107258309440993,
0.172013966873444, 0.198033002646736, 0.0233433518931576, -0.379151278648175,
-0.360331402784382, 0.0815105012533928, 0.4916774405792, -0.325531606767521,
0.0464793855349116, 0.128993599551295, 0.0393187306328187, 0.116498023384732,
0.0585444918583008, 0.0519773823187942, 0.117485670789894, 0.141592582273004,
0.0866016090395172, -0.353101745830432, 0.0903683502030376, -0.0766571214760896,
0.0189849871337894, 0.0284234379094188, -0.074411018513597, -0.125981989564305,
-0.04066896524291, 0.0513708917900996, -0.384362095569569, 0.133461942504857,
-0.32950028028642, -0.0970510208736005, 0.169708833257483, -0.363153793934809,
-0.358442393985438, 0.0823660510982192, 0.14891498101178, 0.0874718551667044,
-0.286609834093365, 0.247017305539772, -0.42431120384093, -0.323957076921413,
0.120304498088591, 0.0372009683336541, -0.334862217128121, 0.0850391283675992,
0.426550700956589, 0.053540404847934, 0.114569082118706, 0.145035302093536,
0.462956587489796, -0.352558028645024, -0.370105398490897, 0.249974349261361,
-0.374913268845847, 0.209780781689884, 0.313250151589845, 0.46260008422501,
-0.304611304484123, 0.11736172451962, -0.35863773173462, -0.391035427221015,
0.219372693586083, -0.373985839773145, 0.28640321397829, -0.319643095574694,
0.0125879234209831, 0.182454650537706, -0.0307250825972499, -0.32490678343306,
-0.341204851832981, 0.314073748792412, -0.364615463916348, -0.0644240574912661,
-0.267640246495039, 0.10370599000585, -0.288131406123636, 0.0357411052061282,
0.295614964446489, -0.0145385512772513, -0.0451979384514853,
0.190115107687624, 0.159441037623466, 0.0550870424124392, 0.0582226744080579
), pc2 = c(-1.01095247497725, -1.03868939268555, 0.217310975677827,
0.0285247896165632, 0.0526206694724207, 0.029933782968998, 0.0777356682984891,
0.178400497047045, -0.895131692154304, 0.209867904648101, -0.0527418216237663,
0.00827859255924409, 0.112996963663788, -0.0395108234571918,
0.173676295351724, 0.203897905654255, -0.936940800121312, -1.04245666692378,
0.171077913138838, -0.164692367490732, 0.0227473300072106, 0.108660664812142,
-0.0570692402038391, 0.219114322364657, -0.00559526046181254,
-0.0904496365732674, 0.0329879550738144, -0.0513100262471313,
0.157624496486177, -0.430836781866961, 0.0336830138484876, -0.997472053889813,
-0.151743057518861, 0.153748243948929, -0.0290891308461303, 0.00866038555153437,
0.131519041243216, -0.0113322871452352, -0.487378228261218, -0.0178833351102055,
0.0262770136476736, -0.0671756888678338, 0.190653963041647, 0.0874833382301275,
-0.0729306295513451, -0.114781088459982, 0.176113469790657, 0.229289749785351,
0.023115521362388, 0.0124139031005011, 0.00629127323542669, 0.229545586035766,
-0.643425633985522, -0.119025249254049, 0.222273563398108, 0.0949392931025451,
-0.103328613004053, 0.0497069994557915, 0.0169108098226666, 0.0176907608810171,
0.0525638095222423, 0.0991718002465503, 0.031701514651561, 0.194031271868605,
0.00563908525013029, 0.144806228737922, 0.145921630779316, 0.164295633824383,
-0.0579825386055256, 0.104068297238545, 0.204915386707032, 0.153880371324229,
0.0676594796683301, 0.183052585806673, 0.113255499327757, 0.107866805397445,
0.142039558115177, 0.0274014273919194, 0.133609276043029, 0.023767214013592,
0.0322573857202049, 0.0409388634816843, 0.0643799435826686, -0.850272489901295,
0.0430623373727956, 0.0213513249227984, 0.112589167129505, 0.0764778027855769,
-0.0187866951639582, 0.0514999426382286, -0.141852017637047,
0.132798155087113, -0.811488800456735, 0.18297353727076, 0.00129211340539928,
-0.0604306388888919, 0.39467615944551, 0.0406033888777663, -0.0115831761153328,
-0.190035979057187)), .Names = c("phi", "sp", "pc1", "pc2"), row.names = c("phi[1245,12]",
"phi[1058,12]", "phi[594,12]", "phi[1999,12]", "phi[1546,12]",
"phi[353,12]", "phi[312,12]", "phi[21,12]", "phi[1371,12]", "phi[1874,12]",
"phi[384,12]", "phi[124,12]", "phi[2085,12]", "phi[163,12]",
"phi[221,12]", "phi[1321,12]", "phi[1767,12]", "phi[1883,12]",
"phi[490,12]", "phi[225,12]", "phi[1719,12]", "phi[1613,12]",
"phi[268,12]", "phi[2132,12]", "phi[1458,12]", "phi[1173,12]",
"phi[1335,12]", "phi[1357,12]", "phi[388,12]", "phi[985,12]",
"phi[184,12]", "phi[945,12]", "phi[2143,12]", "phi[1273,12]",
"phi[1738,12]", "phi[2081,12]", "phi[822,12]", "phi[1236,12]",
"phi[2044,12]", "phi[2018,12]", "phi[1065,12]", "phi[314,12]",
"phi[943,12]", "phi[514,12]", "phi[448,12]", "phi[1535,12]",
"phi[1798,12]", "phi[960,12]", "phi[22,12]", "phi[128,12]", "phi[190,12]",
"phi[2037,12]", "phi[772,12]", "phi[1553,12]", "phi[417,12]",
"phi[1659,12]", "phi[1529,12]", "phi[1369,12]", "phi[2075,12]",
"phi[1722,12]", "phi[712,12]", "phi[80,12]", "phi[1050,12]",
"phi[1877,12]", "phi[1195,12]", "phi[1138,12]", "phi[1549,12]",
"phi[1886,12]", "phi[90,12]", "phi[1990,12]", "phi[423,12]",
"phi[783,12]", "phi[165,12]", "phi[1975,12]", "phi[951,12]",
"phi[1681,12]", "phi[1647,12]", "phi[1286,12]", "phi[1666,12]",
"phi[1029,12]", "phi[1989,12]", "phi[668,12]", "phi[1859,12]",
"phi[763,12]", "phi[879,12]", "phi[1639,12]", "phi[839,12]",
"phi[1366,12]", "phi[731,12]", "phi[34,12]", "phi[250,12]", "phi[25,12]",
"phi[457,12]", "phi[465,12]", "phi[1010,12]", "phi[1388,12]",
"phi[2055,12]", "phi[917,12]", "phi[188,12]", "phi[130,12]"), class = "data.frame")
library(scatterplot3d) #http://www.statmethods.net/graphs/scatterplot.html
library(rgl)
library(akima)
sp= c("fortis","fuliginosa","magnirostris","scandens")
open3d()
par3d(windowRect = c(10, 10, 600, 600))
plot3d(x = df3d$pc1,
y = df3d$pc2,
z = df3d$phi,
col=c("#FF3030","#9ACD31", "#1D90FF", "#FF8001")[(as.factor(df3d$sp))],
xlab = "PC1",
ylab = "PC2",
zlab = "Fitness",
type = "p",
# size = round(as.numeric(df3d$z.mean)),
main = "yo")
for(j in 1:length(sp)){
df3d.sp = df3d[df3d$sp == sp[j],]
if(nrow(df3d.sp) == 1){next} else{
s = interp(df3d.sp$pc1,
df3d.sp$pc2,
df3d.sp$phi,
duplicate="strip")
z = s$z*2
zlim <- range(df3d$phi)
zlen <- zlim[2] - zlim[1] + 1
colorlut <- heat.colors(zlen) # height color lookup table
col <- colorlut[ z-zlim[1]+1 ] # assign colors to heights for each point
surface3d(s$x,s$y,s$z,color=col, alpha = 0.5)
}
}
The best I could do is something like this:
for(j in 1:length(sp)){
df3d.sp = df3d[df3d$sp == sp[j],]
if(nrow(df3d.sp) == 1){next} else{
s = interp(df3d.sp$pc1,
df3d.sp$pc2,
df3d.sp$phi,
duplicate="strip")
rbPal <- colorRampPalette(c('yellow','red'))
nb.div = 10
data.col =as.data.frame(matrix(as.factor(cut(s$z,breaks = nb.div)),
dim(s$z)[1],dim(s$z)[2]))
col.index=matrix(as.numeric(unlist(data.col)),
dim(s$z)[1],dim(s$z)[2])
Col <- rbPal(nb.div)[col.index]
col= matrix(Col,dim(s$z)[1],dim(s$z)[2])
surface3d(s$x,s$y,s$z,color=col, alpha = 0.5)
}
}
The problem is that the colors are not going from red to yellow (0->1). They are randomly associated:
Also, the colors are not constrained to be between 0 and 1.
How could I do this?
I've just tried a new code and it seems to work, but not with the data that I have.
library(scatterplot3d)
library(rgl)
library(akima)
x = rnorm(100)
y = rnorm(100)
z1 = -x^2-y^2+x^3
expit <- function(x) 1/(1+exp(-x))
logit <- function(x) log(x/(1-x))
z = expit(z1+1)
plot3d(x = x,
y = y,
z = z,
col="red",
xlab = "PC1",
ylab = "PC2",
zlab = "Fitness",
type = "p",
# size = round(as.numeric(df3d$z.mean)),
main = "yo")
s = interp(x,
y,
z,
duplicate="strip")
rbPal <- colorRampPalette(c('red','yellow'))
nb.div = 10
data.col = as.data.frame(matrix(as.factor(cut(s$z, breaks = nb.div)),
dim(s$z)[1],dim(s$z)[2]))
col.index = matrix(as.numeric(unlist(data.col)),
dim(s$z)[1],dim(s$z)[2])
Col <- rbPal(nb.div)[col.index]
col= matrix(Col, dim(s$z)[1], dim(s$z)[2])
surface3d(s$x,s$y,s$z,color=col, alpha = 1)
Why would that one work?
I found the answer. I needed to order the cut values and then remap the values of the range with the colors. Not elegant, but working...
data.col = as.data.frame(matrix(as.factor(cut(s$z,ordered_result = T,
include.lowest = TRUE,
right = TRUE,
breaks = nb.div)),
dim(s$z)[1],
dim(s$z)[2],byrow = FALSE))
range = levels(cut(s$z,ordered_result = T,
include.lowest = TRUE,
right = TRUE,
breaks = nb.div))
library(plyr)
for(i in 1:ncol(data.col)){
data.col[,i] <- mapvalues(data.col[,i],
from=range,
to=rbPal(nb.div),
warn_missing = FALSE)
}

Increasing size of circles in ggplot2 graphs [duplicate]

This question already has answers here:
How to increase size of the points in ggplot2, similar to cex in base plots?
(2 answers)
Closed 8 years ago.
I want to increase the scale of circles in ggplot2. I tried something like this aes(size=100*n) but it did not work for me.
df <-
structure(list(Logit = c(-2.9842723737754, 1.49511606166294,
-2.41756623714116, -2.96160412831003, -2.12996384688938, -1.61751836789074,
-0.454353048358851, 0.9284099250287, -0.144082412641708, -2.30422500981431,
-0.658367257547178, 0.082600042011989, -0.318343575566633, -0.717447827238429,
-1.0508122312565, -2.82559465551781, 0.361703788394458, -1.85086010050691,
-0.0916611209129359, -0.740116072703798, 0.0599317965466193,
-0.370764867295404, -0.703703748477917, -0.749040239408657, -2.7575899191217,
-2.51532401980067, 1.38177483433609, 1.47244781619757, -0.205002348239784,
0.135021333740761), PRes = c(-0.661648371860934, 1.63444424896772,
-0.30348016008728, -0.230651042355737, 1.07487559116003, -0.460143991337599,
-0.823052248365889, -0.999903730870253, -0.959022180953211, -0.321344960297977,
-1.40881799070885, -0.674754839222841, 0.239931843185434, -1.81660411888874,
0.830318780187542, -0.24702802619469, 0.692695708496924, -0.40412065378683,
-0.977640032689132, -0.715192962242284, -1.06270128658429, -0.856103053117159,
-0.731162073769824, 1.51334938767359, 4.02946801536109, 3.56902361409375,
0.505952430753934, 0.483660641952208, 1.13712619443209, 0.951889504154342
), n = c(7L, 38L, 1L, 1L, 11L, 1L, 1L, 4L, 1L, 1L, 3L, 9L, 2L,
8L, 2L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L)), .Names = c("Logit", "PRes", "n"), row.names = c(NA, -30L
), class = "data.frame")
library(ggplot2)
ggplot(data=df, mapping=aes(x=Logit, y=PRes, label=rownames(df))) +
geom_point(aes(size=n), shape=1, color="black") +
geom_text() +
theme_bw() +
theme(legend.position="none")
Simply add a scale for size:
+ scale_size_continuous(range = c(10, 15))

Resources