How to plot multiple ACF values on the one graph - r

I've just started using R and would like to use look at the autocorrelation in my data using ACF. My dataframe (GL) looks something like this
GL
well year month value area
684 1994 Jan 8.53 H
684 1994 Feb 8.62 H
684 1994 Mar 8.12 H
684 1994 Apr 8.21 H
684 1995 Jan 8.53 H
684 1995 Feb 8.62 H
684 1995 Mar 8.12 H
684 1995 Apr 8.21 H
684 1996 Jan 8.53 H
684 1996 Feb 8.62 H
684 1996 Mar 8.12 H
684 1996 Apr 8.21 H
101 1994 Jan 8.53 R
101 1994 Feb 8.62 R
101 1994 Mar 8.12 R
101 1994 Apr 8.21 R
101 1995 Jan 8.53 R
101 1995 Feb 8.62 R
101 1995 Mar 8.12 R
101 1995 Apr 8.21 R
101 1996 Jan 8.53 R
101 1996 Feb 8.62 R
101 1996 Mar 8.12 R
101 1996 Apr 8.21 R
I would like to:
1. Calculate ACF for each well using lappy or some kind of loop (my actual data set has about 100 wells and three groups)
2. Plot the ACF values (as lines) for each well on one graph for each group (so in this case I would have two acf graphs H & R.
I can use split and lapply to calculate ACF for each well e.g.
split <- split(GL$value,GL$well)
test <- lapply(split,acf)
But splitting this way doesn't save the area information. If I split like this:
split1 <- split(GL,GL$well)
Then I don't know how to perform lapply on the values for each well.

As you split the data by well,
spl1 <- split(GL, GL$well)
the lapply would look like this.
lapply(spl1, function(x) acf(x$value))
We could make this somewhat nicer, though.
When we do the lapply by list number we get a "counter" with which we can access the list names to paste together informative titles. With par(mfrow=c(<rows>, <columns>)) we can set the arrangement of the plots.
par(mfrow=c(1, 2))
lapply(seq_along(spl1), function(x) acf(spl1[[x]]$value,
main=paste0("well ", names(spl1)[x], ", ",
"area ", unique(spl1[[x]]$area))))
Result
This will probably have to be adapted according to how your wells are divided into groups.
(As a sidenote: Better avoid overwriting function names. You use split() and give the result the same name as the function which could induce confusion, both of yourself and of R. Other popular candidates are data, df, table. We can always quickly check with ? whether the name is "free", e.g. ?df.)
Data
# result of `dput(GL)`
GL <- structure(list(well = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("101", "684"), class = "factor"), year = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("1994", "1995", "1996"
), class = "factor"), month = structure(c(3L, 2L, 4L, 1L, 3L,
2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L,
2L, 4L, 1L), .Label = c("Apr", "Feb", "Jan", "Mar"), class = "factor"),
value = structure(c(3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L,
1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L), .Label = c("8.12",
"8.21", "8.53", "8.62"), class = "factor"), area = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("H", "R"), class = "factor")), row.names = c(NA,
-24L), class = "data.frame")

You can solve it with data.table:
Let's start with the data (slightly modified from yours, so there will be different values for each well):
structure(list(well = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), .Label = c("101", "684"), class = "factor"), year = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("1994", "1995", "1996"), class = "factor"), month = structure(c(3L, 2L, 4L, 1L, 3L,
2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L), .Label = c("Apr", "Feb", "Jan", "Mar"), class = "factor"),
value = c(4.65144120692275, 8.98342372477055, 17.983893298544,
15.3687085728161, 8.9577708535362, 7.47583840973675, 16.6564453896135, 11.6158618542831, 23.6109819535632, 14.1604918171652, 11.3882310683839, 20.4579487598967, 3.31275907787494, 22.109053656226, 13.598402187461, 12.3686389743816, 17.9585587936454, 17.3689122993965, 7.38424337399192, 6.93579732463695, 13.2789171519689, 21.2500206897967, 13.5766511948314, 3.58588649751619), area = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("H", "R"), class = "factor")), row.names = c(NA, -24L), class = c("data.table", "data.frame"))
Then we create a list for each well:
GL[, datos := .(list(value)) , by = well]
Each row in the datos variable will have a list with all the values corresponding to the well, so we can drop most of them and keep only the first row of each well, as it has all the information already. That is done with GL[, .SD[1,], by = well] so the result will be a two-row data table. After that, we can chain another expression that will produce and save each plot:
GL[, .SD[1,], by = well][
, {png(filename = paste0(well, "-", area, ".png"),
width = 1600,
height = 1600,
units = "px",
res = 100);
plot(a[[1]], main = paste("Well:", well,
"Area:", area, sep = " "));
dev.off()},
by = well]
Your two plots will be saved in the current directory with names like "684-H.png" and "101-R.png".
Key point here: data.table takes expressions and not just functions, so it's absolutely possible to produce the plots and save them to any given location.

Related

Why am I getting the same result from Anova and aov_car in R, and why is it different from SPSS?

I am conducting a reanalysis of some data. The dv is continuous (beta value ie neural activity) and the iv is categorical (position) with three levels (1, 2, 3). Position is set as a factor. It is repeated measures, and there are 126 observations. The original analysis was done in SPSS, and I am trying to replicate those results with R.
I don't understand how to make a MRE of this, so my data from dput is at the bottom.
My ANOVA results are different from those reported in the original paper (the data is identical). Specifically, they reported F(2,82) = 18.262, p = 0.00, but my table (below) is totally different. I used the Anova function, and now get the impression that I should be using aov_car but the output is the same between the two.
> Anova(lm(Beta ~ Position, data = stack_ex))
Anova Table (Type II tests)
Response: Beta
Sum Sq Df F value Pr(>F)
Position 60.57 2 1.5213 0.2225
Residuals 2448.70 123
> aov_car(Beta ~ Position + Error(Beta), data = stack_ex)
Contrasts set to contr.sum for the following variables: Position
Anova Table (Type 3 tests)
Response: Beta
Effect df MSE F ges p.value
1 Position 2, 123 19.91 1.52 .024 .222
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1
I didn't know what to use as the error term, so I put Beta. Is this the issue?
Here is the data, apologies for not being able to reduce it down.
> dput(stack_ex)
structure(list(Beta = c(9.97627322506813, 4.51007015616003, 12.5899137493145,
5.16107528902195, 0.69934803628816, 3.05576441003722, 9.73415586595716,
3.48253752326239, 8.72271400892749, 6.58254223482513, 9.64252595049282,
6.2575247824253, 5.55086088416984, 3.26575073266956, -0.189486641765607,
-6.34627220217585, 3.03699535774724, 5.38452950644857, 7.2247809046584,
1.05684383099248, 0.745997758871227, 13.4708766693015, 6.22313273382721,
7.60691743953363, 7.95869706610072, 0.0733745510036445, 5.74455260852637,
9.10243217750976, 3.83463985621549, 6.51540068169028, 6.74657874951813,
9.06748922888841, 4.18661204617864, 8.13865720827057, 4.97289378228525,
4.79399790512039, -12.5433736154914, 3.22520674616528, 4.83924807559523,
6.89780284608954, 2.01175994751707, 1.58936731656692, 8.65646845487533,
2.03332866864119, 6.59573013233866, 4.35624613417537, 3.22584501764675,
3.01812749198894, 8.67739700219412, 5.14273744714805, 7.54959191256081,
7.83244934217214, 8.67126128885367, 3.99955715822518, 2.95804569815409,
2.25327292671231, 0.258342636171449, -6.87648408967595, 1.9848049507549,
2.45033479610578, 7.41525416520838, 1.11896377050173, 0.0698315480648937,
9.90975895502056, 5.03717210651178, 4.67127493715398, 7.90306051043896,
3.0618932143297, 5.43781266582611, 8.9383987897543, 4.7982992164727,
6.90576740201611, 4.43862196057089, 9.06484925843098, 3.35645527138813,
5.42103905597134, 2.32859166774007, 3.65962841104834, -11.716124636774,
7.15256990819002, 4.02640955184303, 7.10747478179406, 2.81026958853589,
1.21494403713035, 9.06256308202033, 2.40170878761068, 6.45729748790901,
4.88232212084591, 1.55722661655526, 3.09556060018938, 6.6629967466337,
4.38848062553557, 4.38871083406173, 6.40367918458127, 6.361735558817,
4.21279189431753, 2.08838813524482, 2.21632202746396, -0.491401226521853,
-7.3685373528786, 2.12839354041543, 4.22958686769682, 4.25606944426722,
0.330400668298046, 1.02776552933976, 10.6734745608271, 3.01238218831987,
4.03318609054561, 6.45849154079659, 0.45593329021199, 5.76390726591623,
7.21202360734704, 4.62140561321984, 3.72714943200746, 5.49911004676976,
9.15658405382221, 3.25231083403689, 3.67627240704932, 3.48390458422993,
2.98674297337782, -19.5189775914798, 2.59812967326379, 2.78334604762499,
3.70635047793331, -0.223282095324164, 2.17552096286021), Position = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1",
"2", "3"), class = "factor")), row.names = c(NA, -126L), class = c("tbl_df",
"tbl", "data.frame"))

How to separate compact letter display (CLD) in multcomp by group without changing the p-value adjustment method?

Problem
I would like to plot estimated marginal means from a three-way factorial experiment with letters indicating significantly different means, adjusted for multiple comparisons. My current workflow is to fit the model with lmer(), calculate estimated marginal means with emmeans(), then implement the compact letter display algorithm with cld().
My problem is that the graph is too busy when you plot all three-way interactions on the same plot. So I would like to split up the plot and generate different sets of letters for each subplot, starting with "a". The problem is that when I use the by argument in cld to split it up, it does a separate correction for multiple comparisons within each by group. Because there are now fewer tests within each group, this results in a less conservative correction. But if I try to manually split up the output of cld() without a by group, I would have to manually re-implement the letter algorithm for each subplot. I guess I could do that but it seems cumbersome. I am trying to share this code with a client for him to modify later, so that solution would probably be too complex. Does anyone have an easy way to either:
Get the output of cld() to use one combined correction for all by groups.
Using a relatively simple method, reduce the compact letter display for each subgroup to the minimal necessary number of letters.
Reproducible example
Load packages and data.
library(lme4)
library(emmeans)
library(multcomp)
dat <- structure(list(y = c(2933.928571, 930.3571429, 210.7142857, 255.3571429,
2112.5, 1835.714286, 1358.928571, 1560.714286, 9192.857143, 3519.642857,
2771.428571, 7433.928571, 4444.642857, 3025, 3225, 2103.571429,
3876.785714, 925, 1714.285714, 3225, 1783.928571, 2223.214286,
2537.5, 2251.785714, 7326.785714, 5130.357143, 2539.285714, 6116.071429,
5808.928571, 3341.071429, 2212.5, 7562.5, 3907.142857, 3241.071429,
1294.642857, 4325, 4487.5, 2551.785714, 5648.214286, 3198.214286,
1075, 335.7142857, 394.6428571, 1605.357143, 658.9285714, 805.3571429,
1580.357143, 1575, 2037.5, 1721.428571, 1014.285714, 2994.642857,
2116.071429, 800, 2925, 3955.357143, 9075, 3917.857143, 2666.071429,
6141.071429, 3925, 1626.785714, 2864.285714, 7271.428571, 3432.142857,
1826.785714, 514.2857143, 1319.642857, 1782.142857, 2637.5, 1355.357143,
3328.571429, 1914.285714, 817.8571429, 1896.428571, 2121.428571,
521.4285714, 360.7142857, 1114.285714, 1139.285714, 7042.857143,
2371.428571, 2287.5, 4967.857143, 2180.357143, 1944.642857, 2408.928571,
5289.285714, 7028.571429, 3080.357143, 5394.642857, 5973.214286,
7323.214286, 1419.642857, 1455.357143, 4657.142857, 7069.642857,
2451.785714, 4319.642857, 5562.5, 3953.571429, 1182.142857, 1957.142857,
3796.428571, 1773.214286, 400, 871.4285714, 842.8571429, 657.1428571,
1360.714286, 1853.571429, 1826.785714, 3405.357143, 2605.357143,
5983.928571, 4935.714286, 4105.357143, 7666.071429, 3619.642857,
5085.714286, 1592.857143, 1751.785714, 5992.857143, 2987.5, 794.6428571,
3187.5, 825, 3244.642857), f1 = structure(c(4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A",
"B", "C", "D"), class = "factor"), f2 = structure(c(2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("foo",
"bar"), class = "factor"), f3 = structure(c(4L, 3L, 2L, 1L, 3L,
4L, 1L, 2L, 4L, 2L, 1L, 3L, 3L, 2L, 4L, 1L, 3L, 1L, 4L, 2L, 2L,
4L, 3L, 1L, 2L, 4L, 1L, 3L, 2L, 3L, 1L, 4L, 3L, 4L, 1L, 2L, 3L,
2L, 4L, 1L, 2L, 1L, 3L, 4L, 1L, 2L, 4L, 3L, 2L, 1L, 3L, 4L, 3L,
1L, 4L, 2L, 4L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 3L, 4L, 1L, 2L, 1L,
4L, 3L, 2L, 3L, 1L, 4L, 2L, 1L, 3L, 4L, 2L, 4L, 3L, 1L, 2L, 1L,
3L, 4L, 2L, 3L, 1L, 4L, 2L, 4L, 1L, 3L, 2L, 2L, 3L, 4L, 1L, 4L,
1L, 2L, 3L, 4L, 1L, 3L, 2L, 1L, 2L, 4L, 3L, 1L, 2L, 4L, 3L, 1L,
4L, 2L, 3L, 1L, 3L, 4L, 2L, 1L, 3L, 2L, 4L), .Label = c("L1",
"L2", "L3", "L4"), class = "factor"), block = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("1",
"2", "3", "4"), class = "factor")), row.names = c(NA, -128L), class = "data.frame")
Fit model and get estimated marginal means.
fit <- lmer(log10(y) ~ f1 * f2 * f3 + (1 | block), data = dat)
emm <- emmeans(fit, ~ f1 + f2 + f3, mode = 'Kenward-Roger', type = 'response')
Version 1
In this version, I take the CLD as a whole which correctly uses the Sidak adjustment for 496 tests. However let's say I wanted to plot only those rows where f2 == 'bar'. The letters are no longer correct because some are redundant (less than 8 are needed). Is there any function that can reduce the letters down?
cldisplay1 <- cld(emm, adjust = 'sidak', Letters = letters)
subset(as.data.frame(cldisplay1), f2 == 'bar') # correct comparisons but contains redundant letters
output
f1 f2 f3 response SE df lower.CL upper.CL .group
8 D bar L1 365.6732 76.1231 96 185.9699 719.0244 a
24 D bar L3 582.8573 121.3349 96 296.4229 1146.0742 ab
16 D bar L2 682.9238 142.1659 96 347.3136 1342.8353 ab
7 C bar L1 898.1560 186.9714 96 456.7740 1766.0470 abcd
6 B bar L1 1627.7069 338.8438 96 827.8006 3200.5652 bcdefg
15 C bar L2 1635.4393 340.4534 96 831.7330 3215.7694 bcdefg
32 D bar L4 1746.6052 363.5951 96 888.2685 3434.3552 bcdefg
31 C bar L4 2348.6629 488.9270 96 1194.4562 4618.1832 cdefgh
21 A bar L3 2499.6772 520.3640 96 1271.2573 4915.1230 cdefgh
5 A bar L1 2545.4594 529.8946 96 1294.5407 5005.1448 cdefgh
23 C bar L3 2561.0138 533.1326 96 1302.4512 5035.7294 cdefgh
30 B bar L4 3158.6969 657.5538 96 1606.4140 6210.9556 efgh
22 B bar L3 3364.9438 700.4887 96 1711.3047 6616.4994 efgh
14 B bar L2 3411.4009 710.1598 96 1734.9313 6707.8482 efgh
13 A bar L2 3769.4223 784.6900 96 1917.0098 7411.8269 efgh
29 A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551 h
Version 2
In this version, I use the by argument to cld() to split by f2. This reduces the letters within each group, but the Sidak adjustment is now less conservative. For example, row 8 and row 16 are not significantly different at the adjusted alpha-level from the comparison above, but now they are different. But I do not want to change the tests used, just to plot only a subset of the data. Is there a way to specify the number of tests I'm adjusting for as a whole, even though cld is split up with by groups?
cldisplay2 <- cld(emm, adjust = 'sidak', by = 'f2', Letters = letters)
subset(as.data.frame(cldisplay2), f2 == 'bar')
output
f1 f2 f3 response SE df lower.CL upper.CL .group
8 D bar L1 365.6732 76.1231 96 185.9699 719.0244 a
24 D bar L3 582.8573 121.3349 96 296.4229 1146.0742 ab
16 D bar L2 682.9238 142.1659 96 347.3136 1342.8353 abc
7 C bar L1 898.1560 186.9714 96 456.7740 1766.0470 abcd
6 B bar L1 1627.7069 338.8438 96 827.8006 3200.5652 bcde
15 C bar L2 1635.4393 340.4534 96 831.7330 3215.7694 bcde
32 D bar L4 1746.6052 363.5951 96 888.2685 3434.3552 cde
31 C bar L4 2348.6629 488.9270 96 1194.4562 4618.1832 de
21 A bar L3 2499.6772 520.3640 96 1271.2573 4915.1230 def
5 A bar L1 2545.4594 529.8946 96 1294.5407 5005.1448 def
23 C bar L3 2561.0138 533.1326 96 1302.4512 5035.7294 def
30 B bar L4 3158.6969 657.5538 96 1606.4140 6210.9556 ef
22 B bar L3 3364.9438 700.4887 96 1711.3047 6616.4994 ef
14 B bar L2 3411.4009 710.1598 96 1734.9313 6707.8482 ef
13 A bar L2 3769.4223 784.6900 96 1917.0098 7411.8269 ef
29 A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551 f
With the two separate tables (or plots?) you are displaying a total of 90 + 90 = 180 comparisons. If you want an overall multiplicity adjustment for all of these 180 comparisons, you need to be considerably less conservative than for 496 comparisons. However, it is possible to speccify a different value of level so that the Sidak adjustment works out correctly. For example, if you want the overall alpha to be 0.05, use
cld(emm, adjust = 'sidak', by = 'f2', Letters = letters,
alpha = 1 - sqrt(0.95))
With this, you are specifying alpha = 0.02532. Note that if
p.adj = 1 - (1 - p)^90 < 1 - sqrt(.95)
then
(1 - p)^90 > sqrt(.95)
so that
(1 - p)^180 > .95
thus
1 - (1 - p)^180 < .05
That is, by splitting the CLD table into two parts showing 90 comparisons each, we correctly apply the Sidak adjustment to correct for the 180 comparisons total at a significance level of .05.
Enhancement
Another idea based on this that results in a less conservative adjustment is to specify the Tukey adjustment instead:
cld(emm, adjust = 'tukey', by = 'f2', Letters = letters,
alpha = 1 - sqrt(0.95))
Thus, each separate table has an exact familywise error rate of 1 - sqrt(0.05); and we used the Sidak adjustment (slightly conservative) so that the error rate for the whole family of 180 tests is less than 0.05.

split block in R lmer

A factorial combination of 16 treatments (4*2*2) was replicated three times and laid out in a strip-split block. Treatments consisted of eight site preparations (4*2) applied as whole plot treatments and two levels of weeding(weeding/no-weeding) were applied randomly to subplots. The analysis was run in Genstat giving the following results:
Variate: result
Source of variation d.f. s.s. m.s. v.r. F pr.
Rep stratum 2 35.735 17.868
Rep.Burning stratum
Burning 1 0.003 0.003 0.00 0.972
Residual 2 3.933 1.966 1.53
Rep.Site_prep stratum
Site_prep 3 7.981 2.660 0.45 0.727
Residual 6 35.477 5.913 4.61
Rep.Burning.Site_prep stratum
Burning.Site_prep 3 2.395 0.798 0.62 0.626
Residual 6 7.691 1.282 0.60
Rep.Burning.Site_prep.*Units* stratum
Weeding 1 13.113 13.113 6.13 0.025
Burning.Weeding 1 0.486 0.486 0.23 0.640
Site_prep.Weeding 3 17.703 5.901 2.76 0.076
Burning.Site_prep.Weed.3 3.425 1.142 0.53 0.666
Residual 16 34.248 2.141
Total 47 162.190
I want to repeat these results in R. I used both the base::aov function and the lmerTest::lmer function. I managed to get the correct results with aov using function
result ~ Burning * Weeding * Site.prep + Error(Rep/Burning*Site.prep). With lmer I used the function
result ~ Burning*Site.prep*Weeding+(1|Rep/(Burning:Site.prep)) giving me only partially correct results. The SS values and the F-values for Burning, Site.prep and Burning:Site.prep deviated (although not too much)from the Genstat results, but the Weeding and Weeding interactions gave the same SS and F-valus as the Genstat output.
I would like to know how I should specify the lmer model to reproduce the Genstat and aov results.
Data and code below:
x <- structure(list(
Rep = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1", "2", "3"
), class = "factor"),Burning = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("Burn",
"No-burn"), class = "factor"), Site.prep = structure(c(4L, 4L,4L, 4L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L),
.Label = c("Chop_Pit", "Chop_Rip", "Pit", "Rip"), class = "factor"), Weeding = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L),
.Label = c("Weedfree", "Weedy"), class = "factor"),
Dbh14 = c(27.4, 28.4083333333333, 27.7066666666667, 27.3461538461538, 28.6, 28.3333333333333, 27.0909090909091,
27.8076923076923, 27.1833333333333, 27.5461538461538, 24.3076923076923,
29.3461538461538, 27.4, 25.1, 26.61, 28.0461538461538, 27.71,
25.2533333333333, 25.3833333333333, 24.2307692307692, 24.2533333333333,
24.95, 24.34375, 26.9909090909091, 24.775, 25.9076923076923,
25.1666666666667, 25.9933333333333, 27.0466666666667, 30.5625,
27.36, 25.2636363636364, 29.6846153846154, 27.7, 28.3071428571429,
29.4857142857143, 27.025, 30.1, 31.2454545454545, 24.2888888888889,
28.4875, 29.23, 30, 28.5, 29.3615384615385, 27.45, 28.8153846153846,
29.1866666666667)), .Names = c("Rep", "Burning", "Site.prep",
"Weeding", "result"), class = "data.frame", row.names = c(NA, -48L))
model1 <- aov(result ~ Burning* Weeding*Site.prep+ Error(Rep/Burning*Site.prep), data=x)
summary(model1)
model2 <- lmer(result ~ Burning*Site.prep*Weeding+(1|Rep/(Burning:Site.prep)),data=x)
anova(model2)
Applying the three-way split-plot-factorial ANOVA example from the site mentioned by #cuttlefish44, leads to:
library(lme4)
library(nlme)
m1 <- aov(result ~ Weeding*Burning*Site.prep + Error(Rep/Burning*Site.prep), data=x)
m2 <- lmer(result ~ Weeding*Burning*Site.prep + (1|Rep) + (1|Burning:Rep) +
(1|Site.prep:Rep), data=x)
m3 <- anova(lme(result ~ Weeding*Burning*Site.prep,
random=list(Rep=pdBlocked(list(~1, pdIdent(~Burning-1), pdIdent(~Site.prep-1)))),
method="ML", data=x))
summary(m1)
anova(m2)
m3
Except for Site.prep, the results match. Moreover, the results between lmer() and lme() are pretty similar (also for Site.prep). I'm not sure whether this is the result of differences in modelling approaches: the multi-level approach takes both within and between effects into account.

Looping through class ltraj?

Apologies if this is not the best forum to inquire this question.
Has anyone been able to loop/iterate though multiple GPS collared individuals that are in class ltraj (adehabitatlt)?
I've been trying to calculate Prox (https://cran.r-project.org/web/packages/wildlifeDI/vignettes/wildlifeDI-vignette.pdf) for multiple individuals but am struggling with how exactly to loop class ltraj because it's different than dataframe (which I'm accustomed to).
Thanks in advance.
install.packages('wildlifeDI', dependencies=TRUE)
library(wildlifeDI)
library(adehabitatLT)
chupacabra <- structure(list(CollarID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L), .Label = c("A4116F", "A4117M", "A4118F"), class = "factor"),
DateTime = structure(c(1433653200, 1433667600, 1433682060,
1433682300, 1433682600, 1433682900, 1433683200, 1433683500,
1433683800, 1433684100, 1433684400, 1433684700, 1433685000,
1433685300, 1433685600, 1433685900, 1433686200, 1433686500,
1433686800, 1433687100, 1433687400, 1433687700, 1433688000,
1433688300, 1433688600, 1433688900, 1433689200, 1433689500,
1433689800, 1433690100, 1433690400, 1433690700, 1433691000,
1433691300, 1433691600, 1433691900, 1433692200, 1433692500,
1433692800, 1433693100, 1433693400, 1433693700, 1433694000,
1433694300, 1433694600, 1433694900, 1433695200, 1433695500,
1433695800, 1433696100, 1433696400, 1433710860, 1433714400,
1433714700, 1433715000, 1433715300, 1433715600, 1433715900,
1433716200, 1433716500, 1433716800, 1433717100, 1433717400,
1433717700, 1433718000, 1433718300, 1433718600, 1433718900,
1433719200, 1433719500, 1433719800, 1433720100, 1433720400,
1433720700, 1433721000, 1433721300, 1433721600, 1433721900,
1433722200, 1433722500, 1433722800, 1433723100, 1433723400,
1433723700, 1433724060, 1433724300, 1433724600, 1433724900,
1433725200, 1433653200, 1433667660, 1433682060, 1433682300,
1433682600, 1433682900, 1433683200, 1433683500, 1433683800,
1433684100, 1433684400, 1433684700, 1433685000, 1433685300,
1433685660, 1433685900, 1433686200, 1433686500, 1433686800,
1433687100, 1433687400, 1433687700, 1433688000, 1433688300,
1433688660, 1433688900, 1433689200, 1433689500, 1433689800,
1433690100, 1433690400, 1433690700, 1433691000, 1433691300,
1433691600, 1433691900, 1433692200, 1433692500, 1433692800,
1433693100, 1433693400, 1433693700, 1433694060, 1433694300,
1433694600, 1433694900, 1433695200, 1433695500, 1433695800,
1433696100, 1433696400, 1433710860, 1433714400, 1433714700,
1433715000, 1433715300, 1433715600, 1433715900, 1433716200,
1433716500, 1433716800, 1433717100, 1433717400, 1433717700,
1433718000, 1433718300, 1433718600, 1433718900, 1433719200,
1433719500, 1433719800, 1433720100, 1433720400, 1433720700,
1433721000, 1433721300, 1433721600, 1433721900, 1433722200,
1433722500, 1433722800, 1433723100, 1433723400, 1433723700,
1433724000, 1433724300, 1433724600, 1433724900, 1433725200,
1433653249, 1433667666, 1433682089, 1433682349, 1433682632,
1433682936, 1433683234, 1433683536, 1433683837, 1433684144,
1433684443, 1433684739, 1433685031, 1433685370, 1433685634,
1433685935, 1433686236, 1433686536, 1433686826, 1433687142,
1433687448, 1433687736, 1433688034, 1433688337, 1433688649,
1433688936, 1433689236, 1433689531, 1433689827, 1433690139,
1433690433, 1433690736, 1433691048, 1433691336, 1433691634,
1433691941, 1433692236, 1433692535, 1433692833, 1433693129,
1433693434, 1433693735, 1433694028, 1433694373, 1433694642,
1433694931, 1433695234, 1433695542, 1433695831, 1433696148,
1433696448, 1433710908, 1433714437, 1433714737, 1433715036,
1433715366, 1433715636, 1433715969, 1433716234, 1433716536,
1433716827, 1433717137, 1433717435, 1433717733, 1433718048,
1433718336, 1433718636, 1433718973, 1433719272, 1433719530,
1433719837, 1433720136, 1433720431, 1433720736, 1433721031,
1433721336, 1433721640, 1433721946, 1433722236, 1433722528,
1433722842, 1433723137, 1433723434, 1433723730, 1433724035,
1433724370, 1433724634, 1433724936, 1433725236), class = c("POSIXct",
"POSIXt"), tzone = ""), UTM_X = c(636979.2503, 636977.6583,
637402.4471, 637400.3063, 637402.3105, 637407.1977, 637406.3305,
637408.2991, 637407.1907, 637407.8414, 637406.7617, 637407.1614,
637409.8019, 637431.5235, 637465.9644, 637495.9583, 637525.2219,
637573.6033, 637645.3501, 637683.3844, 637691.6229, 637693.4815,
637693.4973, 637691.2483, 637691.9061, 637693.6377, 637692.1106,
637692.3169, 637690.9989, 637691.4503, 637693.6252, 637692.4915,
637694.9434, 637692.6685, 637692.8116, 637694.6787, 637694.4404,
637695.9109, 637696.8945, 637695.2403, 637695.4283, 637694.6085,
637693.4962, 637695.6229, 637734.7283, 637773.2897, 637774.9891,
637787.6573, 637792.285, 637807.0486, 637834.6231, 637497.3348,
637149.9982, 637145.0345, 637178.159, 637181.8251, 637181.1075,
637178.023, 637175.327, 637179.9138, 637180.2833, 637181.5512,
637185.8749, 637181.0011, 637177.401, 637177.4498, 637176.787,
637176.0093, 637175.5126, 637177.9578, 637178.5819, 637188.3911,
637188.7303, 637189.496, 637204.3885, 637195.2063, 637204.9823,
637201.5235, 637212.3355, 637274.4294, 637293.0009, 637296.3954,
637331.3382, 637358.4369, 637365.1677, 637357.5562, 637355.3896,
637345.4827, 637339.1054, 628920.3789, 628869.9781, 630028.6781,
630156.4557, 629878.756, 629658.9786, 629412.6432, 629257.5965,
629405.8967, 629113.4479, 628955.5124, 628852.0231, 628711.9202,
628632.7134, 628621.7724, 628622.2565, 628683.6018, 628771.1182,
628790.8437, 628867.7592, 628881.9794, 628830.9898, 628681.9202,
628575.3395, 628578.1836, 628656.4902, 628659.2271, 628656.689,
628660.4677, 628657.294, 628657.077, 628689.6585, 628727.0131,
628716.6979, 628703.8397, 628678.6953, 628679.3594, 628681.3549,
628625.6275, 628563.1372, 628488.425, 628482.5023, 628469.2209,
628417.9697, 628407.7352, 628405.374, 628393.143, 628394.0092,
628396.2344, 628395.05, 628395.7787, 627684.7989, 627704.889,
627702.5528, 627702.0422, 627708.7906, 627706.9374, 627687.0371,
627622.0573, 627605.7932, 627603.5707, 627587.8803, 627606.0471,
627603.2967, 627602.954, 627603.5844, 627604.1232, 627601.697,
627581.6104, 627599.7062, 627616.327, 627661.7402, 627889.446,
627883.5896, 627803.1167, 627792.5918, 627716.0886, 627720.8854,
627671.8217, 627666.9994, 627586.7035, 627584.4273, 627532.492,
627502.6326, 627430.6781, 627408.8845, 627357.5049, 627406.0466,
627427.1382, 636666.3215, 636629.7032, 637179.9041, 637187.7067,
637183.5281, 637193.2082, 637227.2331, 637290.2543, 637347.9311,
637373.0887, 637368.8923, 637371.0722, 637383.95, 637480.1799,
637510.543, 637558.428, 637676.2714, 637682.3564, 637680.8591,
637682.8516, 637680.8317, 637680.8341, 637681.9818, 637681.2897,
637681.3658, 637681.9234, 637681.8824, 637682.0629, 637684.8756,
637681.602, 637682.7548, 637680.8578, 637682.9887, 637680.2496,
637681.4629, 637682.3731, 637682.2223, 637684.1076, 637681.7127,
637681.1249, 637681.6758, 637681.595, 637682.5253, 637702.3094,
637728.9487, 637784.0853, 637776.5727, 637785.2538, 637786.6413,
637807.9935, 637834.8672, 637485.5191, 637148.5674, 637139.2974,
637174.9104, 637191.9371, 637179.4262, 637175.7715, 637176.3455,
637174.5459, 637174.2012, 637173.7462, 637177.3967, 637176.6907,
637177.8458, 637178.0774, 637178.4151, 637178.3272, 637178.2442,
637177.6655, 637176.734, 637186.2713, 637185.0998, 637197.4201,
637197.9147, 637204.1485, 637203.1784, 637204.4993, 637205.3515,
637279.9058, 637303.773, 637303.5724, 637330.3473, 637354.416,
637366.5627, 637340.7274, 637357.5505, 637350.709, 637349.689
), UTM_Y = c(3365828.581, 3365826.066, 3364992.673, 3364991.006,
3364989.036, 3364990.816, 3364989.486, 3364991.849, 3364991.37,
3364990.059, 3364989.58, 3364991.403, 3364991.536, 3364985.614,
3365030.733, 3365054.446, 3365091.064, 3365138.444, 3365289.033,
3365390.111, 3365398.839, 3365387.124, 3365390.427, 3365386.696,
3365387.104, 3365379.344, 3365386.131, 3365388.805, 3365385.152,
3365385.158, 3365386.394, 3365385.637, 3365385.48, 3365386.071,
3365385.397, 3365387.416, 3365387.269, 3365389.505, 3365389.971,
3365387.833, 3365389.676, 3365390.685, 3365385.96, 3365384.934,
3365352.152, 3365369.878, 3365376.795, 3365390.013, 3365382.689,
3365382.189, 3365410.939, 3365683.847, 3365620.829, 3365574.121,
3365527.084, 3365501.513, 3365502.801, 3365512.739, 3365514.733,
3365512.885, 3365511.016, 3365512.562, 3365510.255, 3365511.235,
3365509.494, 3365509.439, 3365509.431, 3365509.388, 3365510.678,
3365509.534, 3365511.083, 3365511.85, 3365514.659, 3365513.371,
3365525.476, 3365526.036, 3365529.429, 3365528.676, 3365513.172,
3365507.793, 3365514.623, 3365512.105, 3365504.477, 3365512.401,
3365495.238, 3365490.863, 3365441.075, 3365411.542, 3365403.003,
3371496.516, 3371594.382, 3370587.966, 3370380.241, 3370270.012,
3370346.817, 3370433.295, 3370488.189, 3370225.222, 3370122.896,
3370174.202, 3370232.298, 3370192.371, 3370255.722, 3370283.548,
3370283.21, 3370305.674, 3370344.002, 3370354.4, 3370348.973,
3370200.353, 3370078.071, 3370123.589, 3370194.686, 3370393.878,
3370500.265, 3370498.635, 3370498.882, 3370497.663, 3370499.687,
3370500.172, 3370633.763, 3370704.904, 3370839.426, 3370879.943,
3370950.842, 3370957.988, 3370963, 3371031.496, 3371082.487,
3371109.89, 3371112.17, 3371118.807, 3371167.072, 3371168.581,
3371170.127, 3371178.074, 3371177.097, 3371178.11, 3371176.777,
3371178.482, 3371566.662, 3371622.632, 3371621.252, 3371619.772,
3371623.975, 3371627.245, 3371636.71, 3371612.734, 3371598.776,
3371590.192, 3371636.009, 3371656.352, 3371656.719, 3371656.471,
3371656.755, 3371659.1, 3371656.401, 3371688.243, 3371717.065,
3371741.492, 3371755.505, 3371618.156, 3371595.308, 3371615.82,
3371560.55, 3371552.166, 3371572.884, 3371547.544, 3371530.616,
3371559.755, 3371591.63, 3371612.877, 3371657.663, 3371727.149,
3371739.263, 3371823.645, 3371912.149, 3371969.549, 3366104.602,
3365712.344, 3365494.627, 3365496.045, 3365484.575, 3365475.02,
3365485.304, 3365467.377, 3365477.805, 3365507.809, 3365510.682,
3365519.888, 3365527.19, 3365491.394, 3365490.37, 3365468.274,
3365393.413, 3365389.355, 3365386.964, 3365391.977, 3365389.125,
3365388.937, 3365389.35, 3365389.375, 3365387.159, 3365387.133,
3365386.578, 3365386.735, 3365386.161, 3365387.472, 3365387.487,
3365387.064, 3365387.977, 3365385.016, 3365387.836, 3365388.036,
3365387.048, 3365389.909, 3365387.074, 3365384.939, 3365387.717,
3365388.026, 3365388.16, 3365385.728, 3365344.996, 3365374.693,
3365377.679, 3365387.866, 3365389.823, 3365391.779, 3365410.631,
3365698.174, 3365622.297, 3365571.954, 3365511.957, 3365510.265,
3365505.086, 3365509.196, 3365512.44, 3365513.438, 3365508.777,
3365509.049, 3365509.838, 3365506.403, 3365507.748, 3365510.711,
3365509.075, 3365507.666, 3365508.152, 3365505.285, 3365498.401,
3365508.531, 3365508.483, 3365513.538, 3365520.783, 3365519.376,
3365523.92, 3365529.634, 3365529.866, 3365498.661, 3365512.941,
3365509.801, 3365503.056, 3365513.548, 3365502.683, 3365482.215,
3365438.852, 3365412.317, 3365406.363)), .Names = c("CollarID",
"DateTime", "UTM_X", "UTM_Y"), row.names = c(NA, -267L), class = "data.frame")
Commenting out incorrect transformation
# chupacabra$DateTime <-as.POSIXct(strptime(chupacabra$DateTime, format='%m/%d/%Y %H:%M:%S'),origin='1970-01-01')
chupacabra2<-as.ltraj(chupacabra[, c("UTM_X","UTM_Y")], date=chupacabra$DateTime,id=chupacabra$CollarID, typeII=TRUE)
monster1<-chupacabra2[1] #extract the first chupacabra
monster2<-chupacabra2[2] #extract the second chupacabra
proxdf <-Prox(monster1,monster2, tc=0.5*60,dc=210, local =TRUE)
Here's a sample of chupcabras that we've been tracking. We'd like to examine how often they interact with each other. This dataset has 3 individuals (but we have many many more chupcabras) and it is inefficient to pull out the animals/creatures 1 by 1 to calculate proximity. I'd like to do a for loop (for i in unique ID, perhaps) but I don't understand how to do this when the data is in ltraj format. Any assistance would be appreciated.
After removing the harmful code that incorrectly reformatted the DateTime value, I plotted the "trajectory" for the 3 2-animal combinations. Clearly animal1 did not interact with animal 2 since their ranges were disjoint. Animal 1 and animal 3 do appear to interact, since about halfway through their joint sojourns they have the roughly same trajectory:
plot( chupacabra2[[1]]$x, chupacabra2[[1]]$y, type="l",
xlim=range( c(chupacabra2[[1]]$x, chupacabra2[[3]]$x)),
ylim= range(c(chupacabra2[[1]]$y, chupacabra2[[3]]$y)))
lines( chupacabra2[[3]]$x, chupacabra2[[3]]$y, col="red")
This appears in proxdf13 as:
> monster1<-chupacabra2[1]
>
> monster3<-chupacabra2[3]
>
> proxdf13 <-Prox(monster1,monster3, tc=0.5*60,dc=210, local =TRUE)
> proxdf13
date prox
1 2015-06-07 06:01:00 549.074863
2 2015-06-07 07:20:00 104.169909
3 2015-06-07 08:10:00 6.205875
4 2015-06-07 09:05:00 14.409016
5 2015-06-07 09:20:00 11.189309
6 2015-06-07 15:40:00 6.481131
7 2015-06-07 16:25:00 4.259042
8 2015-06-07 17:15:00 10.648210
9 2015-06-07 17:35:00 4.181297
10 2015-06-07 17:41:00 7.574566
So a guess that "interact" (might) means something along the lines of "is within 15 distance units for more than 2 days in succession". So a natural function to consider would be rle:
> rle( proxdf13$prox < 15 )
Run Length Encoding
lengths: int [1:2] 2 8
values : logi [1:2] FALSE TRUE
> RL13 <- rle( proxdf13$prox < 15 )
> max( RL13$lengths [ RL13$values] )
[1] 8
And test whether this is greater than some value, say 2?
So that was the method to handle a single 2way animal-animal combination. The comments of the questioner below suggest I may have lost him in the material that follows. To get the 2way combinations of a sequence use combn:
> combn(1:3, 2)
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 3 3
That is a matrix that will be use to generate the index values for pulling two single animal "trajectory" dataframes at a time from the 'ltraj' object using each column separately. When the apply function is used it can "loop" over etiehr rows or columns of a matrix and using 2 as the index value to apply specifieds the columns.
So putting this all together (using apply to loop over the column indices of combn-result to get the two-way combinations):
( max.days.prox <- apply( combn( seq(length(chupacabra2)), 2 ), 2,
# loops over columns of the "combinations matrix"
function(x) {
proxcomb <- Prox( chupacabra2[ x[1] ], chupacabra2[ x[2] ],
tc=0.5*60,dc=210, local =TRUE)
RLcomb <- rle( proxcomb$prox < 15 )
Interact.days <- max( RLcomb$lengths [ RLcomb$values] ) } ) )
# [1] -Inf 8 -Inf
We rbind that to the result to look at the items of interest:
> rbind(combn( seq(length(chupacabra2)), 2 ) , max.days.prox)
[,1] [,2] [,3]
1 1 2
2 3 3
max.days.prox -Inf 8 -Inf
So only the pairing of animal1 and animal3 provided evidence of an interaction. This would generalize to larger instances of ltraj-objects

Place 1 heatmap on another with transparency in R

I'm new to R and have the following challenge;
I want to create a visualization that basically combines 2 kind of 'heatmaps' in order to visualize at what times there are truly dark skies (for astronomy). For this I want to have a heatmap that visualizes the brightness of the moon based on the moonrise and moonset times and the phase of the moon. On this then we can plot a 'band'like heatmap for the time the sun is up with some transparency.
I'm not sure if this is going to work visualy or if I need to find some other solution, however this seems like a good challenge to get into R some more.
But I could use some pointers as I'm stuck already loading the matrix of size 24(hours) x 31(days) with all the 720 values. When trying to create a basic data.frame from the vectors I get the error that the number of rows are inconsistent.
Furthermore I have some heatmap examples working already, but I'm not sure how to combine 2 of them in the same plot like I described.
As an illustration the current 'heatmap' as it is in excel
And some data:
MOON
moon <- structure(list(X1.9.12 = structure(c(2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L), .Label = c("0%", "100%"), class = "factor"), X2.9.12 = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("0%", "98%"), class = "factor"),
X3.9.12 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L
), .Label = c("0%", "94%"), class = "factor"), X4.9.12 = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0%", "89%"), class = "factor"),
X5.9.12 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L
), .Label = c("0%", "82%"), class = "factor"), X6.9.12 = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("0%", "74%"), class = "factor"),
X7.9.12 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Label = c("0%", "65%"), class = "factor"), X8.9.12 = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0%", "56%"), class = "factor"),
X9.9.12 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("0%", "47%"), class = "factor"), X10.9.12 = structure(c(2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("0%", "37%"), class = "factor"),
X11.9.12 = structure(c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("0%", "28%"), class = "factor"), X12.9.12 = structure(c(2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("0%", "20%"), class = "factor"),
X13.9.12 = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L
), .Label = c("0%", "12%"), class = "factor"), X14.9.12 = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("0%", "6%"), class = "factor"),
X15.9.12 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("0%", "2%"), class = "factor"), X16.9.12 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "0%", class = "factor"),
X17.9.12 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("0%", "1%"), class = "factor")), .Names = c("X1.9.12",
"X2.9.12", "X3.9.12", "X4.9.12", "X5.9.12", "X6.9.12", "X7.9.12",
"X8.9.12", "X9.9.12", "X10.9.12", "X11.9.12", "X12.9.12", "X13.9.12",
"X14.9.12", "X15.9.12", "X16.9.12", "X17.9.12"), class = "data.frame", row.names = c("0:00:00",
"1:00:00", "2:00:00", "3:00:00", "4:00:00", "5:00:00", "6:00:00",
"7:00:00", "8:00:00", "9:00:00", "10:00:00", "11:00:00", "12:00:00",
"13:00:00", "14:00:00", "15:00:00", "16:00:00", "17:00:00", "18:00:00",
"19:00:00", "20:00:00", "21:00:00", "22:00:00", "23:00:00"))
SUN
September
Day Sunrise Sunset
1 6:52 20:26
2 6:54 20:24
3 6:56 20:22
4 6:57 20:20
5 6:59 20:17
6 7:00 20:15
7 7:02 20:13
8 7:04 20:10
9 7:05 20:08
10 7:07 20:06
11 7:08 20:05
12 7:09 20:02
13 7:11 20:00
14 7:13 19:58
15 7:14 19:55
16 7:16 19:53
17 7:17 19:51
18 7:19 19:48
19 7:21 19:46
20 7:22 19:44
21 7:25 19:40
22 7:26 19:38
23 7:28 19:35
24 7:30 19:33
25 7:31 19:31
26 7:33 19:28
27 7:35 19:26
28 7:36 19:24
29 7:38 19:21
30 7:40 19:19
So from what I understood, there are basically two questions:
Data organization
The easiest would be, if you'd have all data in one data.frame in long format. I.e. for each combination of time and date you have one row, with additional columns for the moon and sun intensity.
So we start with melting and fixing the moon data:
library(reshape2)
moon$time <- row.names(moon)
moon <- melt(moon, id.vars="time", variable.name="date", value.name="moon" )
moon$date <- sub("X(.*)", "\\1", moon$date)
moon$moon <- 1 - as.numeric(sub("%", "", moon$moon)) /100
Now we bring the sun data to an comparable form, by at least give them the same identifier for the date:
sun$Day <- paste( sun$Day, "9.12", sep ="." )
Next step is to merge the data by the date resp. Day and to set a comparable column for the sun intensity as is given already for the moon intensity. This can be done by casting the times to a time format and compare Sunrise and Sunset with the actual time:
mdf <- merge( moon, sun, by.x = "date", by.y = "Day" )
mdf$time.tmp <- strptime(mdf$time, format="%H:%M")
mdf$Sunrise <- round(strptime(mdf$Sunrise, format="%H:%M"), units = "hours")
mdf$Sunset <- round(strptime(mdf$Sunset, format="%H:%M"), units = "hours")
mdf$sun <- ifelse( mdf$Sunrise <= mdf$time.tmp & mdf$Sunset >= mdf$time.tmp, 1, 0 )
mdf <- mdf[c("date", "time", "moon", "sun")]
mdf[ 5:10, ]
date time moon sun
1.9.12 4:00:00 0 0
1.9.12 5:00:00 0 0
1.9.12 6:00:00 0 0
1.9.12 7:00:00 0 1
1.9.12 8:00:00 1 1
1.9.12 9:00:00 1 1
Plotting
Adding multiple layers with different transparencies begs literally for ggplot2. In order to use this in a proper way, there is one more data manipulation necessary, which ensures the proper order on the axes: date and time have to be converted to factors with factor levels ordered not lexically, but by time:
mdf <- within( mdf, {
date <- factor( date, levels=unique(date)[ order(as.Date( unique(date), "%d.%m.%y" ) ) ] )
time <- factor( time, levels=unique(time)[ order(strptime( time, format="%H:%M:%S"), decreasing=TRUE ) ] )
} )
This can be plot now:
library( ggplot2 )
ggplot( data = mdf, aes(x = date, y = time ) ) +
geom_tile( aes( alpha = sun ), fill = "goldenrod1" ) +
geom_tile( aes( alpha = moon ), fill = "dodgerblue3" ) +
scale_alpha_continuous( "moon", range=c(0,0.5) ) +
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Which gives you the following result

Resources