Displaying Multiple Regression Equations - r

I have three regressions in one plot that I am trying to display the equation of each for. I've been working off of this question to try and do this. However, the filtering doesn't seem to do anything and it displays the same equation 3 times.
The end goal is to compare cpue in relation to veg, while controlling for location (block), and get the slopes/r^2 values for each of the three regression lines.
Data
cpue<- structure(list(lake = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L), veg = c(254.8026498, 219.9422136, 450.9662078, 484.8605026,
407.1662151, 286.7015617, 351.6441798, 179.9959443, 340.4276843,
247.2907435, 502.4119071, 336.4259995, 349.1543197, 281.7493811,
201.8284859, 325.6380404, 288.3855723, 230.8755861, 214.8890894,
326.6376698, 214.7468224, 132.0511504, 335.2727641, 336.8727253,
143.8923225, 277.3053436, 302.7005649, 355.0332852, 307.5736711,
371.8407176, 168.7645221, 365.9156811, 349.205548, 273.8392697,
171.4513348, 197.1067049, 350.5833827, 202.9605797, 365.3415045,
413.2762633, 329.8539209, 377.1415341, 180.8524994, 217.4007852,
258.5909286, 146.7092479, 258.7440138, 393.2014549, 492.6719497,
208.5002392, 219.1466664, 182.1366352, 308.0534171, 317.6037795,
131.7534807, 324.0011761, 469.5861988, 237.4492916, 318.6897863,
47.94967582, 223.5382632, 386.2227607, 343.7657123, 493.6393726,
204.2960349, 294.4218332, 178.7555635, 454.0358039, 207.1363947,
364.6063223, 462.8508521, 292.8613255, 330.3893897, 209.1769838,
237.4264742, 427.8856667), cpue = c(32.63512612, 47.98168449,
33.26735173, 14.41435377, 30.94664495, 40.26817963, 41.26204388,
31.63227286, 36.97932408, 21.54620143, 34.27556883, 6.506644061,
32.24677471, 38.24536746, 30.95968644, 24.86408391, 31.15438304,
21.69779047, 39.86223079, 27.92263229, 23.55684281, 34.6157024,
42.06943746, 24.70597527, 28.36396188, 50.34591832, 55.06361184,
48.69468021, 26.00084784, 44.77320597, 14.56328001, 33.29291085,
21.55078237, 29.95980975, 40.61006429, 43.46931237, 26.26407484,
15.87009067, 39.47297313, 20.50811378, 35.66157343, 35.64563497,
44.47319537, 42.06574907, 40.16356125, 35.57462201, 32.10051291,
34.1254268, 34.21084448, 28.18410732, 32.11249307, 38.39890418,
31.24778375, 29.76951583, 41.52508487, 34.48914051, 28.30923803,
29.33886042, 37.57268795, 59.29849175, 28.9317113, 41.27342427,
38.44878019, 44.53768204, 44.48611219, 33.15553274, 34.48894561,
34.86722967, 31.92515626, 50.04825584, 53.67528105, 37.53150868,
33.16255301, 33.22374846, 28.28172263, 42.5795616), block = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1",
"2", "3"), class = "factor")), row.names = c(NA, -76L), class = "data.frame")
Code
# Make lm() with blocking variable----------
lm_eqn2 <- function(df2){
m2 <- lmer(cpue ~ veg + (1|block), cpue);
eq2 <- substitute(italic(CPUE) == a + b*","~~italic(r)^2~"="~r2, # Write CPUE = a+b, r^2 = x
list(a = format(unname(coef(m2)[1]), digits = 4), # define 'a'
b = format(unname(coef(m2)[2]), digits = 2), # define 'b'
r2 = format(summary(m)$r.squared, digits = 3))) # define 'r2'
as.character(as.expression(eq)); # declare expression as a character
}
ggplot(cpue, aes(x=veg, y=cpue, col=block))+
geom_point()+
geom_smooth(method="lm", show.legend=F, se=F)+
annotate("text", x=100, y=20, label= lm_eqn2(cpue %>% filter(block==1)), parse=T)+
annotate("text", x=200, y=30, label= lm_eqn2(cpue %>% filter(block==2)), parse=T)+
annotate("text", x=300, y=40, label= lm_eqn2(cpue %>% filter(block==3)), parse=T)
When I try to view the equation for each line with the following code:
lm_eqn2(cpue %>% filter(block==2))
it returns the same equation for each blocking number that I filter it by. This makes me think there's something wrong with the code that I made the model and the equation with? The only thing different (that I can tell) from the linked question is that my model has a blocking variable. Not sure if that would actually affect anything though.
Any help would be greatly appreciated.

You have a few problems here.
Firstly, it isn't good practice to use the same name for the dataframe and a vector within. It makes lines like lmer(cpue ~ veg + (1|block), cpue); and ggplot(cpue, aes(x=veg, y=cpue, col=block))+ confusing to many.
But also, using cpue here for the dataframe within your function, means that your function doesn't care what you are passing to it later. Such that m2 <- lmer(cpue ~ veg + (1|block), cpue); is the same every time - hence the same equation is being produced. cpue %>% filter(block==2) is ignored as an argument because df2 doesn't exist within your function. So you need something like this:
lm_eqn2 <- function(df2){
m2 <- lmer(cpue ~ veg + (1|block), df2); ## note the change to df2 here
eq2 <- substitute(italic(CPUE) == a + b*","~~italic(r)^2~"="~r2,
list(a = format(unname(coef(m2)[1]), digits = 4),
b = format(unname(coef(m2)[2]), digits = 2),
r2 = format(summary(m2)$r.squared, digits = 3)))
as.character(as.expression(eq2));
}
** also note that m and eq were not found (in your original code), so I changed them to m2 and eq2 respectively.
This gives the error:
Error: grouping factors must have > 1 sampled level
which makes sense, because you've fit block as a random intercept in your model code, yet you are filtering your data by the blocking factor. So there is only one "type" of blocking factor in each of the lines cpue %>% filter(block==1), cpue %>% filter(block==2), and cpue %>% filter(block==3). That means there is no information added to your regression when you use (1|block), since block is now a constant.
You might want to explain what you are hoping to do with this blocking factor. Some relevant posts: https://stats.stackexchange.com/q/4700/238878 and https://stats.stackexchange.com/q/31569/238878

Related

Why am I getting the same result from Anova and aov_car in R, and why is it different from SPSS?

I am conducting a reanalysis of some data. The dv is continuous (beta value ie neural activity) and the iv is categorical (position) with three levels (1, 2, 3). Position is set as a factor. It is repeated measures, and there are 126 observations. The original analysis was done in SPSS, and I am trying to replicate those results with R.
I don't understand how to make a MRE of this, so my data from dput is at the bottom.
My ANOVA results are different from those reported in the original paper (the data is identical). Specifically, they reported F(2,82) = 18.262, p = 0.00, but my table (below) is totally different. I used the Anova function, and now get the impression that I should be using aov_car but the output is the same between the two.
> Anova(lm(Beta ~ Position, data = stack_ex))
Anova Table (Type II tests)
Response: Beta
Sum Sq Df F value Pr(>F)
Position 60.57 2 1.5213 0.2225
Residuals 2448.70 123
> aov_car(Beta ~ Position + Error(Beta), data = stack_ex)
Contrasts set to contr.sum for the following variables: Position
Anova Table (Type 3 tests)
Response: Beta
Effect df MSE F ges p.value
1 Position 2, 123 19.91 1.52 .024 .222
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1
I didn't know what to use as the error term, so I put Beta. Is this the issue?
Here is the data, apologies for not being able to reduce it down.
> dput(stack_ex)
structure(list(Beta = c(9.97627322506813, 4.51007015616003, 12.5899137493145,
5.16107528902195, 0.69934803628816, 3.05576441003722, 9.73415586595716,
3.48253752326239, 8.72271400892749, 6.58254223482513, 9.64252595049282,
6.2575247824253, 5.55086088416984, 3.26575073266956, -0.189486641765607,
-6.34627220217585, 3.03699535774724, 5.38452950644857, 7.2247809046584,
1.05684383099248, 0.745997758871227, 13.4708766693015, 6.22313273382721,
7.60691743953363, 7.95869706610072, 0.0733745510036445, 5.74455260852637,
9.10243217750976, 3.83463985621549, 6.51540068169028, 6.74657874951813,
9.06748922888841, 4.18661204617864, 8.13865720827057, 4.97289378228525,
4.79399790512039, -12.5433736154914, 3.22520674616528, 4.83924807559523,
6.89780284608954, 2.01175994751707, 1.58936731656692, 8.65646845487533,
2.03332866864119, 6.59573013233866, 4.35624613417537, 3.22584501764675,
3.01812749198894, 8.67739700219412, 5.14273744714805, 7.54959191256081,
7.83244934217214, 8.67126128885367, 3.99955715822518, 2.95804569815409,
2.25327292671231, 0.258342636171449, -6.87648408967595, 1.9848049507549,
2.45033479610578, 7.41525416520838, 1.11896377050173, 0.0698315480648937,
9.90975895502056, 5.03717210651178, 4.67127493715398, 7.90306051043896,
3.0618932143297, 5.43781266582611, 8.9383987897543, 4.7982992164727,
6.90576740201611, 4.43862196057089, 9.06484925843098, 3.35645527138813,
5.42103905597134, 2.32859166774007, 3.65962841104834, -11.716124636774,
7.15256990819002, 4.02640955184303, 7.10747478179406, 2.81026958853589,
1.21494403713035, 9.06256308202033, 2.40170878761068, 6.45729748790901,
4.88232212084591, 1.55722661655526, 3.09556060018938, 6.6629967466337,
4.38848062553557, 4.38871083406173, 6.40367918458127, 6.361735558817,
4.21279189431753, 2.08838813524482, 2.21632202746396, -0.491401226521853,
-7.3685373528786, 2.12839354041543, 4.22958686769682, 4.25606944426722,
0.330400668298046, 1.02776552933976, 10.6734745608271, 3.01238218831987,
4.03318609054561, 6.45849154079659, 0.45593329021199, 5.76390726591623,
7.21202360734704, 4.62140561321984, 3.72714943200746, 5.49911004676976,
9.15658405382221, 3.25231083403689, 3.67627240704932, 3.48390458422993,
2.98674297337782, -19.5189775914798, 2.59812967326379, 2.78334604762499,
3.70635047793331, -0.223282095324164, 2.17552096286021), Position = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1",
"2", "3"), class = "factor")), row.names = c(NA, -126L), class = c("tbl_df",
"tbl", "data.frame"))

Setting up an Mlogit in R with many observations for each category

I'm trying to use Mlogit in R, I'm a little new to logits, and I'm having trouble setting up my problem in the Mlogit framework. I'm actually not entirely sure that mlogit is the right approach. Here is an analogous problem.
Consider a baseball dataset, with an outcome variable that takes on "out" "single" "double" "triple" and "homerun." For explanatory variables, we have the name of the batter, the name of the pitcher, and the stadium. There are hundreds of observations for each batter, including many with the batter facing the same pitcher.
I figured this is definitely a multinomial logit because I have multiple categorical outcomes, but I am not sure because all of the documentation seems to be dealing with "choices" between alternatives, which this isn't really. I tried to start my logit model by having a factor variable for the hitter, another one for the pitcher, and another one for the stadium. When I tried this in R, I get
Error in row.names<-.data.frame(*tmp*, value = value) : invalid 'row.names' length
With some googling I think maybe it is expecting only one observation for each combination of hitter, pitcher, and park? Maybe not? What am I doing wrong? How should I set this up?
Edit:
Example of data here
https://docs.google.com/spreadsheets/d/19fiq_QEMj4nAPcTqIRxeaYNPgqeHxKAEuPrfHMeIJ7o/edit?usp=sharing
Here are some suggestions on how you can start analyzing your data.
# Your dataset
dts <- structure(list(outcome = c(1L, 1L, 2L, 3L, 1L, 3L, 2L, 3L, 3L,
3L, 3L, 1L, 2L, 2L, 2L, 1L, 3L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L,
2L, 2L, 1L, 1L, 2L, 3L, 2L, 3L, 1L, 2L, 2L, 3L, 2L, 3L, 3L, 3L,
2L, 1L, 1L, 1L, 2L, 3L, 2L, 1L), hitter = structure(c(3L, 3L,
3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 2L, 2L, 2L, 1L, 1L, 1L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("james",
"jill", "john"), class = "factor"), pitcher = structure(c(3L,
3L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 3L, 1L, 1L,
2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 3L, 2L, 1L, 2L, 3L, 2L,
3L, 2L, 1L, 1L, 2L, 2L, 1L, 3L, 3L, 1L, 2L, 2L, 1L, 1L, 2L, 2L
), .Label = c("bill", "bob", "brett"), class = "factor"), place = structure(c(3L,
3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 5L,
5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L
), .Label = c("ca", "co", "dc", "ny", "tn"), class = "factor")), .Names = c("outcome",
"hitter", "pitcher", "place"), class = "data.frame", row.names = c(NA,
-49L))
# Estimation of a multinomial logistic regression model
library(mlogit)
dts.wide <- mlogit.data(dts, choice="outcome", shape="wide")
fit.mlogit <- mlogit(outcome ~ 1 | hitter+pitcher+place, data=dts.wide)
# Results
library(stargazer)
stargazer(fit.mlogit, type="text")
# Model coefficients with standard errors and statistical significance (stars)
==========================================
Dependent variable:
---------------------------
outcome
------------------------------------------
2:(intercept) 19.456
(3,056.626)
3:(intercept) 35.179
(4,172.540)
2:hitterjill -17.543
(3,056.625)
3:hitterjill -33.117
(4,172.540)
2:hitterjohn -0.188
(0.996)
3:hitterjohn -1.410
(1.056)
2:pitcherbob -0.070
(1.005)
3:pitcherbob -1.270
(1.091)
2:pitcherbrett -0.908
(1.063)
3:pitcherbrett -2.284*
(1.257)
2:placeco -1.655
(1.557)
3:placeco -17.688
(2,840.270)
2:placedc -19.428
(3,056.626)
3:placedc -34.479
(4,172.540)
2:placeny -18.802
(3,056.625)
3:placeny -32.873
(4,172.540)
2:placetn -18.885
(3,056.626)
3:placetn -32.140
(4,172.540)
------------------------------------------
Observations 49
R2 0.155
Log Likelihood -44.605
LR Test 16.388 (df = 18)
==========================================
Note: *p<0.1; **p<0.05; ***p<0.01
More details on the estimation of multinomial logistic models in R are available here.

Pairwise analyse at once in r

I have a data as follows. For each site I have certain amount of different measurements (value1, value2, value3). My goal is to perform, for e.g., Bartlett test for all possible pairs with all possible variables (like site id=1 vs site id=2 (and all the values), site id=1 vs site id=3 and so on).
Could You please teach me how to do it in automated way, cause with choosing pairs with subset or %in% it is quite time demanding and seems to be the wrong way.
pair1 = subset(mydata,site id==1|site id==2),
pair2 = subset(mydata,site id==1|site id==3).
etc...
DATA
dput(el)
structure(list(nr = 1:62, site_id = c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), value1 = c(0.135956723, 0.244470396,
0.986831591, 0.272748803, 0.089672362, 0.087918874, 0.29432428,
0.281550906, 0.491512301, 0.202822283, 0.636965524, 0.439072133,
0.512626669, 0.076218623, 0.537676093, 0.410301432, 0.704414491,
0.028086268, 0.934842257, 0.319693894, 0.038503085, 0.724755387,
0.933940599, 0.293119698, 0.206668204, 0.931947832, 0.570267962,
0.153459278, 0.761549617, 0.168553595, 0.125666771, 0.072239583,
0.585168488, 0.434769948, 0.693265848, 0.507971072, 0.784221012,
0.625158967, 0.734257194, 0.745229936, 0.40953356, 0.070758169,
0.468803818, 0.482476343, 0.329618097, 0.690907203, 0.043867132,
0.335846451, 0.910523185, 0.337186798, 0.94565722, 0.468518602,
0.269354849, 0.357422627, 0.660574954, 0.636926103, 0.558315665,
0.489907305, 0.47082103, 0.808036842, 0.80682936, 0.486316865
), value2 = c(0.072786841, 0.53838031, 0.41372062, 0.927891345,
0.681514932, 0.099571511, 0.356290822, 0.22791718, 0.222255425,
0.274876628, 0.215780917, 0.679079775, 0.557144492, 0.768317182,
0.209794907, 0.756651704, 0.950439091, 0.394732921, 0.477008544,
0.248762115, 0.452692267, 0.479918885, 0.617401621, 0.107246095,
0.968902896, 0.581772822, 0.654269288, 0.2403724, 0.309798716,
0.305768959, 0.184387495, 0.035095852, 0.513505392, 0.976717695,
0.713275402, 0.948746684, 0.44320735, 0.222039163, 0.440820346,
0.914348945, 0.824638633, 0.392305879, 0.711367921, 0.013197053,
0.990004958, 0.46783633, 0.368384378, 0.105245106, 0.01894147,
0.351691108, 0.689240176, 0.281890828, 0.643299941, 0.295450072,
0.929042677, 0.451298968, 0.087512416, 0.367461399, 0.101109718,
0.388519279, 0.886552629, 0.371934921), value3 = c(0.862942279,
0.306199206, 0.815403468, 0.120029065, 0.120468166, 0.97214058,
0.605333252, 0.381385396, 0.501217425, 0.159266606, 0.712387132,
0.532604745, 0.581300843, 0.764953483, 0.833804202, 0.576785884,
0.739833632, 0.894288301, 0.533339352, 0.454653122, 0.141139261,
0.820376994, 0.804809068, 0.097680334, 0.286965944, 0.610407569,
0.084827216, 0.428986455, 0.080766377, 0.435308821, 0.93199262,
0.453242669, 0.106639551, 0.191650525, 0.807339195, 0.53331683,
0.101494804, 0.952323476, 0.243649472, 0.903883695, 0.265602323,
0.364928386, 0.239852295, 0.388701845, 0.964790214, 0.031507745,
0.922879901, 0.419279331, 0.923975616, 0.370413352, 0.159053801,
0.450200201, 0.262717668, 0.258232936, 0.604593393, 0.625352584,
0.086596067, 0.876201214, 0.95281149, 0.728431032, 0.232121342,
0.53337486)), .Names = c("nr", "site_id", "value1", "value2",
"value3"), row.names = c(NA, -62L), class = "data.frame")
This is probably not very efficient, but It does what you need.
First we create a matrix with all possible combinations of the site_id. We then create a list with all the subsetted data frames. Finally we apply the function to the list for all value columns.
m1 <- combn(1:length(unique(el$site_id)),2)
l2 <- lapply(1:ncol(m1), function(i) el[el$site_id %in% m1[,i],])
final.list <- lapply(l2, function(i) sapply(i, function(j) bartlett.test(j, i$site_id)))

Looping through class ltraj?

Apologies if this is not the best forum to inquire this question.
Has anyone been able to loop/iterate though multiple GPS collared individuals that are in class ltraj (adehabitatlt)?
I've been trying to calculate Prox (https://cran.r-project.org/web/packages/wildlifeDI/vignettes/wildlifeDI-vignette.pdf) for multiple individuals but am struggling with how exactly to loop class ltraj because it's different than dataframe (which I'm accustomed to).
Thanks in advance.
install.packages('wildlifeDI', dependencies=TRUE)
library(wildlifeDI)
library(adehabitatLT)
chupacabra <- structure(list(CollarID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L), .Label = c("A4116F", "A4117M", "A4118F"), class = "factor"),
DateTime = structure(c(1433653200, 1433667600, 1433682060,
1433682300, 1433682600, 1433682900, 1433683200, 1433683500,
1433683800, 1433684100, 1433684400, 1433684700, 1433685000,
1433685300, 1433685600, 1433685900, 1433686200, 1433686500,
1433686800, 1433687100, 1433687400, 1433687700, 1433688000,
1433688300, 1433688600, 1433688900, 1433689200, 1433689500,
1433689800, 1433690100, 1433690400, 1433690700, 1433691000,
1433691300, 1433691600, 1433691900, 1433692200, 1433692500,
1433692800, 1433693100, 1433693400, 1433693700, 1433694000,
1433694300, 1433694600, 1433694900, 1433695200, 1433695500,
1433695800, 1433696100, 1433696400, 1433710860, 1433714400,
1433714700, 1433715000, 1433715300, 1433715600, 1433715900,
1433716200, 1433716500, 1433716800, 1433717100, 1433717400,
1433717700, 1433718000, 1433718300, 1433718600, 1433718900,
1433719200, 1433719500, 1433719800, 1433720100, 1433720400,
1433720700, 1433721000, 1433721300, 1433721600, 1433721900,
1433722200, 1433722500, 1433722800, 1433723100, 1433723400,
1433723700, 1433724060, 1433724300, 1433724600, 1433724900,
1433725200, 1433653200, 1433667660, 1433682060, 1433682300,
1433682600, 1433682900, 1433683200, 1433683500, 1433683800,
1433684100, 1433684400, 1433684700, 1433685000, 1433685300,
1433685660, 1433685900, 1433686200, 1433686500, 1433686800,
1433687100, 1433687400, 1433687700, 1433688000, 1433688300,
1433688660, 1433688900, 1433689200, 1433689500, 1433689800,
1433690100, 1433690400, 1433690700, 1433691000, 1433691300,
1433691600, 1433691900, 1433692200, 1433692500, 1433692800,
1433693100, 1433693400, 1433693700, 1433694060, 1433694300,
1433694600, 1433694900, 1433695200, 1433695500, 1433695800,
1433696100, 1433696400, 1433710860, 1433714400, 1433714700,
1433715000, 1433715300, 1433715600, 1433715900, 1433716200,
1433716500, 1433716800, 1433717100, 1433717400, 1433717700,
1433718000, 1433718300, 1433718600, 1433718900, 1433719200,
1433719500, 1433719800, 1433720100, 1433720400, 1433720700,
1433721000, 1433721300, 1433721600, 1433721900, 1433722200,
1433722500, 1433722800, 1433723100, 1433723400, 1433723700,
1433724000, 1433724300, 1433724600, 1433724900, 1433725200,
1433653249, 1433667666, 1433682089, 1433682349, 1433682632,
1433682936, 1433683234, 1433683536, 1433683837, 1433684144,
1433684443, 1433684739, 1433685031, 1433685370, 1433685634,
1433685935, 1433686236, 1433686536, 1433686826, 1433687142,
1433687448, 1433687736, 1433688034, 1433688337, 1433688649,
1433688936, 1433689236, 1433689531, 1433689827, 1433690139,
1433690433, 1433690736, 1433691048, 1433691336, 1433691634,
1433691941, 1433692236, 1433692535, 1433692833, 1433693129,
1433693434, 1433693735, 1433694028, 1433694373, 1433694642,
1433694931, 1433695234, 1433695542, 1433695831, 1433696148,
1433696448, 1433710908, 1433714437, 1433714737, 1433715036,
1433715366, 1433715636, 1433715969, 1433716234, 1433716536,
1433716827, 1433717137, 1433717435, 1433717733, 1433718048,
1433718336, 1433718636, 1433718973, 1433719272, 1433719530,
1433719837, 1433720136, 1433720431, 1433720736, 1433721031,
1433721336, 1433721640, 1433721946, 1433722236, 1433722528,
1433722842, 1433723137, 1433723434, 1433723730, 1433724035,
1433724370, 1433724634, 1433724936, 1433725236), class = c("POSIXct",
"POSIXt"), tzone = ""), UTM_X = c(636979.2503, 636977.6583,
637402.4471, 637400.3063, 637402.3105, 637407.1977, 637406.3305,
637408.2991, 637407.1907, 637407.8414, 637406.7617, 637407.1614,
637409.8019, 637431.5235, 637465.9644, 637495.9583, 637525.2219,
637573.6033, 637645.3501, 637683.3844, 637691.6229, 637693.4815,
637693.4973, 637691.2483, 637691.9061, 637693.6377, 637692.1106,
637692.3169, 637690.9989, 637691.4503, 637693.6252, 637692.4915,
637694.9434, 637692.6685, 637692.8116, 637694.6787, 637694.4404,
637695.9109, 637696.8945, 637695.2403, 637695.4283, 637694.6085,
637693.4962, 637695.6229, 637734.7283, 637773.2897, 637774.9891,
637787.6573, 637792.285, 637807.0486, 637834.6231, 637497.3348,
637149.9982, 637145.0345, 637178.159, 637181.8251, 637181.1075,
637178.023, 637175.327, 637179.9138, 637180.2833, 637181.5512,
637185.8749, 637181.0011, 637177.401, 637177.4498, 637176.787,
637176.0093, 637175.5126, 637177.9578, 637178.5819, 637188.3911,
637188.7303, 637189.496, 637204.3885, 637195.2063, 637204.9823,
637201.5235, 637212.3355, 637274.4294, 637293.0009, 637296.3954,
637331.3382, 637358.4369, 637365.1677, 637357.5562, 637355.3896,
637345.4827, 637339.1054, 628920.3789, 628869.9781, 630028.6781,
630156.4557, 629878.756, 629658.9786, 629412.6432, 629257.5965,
629405.8967, 629113.4479, 628955.5124, 628852.0231, 628711.9202,
628632.7134, 628621.7724, 628622.2565, 628683.6018, 628771.1182,
628790.8437, 628867.7592, 628881.9794, 628830.9898, 628681.9202,
628575.3395, 628578.1836, 628656.4902, 628659.2271, 628656.689,
628660.4677, 628657.294, 628657.077, 628689.6585, 628727.0131,
628716.6979, 628703.8397, 628678.6953, 628679.3594, 628681.3549,
628625.6275, 628563.1372, 628488.425, 628482.5023, 628469.2209,
628417.9697, 628407.7352, 628405.374, 628393.143, 628394.0092,
628396.2344, 628395.05, 628395.7787, 627684.7989, 627704.889,
627702.5528, 627702.0422, 627708.7906, 627706.9374, 627687.0371,
627622.0573, 627605.7932, 627603.5707, 627587.8803, 627606.0471,
627603.2967, 627602.954, 627603.5844, 627604.1232, 627601.697,
627581.6104, 627599.7062, 627616.327, 627661.7402, 627889.446,
627883.5896, 627803.1167, 627792.5918, 627716.0886, 627720.8854,
627671.8217, 627666.9994, 627586.7035, 627584.4273, 627532.492,
627502.6326, 627430.6781, 627408.8845, 627357.5049, 627406.0466,
627427.1382, 636666.3215, 636629.7032, 637179.9041, 637187.7067,
637183.5281, 637193.2082, 637227.2331, 637290.2543, 637347.9311,
637373.0887, 637368.8923, 637371.0722, 637383.95, 637480.1799,
637510.543, 637558.428, 637676.2714, 637682.3564, 637680.8591,
637682.8516, 637680.8317, 637680.8341, 637681.9818, 637681.2897,
637681.3658, 637681.9234, 637681.8824, 637682.0629, 637684.8756,
637681.602, 637682.7548, 637680.8578, 637682.9887, 637680.2496,
637681.4629, 637682.3731, 637682.2223, 637684.1076, 637681.7127,
637681.1249, 637681.6758, 637681.595, 637682.5253, 637702.3094,
637728.9487, 637784.0853, 637776.5727, 637785.2538, 637786.6413,
637807.9935, 637834.8672, 637485.5191, 637148.5674, 637139.2974,
637174.9104, 637191.9371, 637179.4262, 637175.7715, 637176.3455,
637174.5459, 637174.2012, 637173.7462, 637177.3967, 637176.6907,
637177.8458, 637178.0774, 637178.4151, 637178.3272, 637178.2442,
637177.6655, 637176.734, 637186.2713, 637185.0998, 637197.4201,
637197.9147, 637204.1485, 637203.1784, 637204.4993, 637205.3515,
637279.9058, 637303.773, 637303.5724, 637330.3473, 637354.416,
637366.5627, 637340.7274, 637357.5505, 637350.709, 637349.689
), UTM_Y = c(3365828.581, 3365826.066, 3364992.673, 3364991.006,
3364989.036, 3364990.816, 3364989.486, 3364991.849, 3364991.37,
3364990.059, 3364989.58, 3364991.403, 3364991.536, 3364985.614,
3365030.733, 3365054.446, 3365091.064, 3365138.444, 3365289.033,
3365390.111, 3365398.839, 3365387.124, 3365390.427, 3365386.696,
3365387.104, 3365379.344, 3365386.131, 3365388.805, 3365385.152,
3365385.158, 3365386.394, 3365385.637, 3365385.48, 3365386.071,
3365385.397, 3365387.416, 3365387.269, 3365389.505, 3365389.971,
3365387.833, 3365389.676, 3365390.685, 3365385.96, 3365384.934,
3365352.152, 3365369.878, 3365376.795, 3365390.013, 3365382.689,
3365382.189, 3365410.939, 3365683.847, 3365620.829, 3365574.121,
3365527.084, 3365501.513, 3365502.801, 3365512.739, 3365514.733,
3365512.885, 3365511.016, 3365512.562, 3365510.255, 3365511.235,
3365509.494, 3365509.439, 3365509.431, 3365509.388, 3365510.678,
3365509.534, 3365511.083, 3365511.85, 3365514.659, 3365513.371,
3365525.476, 3365526.036, 3365529.429, 3365528.676, 3365513.172,
3365507.793, 3365514.623, 3365512.105, 3365504.477, 3365512.401,
3365495.238, 3365490.863, 3365441.075, 3365411.542, 3365403.003,
3371496.516, 3371594.382, 3370587.966, 3370380.241, 3370270.012,
3370346.817, 3370433.295, 3370488.189, 3370225.222, 3370122.896,
3370174.202, 3370232.298, 3370192.371, 3370255.722, 3370283.548,
3370283.21, 3370305.674, 3370344.002, 3370354.4, 3370348.973,
3370200.353, 3370078.071, 3370123.589, 3370194.686, 3370393.878,
3370500.265, 3370498.635, 3370498.882, 3370497.663, 3370499.687,
3370500.172, 3370633.763, 3370704.904, 3370839.426, 3370879.943,
3370950.842, 3370957.988, 3370963, 3371031.496, 3371082.487,
3371109.89, 3371112.17, 3371118.807, 3371167.072, 3371168.581,
3371170.127, 3371178.074, 3371177.097, 3371178.11, 3371176.777,
3371178.482, 3371566.662, 3371622.632, 3371621.252, 3371619.772,
3371623.975, 3371627.245, 3371636.71, 3371612.734, 3371598.776,
3371590.192, 3371636.009, 3371656.352, 3371656.719, 3371656.471,
3371656.755, 3371659.1, 3371656.401, 3371688.243, 3371717.065,
3371741.492, 3371755.505, 3371618.156, 3371595.308, 3371615.82,
3371560.55, 3371552.166, 3371572.884, 3371547.544, 3371530.616,
3371559.755, 3371591.63, 3371612.877, 3371657.663, 3371727.149,
3371739.263, 3371823.645, 3371912.149, 3371969.549, 3366104.602,
3365712.344, 3365494.627, 3365496.045, 3365484.575, 3365475.02,
3365485.304, 3365467.377, 3365477.805, 3365507.809, 3365510.682,
3365519.888, 3365527.19, 3365491.394, 3365490.37, 3365468.274,
3365393.413, 3365389.355, 3365386.964, 3365391.977, 3365389.125,
3365388.937, 3365389.35, 3365389.375, 3365387.159, 3365387.133,
3365386.578, 3365386.735, 3365386.161, 3365387.472, 3365387.487,
3365387.064, 3365387.977, 3365385.016, 3365387.836, 3365388.036,
3365387.048, 3365389.909, 3365387.074, 3365384.939, 3365387.717,
3365388.026, 3365388.16, 3365385.728, 3365344.996, 3365374.693,
3365377.679, 3365387.866, 3365389.823, 3365391.779, 3365410.631,
3365698.174, 3365622.297, 3365571.954, 3365511.957, 3365510.265,
3365505.086, 3365509.196, 3365512.44, 3365513.438, 3365508.777,
3365509.049, 3365509.838, 3365506.403, 3365507.748, 3365510.711,
3365509.075, 3365507.666, 3365508.152, 3365505.285, 3365498.401,
3365508.531, 3365508.483, 3365513.538, 3365520.783, 3365519.376,
3365523.92, 3365529.634, 3365529.866, 3365498.661, 3365512.941,
3365509.801, 3365503.056, 3365513.548, 3365502.683, 3365482.215,
3365438.852, 3365412.317, 3365406.363)), .Names = c("CollarID",
"DateTime", "UTM_X", "UTM_Y"), row.names = c(NA, -267L), class = "data.frame")
Commenting out incorrect transformation
# chupacabra$DateTime <-as.POSIXct(strptime(chupacabra$DateTime, format='%m/%d/%Y %H:%M:%S'),origin='1970-01-01')
chupacabra2<-as.ltraj(chupacabra[, c("UTM_X","UTM_Y")], date=chupacabra$DateTime,id=chupacabra$CollarID, typeII=TRUE)
monster1<-chupacabra2[1] #extract the first chupacabra
monster2<-chupacabra2[2] #extract the second chupacabra
proxdf <-Prox(monster1,monster2, tc=0.5*60,dc=210, local =TRUE)
Here's a sample of chupcabras that we've been tracking. We'd like to examine how often they interact with each other. This dataset has 3 individuals (but we have many many more chupcabras) and it is inefficient to pull out the animals/creatures 1 by 1 to calculate proximity. I'd like to do a for loop (for i in unique ID, perhaps) but I don't understand how to do this when the data is in ltraj format. Any assistance would be appreciated.
After removing the harmful code that incorrectly reformatted the DateTime value, I plotted the "trajectory" for the 3 2-animal combinations. Clearly animal1 did not interact with animal 2 since their ranges were disjoint. Animal 1 and animal 3 do appear to interact, since about halfway through their joint sojourns they have the roughly same trajectory:
plot( chupacabra2[[1]]$x, chupacabra2[[1]]$y, type="l",
xlim=range( c(chupacabra2[[1]]$x, chupacabra2[[3]]$x)),
ylim= range(c(chupacabra2[[1]]$y, chupacabra2[[3]]$y)))
lines( chupacabra2[[3]]$x, chupacabra2[[3]]$y, col="red")
This appears in proxdf13 as:
> monster1<-chupacabra2[1]
>
> monster3<-chupacabra2[3]
>
> proxdf13 <-Prox(monster1,monster3, tc=0.5*60,dc=210, local =TRUE)
> proxdf13
date prox
1 2015-06-07 06:01:00 549.074863
2 2015-06-07 07:20:00 104.169909
3 2015-06-07 08:10:00 6.205875
4 2015-06-07 09:05:00 14.409016
5 2015-06-07 09:20:00 11.189309
6 2015-06-07 15:40:00 6.481131
7 2015-06-07 16:25:00 4.259042
8 2015-06-07 17:15:00 10.648210
9 2015-06-07 17:35:00 4.181297
10 2015-06-07 17:41:00 7.574566
So a guess that "interact" (might) means something along the lines of "is within 15 distance units for more than 2 days in succession". So a natural function to consider would be rle:
> rle( proxdf13$prox < 15 )
Run Length Encoding
lengths: int [1:2] 2 8
values : logi [1:2] FALSE TRUE
> RL13 <- rle( proxdf13$prox < 15 )
> max( RL13$lengths [ RL13$values] )
[1] 8
And test whether this is greater than some value, say 2?
So that was the method to handle a single 2way animal-animal combination. The comments of the questioner below suggest I may have lost him in the material that follows. To get the 2way combinations of a sequence use combn:
> combn(1:3, 2)
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 3 3
That is a matrix that will be use to generate the index values for pulling two single animal "trajectory" dataframes at a time from the 'ltraj' object using each column separately. When the apply function is used it can "loop" over etiehr rows or columns of a matrix and using 2 as the index value to apply specifieds the columns.
So putting this all together (using apply to loop over the column indices of combn-result to get the two-way combinations):
( max.days.prox <- apply( combn( seq(length(chupacabra2)), 2 ), 2,
# loops over columns of the "combinations matrix"
function(x) {
proxcomb <- Prox( chupacabra2[ x[1] ], chupacabra2[ x[2] ],
tc=0.5*60,dc=210, local =TRUE)
RLcomb <- rle( proxcomb$prox < 15 )
Interact.days <- max( RLcomb$lengths [ RLcomb$values] ) } ) )
# [1] -Inf 8 -Inf
We rbind that to the result to look at the items of interest:
> rbind(combn( seq(length(chupacabra2)), 2 ) , max.days.prox)
[,1] [,2] [,3]
1 1 2
2 3 3
max.days.prox -Inf 8 -Inf
So only the pairing of animal1 and animal3 provided evidence of an interaction. This would generalize to larger instances of ltraj-objects

How to use facet_grid correctly in ggplot2?

I'm trying to generate one chart per profile with the following code, but I keep getting "At least one layer must contain all variables used for facetting." errors. I spent the last few hours trying to make it work but I couldn't.
I believe the anwser must be simple, can anyone help?
d = structure(list(category = structure(c(2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L,
3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("4X4",
"HATCH", "SEDAN"), class = "factor"), profile = structure(c(1L,
1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L), .Label = c("FIXED", "FREE", "MOBILE"), class = "factor"),
value = c(6440.32, 6287.22, 9324, 7532, 7287.63, 6827.27,
6880.48, 7795.15, 7042.51, 2708.41, 1373.69, 6742.87, 7692.65,
7692.65, 8116.56, 7692.65, 7692.65, 7692.65, 7962.65, 8116.56,
5691.12, 2434, 8343, 7727.73, 7692.65, 7721.15, 1944.38,
6044.23, 8633.65, 7692.65, 7692.65, 8151.65, 7692.65, 7692.65,
2708.41, 3271.45, 3333.82, 1257.48, 6223.13, 7692.65, 6955.46,
7115.46, 7115.46, 7115.46, 7115.46, 6955.46, 7615.46, 2621.21,
2621.21, 445.61)), .Names = c("category", "profile", "value"
), class = "data.frame", row.names = c(NA, -50L))
library(ggplot2)
p = ggplot(d, aes(x=d$value, fill=d$category)) + geom_density(alpha=.3)
p + facet_grid(d$profile ~ .)
Your problem comes from referring to variables explicitly (i.e. d$profile), not with respect to the data argument in the call to ggplot. There is no need for d$ anywhere.
When faceting using facet_grid or facet_wrap, you need to do so. It is also good practice to do in calls to aes
p <- ggplot(d, aes(x = value, fill = category)) + geom_density(alpha = .3)
p + facet_grid(profile ~ .)

Resources