Marginal effects from the multinomial model - r

I am trying to get the marginal effects from a multinomial model derived from the mlogit package but it shows an error. Can anyone provide some guidance to solve the problem? Many thanks!
# data
df1 <- structure(list(Y = c(3, 4, 1, 2, 3, 4, 1, 5, 2, 3, 4, 2, 1, 4,
1, 5, 3, 3, 3, 5, 5, 4, 3, 5, 4, 2, 5, 4, 3, 2, 5, 3, 2, 5, 5,
4, 5, 1, 2, 4, 3, 1, 2, 3, 1, 1, 3, 2, 4, 2, 2, 4, 1, 5, 3, 1,
5, 2, 3, 4, 2, 4, 5, 2, 4, 1, 4, 2, 1, 5, 3, 2, 1, 4, 4, 1, 5,
1, 1, 1, 4, 5, 5, 3, 2, 3, 3, 2, 4, 4, 5, 3, 5, 1, 2, 5, 5, 1,
2, 3), D = c(12, 8, 6, 11, 5, 14, 0, 22, 15, 13, 18, 3, 5, 9,
10, 28, 9, 16, 17, 14, 26, 18, 18, 23, 23, 12, 28, 14, 10, 15,
26, 9, 2, 30, 18, 24, 27, 7, 6, 25, 13, 8, 4, 16, 1, 4, 5, 18,
21, 1, 2, 19, 4, 2, 16, 17, 23, 15, 13, 21, 24, 14, 27, 6, 20,
6, 19, 8, 7, 23, 11, 11, 1, 22, 21, 4, 27, 6, 2, 9, 18, 30, 26,
22, 10, 1, 4, 7, 26, 15, 26, 18, 30, 1, 11, 29, 25, 3, 19, 15
), x1 = c(13, 12, 4, 3, 16, 16, 15, 13, 1, 15, 10, 16, 1, 17,
7, 13, 12, 6, 8, 16, 16, 11, 7, 16, 5, 13, 12, 16, 17, 6, 16,
9, 14, 16, 15, 5, 7, 2, 8, 2, 9, 9, 15, 13, 9, 4, 16, 2, 11,
13, 11, 6, 4, 3, 7, 4, 12, 2, 16, 14, 3, 13, 10, 11, 10, 4, 11,
16, 8, 12, 14, 9, 4, 16, 16, 12, 9, 10, 6, 1, 3, 8, 7, 7, 5,
16, 17, 10, 4, 15, 10, 8, 3, 13, 9, 16, 12, 7, 4, 11), x2 = c(12,
19, 18, 19, 15, 12, 15, 16, 15, 11, 12, 16, 17, 14, 12, 17, 17,
16, 12, 20, 11, 11, 15, 14, 18, 10, 14, 13, 10, 14, 18, 18, 18,
17, 18, 14, 16, 19, 18, 16, 18, 14, 17, 10, 16, 12, 16, 15, 11,
18, 19, 15, 19, 11, 16, 10, 20, 14, 10, 12, 10, 15, 13, 15, 11,
20, 11, 12, 16, 16, 11, 15, 11, 11, 10, 10, 16, 11, 20, 17, 20,
17, 16, 11, 18, 19, 18, 14, 17, 11, 16, 11, 18, 14, 15, 16, 11,
14, 11, 13)), class = "data.frame", row.names = c(NA, -100L))
library(mlogit)
mld <- mlogit.data(df1, choice="Y", shape="wide") # shape data for `mlogit()`
mlfit <- mlogit(Y ~ 1 | D + x1 + x2, reflevel="1", data=ml.d) # fit the model
effects(mlfit) # this shows the following error:
Error in if (rhs %in% c(1, 3)) { : argument is of length zero
Called from: effects.mlogit(mlfit)

I believe you are missing the covariate information that needs to be put there, so if you use effects(mlfit, covariate = 'D'), It should work. Now the error is coming because the default of covariate is NULL. NULL is special in R, it has no(zero) length and hence you are getting argument of length zero. Please let me know if it fixes your issue.
As per documentation of effects.mlogit , it says:
covariate
the name of the covariate for which the effect should be computed,
I am getting this output at my end:
R>effects(mlfit, covariate = 'D')
1 2 3
-0.003585105992 -0.070921137682 -0.026032167377
4 5
0.078295227196 0.022243183855

Related

Density plot of a vector shows tails before and after its minimum and maximum

I have the following vector:
v<-c(1, 1, 8, 3, 1, 9, 4, 21, 13, 13, 1, 1, 3, 10, 1, 13, 22, 1,
1, 4, 2, 1, 13, 1, 5, 1, 2, 1, 1, 2, 12, 10, 26, 15, 2, 9, 6,
5, 1, 3, 18, 2, 10, 2, 8, 9, 4, 1, 11, 4, 2, 12, 3, 14, 2, 1,
27, 3, 6, 2, 1, 1, 3, 16, 3, 36, 13, 9, 11, 10, 24, 2, 27, 4,
4, 2, 9, 1, 3, 13, 3, 1, 8, 5, 5, 15, 1, 1, 3, 1, 4, 14, 8, 1,
1, 2, 20, 1, 9, 3, 1, 2, 5, 14, 5, 11, 1, 3, 2, 9, 10, 21, 9,
1, 20, 5, 11, 23, 2, 1, 1, 2, 1, 7, 2, 9, 1, 19, 9, 9, 2, 15,
17, 8, 11, 17, 2, 14, 2, 8, 13, 1, 2, 9, 15, 25, 3, 8, 32, 4,
11, 1, 1, 2)
I would like to estimate its density in R through the command density. With few lines of code:
d<-density(v)
df<-data.frame(x=d$x,y=d$y,stringsAsFactors = FALSE)
plot(df)
I obtained the following picture:
But the resulting plot doesn't add up, because max(v) is 36 and min(v) is 1 while the graph shows tails before and after 0 and 40.

Using text3D in ribbon plot in R

I want to construct a 3D ribbon plot with the following data.
structure(c(10, 10, 10, 10, 10, 10, 21, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 20, 10, 10, 10, 10, 10, 10, 10, 21, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 20, 10, 10, 10, 19,
10, 10, 10, 21, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
20, 10, 20, 9, 9, 9, 9, 9, 21, 9, 9, 9, 18, 9, 9, 9, 9, 9, 9,
9, 9, 19, 9, 8, 8, 16, 8, 16, 8, 21, 20, 8, 8, 16, 8, 8, 8, 8,
8, 18, 8, 8, 19, 8, 9, 9, 9, 9, 9, 9, 21, 20, 9, 9, 9, 9, 9,
9, 9, 9, 19, 9, 9, 18, 9, 8, 8, 16, 8, 16, 8, 21, 20, 8, 8, 8,
8, 8, 8, 8, 8, 19, 8, 8, 18, 8, 7, 7, 14, 7, 16, 7, 21, 20, 7,
18, 7, 7, 7, 7, 14, 7, 19, 7, 7, 16, 7, 8, 8, 16, 8, 8, 8, 20,
19, 8, 21, 8, 8, 8, 8, 16, 8, 18, 8, 8, 8, 8, 8, 8, 16, 8, 8,
8, 20, 19, 16, 21, 8, 8, 8, 8, 16, 8, 18, 8, 8, 8, 8, 8, 8, 17,
8, 16, 8, 20, 18, 8, 21, 8, 8, 8, 8, 16, 8, 18, 8, 8, 8, 8, 7,
7, 16, 16, 16, 7, 18, 20, 7, 21, 16, 7, 7, 7, 7, 7, 19, 7, 7,
7, 7), .Dim = c(21L, 12L), .Dimnames = list(c("colmA", "colmB",
"colmC", "colmD", "colmE", "colmF", "colmG", "colmH", "colmI",
"colmJ", "colmK", "colmL", "colmM", "colmN", "colmO", "colmP",
"colmQ", "colmR", "colmS", "colmT", "colmU"), c("2005", "2006",
"2007", "2008", "2009", "2010", "2011", "2012", "2013", "2014",
"2015", "2016")))
I have to work out a code in the meanwhile as I did not get any response. Here is the code.
ribbon3D(x = 1:21, y = 1:12, z = tf14, scale = T, expand = 0.01, bty = "g", along = "y",
col = "pink", border = "black", shade = 0.2, ltheta = -90, lphi = 30, space = 0.5,
ticktype = "detailed", d = 2, curtain = T, xlab = "", ylab = "", zlab = "")
# Use text3D to label x axis
text3D(x = 1:21, y = rep(0.5, 21), z = rep(1, 21),
labels = rownames(tf14),
add = TRUE, adj = 0, lphi = 30, ltheta = -90)
# Use text3D to label y axis
text3D(x = rep(0.5, 12), y = 1:12, z = rep(1, 12),
labels = colnames(tf14),
add = TRUE, adj = 1, lphi = 30, ltheta = -90)
But, the image that I get is not the desired one. The axis labels are cluttered and the side on which years are displayed needs to be right hand side. Also, the height of the ribbons is too low.
Can somebody improve the code?

Computing a few difficult metrics from an integer vector in R

For some context, I am working with sports / basketball data. The following vector is for 1 NBA game, and contains the number of points that the home team was ahead or behind at any given point in the game.
dput(leads_vector)
c(0, 0, 0, 0, 0, 0, 0, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2,
-2, -2, -2, -2, -2, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 4, 2,
5, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 8, 8, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 11, 11, 9, 9, 9, 9, 9, 9, 9, 9, 11,
11, 9, 9, 9, 11, 11, 11, 11, 12, 13, 13, 13, 13, 13, 13, 15,
14, 14, 13, 13, 13, 13, 11, 14, 14, 14, 14, 14, 14, 14, 14, 14,
14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 16,
16, 13, 13, 11, 11, 11, 11, 11, 9, 9, 9, 7, 9, 9, 9, 10, 10,
11, 11, 11, 11, 11, 11, 13, 13, 13, 13, 13, 11, 11, 11, 11, 11,
12, 13, 13, 13, 13, 13, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
11, 11, 12, 13, 13, 13, 13, 12, 12, 12, 12, 12, 12, 12, 12, 12,
12, 12, 12, 12, 15, 15, 15, 13, 13, 13, 13, 15, 12, 12, 12, 9,
9, 9, 9, 9, 11, 11, 11, 11, 13, 13, 10, 10, 10, 8, 8, 8, 8, 8,
8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
8, 8, 8, 10, 8, 7, 7, 7, 7, 7, 7, 7, 7, 8, 9, 9, 9, 11, 12, 12,
12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 10, 12, 10, 12, 12, 12,
12, 14, 14, 14, 12, 12, 12, 12, 12, 12, 12, 12, 14, 14, 14, 15,
16, 16, 16, 16, 14, 14, 11, 11, 11, 11, 11, 11, 9, 9, 9, 9, 9,
9, 9, 10, 11, 11, 9, 9, 9, 9, 7, 6, 6, 6, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 3, 3, 3, 3, 3, 3, 3, 2, 1, 1, 1,
3, 3, 3, 3, 2, 2, 2, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 6, 6, 6, 6, 6,
6, 6, 6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 10, 10, 10, 8, 8, 7, 7, 7,
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 11, 11, 11, 11,
9, 9, 9, 9, 9, 9, 10, 11, 11, 11, 8, 11, 8, 10, 10, 11, 11, 11,
11, 11, 9, 11, 11, 11, 10, 10, 10, 12, 12, 12, 12, 13, 13, 16,
16, 16, 16, 17, 18, 19, 19, 19, 19, 19, 18, 18, 18, 20, 20, 20,
20, 20, 20, 20, 18, 18, 18, 16, 16, 16, 13, 13, 13, 11, 10, 10,
10, 10, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13)
These vectors always start with 0, since the game begins tied at 0-0. leads_vector[100] equals 14, which means the home team was winning by 14 at this point in the game. Note that the numbers in the vector repeat, since the score can remain the same for several plays in a row in a basketball game.
The 4 metrics I would like to compute are:
Biggest Lead
Number of times the game was tied
Longest run (consecutive points for one team)
Lead changes
Biggest Lead is easy to compute:
biggest_lead <- abs(max(leads_vector))
Number of times the game was tied is a bit more difficult to compute:
times_tied <- sum(leads_vector[2:length(leads_vector)] == 0 & leads_vector[1:(length(leads_vector)-1)] != 0)
times_tied checks for all instances in the vector where the value is 0 (the score is tied), and the preceding value in the vector is not 0. This ensures that each sequence of zeros counts as the score being tied only once.
I am not sure how to compute longest run. The longest run in the game is the largest monotonically increasing or decreasing sequence in the vector. Just using the eye test, I notice a long run of 8 at leads_vector[38:65].
Number of lead changes is difficult to compute as well. It would be equal to the number of times the lead went from positive to negative in this vector. The following leads_vector:
c(3, -3, 2, 5, 4, 3, 0, 2, -3, -1, -4, -5, -2, 0, 1)
... would have 4 lead changes (from 3 to -3, from -3 to 2, from 2 to -3, and from -2 to 0 to 1).
Any help with this is appreciated!
EDIT - longest run is the tough stat to compute here, but i'm working on it.
EDIT2 - i think longest run will be easier to compute if i remove repeat values from leads_vector. but i cannot use duplicated() function, because that will remove duplicates in different places in the vector. Instead i'd want to only remove repeat values next to each other (get c(0, -2, 5, 3, 5, 8, 10, 11, 9, 11, 9, 11, ... ))
Computing of longest run:
compute_longest_run <- function(x) {
# Collapse repetitions
x_unique <- rle(x)$values
# Compute score change
score_change <- diff(x_unique)
# Need to compute sum of all subvectors with the same sign
run_side <- sign(score_change)
run_id <- c(1, cumsum(diff(run_side) != 0) + 1)
run_value <- tapply(score_change, run_id, sum)
max(abs(run_value))
}
compute_longest_run(leads_vector)
#> [1] 10
#biggest_lead
with(rle(leads_vector), max(abs(values)))
#number_ties
with(rle(leads_vector), sum(values == 0))
#longest_run
#lead_changes
length(rle(leads_vector[leads_vector != 0] < 0)$values)
I found out how to compute lead changes using the sign() and diff() function. First I need to filter out the values where the lead equals 0, since these are not lead changes for my calculations, even though R's sign() function has different values for (+), (-) and 0. I have this:
lead_changes <- sum(diff(sign(leads_vector[leads_vector != 0]))) / 2
For longest run, I think starting with this, to remove repeat values, is a good start:
lead_changes[c(TRUE, lead_changes[-1] != hL[-length(hLlead_changes])]

Histogram of categorical data subdivided into subcategories with ggplot2?

method <- c(rep("method1", 25), rep("method2", 80), rep("method3", 177))
exc <- c(1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)
pr <- c(9.778158e+01, 2.383512e+00, 7.017030e-02, 5.895421e-02, 9.979526e+01, 9.996787e+01, 7.162376e+01, 8.286646e-01, 3.059980e-01, 5.724184e-02, 7.823576e-01, 2.301170e-01, 1.622905e+00, 1.689215e+00, 9.845610e+01, 1.123755e+01, 3.875266e-01, 3.167882e+00, 4.120089e+01, 1.052839e+00, 6.381193e-01, 1.757222e+01, 9.984169e+01, 1.493086e+00, 9.998443e+01, 3.005090e-01, 8.079704e+01, 9.032584e-03, 2.828906e-02, 7.804222e+01, 1.763882e-01, 3.713532e-02, 5.388052e-02, 5.403792e-02, 1.249938e-01, 1.998921e-01, 4.402396e-03, 3.342248e-03, 1.759018e-02, 2.420178e-02, 4.375781e-04, 6.137413e-03, 6.512810e+01, 4.668736e-01, 1.045915e-02, 1.513374e-01, 8.237510e-02, 9.512328e-02, 6.231771e-02, 2.316358e-02, 9.073479e-02, 1.911786e-02, 2.136473e-02, 4.618955e-01, 1.019962e+00, 5.266899e-02, 1.339400e-02, 3.811348e-02, 8.515192e+00, 3.003492e-02, 7.733972e-03, 5.277713e-01, 4.700602e-02, 8.439378e-03, 5.790735e-03, 1.029814e-02, 1.994390e-02, 1.083500e+00, 5.702771e-02, 1.247394e-02, 1.464561e-01, 3.259409e-02, 6.687443e-03, 1.139584e-02, 2.847002e-02, 2.377251e+00, 2.834861e+01, 4.786969e-01, 1.458309e-03, 1.174660e-01, 4.023188e-03, 2.273696e-01, 2.558717e-01, 2.559825e-03, 7.208975e+01, 1.195069e+00, 8.448632e+01, 2.057061e-02, 6.576548e-03, 3.235016e+01, 4.556932e-02, 4.185363e-01, 7.901099e+00, 1.921646e-02, 2.226815e-01, 2.566391e-01, 1.899886e-01, 9.862253e+01, 9.937875e-02, 1.956194e-02, 2.190072e-02, 9.479701e-03, 9.008552e+01, 1.590398e-01, 5.293134e-02, 1.412295e-02, 6.126795e-03, 1.617179e-02, 9.723586e+01, 2.291093e-01, 3.501661e-02, 2.065283e-01, 8.174733e+01, 9.902566e-02, 2.623761e-03, 7.848924e-03, 5.918991e+01, 4.847605e-03, 9.998977e+01, 2.598032e-03, 6.423861e+01, 5.152965e-03, 7.230036e+01, 3.566760e-01, 6.076995e-02, 5.944848e-02, 2.120722e-02, 4.765753e-03, 1.923560e-02, 3.962496e-02, 2.374236e-01, 2.629203e-02, 2.163854e+00, 1.699109e+00, 1.253692e-01, 6.023272e-01, 1.287564e-02, 4.878132e+01, 2.272958e-01, 3.276120e-01, 8.198424e+01, 9.638494e+01, 9.549004e-02, 2.739015e-03, 3.203505e-02, 4.889431e-03, 2.291124e+01, 1.524237e-02, 6.829877e+00, 2.879291e-02, 9.585328e-02, 5.195482e-02, 4.460379e+00, 1.788282e-01, 7.170554e-02, 1.260122e-02, 2.214409e-02, 3.507428e+00, 4.991801e-03, 6.293137e-02, 5.844928e-03, 3.248548e+00, 2.591625e-01, 3.426773e-02, 1.774687e-01, 9.756274e+01, 7.076785e+01, 6.956855e-02, 2.147921e-01, 3.928065e-01, 4.460488e-02, 1.764221e-01, 5.277367e-02, 4.594422e-02, 1.060427e-01, 1.114139e+00, 3.626841e+01, 2.771743e-03, 1.589661e-02, 7.022587e-02, 3.651647e-03, 1.252429e+01, 4.773260e-01, 5.942872e-02, 2.483184e+01, 1.420190e-01, 3.652907e-03, 9.955749e-03, 1.111343e-01, 6.028820e+01, 4.564798e-02, 8.623486e-03, 4.440423e-03, 2.574466e-01, 3.370628e-02, 7.822098e-03, 1.608214e-02, 2.638099e-02, 8.873891e+01, 4.317737e-01, 1.735514e-02, 1.277323e-01, 5.741704e+01, 2.375506e-02, 5.359338e-02, 1.709943e+00, 3.987456e-01, 1.505662e-02, 2.816364e-02, 9.348211e-02, 4.637968e-02, 4.608277e-02, 2.236911e+01, 1.326066e+00, 1.613083e+01, 3.346656e+00, 2.611929e-02, 9.568675e+01, 2.724418e-02, 2.839759e-03, 9.563275e-03, 9.963634e+01, 9.959123e+01, 6.270345e-01, 4.705309e-02, 5.428094e-02, 1.648435e-02, 8.296800e-02, 4.055409e-02, 1.621184e-02, 1.314828e-01, 7.527521e+01, 5.230978e-02, 2.607093e-01, 6.399820e+01, 1.088062e-02, 9.429669e-02, 1.953214e-01, 1.474039e-01, 9.416921e+01, 2.380873e-03, 2.471462e+00, 3.633414e-02, 1.125673e+00, 1.344756e-01, 1.064287e+00, 8.415448e-01, 1.344756e-01, 2.472510e+01, 1.972818e-01, 2.448721e-02, 8.257077e-02, 6.035718e-02, 5.909623e+00, 3.034434e-02, 5.397369e-01, 7.197956e+01, 2.213201e-01, 1.816337e-01, 3.026522e+00, 4.670512e-03, 6.729115e-02, 8.528516e-02, 8.076168e-03, 2.702415e-01, 4.297185e-03, 2.110990e-02, 8.785337e-02, 1.729627e-01, 7.216669e-03, 1.097661e-02, 3.911452e-02, 2.573924e-02, 4.284408e-01, 1.453148e-01, 2.295089e-02, 2.240505e-01, 1.232520e-01, 1.360767e+00, 5.738281e-01, 6.136543e-02, 3.749292e-02)
loc <- c(18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 20, 18, 19, 20, 19, 19, 20, 20, 19, 19, 18, 20, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15)
data <- data.frame(method, pr, exc, loc)
How can I create a histogram/density plot with ggplot2 of data$pr divided into categories ($method) and each category divided into subcategories ($exc), colored by $loc?
You can use faceting :
ggplot(data)+
geom_histogram(aes(x=pr,fill=factor(loc))) +
facet_grid(method~exc)

R circular LOESS function over 24 hours (a day)

I have data for free parking slots over hours and days.
Here's a random sample of 100.
sl <- list(EmptySlots = c(7, 6, 20, 5, 16, 20, 24, 5, 24, 24, 15, 11,
8, 6, 13, 2, 21, 6, 1, 6, 9, 1, 8, 0, 20, 9, 20, 11, 22, 24,
1, 2, 12, 6, 8, 2, 23, 18, 8, 3, 20, 2, 1, 0, 5, 21, 1, 4, 20,
15, 24, 12, 4, 14, 2, 4, 20, 16, 2, 10, 2, 1, 24, 9, 22, 7, 6,
3, 20, 13, 1, 16, 12, 5, 2, 7, 4, 1, 6, 1, 1, 2, 0, 13, 24, 6,
13, 7, 24, 24, 15, 6, 10, 1, 2, 9, 5, 2, 11, 15), hour = c(8,
16, 23, 14, 18, 7, 17, 15, 19, 19, 17, 17, 16, 14, 17, 12, 19,
10, 10, 13, 16, 10, 16, 11, 12, 9, 0, 15, 16, 21, 10, 11, 17,
11, 16, 15, 23, 7, 16, 14, 18, 14, 14, 9, 15, 2, 10, 9, 19, 17,
20, 16, 12, 17, 12, 9, 23, 9, 15, 17, 10, 12, 18, 17, 18, 17,
13, 10, 7, 8, 10, 18, 11, 11, 12, 17, 12, 9, 14, 15, 10, 11,
10, 10, 20, 16, 18, 15, 21, 18, 17, 13, 8, 11, 15, 16, 11, 9,
12, 18))
A quick way to calculate a LOESS function via ggplot2.
sl <- as.data.frame(sl)
library(ggplot2)
qplot(hour, EmptySlots, data=sl, geom="jitter") + theme_bw() + stat_smooth(size = 2)
What is the best way to tell the LOESS function that 0 and 24 are neighbours? I.e. the line on the left and the right should be the same value if we were to estimate it this way.
Pointers on where to start will do fine.
I'd be tempted just to replicate the data on either side:
library(ggplot2)
empty <- c(7, 6, 20, 5, 16, 20, 24, 5, 24, 24, 15, 11, 8, 6, 13, 2, 21, 6, 1, 6, 9, 1, 8, 0, 20, 9, 20, 11, 22, 24, 1, 2, 12, 6, 8, 2, 23, 18, 8, 3, 20, 2, 1, 0, 5, 21, 1, 4, 20, 15, 24, 12, 4, 14, 2, 4, 20, 16, 2, 10, 2, 1, 24, 9, 22, 7, 6, 3, 20, 13, 1, 16, 12, 5, 2, 7, 4, 1, 6, 1, 1, 2, 0, 13, 24, 6, 13, 7, 24, 24, 15, 6, 10, 1, 2, 9, 5, 2, 11, 15)
hour <- c(8, 16, 23, 14, 18, 7, 17, 15, 19, 19, 17, 17, 16, 14, 17, 12, 19, 10, 10, 13, 16, 10, 16, 11, 12, 9, 0, 15, 16, 21, 10, 11, 17, 11, 16, 15, 23, 7, 16, 14, 18, 14, 14, 9, 15, 2, 10, 9, 19, 17, 20, 16, 12, 17, 12, 9, 23, 9, 15, 17, 10, 12, 18, 17, 18, 17, 13, 10, 7, 8, 10, 18, 11, 11, 12, 17, 12, 9, 14, 15, 10, 11, 10, 10, 20, 16, 18, 15, 21, 18, 17, 13, 8, 11, 15, 16, 11, 9, 12, 18)
emptyrep <- rep.int(empty,3)
hourrep <- c(hour,hour+24,hour-24)
sl <- data.frame(empty=emptyrep, hour=hourrep)
qplot(hour, empty, data=sl, geom="jitter") + theme_bw() + geom_smooth(method="loess",size = 1.5,span=0.2) + coord_cartesian(xlim=c(0,24))
... just like joran said a few minutes earlier (woops)

Resources