Plotting random intercepts of clmm - r

I'm having trouble plotting random intercepts from a clmm() model with 4 random effects in 31 countries.
I tried following this SO post: In R, plotting random effects from lmer (lme4 package) using qqmath or dotplot: how to make it look fancy? However, I cannot get the confidence intervals to show up. I've managed to use dotchart to plot the intercepts by country.
library(ggplot2)
library(ordinal)
# create data frame with intercepts and variances of all random effects
# the first column are the grouping factor, followed by 5 columns of intercepts,
# columns 7-11 are the variances.
randoms <- as.data.frame(ranef(nodual.logit, condVar = F))
var <- as.data.frame(condVar(nodual.logit))
df <- merge(randoms, var, by ="row.names")
# calculate the CI
df[,7:11] <- (1.96*(sqrt(df[,7:11])/sqrt(length(df[,1]))))
# dot plot of intercepts and CI.
p <- ggplot(df,aes(as.factor(Row.names),df[,2]))
p <- p + geom_hline(yintercept=0) +
geom_errorbar(aes(xmax=df[,2]+df[,7], xmin=df[,2]-df[,7]), width=0, color="black") +
geom_point(aes(size=2))
p <- p + coord_flip()
print(p)
Error: Discrete value supplied to continuous scale
Here is another way I tried to plot them:
D <- dotchart(df[,2], labels = df[,1])
D <- D + geom_errorbarh(aes(xmax=df[,2]+df[,7], xmin=df[,2]-df[,7],))
Error in dotchart(df[, 2], labels = df[, 1]) + geom_errorbarh(aes(xmax = df[, : non-numeric argument to binary operator

Found a solution based on R.H.B Christensen (2013) “A Tutorial on fitting Cumulative Link Mixed Models with clmm2 from the ordinal Package” pg. 5.
First plot intercept points for all 31 countries, the add labels using axis(), then add CI’s using segments().
plot(1:31,df[,2], ylim=range(df[,2]), axes =F, ylab ="intercept")
abline(h = 0, lty=2)
axis(1, at=1:31, labels = df[,1], las =2)
axis(2, at= seq(-2,2, by=.5))
for(i in 1:31) segments(i, df[i,2]+df[i,7], i, df[i,2]-df[i, 7])
Can put this code into another loop to plot the Betas of the random effects
for(n in 2:6) plot(1:31,df[,n], ylim=range(df[,n]),axes =F, ylab =colnames(df[n]))+
abline(h = 0, lty=2)+
axis(1, at=1:31, labels = df[,1], las =2)+
axis(2, at= seq(-2,2, by=.5))+
for(i in 1:31) segments(i, df[i,n]+df[i,(n+5)], i, df[i,n]-df[i, (n+5)])

Related

superimposing two probability plots with probplot

I can create a lognormal probability plot using the probplot() function from the e1071 package. A problem arises when I try to add another set of lognormal data to the first plot. Although I use the command par(new=T), the xaxis of the two plots are different and don't align.
Is there another way to go about this?
I tried using the points() function. However, it appears I need the x and y coordinates to plot it and I don't know how to extract the x, y coordinates from the probplot() function.
''' R
# Program to plot random logn failure times with probability plot
library(e1071)
logn_prob_plot <- function() {
set.seed(1)
x<-rlnorm(10,1,1)
par(bty="l")
par(col.lab="white")
p<-probplot(x,qdist=qlnorm)
par(col.lab="black")
mtext(text="failure time", col="black",side=1,line=3,outer=F)
mtext(text="lognormal probability", col="black",side=2,line=3,outer=F)
set.seed(2)
y=rlnorm(10,2,3)
par(new=T)
par(col.lab="white")
probplot(y,qdist=qlnorm,xlab="fail time",ylab="lognormal probability")
par(col.lab="black")
mtext(text="failure time", col="black",side=1,line=3,outer=F)
mtext(text="lognormal probability", col="black",side=2,line=3,outer=F)
}
logn_prob_plot()
My expected result is two groups of data on the same probability plot with the same x and y axes. Instead, I get two different x-axes that are not aligned.
First lets simulate the variables:
set.seed(1)
x<-rlnorm(10,1,1)
set.seed(2)
y=rlnorm(10,2,3)
The first probplot is:
p<-probplot(x,qdist=qlnorm, meanlog = 1, sdlog = 1)
which produces the output:
The second probplot is:
q <- probplot(y,qdist=qlnorm,meanlog = 2, sdlog = 3)
which produces the output:
Your best shot a merging them is using the scale of the smaller one and discarding some points:
p<-probplot(x,qdist=qlnorm, meanlog = 1, sdlog = 1)
points(sort(x), p[[1]](ppoints(length(x))), col = "red", pch = 19)
lines(q, col = "blue")
points(sort(y), q[[1]](ppoints(length(y))), col = "blue", pch = 19)
which gives:
The red line and points are from the distribution with meanlog = 1, sdlog = 1 and the
blue ones are from the one with meanlog = 2, sdlog = 3.
I further have to warn you that from reading the code of the probplot() function:
xl <- quantile(x, c(0.25, 0.75))
yl <- qdist(c(0.25, 0.75), ...)
slope <- diff(yl)/diff(xl)
the slope of the line is determined only by position the first and the third quartile and not bz what happens elsewhere.

How do you implement rgamma and dgamma in a single plot

For an assignment I was asked this:
For the values of
(shape=5,rate=1),(shape=50,rate=10),(shape=.5,rate=.1), plot the
histogram of a random sample of size 10000. Use a density rather than
a frequency histogram so that you can add in a line for the population
density (hint: you will use both rgamma and dgamma to make this plot).
Add an abline for the population and sample mean. Also, add a subtitle
that reports the population variance as well as the sample variance.
My current code looks like this:
library(ggplot2)
set.seed(1234)
x = seq(1, 1000)
s = 5
r = 1
plot(x, dgamma(x, shape = s, rate = r), rgamma(x, shape = s, rate = r), sub =
paste0("Shape = ", s, "Rate = ", r), type = "l", ylab = "Density", xlab = "", main =
"Gamma Distribution of N = 1000")
After running it I get this error:
Error in plot.window(...) : invalid 'xlim' value
What am I doing incorrectly?
plot() does not take y1 and y2 arguments. See ?plot. You need to do a plot (or histogram) of one y variable (e.g., from rgamma), then add the second y variable (e.g., from dgamma) using something like lines().
Here's one way to get a what you want:
#specify parameters
s = 5
r = 1
# plot histogram of random draws
set.seed(1234)
N = 1000
hist(rgamma(N, shape=s, rate=r), breaks=100, freq=FALSE)
# add true density curve
x = seq(from=0, to=20, by=0.1)
lines(x=x, y=dgamma(x, shape=s, rate=r))

Add error bars to scatterplot [duplicate]

How can I generate the following plot in R? Points, shown in the plot are the averages, and their ranges correspond to minimal and maximal values.
I have data in two files (below is an example).
x y
1 0.8773
1 0.8722
1 0.8816
1 0.8834
1 0.8759
1 0.8890
1 0.8727
2 0.9047
2 0.9062
2 0.8998
2 0.9044
2 0.8960
.. ...
First of all: it is very unfortunate and surprising that R cannot draw error bars "out of the box".
Here is my favourite workaround, the advantage is that you do not need any extra packages. The trick is to draw arrows (!) but with little horizontal bars instead of arrowheads (!!!). This not-so-straightforward idea comes from the R Wiki Tips and is reproduced here as a worked-out example.
Let's assume you have a vector of "average values" avg and another vector of "standard deviations" sdev, they are of the same length n. Let's make the abscissa just the number of these "measurements", so x <- 1:n. Using these, here come the plotting commands:
plot(x, avg,
ylim=range(c(avg-sdev, avg+sdev)),
pch=19, xlab="Measurements", ylab="Mean +/- SD",
main="Scatter plot with std.dev error bars"
)
# hack: we draw arrows but with very special "arrowheads"
arrows(x, avg-sdev, x, avg+sdev, length=0.05, angle=90, code=3)
The result looks like this:
In the arrows(...) function length=0.05 is the size of the "arrowhead" in inches, angle=90 specifies that the "arrowhead" is perpendicular to the shaft of the arrow, and the particularly intuitive code=3 parameter specifies that we want to draw an arrowhead on both ends of the arrow.
For horizontal error bars the following changes are necessary, assuming that the sdev vector now contains the errors in the x values and the y values are the ordinates:
plot(x, y,
xlim=range(c(x-sdev, x+sdev)),
pch=19,...)
# horizontal error bars
arrows(x-sdev, y, x+sdev, y, length=0.05, angle=90, code=3)
Using ggplot and a little dplyr for data manipulation:
set.seed(42)
df <- data.frame(x = rep(1:10,each=5), y = rnorm(50))
library(ggplot2)
library(dplyr)
df.summary <- df %>% group_by(x) %>%
summarize(ymin = min(y),
ymax = max(y),
ymean = mean(y))
ggplot(df.summary, aes(x = x, y = ymean)) +
geom_point(size = 2) +
geom_errorbar(aes(ymin = ymin, ymax = ymax))
If there's an additional grouping column (OP's example plot has two errorbars per x value, saying the data is sourced from two files), then you should get all the data in one data frame at the start, add the grouping variable to the dplyr::group_by call (e.g., group_by(x, file) if file is the name of the column) and add it as a "group" aesthetic in the ggplot, e.g., aes(x = x, y = ymean, group = file).
#some example data
set.seed(42)
df <- data.frame(x = rep(1:10,each=5), y = rnorm(50))
#calculate mean, min and max for each x-value
library(plyr)
df2 <- ddply(df,.(x),function(df) c(mean=mean(df$y),min=min(df$y),max=max(df$y)))
#plot error bars
library(Hmisc)
with(df2,errbar(x,mean,max,min))
grid(nx=NA,ny=NULL)
To summarize Laryx Decidua's answer:
define and use a function like the following
plot.with.errorbars <- function(x, y, err, ylim=NULL, ...) {
if (is.null(ylim))
ylim <- c(min(y-err), max(y+err))
plot(x, y, ylim=ylim, pch=19, ...)
arrows(x, y-err, x, y+err, length=0.05, angle=90, code=3)
}
where one can override the automatic ylim, and also pass extra parameters such as main, xlab, ylab.
Another (easier - at least for me) way to do this is below.
install.packages("ggplot2movies")
data(movies, package="ggplot2movies")
Plot average Length vs Rating
rating_by_len = tapply(movies$length,
movies$rating,
mean)
plot(names(rating_by_len), rating_by_len, ylim=c(0, 200)
,xlab = "Rating", ylab = "Length", main="Average Rating by Movie Length", pch=21)
Add error bars to the plot: mean - sd, mean + sd
sds = tapply(movies$length, movies$rating, sd)
upper = rating_by_len + sds
lower = rating_by_len - sds
segments(x0=as.numeric(names(rating_by_len)),
y0=lower,
y1=upper)
Hope that helps.
I put together start to finish code of a hypothetical experiment with ten measurement replicated three times. Just for fun with the help of other stackoverflowers. Thank you... Obviously loops are an option as applycan be used but I like to see what happens.
#Create fake data
x <-rep(1:10, each =3)
y <- rnorm(30, mean=4,sd=1)
#Loop to get standard deviation from data
sd.y = NULL
for(i in 1:10){
sd.y[i] <- sd(y[(1+(i-1)*3):(3+(i-1)*3)])
}
sd.y<-rep(sd.y,each = 3)
#Loop to get mean from data
mean.y = NULL
for(i in 1:10){
mean.y[i] <- mean(y[(1+(i-1)*3):(3+(i-1)*3)])
}
mean.y<-rep(mean.y,each = 3)
#Put together the data to view it so far
data <- cbind(x, y, mean.y, sd.y)
#Make an empty matrix to fill with shrunk data
data.1 = matrix(data = NA, nrow=10, ncol = 4)
colnames(data.1) <- c("X","Y","MEAN","SD")
#Loop to put data into shrunk format
for(i in 1:10){
data.1[i,] <- data[(1+(i-1)*3),]
}
#Create atomic vectors for arrows
x <- data.1[,1]
mean.exp <- data.1[,3]
sd.exp <- data.1[,4]
#Plot the data
plot(x, mean.exp, ylim = range(c(mean.exp-sd.exp,mean.exp+sd.exp)))
abline(h = 4)
arrows(x, mean.exp-sd.exp, x, mean.exp+sd.exp, length=0.05, angle=90, code=3)

Plotting percentiles in error bars [duplicate]

How can I generate the following plot in R? Points, shown in the plot are the averages, and their ranges correspond to minimal and maximal values.
I have data in two files (below is an example).
x y
1 0.8773
1 0.8722
1 0.8816
1 0.8834
1 0.8759
1 0.8890
1 0.8727
2 0.9047
2 0.9062
2 0.8998
2 0.9044
2 0.8960
.. ...
First of all: it is very unfortunate and surprising that R cannot draw error bars "out of the box".
Here is my favourite workaround, the advantage is that you do not need any extra packages. The trick is to draw arrows (!) but with little horizontal bars instead of arrowheads (!!!). This not-so-straightforward idea comes from the R Wiki Tips and is reproduced here as a worked-out example.
Let's assume you have a vector of "average values" avg and another vector of "standard deviations" sdev, they are of the same length n. Let's make the abscissa just the number of these "measurements", so x <- 1:n. Using these, here come the plotting commands:
plot(x, avg,
ylim=range(c(avg-sdev, avg+sdev)),
pch=19, xlab="Measurements", ylab="Mean +/- SD",
main="Scatter plot with std.dev error bars"
)
# hack: we draw arrows but with very special "arrowheads"
arrows(x, avg-sdev, x, avg+sdev, length=0.05, angle=90, code=3)
The result looks like this:
In the arrows(...) function length=0.05 is the size of the "arrowhead" in inches, angle=90 specifies that the "arrowhead" is perpendicular to the shaft of the arrow, and the particularly intuitive code=3 parameter specifies that we want to draw an arrowhead on both ends of the arrow.
For horizontal error bars the following changes are necessary, assuming that the sdev vector now contains the errors in the x values and the y values are the ordinates:
plot(x, y,
xlim=range(c(x-sdev, x+sdev)),
pch=19,...)
# horizontal error bars
arrows(x-sdev, y, x+sdev, y, length=0.05, angle=90, code=3)
Using ggplot and a little dplyr for data manipulation:
set.seed(42)
df <- data.frame(x = rep(1:10,each=5), y = rnorm(50))
library(ggplot2)
library(dplyr)
df.summary <- df %>% group_by(x) %>%
summarize(ymin = min(y),
ymax = max(y),
ymean = mean(y))
ggplot(df.summary, aes(x = x, y = ymean)) +
geom_point(size = 2) +
geom_errorbar(aes(ymin = ymin, ymax = ymax))
If there's an additional grouping column (OP's example plot has two errorbars per x value, saying the data is sourced from two files), then you should get all the data in one data frame at the start, add the grouping variable to the dplyr::group_by call (e.g., group_by(x, file) if file is the name of the column) and add it as a "group" aesthetic in the ggplot, e.g., aes(x = x, y = ymean, group = file).
#some example data
set.seed(42)
df <- data.frame(x = rep(1:10,each=5), y = rnorm(50))
#calculate mean, min and max for each x-value
library(plyr)
df2 <- ddply(df,.(x),function(df) c(mean=mean(df$y),min=min(df$y),max=max(df$y)))
#plot error bars
library(Hmisc)
with(df2,errbar(x,mean,max,min))
grid(nx=NA,ny=NULL)
To summarize Laryx Decidua's answer:
define and use a function like the following
plot.with.errorbars <- function(x, y, err, ylim=NULL, ...) {
if (is.null(ylim))
ylim <- c(min(y-err), max(y+err))
plot(x, y, ylim=ylim, pch=19, ...)
arrows(x, y-err, x, y+err, length=0.05, angle=90, code=3)
}
where one can override the automatic ylim, and also pass extra parameters such as main, xlab, ylab.
Another (easier - at least for me) way to do this is below.
install.packages("ggplot2movies")
data(movies, package="ggplot2movies")
Plot average Length vs Rating
rating_by_len = tapply(movies$length,
movies$rating,
mean)
plot(names(rating_by_len), rating_by_len, ylim=c(0, 200)
,xlab = "Rating", ylab = "Length", main="Average Rating by Movie Length", pch=21)
Add error bars to the plot: mean - sd, mean + sd
sds = tapply(movies$length, movies$rating, sd)
upper = rating_by_len + sds
lower = rating_by_len - sds
segments(x0=as.numeric(names(rating_by_len)),
y0=lower,
y1=upper)
Hope that helps.
I put together start to finish code of a hypothetical experiment with ten measurement replicated three times. Just for fun with the help of other stackoverflowers. Thank you... Obviously loops are an option as applycan be used but I like to see what happens.
#Create fake data
x <-rep(1:10, each =3)
y <- rnorm(30, mean=4,sd=1)
#Loop to get standard deviation from data
sd.y = NULL
for(i in 1:10){
sd.y[i] <- sd(y[(1+(i-1)*3):(3+(i-1)*3)])
}
sd.y<-rep(sd.y,each = 3)
#Loop to get mean from data
mean.y = NULL
for(i in 1:10){
mean.y[i] <- mean(y[(1+(i-1)*3):(3+(i-1)*3)])
}
mean.y<-rep(mean.y,each = 3)
#Put together the data to view it so far
data <- cbind(x, y, mean.y, sd.y)
#Make an empty matrix to fill with shrunk data
data.1 = matrix(data = NA, nrow=10, ncol = 4)
colnames(data.1) <- c("X","Y","MEAN","SD")
#Loop to put data into shrunk format
for(i in 1:10){
data.1[i,] <- data[(1+(i-1)*3),]
}
#Create atomic vectors for arrows
x <- data.1[,1]
mean.exp <- data.1[,3]
sd.exp <- data.1[,4]
#Plot the data
plot(x, mean.exp, ylim = range(c(mean.exp-sd.exp,mean.exp+sd.exp)))
abline(h = 4)
arrows(x, mean.exp-sd.exp, x, mean.exp+sd.exp, length=0.05, angle=90, code=3)

Scatter plot with error bars

How can I generate the following plot in R? Points, shown in the plot are the averages, and their ranges correspond to minimal and maximal values.
I have data in two files (below is an example).
x y
1 0.8773
1 0.8722
1 0.8816
1 0.8834
1 0.8759
1 0.8890
1 0.8727
2 0.9047
2 0.9062
2 0.8998
2 0.9044
2 0.8960
.. ...
First of all: it is very unfortunate and surprising that R cannot draw error bars "out of the box".
Here is my favourite workaround, the advantage is that you do not need any extra packages. The trick is to draw arrows (!) but with little horizontal bars instead of arrowheads (!!!). This not-so-straightforward idea comes from the R Wiki Tips and is reproduced here as a worked-out example.
Let's assume you have a vector of "average values" avg and another vector of "standard deviations" sdev, they are of the same length n. Let's make the abscissa just the number of these "measurements", so x <- 1:n. Using these, here come the plotting commands:
plot(x, avg,
ylim=range(c(avg-sdev, avg+sdev)),
pch=19, xlab="Measurements", ylab="Mean +/- SD",
main="Scatter plot with std.dev error bars"
)
# hack: we draw arrows but with very special "arrowheads"
arrows(x, avg-sdev, x, avg+sdev, length=0.05, angle=90, code=3)
The result looks like this:
In the arrows(...) function length=0.05 is the size of the "arrowhead" in inches, angle=90 specifies that the "arrowhead" is perpendicular to the shaft of the arrow, and the particularly intuitive code=3 parameter specifies that we want to draw an arrowhead on both ends of the arrow.
For horizontal error bars the following changes are necessary, assuming that the sdev vector now contains the errors in the x values and the y values are the ordinates:
plot(x, y,
xlim=range(c(x-sdev, x+sdev)),
pch=19,...)
# horizontal error bars
arrows(x-sdev, y, x+sdev, y, length=0.05, angle=90, code=3)
Using ggplot and a little dplyr for data manipulation:
set.seed(42)
df <- data.frame(x = rep(1:10,each=5), y = rnorm(50))
library(ggplot2)
library(dplyr)
df.summary <- df %>% group_by(x) %>%
summarize(ymin = min(y),
ymax = max(y),
ymean = mean(y))
ggplot(df.summary, aes(x = x, y = ymean)) +
geom_point(size = 2) +
geom_errorbar(aes(ymin = ymin, ymax = ymax))
If there's an additional grouping column (OP's example plot has two errorbars per x value, saying the data is sourced from two files), then you should get all the data in one data frame at the start, add the grouping variable to the dplyr::group_by call (e.g., group_by(x, file) if file is the name of the column) and add it as a "group" aesthetic in the ggplot, e.g., aes(x = x, y = ymean, group = file).
#some example data
set.seed(42)
df <- data.frame(x = rep(1:10,each=5), y = rnorm(50))
#calculate mean, min and max for each x-value
library(plyr)
df2 <- ddply(df,.(x),function(df) c(mean=mean(df$y),min=min(df$y),max=max(df$y)))
#plot error bars
library(Hmisc)
with(df2,errbar(x,mean,max,min))
grid(nx=NA,ny=NULL)
To summarize Laryx Decidua's answer:
define and use a function like the following
plot.with.errorbars <- function(x, y, err, ylim=NULL, ...) {
if (is.null(ylim))
ylim <- c(min(y-err), max(y+err))
plot(x, y, ylim=ylim, pch=19, ...)
arrows(x, y-err, x, y+err, length=0.05, angle=90, code=3)
}
where one can override the automatic ylim, and also pass extra parameters such as main, xlab, ylab.
Another (easier - at least for me) way to do this is below.
install.packages("ggplot2movies")
data(movies, package="ggplot2movies")
Plot average Length vs Rating
rating_by_len = tapply(movies$length,
movies$rating,
mean)
plot(names(rating_by_len), rating_by_len, ylim=c(0, 200)
,xlab = "Rating", ylab = "Length", main="Average Rating by Movie Length", pch=21)
Add error bars to the plot: mean - sd, mean + sd
sds = tapply(movies$length, movies$rating, sd)
upper = rating_by_len + sds
lower = rating_by_len - sds
segments(x0=as.numeric(names(rating_by_len)),
y0=lower,
y1=upper)
Hope that helps.
I put together start to finish code of a hypothetical experiment with ten measurement replicated three times. Just for fun with the help of other stackoverflowers. Thank you... Obviously loops are an option as applycan be used but I like to see what happens.
#Create fake data
x <-rep(1:10, each =3)
y <- rnorm(30, mean=4,sd=1)
#Loop to get standard deviation from data
sd.y = NULL
for(i in 1:10){
sd.y[i] <- sd(y[(1+(i-1)*3):(3+(i-1)*3)])
}
sd.y<-rep(sd.y,each = 3)
#Loop to get mean from data
mean.y = NULL
for(i in 1:10){
mean.y[i] <- mean(y[(1+(i-1)*3):(3+(i-1)*3)])
}
mean.y<-rep(mean.y,each = 3)
#Put together the data to view it so far
data <- cbind(x, y, mean.y, sd.y)
#Make an empty matrix to fill with shrunk data
data.1 = matrix(data = NA, nrow=10, ncol = 4)
colnames(data.1) <- c("X","Y","MEAN","SD")
#Loop to put data into shrunk format
for(i in 1:10){
data.1[i,] <- data[(1+(i-1)*3),]
}
#Create atomic vectors for arrows
x <- data.1[,1]
mean.exp <- data.1[,3]
sd.exp <- data.1[,4]
#Plot the data
plot(x, mean.exp, ylim = range(c(mean.exp-sd.exp,mean.exp+sd.exp)))
abline(h = 4)
arrows(x, mean.exp-sd.exp, x, mean.exp+sd.exp, length=0.05, angle=90, code=3)

Resources