This sample from a study is very close to what I need. The question is, how do I achieve the conditional background color like in the chart below. This chart has two categories, I have three, so I would use some texture for the third.
The categories for the condition that changes over time are in a vector with names CL, C, and CR.
Here's some sample data. So there's the index and then there's the categories that are government types (center-left, center, center-right). In the data there are 72 government terms so there are 72 consecutive runs, therefore doing it by hand with rects is kind of cumbersome at least. I do understand that first I need to plot the categories and then add the line to the plot, I'll worry about axes after the fact and add them last.
shareindex categ
100 C
103 C
104 C
102 CL
99 CL
98 CR
99 CR
101 CL
104 CL
105 CR
104 CR
102 C
103 C
Here's some example data and a call to plot using the panel.first argument to draw the rectangles. I've suggested here using an lapply call to simply the drawing the many rectangles.
# data
set.seed(1)
x <- rnorm(1000)
x2 <- cumsum(x)
y <- rnorm(1000)
y2 <- cumsum(y)-5
ranges <- list(c(5,10), c(20,100), c(200,250), c(500,600), c(800,820), c(915,930))
# expression to be used for plotting gray boxes
boxes <- expression(lapply(ranges, function(z) rect(z[1],-100,z[2],100, col='gray', border=NA)))
# the actual plotting
plot(1:1000, x2, type='l', xlab='time', panel.first = eval(boxes))
lines(1:1000, y2, col='red')
You can use rect to make rectangles and plot lines on top of that
For your data example:
set.seed(1)
x <- 1:100
y <- cumsum(rnorm(100))
z <- c(rep(1, 10), rep(2,20), rep(1,40), rep(3,30))
plot(x, y, type="n")
rect(xleft = x - 1, xright = x, ybottom=par("usr")[3], ytop=par("usr")[4], col=z, border=NA )
lines(x, y, col="white")
Edit for your data:
## Data frame with the data
dat <- data.frame(shareindex=c(100,103,104,102, 99,98,99,101,104,105,104,102,103),
categ=c("C","C","C","CL","CL","CR","CR", "CL", "CL","CR", "CR","C", "C"))
## Add index column
dat$id <- seq(along.with=dat$shareindex)
# Add your background colors here
cols <- c("lightgray","grey", "lightblue")
## Just an empty plot
plot(dat$id, dat$shareindex, type="n", ylab="Share index", xlab="id")
## Plot the rectangles for the background
rect(xleft =dat$id - 1 , xright = dat$id,
ybottom=par("usr")[3], ytop=par("usr")[4],
col=cols[dat$categ], border=NA )
## Plot the line
lines(dat$id, dat$shareindex, lwd=2)
The output looks like this:
Cheers,
alex
Related
I have a function,
x= (z-z^2.5)/(1+2*z-z^2)
y = z-z^2.5
where z is the only variable. How to draw a graph where x-axis shows value of function x, and y-axis shows value of function y as z range from 0 to 5?
You can get a very basic plot by simply following your own instructions.
## z ranges from 0 to 5
z = seq(0,5,0.01)
## x and y are functions of z
x = (z-z^2.5)/(1+2*z-z^2)
y = z-z^2.5
##plot
plot(x,y, pch=20, cex=0.5)
If you want a smooth curve it is a little trickier. There is a discontinuity in the curve at
z = 1 + sqrt(2) ~ 2.414. If you just draw the curve as one piece, you get an unwanted line connecting across the discontinuity. So, in two pieces,
plot(x[1:242],y[1:242], type='l', xlab='x', ylab='y',
xlim=range(x), ylim=range(y))
lines(x[243:501],y[243:501])
But be careful about interpreting this. There is something tricky going on from z=0 to z=1.
Using ggplot2
# z ranges from -1000 to 1000 (The range can be arbitrary)
z = seq(-1000,1000,.25)
# x as a function of z
x = (z-z^2.5) / ((1+2*z)-z^2)
# y as a function of z
y = z-z^2.5
# make a dataframe of x,y and z
df <- data.frame(x=x, y=y, z=z)
# subset the df where z is between 0 and 5
df_5 <- subset(df, (df$z>=0 & df$z<=5))
# plot the graph
library(ggplot2)
ggplot(df_5, aes(x,y))+ geom_point(color="red")
The only addition to #G5W answer is subset() of values between 0 and 5 from your dataset to plot and the use of ggplot2.
So I want to superimpose a regression line in a barplot in R. Similar to the attached image by Rosindell et al. 2011. However, when I try to do this with my data the line does not stretch the entire length of the barplot.
For a reproducible example, I made a dummy code:
x = 20:1
y = 1:20
barplot(x, y, space = 0)
lines(x, y, col = 'red')
How do I get the lines to transverse the entire stretch of the barplot bins?
PS: the line does not need to be non-linear. I just want to superimpose a straight line on the barplot
Thank you.
A more general solution could be to rely on the x-values that are generated by barplot(). This way, you can deal with scenarios where you only have counts (rather than x and y values). I am referring to a variable like this one, where your "x" is categorical (precisely, x-axis values correspond to the names of y).
p.x <- c(8,12,14,9,5,3,2)
x <- sample(c("A","B","C","D","E","F","G"),
prob = p.x/sum(p.x),
replace = TRUE,
size = 200)
y <- table(x)
y
# A B C D E F G
# 27 52 46 36 21 11 7
When you use barplot(), you can collect the x-positions of the bars in a variable (plot.dim in this case) and use to guide your line
plot.dim <- barplot(y)
lines(plot.dim, y, col = "red", lwd = 2)
The result
Now, back to your data. Even if you have both x and y, in a barplot you are displaying only your y variable, while x is used for the labels of y.
x <- 20:1
y <- as.integer(22 - 1 * sample(seq(0.7, 1.3, length.out = length(x))) * x)
names(y) <- x
y <- y[order(as.numeric(names(y)))]
Let's plot your y values again. Collect the barplot positions in the xpos variable.
xpos <- barplot(y, las = 2)
Note that the first bar (x=1) is not positioned at 1. Similarly, the last bar is positioned at 23.5 (and not 20).
xpos[1]
# x=1 is indeed at 0.7
xpos[length(xpos)]
# x=20 is indeed at 23.5
Do your regression (for example, use lm()). Compute the predicted y values at the first and the last x (y labels).
lm.fit <- lm(y~as.numeric(names(y)))
y.init <- lm.fit$coefficients[2] * as.numeric(names(y))[1] + lm.fit$coefficients[1]
y.end <- lm.fit$coefficients[2] * as.numeric(names(y))[(length(y))] + lm.fit$coefficients[1]
You can now over-pose a line using segments(), but remember to set your x-values according to what stored in xpos.
segments(xpos[1], y.init, xpos[length(xpos)], y.end, lwd = 2, col = "red")
Check out the help page ?barplot: the second argument is width - optional vector of bar widths, not the y coordinate. The following code does what you want, but I don't believe it's a general purpose solution.
barplot(y[x], space = 0)
lines(x, y, col = 'red')
Edit:
A probably better way would be to use the return value of barplot.
bp <- barplot(y[x], space = 0)
lines(c(bp), y[x], col = 'red')
How one can get the following visualization in R (see below):
let's consider a simple case of three points.
# Define two vectors
x <- c(12,21,54)
y <- c(2, 7, 11)
# OLS regression
ols <- lm(y ~ x)
# Visualisation
plot(x,y, xlim = c(0,60), ylim =c(0,15))
abline(ols, col="red")
What I desire is, to draw the vertical distance lines from OLS line (red line) to points.
You can do this really nicely with ggplot2
library(ggplot2)
set.seed(1)
x<-1:10
y<-3*x + 2 + rnorm(10)
m<-lm(y ~ x)
yhat<-m$fitted.values
diff<-y-yhat
qplot(x=x, y=y)+geom_line(y=yhat)+
geom_segment(aes(x=x, xend=x, y=y, yend=yhat, color="error"))+
labs(title="regression errors", color="series")
There is a much simpler solution:
segments(x, y, x, predict(ols))
If you construct a matrix of points, you can use apply to plot the lines like this:
Create a matrix of coordinates:
cbind(x,x,y,predict(ols))
# x x y
#1 12 12 2 3.450920
#2 21 21 7 5.153374
#3 54 54 11 11.395706
This can be plotted as:
apply(cbind(x,x,y,predict(ols)),1,function(coords){lines(coords[1:2],coords[3:4])})
effectively a for loop running over the rows of the matrix and plotting one line for each row.
I need to plot several data points that are defined as
c(x,y, stdev_x, stdev_y)
as a scatter plot with a representation of their 95% confidence limits, for examples showing the point and one contour around it. Ideally I'd like to plot on oval around the point, but don't know how to do it. I was thinking of building samples and plotting them, adding stat_density2d() but would need to limit the number of contours to 1, and could not figure out how to do it.
require(ggplot2)
n=10000
d <- data.frame(id=rep("A", n),
se=rnorm(n, 0.18,0.02),
sp=rnorm(n, 0.79,0.06) )
g <- ggplot (d, aes(se,sp)) +
scale_x_continuous(limits=c(0,1))+
scale_y_continuous(limits=c(0,1)) +
theme(aspect.ratio=0.6)
g + geom_point(alpha=I(1/50)) +
stat_density2d()
First, saved all your plot as object (changed limits).
g <- ggplot (d, aes(se,sp, group=id)) +
scale_x_continuous(limits=c(0,0.5))+
scale_y_continuous(limits=c(0.5,1)) +
theme(aspect.ratio=0.6) +
geom_point(alpha=I(1/50)) +
stat_density2d()
With function ggplot_build() save all the information used for the plot. Contours are stored in object data[[2]].
gg<-ggplot_build(g)
str(gg$data)
head(gg$data[[2]])
level x y piece group PANEL
1 10 0.1363636 0.7390318 1 1-1 1
2 10 0.1355521 0.7424242 1 1-1 1
3 10 0.1347814 0.7474747 1 1-1 1
4 10 0.1343692 0.7525253 1 1-1 1
5 10 0.1340186 0.7575758 1 1-1 1
6 10 0.1336037 0.7626263 1 1-1 1
There are in total 12 contour lines but to keep only outer line, you should subset only group=="1-1" and replace original information.
gg$data[[2]]<-subset(gg$data[[2]],group=="1-1")
Then use ggplot_gtable() and grid.draw() to get your plot.
p1<-ggplot_gtable(gg)
grid.draw(p1)
latticeExtra provides panel.ellipse is a lattice panel function that computes and draws a confidence ellipsoid from bivariate data, possibly grouped by a third variable.
here I draw the levels 0.65 and 0.95 suing your data.
library(latticeExtra)
xyplot(sp~se,data=d,groups=id,
par.settings = list(plot.symbol = list(cex = 1.1, pch=16)),
panel = function(x,y,...){
panel.xyplot(x, y,alpha=0.2)
panel.ellipse(x, y, lwd = 2, col="green", robust=FALSE, level=0.65,...)
panel.ellipse(x, y, lwd = 2, col="red", robust=TRUE, level=0.95,...)
})
Looks like the stat_ellipse function that you found is really a great solution, but here's another one (non-ggplot), just for the record, using dataEllipse from the car package.
# some sample data
n=10000
g=4
d <- data.frame(ID = unlist(lapply(letters[1:g], function(x) rep(x, n/g))))
d$x <- unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2)))
d$y <- unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2)))
# plot points with 95% normal-probability contour
# default settings...
library(car)
with(d, dataEllipse(x, y, ID, level=0.95, fill=TRUE, fill.alpha=0.1))
# with a little more effort...
# random colours with alpha-blending
d$col <- unlist(lapply(1:g, function (x) rep(rgb(runif(1), runif(1), runif(1), runif(1)),n/g)))
# plot points first
with(d, plot(x,y, col=col, pch="."))
# then ellipses over the top
with(d, dataEllipse(x, y, ID, level=0.95, fill=TRUE, fill.alpha=0.1, plot.points=FALSE, add=TRUE, col=unique(col), ellipse.label=FALSE, center.pch="+"))
Just found the function stat_ellipse() here (and here) and it takes care of this beautifully.
g + geom_point(alpha=I(1/10)) +
stat_ellipse(aes(group=id), color="black")
Different data set, of course:
I don't know anything about the ggplot2 library, but you can draw ellipses with plotrix. Does this plot look anything like what you're asking for?
library(plotrix)
n=10
d <- data.frame(x=runif(n,0,2),y=runif(n,0,2),seX=runif(n,0,0.1),seY=runif(n,0,0.1))
plot(d$x,d$y,pch=16,ylim=c(0,2),xlim=c(0,2))
draw.ellipse(d$x,d$y,d$seX,d$seY)
I have data conditioned on two variables, one major condition, one minor condition. I want a xyplot (lattice) with points and lines (type='b'), in one panel so that the major condition determines the color and the minor condition is used for drawing the lines.
Here is an example that is representative of my problem (see the code below to produce the data frame). d is the major condition, and c is the minor condition.
> dat
x y c d
1 1 0.9645269 a A
2 2 1.4892217 a A
3 3 1.4848654 a A
....
10 10 2.4802803 a A
11 1 1.5606218 b A
12 2 1.5346806 b A
....
98 8 2.0381943 j B
99 9 2.0826099 j B
100 10 2.2799917 j B
The way to get the connecting lines to be conditioned on c is to use groups=c in the plot. Then the way to tell them apart is to use a formula conditioned on d:
xyplot(y~x|d, data=dat, type='b', groups=c)
However, I want the plots in the same panel. Removing the formula condition on d produces one panel, but when group=d is specified, there are "retrace" lines drawn:
xyplot(y~x, data=dat, type='b', groups=d, auto.key=list(space='inside'))
What I want looks very like the above plot, only without these "retrace" lines.
It's possible to set the colors explicitly in this example, as I know that there are five lines of category 'A' followed by five of category 'B', but this won't easily work for my real problem. In addition, auto.key is useless when setting the colors this way:
xyplot(y~x, data=dat, type='b', groups=c, col=rep(5:6, each=5))
The data:
set.seed(1)
dat <- do.call(
rbind,
lapply(1:10,
function(x) {
firsthalf <- x < 6
data.frame(x=1:10, y=log(1:10 + rnorm(10, .25) + 2 * firsthalf),
c=letters[x],
d=LETTERS[2-firsthalf]
)
}
)
)
The default graphical parameters are obtained from the superpose.symbol and superpose.line. One solution s to set them using par.settings argument.
## I compute the color by group
col <-by(dat,dat$c,
FUN=function(x){
v <- ifelse(x$d=='A','darkgreen','orange')
v[1] ## I return one parameter , since I need one color
}
)
xyplot(y~x, data=dat, type='b', groups=c,
auto.key = list(text =levels(dat$d),points=F),
par.settings=
list(superpose.line = list(col = col), ## color of lines
superpose.symbol = list(col=col), ## colors of points
add.text = list(col=c('darkgreen','orange')))) ## color of text in the legend
Does it have to be lattice? In ggplot it is rather easy:
library(ggplot2)
ggplot(dat, aes(x=x,y=y,colour=d)) + geom_line(aes(group=c),size=0.8) + geom_point(shape=1)
This is a quick and dirty example. You can customize the colour of the lines, the legend , the axis, the background,...