l have four variables y1,y2,y3,y4. I want to make a plot that will show how y2,y3 and y4 behave in relation to y1. I have tried using scatterplot but l do not get much information from that.
matplot might be useful here as well:
dat <- data.frame(y1=1:3,y2=1:3,y3=2:4,y4=3:5)
matplot(dat[1],dat[-1],type="l",lty=1)
par(mfrow=c(1,3)) #make a plot area with space for three plots
y1=rnorm(100)
y2=rnorm(100)
y3=rnorm(100)
y4=rnorm(100)
plot(y1,y2)
plot(y1,y3)
plot(y1,y3)
Related
I have been searching for hours, but I can't find a function that does this.
How do I generate a plot like
Lets say I have an array x1 = c(2,13,4) and y2=c(5,23,43). I want to create 3 blocks with height from 2-5,13-23...
How would I approach this problem? I'm hoping that I could be pointed in the right direction as to what built-in function to look at?
I have not used your data because you say you are working with an array, but you gave us two vectors. Moreover, the data you showed us is overlapping. This means that if you chart three bars, you only see two.
Based on the little image you provided, you have three ranges you want to plot for each individual or date. Using times series, we usually see this to plot the min/max, the standard deviation and the current data.
The trick is to chart the series as layers. The first series is the one with the largest range (the beige band in this example). In the following example, I chart an empty plot first and I add three layers of rectangles, one for beige, one for gray and one for red.
#Create data.frame
n=100
df <-data.frame(1:n,runif(n)*10,60+runif(n)*10,25+runif(n)*10,40+runif(n)*10,35-runif(n)*10,35+runif(n)*10)
colnames(df) <-c("id","beige.min","beige.max","gray.min","gray.max","red.min","red.max")
#Create chart
plot(x=df$id,y=NULL,ylim=range(df[,-1]), type="n") #blank chart, ylim is the range of the data
rect(df$id-0.5,df[,2],df$id+0.5,df[,3],col="beige", border=FALSE) #first layer
rect(df$id-0.5,df[,4],df$id+0.5,df[,5],col="gray", border=FALSE) #second layer
rect(df$id-0.5,df[,6],df$id+0.5,df[,7],col="darkred", border=FALSE) #third layer
It's not entirely clear what you want based on the png, but based on what you've written:
x1 <- c(2,13,4)
y2 <- c(5,23,43)
foo <- data.frame(id=1:3, x1, y2)
library(ggplot2)
ggplot(data=foo) + geom_rect(aes(ymin=x1, ymax=y2, xmin=id-0.4, xmax=id+0.4))
I am trying to plot points in a plot where each dot is represented by a number. However, it seems that the points can only be one character long, as you can see in the plot produced by the code below:
set.seed(1); plot(rnorm(15), pch=paste(1:15))
I wonder if there is any workaround for this. Thanks.
set.seed(1); plot(rnorm(15), pch=paste(1:15),type='n')
text(x=1:15,y=rnorm(15),label=round(rnorm(15),2))
another grid option using lattice for example:
dat <- data.frame(x=1:15,y=rnorm(15))
xyplot(y~x,data=dat,
panel=function(x,y,...){
panel.xyplot(x,y,...)
panel.text(x,y,label=round(rnorm(15),2),adj=2,col='red')})
I have been trying to plot simple density plots using R as:
plot(density(Data$X1),col="red")
plot(density(Data$X2),col="green")
Since I want to compare, I'd like to plot both in one figure. But 'matplot' doesn't work!! I also tried with ggplot2 as:
library(ggplot2)
qplot(Data$X1, geom="density")
qplot(Data$X2, add = TRUE, geom="density")
Also in this case, plots appear separately (though I wrote add=TRUE)!! Can anyone come up with an easy solution to the problem, please?
In ggplot2 or lattice you need to reshape the data to seupose them.
For example :
dat <- data.frame(X1= rnorm(100),X2=rbeta(100,1,1))
library(reshape2)
dat.m <- melt(dat)
Using ``lattice`
densityplot(~value , groups = variable, data=dat.m,auto.key = T)
Using ``ggplot2`
ggplot(data=dat.m)+geom_density(aes(x=value, color=variable))
EDIT add X1+X2
Using lattice and the extended formua interface, it is extremely easy to do this:
densityplot(~X1+X2+I(X1+X2) , data=dat) ## no need to reshape data!!
You can try:
plot(density(Data$X1),col="red")
points(density(Data$X2),col="green")
I must add that the xlim and ylim values should ideally be set to include ranges of both X1 and X2, which could be done as follows:
foo <- density(Data$X1)
bar <- density(Data$X2)
plot(foo,col="red", xlim=c(min(foo$x,bar$x),max(foo$x,bar$x)) ylim=c(min(foo$y,bar$y),max(foo$y,bar$y))
points(bar,col="green")
In base graphics you can overlay density plots if you keep the ranges identical and use par(new=TRUE) between them. I think add=TRUE is a base graphics strategy that some functions but not all will honor.
If you specify n, from, and to in the calls to density and make sure that they match between the 2 calls then you should be able to use matplot to plot both in one step (you will need to bind the 2 sets of y values into a single matrix).
Let's say I have the following dataset
bodysize=rnorm(20,30,2)
bodysize=sort(bodysize)
survive=c(0,0,0,0,0,1,0,1,0,0,1,1,0,1,1,1,0,1,1,1)
dat=as.data.frame(cbind(bodysize,survive))
I'm aware that the glm plot function has several nice plots to show you the fit,
but I'd nevertheless like to create an initial plot with:
1)raw data points
2)the loigistic curve and both
3)Predicted points
4)and aggregate points for a number of predictor levels
library(Hmisc)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
All fine up to here.
Now I want to plot the real data survival rates for a given levels of x1
dat$bd<-cut2(dat$bodysize,g=5,levels.mean=T)
AggBd<-aggregate(dat$survive,by=list(dat$bd),data=dat,FUN=mean)
plot(AggBd,add=TRUE)
#Doesn't work
I've tried to match AggBd to the dataset used for the model and all sort of other things but I simply can't plot the two together. Is there a way around this?
I basically want to overimpose the last plot along the same axes.
Besides this specific task I often wonder how to overimpose different plots that plot different variables but have similar scale/range on two-dimensional plots. I would really appreciate your help.
The first column of AggBd is a factor, you need to convert the levels to numeric before you can add the points to the plot.
AggBd$size <- as.numeric (levels (AggBd$Group.1))[AggBd$Group.1]
to add the points to the exisiting plot, use points
points (AggBd$size, AggBd$x, pch = 3)
You are best specifying your y-axis. Also maybe using par(new=TRUE)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
#then
par(new=TRUE)
#
plot(AggBd$Group.1,AggBd$x,pch=30)
obviously remove or change the axis ticks to prevent overlap e.g.
plot(AggBd$Group.1,AggBd$x,pch=30,xaxt="n",yaxt="n",xlab="",ylab="")
giving:
I am using the sm package in R to draw a density plot of several variables with different sample sizes, like this:
var1 <- density(vars1[,1])
var2 <- density(vars2[,1])
var3 <- density(vars3[,1])
pdf(file="density.pdf",width=8.5,height=8)
plot(var1,col="BLUE")
par(new=T)
plot(var2,axes=FALSE,col="RED")
par(new=T)
plot(var3,axes=FALSE,col="GREEN")
dev.off()
The problem I'm having, is that I want the y-axis to show the proportions so I can compare the different variables with each other in a more meaningful way. The maxima of all three density plots are now exactly the same, and I'm pretty sure that they wouldn't be if the y-axis showed proportions. Any suggestions? Many thanks!
Edit:
I just learned that I should not plot on top of an existing plot, so now the plotting part of the code looks like this:
pdf(file="density.pdf",width=8.5,height=8)
plot(var1,col="BLUE")
lines(var2,col="RED")
lines(var3,col="GREEN")
dev.off()
The maxima of those lines however are now very much in line with the sample size differences. Is there a way to put the proportions on the y-axis for all three variables, so the area under the curve is equal for all three variables? Many thanks!
Don't plot on top of an existing plot, because they axes may be different. Instead, use lines() to plot the second and third densities after plotting the first. If necessary, adjust the ylim parameter in plot() so that they all fit.
An example for how sample size ought not matter:
set.seed(1)
D1 <- density(rnorm(1000))
D2 <- density(rnorm(10000))
D3 <- density(rnorm(100000))
plot(D1$x,D1$y,type='l',col="red",ylim=c(0,.45))
lines(D2$x,D2$y,lty=2,col="blue")
lines(D3$x,D3$y,lty=3,col="green")
You could make tim's solution a little more flexible by not hard-coding in the limits.
plot(D1$x,D1$y,type='l',col="red",ylim=c(0, max(sapply(list(D1, D2, D3),
function(x) {max(x$y)}))))
This would also cater for Vincent's point that the density functions are not necessarily constrained in their range.