how to plot in R - r

I am struggling with the following R code for plotting;
set.seed(1234)
x1=rnorm(100)^2
Index=1:length(x1)
cutoff=3
plot(Index,x1,type="h")
abline(h=cutoff,lty=2)
Firstly, I want to plot also the index of those values that are greater than cutoff value?
And secondly, I have five plots similar to above, I am using
par(mfrow= c(3,2))
It give 3 x 2 space, but for fifth plot, I need to plot it in the center;

The layout() function will be your best friend in terms of arranging the plots.
Check http://www.statmethods.net/advgraphs/layout.html for a good tutorial on the function.

Related

save multiple plots in R as a .jpg file, how?

I am very new to R and I am using it for my probability class. I searched for this question here, but it looks that is not the same as I want to do. (If there is already an answer, please tell me).
The problem is that I want to save multiple plots of histograms in the same file. For example, if I do this in the R prompt, I get what I want:
library(PASWR)
data(Grades)
attach(Grades) # Grade has gpa and sat variables
par(mfrow=c(2,1))
hist(gpa)
hist(sat)
So I get both histograms in the same plot. but if I want to save it as a jpeg:
library(PASWR)
data(Grades)
attach(Grades) # Grades has gpa and sat variables
par(mfrow=c(2,1))
jpeg("hist_gpa_sat.jpg")
hist(gpa)
hist(sat)
dev.off()
It saves the file but just with one plot... Why? How I can fix this?
Thanks.
Also, if there is some good article or tutorial about how to plot with gplot and related stuff it will be appreciated, thanks.
Swap the order of these two lines:
par(mfrow=c(2,1))
jpeg("hist_gpa_sat.jpg")
so that you have:
jpeg("hist_gpa_sat.jpg")
par(mfrow=c(2,1))
hist(gpa)
hist(sat)
dev.off()
That way you are opening the jpeg device before doing anything related to plotting.
You could also have a look at the function layout. With this, you can arrange plots more freely. This example gives you a 2 column layout of plots with 3 rows.
The first row is occupied by one plot, the second row by 2 plots and the third row again by one plot. This can come in very handy.
x <- rnorm(1000)
jpeg("normdist.jpg")
layout(mat=matrix(c(1,1,2,3,4,4),nrow=3,ncol=2,byrow=T))
boxplot(x, horizontal=T)
hist(x)
plot(density(x))
plot(x)
dev.off()
Check ?layout how the matrix 'mat' (layout's first argument) is interpreted.

Plotting distribution of differences in R

I have a dataset with numbers indicating daily difference in some measure.
https://dl.dropbox.com/u/22681355/diff.csv
I would like to create a plot of the distribution of the differences with special emphasis on the rare large changes.
I tried plotting each column using the hist() function but it doesn't really provide a detailed picture of the data.
For example plotting the first column of the dataset produces the following plot:
https://dl.dropbox.com/u/22681355/Rplot.pdf
My problem is that this gives very little detail to the infrequent large deviations.
What is the easiest way to do this?
Also any suggestions on how to summarize this data in a table? For example besides showing the min, max and mean values, would you look at quantiles? Any other ideas?
You could use boxplots to visualize the distribution of the data:
sdiff <- read.csv("https://dl.dropbox.com/u/22681355/diff.csv")
boxplot(sdiff[,-1])
Outliers are printed as circles.
I back #Sven's suggestion for identifying outliers, but you can get more refinement in your histograms by specifying a denser set of breakpoints than what hist chooses by default.
d <- read.csv('https://dl.dropbox.com/u/22681355/diff.csv', header=TRUE, row.names=1)
with(d, hist(a, breaks=seq(min(a), max(a), length.out=100)))
Violin plots could be useful:
df <- read.csv('https://dl.dropbox.com/u/22681355/diff.csv')
library(vioplot)
with(df,vioplot(a,b,c,d,e,f,g,h,i,j))
I would use a boxplot on transformed data, e.g.:
boxplot(df[,-1]/sqrt(abs(df[,-1])))
Obviously a histogram would also look better after transformation.

R - logistic curve plot with aggregate points

Let's say I have the following dataset
bodysize=rnorm(20,30,2)
bodysize=sort(bodysize)
survive=c(0,0,0,0,0,1,0,1,0,0,1,1,0,1,1,1,0,1,1,1)
dat=as.data.frame(cbind(bodysize,survive))
I'm aware that the glm plot function has several nice plots to show you the fit,
but I'd nevertheless like to create an initial plot with:
1)raw data points
2)the loigistic curve and both
3)Predicted points
4)and aggregate points for a number of predictor levels
library(Hmisc)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
All fine up to here.
Now I want to plot the real data survival rates for a given levels of x1
dat$bd<-cut2(dat$bodysize,g=5,levels.mean=T)
AggBd<-aggregate(dat$survive,by=list(dat$bd),data=dat,FUN=mean)
plot(AggBd,add=TRUE)
#Doesn't work
I've tried to match AggBd to the dataset used for the model and all sort of other things but I simply can't plot the two together. Is there a way around this?
I basically want to overimpose the last plot along the same axes.
Besides this specific task I often wonder how to overimpose different plots that plot different variables but have similar scale/range on two-dimensional plots. I would really appreciate your help.
The first column of AggBd is a factor, you need to convert the levels to numeric before you can add the points to the plot.
AggBd$size <- as.numeric (levels (AggBd$Group.1))[AggBd$Group.1]
to add the points to the exisiting plot, use points
points (AggBd$size, AggBd$x, pch = 3)
You are best specifying your y-axis. Also maybe using par(new=TRUE)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
#then
par(new=TRUE)
#
plot(AggBd$Group.1,AggBd$x,pch=30)
obviously remove or change the axis ticks to prevent overlap e.g.
plot(AggBd$Group.1,AggBd$x,pch=30,xaxt="n",yaxt="n",xlab="",ylab="")
giving:

In R, how to plot bars only in a certain interval of data?

My problem is very simple.
I have to plot a data series in R, using bars. Data are contained in a vector vet.
I've used barplot, that plots my data from the first to the last:
barplot(vet), and everything was fine.
Now, on the contrary, I would like to plot not all my data, but just a part of them: from 10% to the end.
How could I do this with barplot()?
How could I do this with plot()?
Thanx
You need to subset your data before plotting:
##Work out the 10% quantile and subset
v = vet[vet > quantile(vet, 0.1)]
It is not clear exactly what you want to do.
If you want to plot only a subset of the bars (but the whole bars) then you could just subset the data before passing it to barplot.
If you want to plot all the bars, but only that part beyond 10% (not include 0) then you can do this by setting the ylim argument. But it is very discouraged to do a barplot that does not include 0. You may be better off using a dotplot instead of a barplot if 0 is not meaningful.
If you want the regular plot, but want to exclude plotting outside of a given window within the plot then the clip function may be what you want.
The gap.barplot function from the plotrix package may also be what you want.

for loop generates multiple plots

I have a question about the for loop in combination with the plot function.
I want to use a for loop function (see below) to plot multiple points in one plot. But my loop generates for each point his one plot. So with an i of 35 I generate 35 plot. My question is, is there a way to plot all the points in the same plot?
pdf("test plot.pdf")
for (i in 1:nrow(MYC)){
plot(MYC[i,1], MYC[i,2]
}
dev.off()
Thank you all!
As mentioned in the comments, you are in essence trying to do multiple plots with a loop. R doesn't understand that actually want to plot only points. There's a cure for that, and it comes in vials of points(). Before calling a loop, construct your plot using the type argument. This will make an empty plot, something along the lines of:
plot(your.data, type = "n")
You can then use your loop (with points) to add points to this existing plot.

Resources