r program side-by-side boxplots - r

I have three different boxplots,
k1<-boxplot(decreased$Group.1)
k2<-boxplot(unchanged$Group.1)
k3<-boxplot(created$Group.1)
Is there any way I can make side-by-side boxplot with it or do I have to combine the columns for table together and use ~ to find out side by side?

It can happen but you will need to play with the xlim, ylim, at and add arguments.
See this example:
boxplot(1:10, xlim=c(1,6), ylim=c(0,20), at=1.5)
boxplot(2:10, add=TRUE, at=3.5)
boxplot(3:20, add=TRUE, at=5.5)
So, you need to add the x-limits and y-limits on the first plot along with the location of where to plot the first barplot (specified by at). Then consecutive barplots need the location (i.e. again at) and also the add=TRUE argument.

Related

R plot and barplot how to fix ylim not alike?

I try to use base R to plot a time series as a bar plot and as ordinary line plot. I try to write a flexible function to draw such a plot and would like to draw the plots without axes and then add universal axis manually.
Now, I hampered by strange problem: same ylim values result into different axes. Consider the following example:
data(presidents)
# shorten this series a bit
pw <- window(presidents,start=c(1965))
barplot(t(pw),ylim = c(0,80))
par(new=T)
plot(pw,ylim = c(0,80),col="blue",lwd=3)
I intentionally plot y-axes coming from both plots here to show it's not the same. I know I can achieve the intended result by plotting a bar plot first and then add lines using x and y args of lines.
But the I am looking for flexible solution that let's you add lines to barplots like you add lines to points or other line plots. So is there a way to make sure y-axes are the same?
EDIT: also adding the usr parameter to par doesn't help me here.
par(new=T,usr = par("usr"))
Add yaxs="i" to your lineplot. Like this:
plot(pw,ylim = c(0,80),col="blue",lwd=3, yaxs="i")
R start barplots at y=0, while line plots won't. This is to make sure that you see a line if it happens that your data is y=0, otherwise it aligns with the x axis line.

Produce a legend in R

In R I often add legends to my plots like this
legend("topright",c("a=1","b=1"),lwd=c(1,2))
However, what I want to do is produce a plot which contains nothing but that legend. How do I do it? (Preferably without using package such as ggplot)
You can generate a new, empty plot frame using frame() or plot.new()
plot.new()
legend("topright",c("a=1","b=1"),lwd=c(1,2))
Use the type='n' parameter as in:
plot(x,y,type='n')
See ?plot.default for details. If you will want to add some text/points/lines to the plot afterward you may want to provide the x and y parameters, and/or the ylim and xlim parameters in order to set up the plotting region.
You can also drop the axes with the argument axes=F, and you can set the xlab,ylab, and main to NA, if you really want a blank plot.

R Programming on Graphics

Suppose the following R code gives a multiple graph containing four graphs. There are enough spaces among the graphs. How to reduce the space between these graphs? Secondly, How to give axis name to only for the outer side i.e., from the first graph and second graph remove the x axis legend.
getOption("device")()
par(mfrow =c(2,2))
x<-seq(0.01,10,by=0.01)
plot(x,2*x)
plot(x,sin(x))
plot(x,cos(x))
plot(x,x^3)
Try using the margins argument in par(mar). The second plot x-axis is removed using the argument xlab="":
par(mfrow =c(2,2), mar=c(4,4,1,1))
x<-seq(0.01,10,by=0.01)
plot(x,2*x, xlab="") # here the label for the x axis is removed
plot(x,sin(x))
plot(x,cos(x))
plot(x,x^3)

Plotting histograms with R; y axis keeps changing to frequency from proportion/probability

I try to overlay two histograms in the same plane but the option Probability=TRUE (relative frequencies) in hist() is not effective with the code below. It is a problem because the two samples have very different sizes (length(cl1)=9 and length(cl2)=339) and, with this script, I cannot vizualize differences between both histograms because each shows frequencies. How can I overlap two histograms with the same bin width, showing relative frequencies?
c1<-hist(dataList[["cl1"]],xlim=range(minx,maxx),breaks=seq(minx,maxx,pasx),col=rgb(1,0,0,1/4),main=paste(paramlab,"Group",groupnum,"cl1",sep=" "),xlab="",probability=TRUE)
c2<-hist(dataList[["cl2"]],xlim=range(minx,maxx),breaks=seq(minx,maxx,pasx),col=rgb(0,0,1,1/4),main=paste(paramlab,"Group",groupnum,"cl2",sep=" "),xlab="",probability=TRUE)
plot(c1, col=rgb(1,0,0,1/4), xlim=c(minx,maxx), main=paste(paramlab,"Group",groupnum,sep=" "),xlab="")# first histogram
plot(c2, col=rgb(0,0,1,1/4), xlim=c(minx,maxx), add=T)
cl1Col <- rgb(1,0,0,1/4)
cl2Col <- rgb(0,0,1,1/4)
legend('topright',c('Cl1','Cl2'),
fill = c(cl1Col , cl2Col ), bty = 'n',
border = NA)
Thanks in advance for your help!
When you call plot on an object of class histogram (like c1), it calls the S3 method for the histogram. Namely, plot.histogram. You can see the code for this function if you type graphics:::plot.histogram and you can see its help under ?plot.histogram. The help file for that function states:
freq logical; if TRUE, the histogram graphic is to present a
representation of frequencies, i.e, x$counts; if FALSE, relative
frequencies (probabilities), i.e., x$density, are plotted. The default
is true for equidistant breaks and false otherwise.
So, when plot renders a histogram it doesn't use the previously specified probability or freq arguments, it tries to figure it out for itself. The reason for this is obvious if you dig around inside c1, it contains all of the data necessarily for the plot, but does not specify how it should be rendered.
So, the solution is to reiterate the argument freq=FALSE when you run the plot functions. Notably, freq=FALSE works whereas probability=TRUE does not because plot.histogram does not have a probability option. So, your plot code will be:
plot(c1, col=rgb(1,0,0,1/4), xlim=c(minx,maxx), main=paste(paramlab,"Group",groupnum,sep=" "),xlab="",freq=FALSE)# first histogram
plot(c2, col=rgb(0,0,1,1/4), xlim=c(minx,maxx), add=T, freq=FALSE)
This all seems like a oversight/idiosyncratic decision (or lack thereof) on the part of the R devs. To their credit it is appropriately documented and is not "unexpected behavior" (although I certainly didn't expect it). I wonder where such oddness should be reported, if it should be reported at all.

plot multiple line segments on one graph using R

How can I duplicate this style of graph, with multiple plots on one graph, and, preferably, legends attached as below.
I have tried the concept of "facet" but ggplot2 and trellis:xyplot both think of facets as separate panels rather than overlaid plots.
I can do it using plain Jane plot() and line().. but was using ggplot2 and woudl like to get multiple lines on one plot in that package.
Here is some example data in long form (captured from the plot using a nifty app called "Graphclick")
comp <- read.table(pipe("pbpaste"), header=T, sep=',')
company, year, sales
Apple,1975.003,17298.457
Apple,1977.302,16784.502
Apple,1978.314,17298.457
Apple,1980.246,20730.098
Apple,1981.533,27608.426
Apple,1984.293,40862.852
Apple,1986.408,50468.617
Apple,1987.328,48236.188
Apple,1988.892,35676.547
Apple,1989.904,34616.582
Apple,1991.192,44732.742
Apple,1992.387,44732.742
Apple,1993.399,39055.324
Apple,1995.791,37894.922
Apple,1996.895,39648.746
Apple,1998.274,52804.367
Apple,1999.378,61399.512
Apple,2001.770,2.350e5
Apple,2005.265,7.735e5
Toshiba,1999.378,86856.6
Toshiba,2001.862,1.192e5
Toshiba,2004.069,1.495e5
Toshiba,2004.069,1.495e5
IBM,1975.003,22019.092
IBM,1975.830,27195.193
IBM,1976.934,30682.320
IBM,1978.130,31148.527
IBM,1980.430,35676.547
IBM,1981.625,35676.547
IBM,1983.005,39648.746
IBM,1985.305,40862.852
IBM,1986.408,46102.508
IBM,1987.512,64241.156
IBM,1989.996,75832.898
IBM,1991.100,84276.039
IBM,1992.295,85556.641
IBM,1993.307,79342.539
IBM,1994.779,79342.539
IBM,1995.791,84276.039
IBM,1996.895,95082.484
IBM,1996.895,95082.484
Commodore,1975.003,33588.051
Commodore,1975.830,34616.582
Commodore,1977.118,25219.982
Commodore,1978.130,23388.229
Commodore,1979.326,25992.234
Commodore,1980.521,21689.514
Commodore,1981.717,25219.982
Commodore,1984.201,6999.029
Commodore,1985.213,1670.460
Commodore,1986.408,1458.447
(source: asymco.com)
If you're looking for the most control, you could just use the low-level plot and lines commands. Use "plot" to generate the first graph (with title, xlimits, and ylimits), then use "lines" to add lines to that graph.
plot(0,type="n", xlim=c(0,10), ylim=c(0,10), xlab="X Label", ylab="Y Label", main="Title")
Then add lines using the lines command:
lines(1:10, 1:10, type="l", lty=2)
lines(2:4, 10:8, col=2, type="l")
lines(6:9, c(5,6,5,6), col=3, type="l")
You can fine-tune the look by using all of the parameters listed in the "par" help file ("?par")
so, in ggplot2, this code works
qplot(year, sales, data=comp, colour=as.factor(company), group= company, geom="path", log="y")
The only things left now is to format the value on the Y axis as numeric (not sci notation), and the labels are in an off-graph legend, rather than on the plots... Final suggestions welcomed.
This is a lot easier in the end than plot() + lines(), as that required support code to get the ranges, iterate over the group levels etc.

Resources