R - Violin plot x-axis names - r

I am creating violin plots with lots (will be ~100) of columns (violins). The problem is that the name of each column is very long. What I am doing current is as follows:
jpeg("stats/AllDistanceViolinPlot.jpg", width = 1000, height = 1000);
do.call(vioplot, c(lapply(data, na.omit),list(names=c("veryveryveryverylongname1", "veryveryveryverylongname2", "veryveryveryverylongname4", "veryveryveryverylongname4", "veryveryveryverylongname5", "veryveryveryverylongname6", "veryveryveryverylongname7", "veryveryveryverylongname8"))));
dev.off()
Which gives me this plot:
As you can see, the names of the columns are very long and some actually are not shown. I have also tried something without the list:
jpeg("stats/plot.jpg", width = 1000, height = 1000);
do.call(vioplot, c(lapply(data, na.omit)));
dev.off()
Which gives me this plot:
What I'd like is one of two things:
The names of the columns would be vertical so that they are shown and aren't cut off
or
Make the main plot like the second image I posted and have a separate legend that would correlate each column with the full name. For example, something like the following:
1 - veryveryveryverylongname1
2 - veryveryveryverylongname2
...
8 - veryveryveryverylongname8
Could someone please suggest the better way (or both) and comment on how to implement them?
Greatly appreciated.

Unfortunately the vioplot function in the vioplot package does not accept the usual base graphics parameters for modifying the orientation of axis annotation. You will need to make a new vioplot function and change this code:
if (!horizontal) {
if (!add) {
plot.window(xlim = xlim, ylim = ylim)
axis(2)
axis(1, at = at, label = label)
To this:
if (!horizontal) {
if (!add) {
plot.window(xlim = xlim, ylim = ylim)
axis(2)
axis(1, at = at, label = label , las=2)

Related

Move title of plots in a list of plots in R

I have a list of plots that I have assigned names to, and then converted to plot titles as suggested by https://stackoverflow.com/a/14790376/9335733. The titles happen to appear over the top x-axis title and so I attempt to move them as suggested here: https://stackoverflow.com/a/44618277/9335733. The overall code looks as follows:
lapply(names(Cast.files), function (x) plot(Cast.files[[x]],
main = x,
adj = 0, #adjust title to the farthest left
line =2.5 #adjust title up 2.5
)
)
It should be noted that plot is now converted from base R to the oce package for analyzing oceanographic data, but calls the same arguments from base R plot.
The problem becomes that in trying to move the title, the axis labels move as well and overlap. Any suggestions?
Edit: Here is what the image looks like before:
And after:
You might also want to look into the oma= argument in par(), which provides an "outer" margin which can be used to put a nice title. Something like:
library(oce)
data(ctd)
par(oma=c(0, 0, 1, 0))
plot(ctd)
title('Title', outer=TRUE)
This was solved by adding a title argument outside of the plot function as follows:
lapply(names(Cast.files), function (x) plot(Cast.files[[x]],
which = c("temperature", "salinity", "sigmaT","conductivity"),
Tlim = c(11,12),
Slim = c(29,32),
col = "red")
+ title(main = x, adj = 0.48, line = 3.5)#adding the titles at a specific location
)
This allowed for plots that looked like:
If you use the title function, rather than setting main within plot, it would allow you to change the line without affecting anything else in the plot.

How do I exclude parameters from an RDA plot

I'm still relatively inexperienced manipulating plots in R, and am in need of assistance. I ran a redundancy analysis in R using the rda() function, but now I need to simplify the figure to exclude unnecessary information. The code I'm currently using is:
abio1516<-read.csv("1516 descriptors.csv")
attach(abio1516)
bio1516<-read.csv("1516habund.csv")
attach(bio1516)
rda1516<-rda(bio1516[,2:18],abio1516[,2:6])
anova(rda1516)
RsquareAdj(rda1516)
summary(rda1516)
varpart(bio1516[,2:18],~Distance_to_source,~Depth, ~Veg._cover, ~Surface_area,data=abio1516)
plot(rda1516,bty="n",xaxt="n",yaxt="n",main="1516; P=, R^2=",
ylab="Driven by , Var explained=",xlab="Driven by , Var explained=")
The produced plot looks like this:
Please help me modify my code to: exclude the sites (sit#), all axes, and the internal dashed lines.
I'd also like to either expand the size of the field, or move the vector labels to all fit in the plotting field.
updated as per responses, working code below this point
plot(rda,bty="n",xaxt="n",yaxt="n",type="n",main="xxx",ylab="xxx",xlab="xxx
Overall best:xxx")
abline(h=0,v=0,col="white",lwd=3)
points(rda,display="species",col="blue")
points(rda,display="cn",col="black")
text(rda,display="cn",col="black")
Start by plotting the rda with type = "n" which generates an empty plot to which you can add the things you want. The dotted lines are hard coded into the plot.cca function, so you need either make your own version, or use abline to hide them (then use box to cover up the holes in the axes).
require(vegan)
data(dune, dune.env)
rda1516 <- rda(dune~., data = dune.env)
plot(rda1516, type = "n")
abline(h = 0, v = 0, col = "white", lwd = 3)
box()
points(rda1516, display = "species")
points(rda1516, display = "cn", col = "blue")
text(rda1516, display = "cn", col = "blue")
If the text labels are not in the correct position, you can use the argument pos to move them (make a vector as long as the number of arrows you have with the integers 1 - 4 to move the label down, left, up, or right. (there might be better solutions to this)

(R) Axis widths in gbm.plot

Hoping for some pointers or some experiences insight as i'm literally losing my mind over this, been trying for 2 full days to set up the right values to have a function spit out clean simple line plots from the gbm.plot function (packages dismo & gbm).
Here's where I start. bty=n in par to turn off the box & leave me with only left & bottom axes. Gbm.plot typically spits out one plot per explanatory variable, so usually 6 plots etc, but I'm tweaking it to do one per variable & looping it. I've removed the loop & lots of other code so it's easy to see what's going on.
png(filename = "whatever.png",width=4*480, height=4*480, units="px", pointsize=80, bg="white", res = NA, family="", type="cairo-png")
par(mar=c(2.6,2,0.4,0.5), fig=c(0,1,0.1,1), las=1, bty="n", mgp=c(1.6,0.5,0))
gbm.plot(my_gbm_model,
n.plots=1,
plot.layout = c(1,1),
y.label = "",
write.title=F,
variable.no = 1, #this is part of the multiple plots thing, calls the explanatory variable
lwd=8, #this controls the width of the main result line ONLY
rug=F)
dev.off()
So this is what the starting condition looks like. Aim: make the axes & ticks thicker. That's it.
Putting "lwd=20" in par does nothing.
Adding axes=F into gbm.plot() turns the axes and their numbers off. So I conclude that the control of these axes is handled by gbm.plot, not par. Here's where it get's frustrating and crap. Accepted wisdom from searches says that lwd should control this but it only controls the wiggly centre line as per my note above. So maybe I could add axis(side=1, lwd=8) into gbm.plot() ?
It runs but inexplicably adds a smoother! (which is very thin & hard to see on the web but it's there, I promise). It adds these warnings:
In if (smooth & is.vector(predictors[[j]])) { ... :
the condition has length > 1 and only the first element will be used
Fine, R's going to be a dick for seemingly no reason, I'll keep plugging the leaks as they come up. New code with axis as before and now smoother turned off:
png(filename = "whatever.png",width=4*480, height=4*480, units="px", pointsize=80, bg="white", res = NA, family="", type="cairo-png")
par(mar=c(2.6,2,0.4,0.5), fig=c(0,1,0.1,1), las=1, bty="n", mgp=c(1.6,0.5,0))
gbm.plot(my_gbm_model,
n.plots=1,
plot.layout = c(1,1),
y.label = "",
write.title=F,
variable.no = 1,
lwd=8,
rug=F,
smooth=F,
axis(side=1,lwd=8))
dev.off()
Gives error:
Error in axis(side = 1, lwd = 8) : plot.new has not been called yet
So it's CLEARLY drawing axes within plot since I can't affect the axes from par and I can turn them off in plot. I can do what I want and make one axis bold, but that results in a smoother and warnings. I can turn the smoother off, but then it fails because it says plot.new hadn't been called. And this doesn't even account for the other axis I have to deal with, which also causes the plot.new failure if I call 2 axis sequentially and allow the smoother.
Am I the butt of a big joke here, or am I missing something obvious? It took me long enough to work out that par is supposed to be before all plots unless you're outputting them with png etc in which case it has to be between png & plot - unbelievably this info isn't in ?par. I know I'm going off topic by ranting, sorry, but yeah, 2 full days. Has this been everyone's experience of plotting in R?
I'm going to open the vodka in the freezer. I appreciate I've not put the full reproducible code here, apologies, I can do if absolutely necessary, but it's such a huge timesuck to get to reproducible stage and I'm hoping someone can see a basic logical/coding failure screaming out at them from what I've given.
Thanks guys.
EDIT: reproducibility
core data csv: https://drive.google.com/file/d/0B6LsdZetdypkWnBJVDJ5U3l4UFU
(I've tried to make these data reproducible before and I can't work out how to do so)
samples<-read.csv("data.csv", header = TRUE, row.names=NULL)
my_gbm_model<-gbm.step(data=samples, gbm.x=1:6, gbm.y=7, family = "bernoulli", tree.complexity = 2, learning.rate = 0.01, bag.fraction = 0.5))
Here's what will widen your axis ticks:
..... , lwd.ticks=4 , ...
I predict on the basis of no testing because I keep getting errors with what limited code you have provided) that it will get handled correctly in either gbm.plot or in a subsequent axis call. There will need to be a subsequent axis call, two of them in fact (because as you noted 'lwd' gets passed around indiscriminately):
png(filename = "whatever.png",width=4*480, height=4*480, units="px", pointsize=80, bg="white", res = NA, family="", type="cairo-png")
par(mar=c(2.6,2,0.4,0.5), fig=c(0,1,0.1,1), las=1, bty="n", mgp=c(1.6,0.5,0))
gbm.plot(my_gbm_model,
n.plots=1,
plot.layout = c(1,1),
y.label = "",
write.title=F,
variable.no = 1,
lwd=8,
rug=F,
smooth=F, axes="F",
axis(side=1,lwd=8))
axis(1, lwd.ticks=4, lwd=4)
# the only way to prevent `lwd` from also affecting plot line
axis(2, lwd.ticks=4, lwd=4)
dev.off()
This is what I see with a simple example:
png(); Speed <- cars$speed
Distance <- cars$dist
plot(Speed, Distance,
panel.first = lines(stats::lowess(Speed, Distance), lty = "dashed"),
pch = 0, cex = 1.2, col = "blue", axes=FALSE)
axis(1, lwd.ticks=4, lwd=4)
axis(2, lwd.ticks=4, lwd=4)
dev.off()

Convenient wrapper for plots, lines, and points

One of the things that most bugs me about R is the separation of the plot, points, and lines commands. It's somewhat irritating to have to change plot to whatever variant for the first plot done, and to have to replot from scratch if you failed to have set the correct the ylim and xlim initially. Wouldn't it be nice to have one command that:
Picks lines, points or both via an argument, as in plot(..., type = "l") ?
By default, chooses whether to create a new plot, or add to an existing one according to whether the current device is empty or not.
Rescales axes automatically if the added elements to the plot exceed the current bounds.
Has anyone done anything like this? If not, and there's no strong reason why this isn't possible, I'll answer this myself in a bit...
Some possible functionality that may help with what you want:
The matplot function uses base graphics and will plot several sets of points or lines in one step, figuring out the correct ranges in one step.
There is an update method for lattice graphics that can be used to add/change things in the plot and will therefore result in automatic recalculation of things like limits and axes.
If you add additional information (useing +) to a ggplot2 plot, then the things that are automatically calculated will be recalculated.
You already found zoomplot and there is always the approach of writing your own function like you did.
Anyway, this is what I came up with: (It uses zoomplot from TeachingDemos)
fplot <- function(x, y = NULL, type = "l", new = NULL, xlim, ylim, zoom = TRUE,...){
require(TeachingDemos)
if (is.null(y)){
if (length(dim(x)) == 2){
y = x[,2]
x = x[,1]
} else {
y = x
x = 1:length(y)
}
}
if ( is.null(new) ){
#determine whether to make a new plot or not
new = FALSE
if (is.null(recordPlot()[[1]])) new = TRUE
}
if (missing(xlim)) xlim = range(x)
if (missing(ylim)) ylim = range(y)
if (new){
plot(x, y, type = type, xlim = xlim, ylim = ylim, ...)
} else {
if (type == "p"){
points(x,y, ...)
} else {
lines(x,y, type = type, ...)
}
if (zoom){
#rescale plot
xcur = par("usr")[1:2]
ycur = par("usr")[3:4]
#shrink coordinates and pick biggest
xcur = (xcur - mean(xcur)) /1.08 + mean(xcur)
ycur = (ycur - mean(ycur)) /1.08 + mean(ycur)
xlim = c(min(xlim[1], xcur[1]), max(xlim[2], xcur[2]))
ylim = c(min(ylim[1], ycur[1]), max(ylim[2], ycur[2]))
#zoom plot
zoomplot(xlim, ylim)
}
}
}
So you can do, e.g.
dev.new()
fplot(1:4)
fplot(1:4 +1, col = 2)
fplot(0:400/100 + 1, sin(0:400/10), type = "p")
dev.new()
for (k in 1:20) fplot(sort(rnorm(20)), type = "b", new = (k==1) )
par(mfrow) and log axis don't currently work well with zooming, but, it's a start...

ploting artefact with points over raster

I noticed some weird behavior when resizing the plot window. Consider
library(sp)
library(rgeos)
library(raster)
rst.test <- raster(nrows=300, ncols=300, xmn=-150, xmx=150, ymn=-150, ymx=150, crs="NA")
sap.krog300 <- SpatialPoints(coordinates(matrix(c(0,0), ncol = 2)))
sap.krog300 <- gBuffer(spgeom = sap.krog300, width = 100, quadsegs = 20)
shrunk <- gBuffer(spgeom = sap.krog300, width = -30)
shrunk <- rasterize(x = shrunk, y = rst.test)
shrunk.coords <- xyFromCell(object = rst.test, cell = which(shrunk[] == 1))
plot(shrunk)
points(shrunk.coords, pch = "+")
If you resize the window, plotted points get different extent compared to the underlying raster. If you resize the window and plot shrunk and shrunk.coords again, the plot turns out fine. Can anyone explain this?
If you plot directly with the RasterLayer method for plot the resize problem does not occur.
## gives an error, but still plots
raster:::.imageplot(shrunk)
points(shrunk.coords, pch = ".")
So it must be something in the original plot call before the .imageplot method is called.
showMethods("plot", classes = "RasterLayer", includeDefs = TRUE)
It does occur if we call raster:::.plotraster directly, and this is the function that calls raster:::.imageplot:
raster:::.plotraster(shrunk, col = rev(terrain.colors(255)), maxpixels = 5e+05)
points(shrunk.coords, pch = ".")
It is actually in the axis labels, not the image itself. See with this, this plots faithfully on resize:
raster:::.imageplot(shrunk)
abline(h = c(-80, 80), v = c(-80, 80))
But do it like this, and the lines are no longer at [-80, 80] after resize:
plot(shrunk)
abline(h = c(-80, 80), v = c(-80, 80))
So it is actually the points plotted after the raster that are showing incorrectly: the plot method keeps the aspect ratio fixed, so widening the plot doesn't "stretch" out the raster circle to an ellipse. But, it does something to the points that are added afterwards so the calls to par() must not be handled correctly (probably in raster:::.imageplot).
Another way of seeing the problem is to show that axis() does not know the space being used by the plot, which is the same problem you see when overplotting:
plot(shrunk)
axis(1, pos = 1)
When you resize the x-axis length the two axes are no longer synchronized.
Because you have a raster, try replacing plot() with image(). I had the same problem but this solved it for me.

Resources