Since I have updated R to version 3.5.2, my pirateplots (yarrr library) show more y axis ticks than before. Before they only showed ticks at full number values (1, 2, 3, etc., 1), but now they also show ticks at .5 values (1, 1.5, 2, 2.5, 3, etc., 2). This is despite the fact that the data and scripts are exactly the same as before.
Do you know how I can remove the .5 value ticks and make the plots look like they looked before?
I think the yaxt.y argument is what you're looking for as it allows you to override the default y axis construction.
Here's a plot using default arguments
pirateplot(weight ~ Diet, data = ChickWeight)
Original plot
Now with the yaxt.y argument
pirateplot(weight ~ Diet, data = ChickWeight, yaxt.y = seq(0, 400, 25))
Second version
You can also specify the grid line widths with gl.lwd:
pirateplot(weight ~ Diet, data = ChickWeight, yaxt.y = seq(0, 400, 25), gl.lwd = c(.5, 1.5), gl.col = "black")
Third version
Hope this helps!
Related
In the following example, the last x-axis label ("4.0") is omitted.
df <- data.frame(x = c(1, 2, 3.8), y = c(1, 2, 3))
#png(filename = "cutoff.png")
plot(df$x, df$y, xaxt = "n")
axis(side = 1, at = seq(0, 4, 0.5), labels = seq(0, 4, 0.5))
#dev.off()
How to prevent this behaviour?
You axis limit does not include 4; you need to overwrite the default limits of the plot (which it derives from the data) using xlim:
plot(df$x, df$y, xaxt = "n", xlim = c(1, 4))
Note that when using axis your specification of at will become your labels unless you overwrite that, so your script doesn't need to specify labels; your script can become:
axis(side = 1, at = seq(0, 4, 0.5))
As #griffinevo answered (+1), If you want the axis limits to go to 4, you must specify that using xlim. However, it is probably worth explaining how the default limits are computed. This is explained in the documentation, but in a slightly obscure place. On the help page ?par search for xaxs. There you will see
Style "r" (regular) first extends the data range by 4 percent at each
end and then finds an axis with pretty labels that fits within the
extended range.
In your case, the data ranges from 1 to 3.8. So plot will look for pretty labels inside the range
1 - 0.04*(3.8-1) = 0.888
to
3.8 + 0.04*(3.8-1) = 3.912
4 is outside of this range and so will not appear as an axis label. For completeness, it is worth noting that "pretty" sounds like just a word, but actually has a technical meaning here - related to the pretty function. If you look at the help page ?pretty You will see the description:
Compute a sequence of about n+1 equally spaced ‘round’ values which
cover the range of the values in x. The values are chosen so that they
are 1, 2 or 5 times a power of 10.
There is additional detail on the help page.
I have plots that are .25 ha and I need my data to be displayed as 1 ha. I'm trying to make the following graph but multiplying the counts by 4 (so I have a full hectare instead of a quarter). However, all posts seem to deal with changing axis titles, values, etc., but I need to change the actual histogram frequency counts.
Histogram x-variable in size classes plotted by factor variable
ggplot(liveTrees, aes(diam1DBH)) +
geom_histogram(binwidth =10) +
facet_wrap(~site) +
ggtitle("Stems/0.25ha by Size Class") +
ylab("Stems/0.25ha") +
xlab("Diameter Class")
liveTrees = my data
diam1DBH = diameter (numeric, continuous)
site = plot location (factor)
Original code:
What I've tried: `
for (i in 1:length(unique(liveTrees$site))) {
test<-hist(liveTrees[liveTrees$site== unique(liveTrees$site)[i], "diam1DBH"], plot = F)
b <- barchart(test$counts*4, width = 10, xlim=c(0,350), cex.axis = 0.85)
axis(side = 1, at = "b", cex.axis = 0.85)
}
But I keep getting
Error in axis(side = 1, at = "b", cex.axis = 0.85) : no locations are
finite In addition: Warning message: In axis(side = 1, at = "b",
cex.axis = 0.85) : NAs introduced by coercion
So, with this I can get the counts, but the numbers aren't right and they're not in a useful format.
My data is a data.frame, example: data example
What I need is the sum of each diameter class, each bin frequency amount, multiplied by 4. I've been trying to do this but can't get it to work, any help is appreciated!
If you multiply the frequencies by 4, the values will change but the graphs will still look the same, so there are two options, one is to simply change the axis value labels, or the other simpler way is to add the data 4 times. For example:
ggplot(rbind(data, data,data,data), aes(variable_X)) + geom_histogram(binwidth =10)
This way the data is multiplied, and no new data.frame is made that could confuse analysis later on.
Using this example:
x<-mtcars;
barplot(x$mpg);
you get a graph that is a lot of barplots from (0 - 30).
My question is how can you adjust it so that the y axis is (10-30) with a split at the bottom indicating that there was data below the cut off?
Specifically, I want to do this in base R program using only the barplot function and not functions from plotrix (unlike the suggests already provided). Is this possible?
This is not recommended. It is generally considered bad practice to chop off the bottoms of bars. However, if you look at ?barplot, it has a ylim argument which can be combined with xpd = FALSE (which turns on "clipping") to chop off the bottom of the bars.
barplot(mtcars$mpg, ylim = c(10, 30), xpd = FALSE)
Also note that you should be careful here. I followed your question and used 0 and 30 as the y-bounds, but the maximum mpg is 33.9, so I also clipped the top of the 4 bars that have values > 30.
The only way I know of to make a "split" in an axis is using plotrix. So, based on
Specifically, I want to do this in base R program using only the barplot function and not functions from plotrix (unlike the suggests already provided). Is this possible?
the answer is "no, this is not possible" in the sense that I think you mean. plotrix certainly does it, and it uses base R functions, so you could do it however they do it, but then you might as well use plotrix.
You can plot on top of your barplot, perhaps a horizontal dashed line (like below) could help indicate that you're breaking the commonly accepted rules of what barplots should be:
abline(h = 10.2, col = "white", lwd = 2, lty = 2)
The resulting image is below:
Edit: You could use segments to spoof an axis break, something like this:
barplot(mtcars$mpg, ylim = c(10, 30), xpd = FALSE)
xbase = -1.5
xoff = 0.5
ybase = c(10.3, 10.7)
yoff = 0
segments(x0 = xbase - xoff, x1 = xbase + xoff,
y0 = ybase-yoff, y1 = ybase + yoff, xpd = T, lwd = 2)
abline(h = mean(ybase), lwd = 2, lty = 2, col = "white")
As-is, this is pretty fragile, the xbase was adjusted by hand as it will depend on the range of your data. You could switch the barplot to xaxs = "i" and set xbase = 0 for more predictability, but why not just use plotrix which has already done all this work for you?!
ggplot In comments you said you don't like the look of ggplot. This is easily customized, e.g.:
library(ggplot2)
ggplot(x, aes(y = mpg, x = id)) +
geom_bar(stat = "identity", color = "black", fill = "gray80", width = 0.8) +
theme_classic()
The data for some of these types graphs that I'm graphing in R,
http://graphpad.com/faq/images/1352-1(1).gif
has outliers that are way out of range and I can't just exclude them. I attempted to use the axis.break() function from plotrix but the function doesn't rescale the y axis. It just places a break mark on the axis. The purpose of doing this is to be able to show the medians for both groups, as well as the data points, and the outliers all in one plot frame. Essentially, the data points that are far apart from the majority is taking up a chunk of space and the majority of points are being squished, not displaying much differences. Here is the code:
https://gist.github.com/9bfb05dcecac3ecb7491
Any suggestions would be helpful.
Thanks
Unfortunately the code you link to isn't self-contained, but possibly the code you have for gap.plot() there doesn't work as you expect because you are setting ylim to cover the full data range rather than the plotted sections only. Consider the following plot:
As you can see, the y axis has tickmarks for every 50 pg/ml, but there is a gap between 175 and 425. So the data range (to the nearest 50) is c(0, 500) but the range of the y axis is c(0, 250) - it's just that the tickmarks for 200 and 250 are being treated as those for 450 and 500.
This plot was produced using the following modified version of your code:
## made up data
GRO.Controls <- c(25, 40:50, 60, 150)
GRO.Breast <- c(70, 80:90, 110, 500)
##Scatter plot for both groups
library(plotrix)
gap.plot(jitter(rep(0,length(GRO.Controls)),amount = 0.2), GRO.Controls,
gap = c(175,425), xtics = -2, # no xtics visible
ytics = seq(0, 500, by = 50),
xlim = c(-0.5, 1.5), ylim = c(0, 250),
xlab = "", ylab = "Concentrations (pg/ml)", main = "GRO(P=0.0010)")
gap.plot(jitter(rep(1,length(GRO.Breast)),amount = 0.2), GRO.Breast,
gap = c(175, 425), col = "blue", add = TRUE)
##Adds x- variable (groups) labels
mtext("Controls", side = 1, at= 0.0)
mtext("Breast Cancer", side = 1, at= 1.0)
##Adds median lines for each group
segments(-0.25, median(GRO.Controls), 0.25, median(GRO.Controls), lwd = 2.0)
segments(0.75, median(GRO.Breast), 1.25, median(GRO.Breast), lwd = 2.0,
col = "blue")
You could be using gap.plot() which is easily found by following the link on the axis.break help page. There is a worked example there.
This is a follow-up of this question.
I wanted to plot multiple curves on the same graph but so that my new curves respect the same y-axis scale generated by the first curve.
Notice the following example:
y1 <- c(100, 200, 300, 400, 500)
y2 <- c(1, 2, 3, 4, 5)
x <- c(1, 2, 3, 4, 5)
# first plot
plot(x, y1)
# second plot
par(new = TRUE)
plot(x, y2, axes = FALSE, xlab = "", ylab = "")
That actually plots both sets of values on the same coordinates of the graph (because I'm hiding the new y-axis that would be created with the second plot).
My question then is how to maintain the same y-axis scale when plotting the second graph.
(The typical method would be to use plot just once to set up the limits, possibly to include the range of all series combined, and then to use points and lines to add the separate series.) To use plot multiple times with par(new=TRUE) you need to make sure that your first plot has a proper ylim to accept the all series (and in another situation, you may need to also use the same strategy for xlim):
# first plot
plot(x, y1, ylim=range(c(y1,y2)))
# second plot EDIT: needs to have same ylim
par(new = TRUE)
plot(x, y2, ylim=range(c(y1,y2)), axes = FALSE, xlab = "", ylab = "")
This next code will do the task more compactly, by default you get numbers as points but the second one gives you typical R-type-"points":
matplot(x, cbind(y1,y2))
matplot(x, cbind(y1,y2), pch=1)
points or lines comes handy if
y2 is generated later, or
the new data does not have the same x but still should go into the same coordinate system.
As your ys share the same x, you can also use matplot:
matplot (x, cbind (y1, y2), pch = 19)
(without the pch matplopt will plot the column numbers of the y matrix instead of dots).
You aren't being very clear about what you want here, since I think #DWin's is technically correct, given your example code. I think what you really want is this:
y1 <- c(100, 200, 300, 400, 500)
y2 <- c(1, 2, 3, 4, 5)
x <- c(1, 2, 3, 4, 5)
# first plot
plot(x, y1,ylim = range(c(y1,y2)))
# Add points
points(x, y2)
DWin's solution was operating under the implicit assumption (based on your example code) that you wanted to plot the second set of points overlayed on the original scale. That's why his image looks like the points are plotted at 1, 101, etc. Calling plot a second time isn't what you want, you want to add to the plot using points. So the above code on my machine produces this:
But DWin's main point about using ylim is correct.
My solution is to use ggplot2. It takes care of these types of things automatically. The biggest thing is to arrange the data appropriately.
y1 <- c(100, 200, 300, 400, 500)
y2 <- c(1, 2, 3, 4, 5)
x <- c(1, 2, 3, 4, 5)
df <- data.frame(x=rep(x,2), y=c(y1, y2), class=c(rep("y1", 5), rep("y2", 5)))
Then use ggplot2 to plot it
library(ggplot2)
ggplot(df, aes(x=x, y=y, color=class)) + geom_point()
This is saying plot the data in df, and separate the points by class.
The plot generated is
I'm not sure what you want, but i'll use lattice.
x = rep(x,2)
y = c(y1,y2)
fac.data = as.factor(rep(1:2,each=5))
df = data.frame(x=x,y=y,z=fac.data)
# this create a data frame where I have a factor variable, z, that tells me which data I have (y1 or y2)
Then, just plot
xyplot(y ~x|z, df)
# or maybe
xyplot(x ~y|z, df)