Labels on axis are overwritten - r

I'm sorry for asking this question, but I already googled and searched here but I found nothing useful (that means lots of different functions for drawing the plot, but no one with my problem).
I have a vector containing the data I have to plot (named "rmse"), and a vector containing the names of the columns on the x-axis (named "nomi"). I simply want to plot the data with the labels on the x-axis rotated of 90°, due to space problems.
I found this useful site: http://harding.edu/fmccown/r/
Looking at it, I found how to rotate the labels on the axis, but, even though I have 12 columns, I have 6 columns with an overwritten label and 6 columns without label.
Here's my code:
library(lattice)
library(gstat)
nomi<-c("Quota","No Quota","Mare","No Mare","Slope","No Slope","Terreno","No Terreno","Facet","No Facet","Po","No Po")
rmse<-c(1.79,1.97,1.82,1.84,1.82,1.82,1.80,1.83,1.82,1.84,1.82,1.81)
g_range <- range(0, rmse)
plot(rmse, type='h',axes=F, ann=F)
axis(1, at=1:12, lab=F)
text(axTicks(1),par("usr")[3], srt=90, adj=1, labels=nomi, xpd=T, cex=0.8)
axis(2, las=1)
box()
And here's the plot:
Do you know what am I doing wrong? I know it's a simple questions, but I'm quite a beginner and sometimes I need help :)
Thank you for the attention!

I solved! It was enough to add "las=2" as argument of axis, thanks to joran to have suggested me that I can avoid "text" ;)
nomi<-c("Quota","No Quota","Mare","No Mare","Slope","No Slope","Terreno","No Terreno","Facet","No Facet","Po","No Po")
rmse<-c(1.79,1.97,1.82,1.84,1.82,1.82,1.80,1.83,1.82,1.84,1.82,1.81)
g_range <- range(0, rmse)
plot(rmse, type='h',axes=F, ann=F)
axis(1, at=1:12, lab=nomi, las=2)
axis(2, las=1)
box()

Another approach is to use by ggplot2 command to produce the chart
dt <- data.frame(
rownum = 1:length(nomi),
nomi=c("Quota","No Quota","Mare","No Mare","Slope","No Slope","Terreno","No Terreno","Facet","No Facet","Po","No Po"),
rmse=c(1.79,1.97,1.82,1.84,1.82,1.82,1.80,1.83,1.82,1.84,1.82,1.81)
)
library(ggplot2)
ggplot(dt) + aes(x =reorder(nomi,rownum), y = rmse) + geom_bar(stat = "identity")+
theme(axis.text.x = element_text(angle=90, face="bold", colour="black"))+
scale_x_discrete(name="" )

Related

How to reduce the size of the legend in R Plot, while still making it readable?

I am trying to plot some data over years with two y-axes in R. However, whenever I try to include a legend, the the legend dominates my plot. When I use solutions suggested elsewhere like keyword and/or using the cex argument, suggested in another post here, it either becomes unreadable or is still too big.
Here is my example with randomly generated data:
#Create years
year.df <- seq(1974, 2014, 1)
# Create y-axis data
set.seed(75)
mean1 <- rnorm(length(year.df), 52.49, 0.87)
mean2 <- rnorm(length(year.df), 52.47, 0.96)
#Create dataframe
df <- data.frame(cbind(year.df, mean1, mean2))
I want a second y-axis, the difference of the two means over the years
df$diff <- abs(df$mean1 - df$mean2)
When I plot using the code below to create two y-axes:
par(mfrow=c(1,1), mar=c(5.1,4.1,4.1,5.1))
with(df, plot(year.df, mean1, type = "l", lwd=4, xlab="Year", ylab="Mean", ylim=c(48,58)))
with(df, lines(year.df, mean2, type = "l", col="green", lwd=4))
par(new=TRUE)
with(df, plot(year.df, diff, type="l", axes=FALSE, xlab=NA, ylab=NA, col="red", lty=5, ylim=c(0,10)))
axis(side = 4)
mtext(side = 4, line = 3, "Annual Difference")
legend("topleft",
legend=c("Calculated", "MST", "Diff"),
lty=c(1,1,5), col=c("black", "green", "red"))
I get:
When I use the cex=0.5 argument in the legend(), it starts to become unreadable:
Is there a way to format my legend in a clear, readable manner? Better than what I have?
The white space in the legend tells me that you manually widened your plot window. Legends do not scale well when it comes to manual re-sizing.
The solution is opening a plot of the exact size you need before plotting. In Windows, this is done with windows(width=10, height=8). Units are in inches.
As you can see below, the legend sits tightly in the corner.
Apparently, I forgot to do the first step of troubleshooting: turn things off an turn it on. I woke up this morning and ran the script again. Even with cex = 0.5 and it turned out fine. I chose to use cex = 0.75. I would still appreciate any help in why that might be. Spent many hours yesterday trying to fix my legend and the same code works and receives this product (cex=0.75):

How to add colour matched legend to a R matplot

I plot several lines on a graph using matplot:
matplot(cumsum(as.data.frame(daily.pnl)),type="l")
This gives me default colours for each line - which is fine,
But I now want to add a legend that reflects those same colours - how can I achieve that?
PLEASE NOTE - I am trying NOT to specify the colours to matplot in the first place.
legend(0,0,legend=spot.names,lty=1)
Gives me all the same colour.
The default color parameter to matplot is a sequence over the nbr of column of your data.frame. So you can add legend like this :
nn <- ncol(daily.pnl)
legend("top", colnames(daily.pnl),col=seq_len(nn),cex=0.8,fill=seq_len(nn))
Using cars data set as example, here the complete code to add a legend. Better to use layout to add the legend in a pretty manner.
daily.pnl <- cars
nn <- ncol(daily.pnl)
layout(matrix(c(1,2),nrow=1), width=c(4,1))
par(mar=c(5,4,4,0)) #No margin on the right side
matplot(cumsum(as.data.frame(daily.pnl)),type="l")
par(mar=c(5,0,4,2)) #No margin on the left side
plot(c(0,1),type="n", axes=F, xlab="", ylab="")
legend("center", colnames(daily.pnl),col=seq_len(nn),cex=0.8,fill=seq_len(nn))
I have tried to reproduce what you are looking for using the iris dataset. I get the plot with the following expression:
matplot(cumsum(iris[,1:4]), type = "l")
Then, to add a legend, you can specify the default lines colour and type, i.e., numbers 1:4 as follows:
legend(0, 800, legend = colnames(iris)[1:4], col = 1:4, lty = 1:4)
Now you have the same in the legend and in the plot. Note that you might need to change the coordinates for the legend accordingly.
I like the #agstudy's trick to have a nice legend.
For the sake of comparison, I took #agstudy's example and plotted it with ggplot2:
The first step is to "melt" the data-set
require(reshape2)
df <- data.frame(x=1:nrow(cars), cumsum(data.frame(cars)))
df.melted <- melt(df, id="x")
The second step looks rather simple in comparison to the solution with matplot
require(ggplot2)
qplot(x=x, y=value, color=variable, data=df.melted, geom="line")
Interestingly #agstudy solution does the trick, but only for n ≤ 6
Here we have a matrix with 8 columns. The colour of the first 6 labels are correct.
The 7th and 8th are wrong. The colour in the plots restarts from the beginning (black, red ...) , whereas in the label it continues (yellow, grey, ...)
Still haven't figured out why this is the case. I'll maybe update this post with my findings.
matplot(x = lambda, y = t(ridge$coef), type = "l", main="Ridge regression", xlab="λ", ylab="Coefficient-value", log = "x")
nr = nrow(ridge$coef)
legend("topright", rownames(ridge$coef), col=seq_len(nr), cex=0.8, lty=seq_len(nr), lwd=2)
Just discovered that matplot uses linetypes 1:5 and colors 1:6 to establish the appearance of the lines. If you want to create a legend try the following approach:
## Plot multiple columns of the data frame 'GW' with matplot
cstart = 10 # from column
cend = cstart + 20 # to column
nr <- cstart:cend
ltyp <- rep(1:5, times=length(nr)/5, each=1) # the line types matplot uses
cols <- rep(1:6, times=length(nr)/6, each=1) # the cols matplot uses
matplot(x,GW[,nr],type='l')
legend("bottomright", as.character(nr), col=cols, cex=0.8, lty=ltyp, ncol=3)

rotate X axis labels 45 degrees on grouped bar plot R

How can I rotate the X axis labels 45 degrees on a grouped bar plot in R?
I have tried the solution suggested here but got something very messy, the labels seem to have been added multiple times (only showing the axis part to protect data privacy):
This solution (gridBase) was also unsuccessful for me, for some reason I get the following error:
"Cannot pop the top-level viewport (grid and graphics output mixed?)"
PS.
Most people seem to recommend this solution in R base but I am stuck with that too because I don't understand what data they are referring to (I need some kind of example data set to understand new command lines...).
Are these solutions not working because my barplot is a grouped barplot? Or should it work nevertheless? Any suggestions are welcome, I have been stuck for quite some time. Thank you.
[edit] On request I am adding the code that I used to generate the picture above (based on one of the text() solutions):
data <- #this is a matrix with 4 columns and 20 rows;
#colnames and rownames are specified.
#the barplot data is grouped by rows
lablist <- as.vector(colnames(data))
barplot(data, beside=TRUE, col=c("darkred","red","grey20","grey40"))
text(1:100, par("usr")[1], labels=lablist, srt=45, pos=1, xpd=TRUE)
I am not a base plot proficient, so maybe my solution is not very simple. I think that using ggplot2 is better here.
def.par <- par(no.readonly = TRUE)
## divide device into two rows and 1 column
## allocate figure 1 for barplot
## allocate figure 2 for barplot labels
## respect relations between widths and heights
nf <- layout(matrix(c(1,1,2,2),2,2,byrow = TRUE), c(1,3), c(3,1), TRUE)
layout.show(nf)
## barplot
par(mar = c(0,1,1,1))
set.seed(1)
nKol <- 8 ## you can change here but more than 11 cols
## the solution is not really readable
data <- matrix(sample(1:4,nKol*4,rep=TRUE),ncol=nKol)
xx <- barplot(data, beside=TRUE,
col=c("darkred","red","grey20","grey40"))
## labels , create d ummy plot for sacles
par(mar = c(1,1,0,1))
plot(seq_len(length(xx)),rep(1,length(xx)),type='n',axes=FALSE)
## Create some text labels
labels <- paste("Label", seq_len(ncol(xx)), sep = " ")
## Plot text labels with some rotation at the top of the current figure
text(seq_len(length(xx)),rep(1.4,length(xx)), srt = 90, adj = 1,
labels = labels, xpd = TRUE,cex=0.8,srt=60,
col=c("darkred","red","grey20","grey40"))
par(def.par) #- reset to default
Try the first answer:
x <- barplot(table(mtcars$cyl), xaxt="n")
labs <- paste(names(table(mtcars$cyl)), "cylinders")
text(cex=1, x=x-.25, y=-1.25, labs, xpd=TRUE, srt=45)
But change cex=1 to cex=.8 or .6 in the text() function:
text(cex=.6, x=x-.25, y=-1.25, labs, xpd=TRUE, srt=45)
In the picture you posted, it appears to me that the labels are just too big. cex sets the size of these labels.
I had the same problem with a grouped bar plot. I assume that you only want one label below each group. I may be wrong about this, since you don't state it explicitly, but this seems to be the case since your labels are repeated in image. In that case you can use the solution proposed by Stu although you have to apply colMeans to the x variable when you supply it to the text function:
x <- barplot(table(mtcars$cyl), xaxt="n")
labs <- paste(names(table(mtcars$cyl)), "cylinders")
text(cex=1, x=colMeans(x)-.25, y=-1.25, labs, xpd=TRUE, srt=45)

Histogram with Logarithmic Scale and custom breaks

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:
hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))
This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.
Then I've tried doing:
mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")
It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.
A histogram is a poor-man's density estimate. Note that in your call to hist() using default arguments, you get frequencies not probabilities -- add ,prob=TRUE to the call if you want probabilities.
As for the log axis problem, don't use 'x' if you do not want the x-axis transformed:
plot(mydata_hist$count, log="y", type='h', lwd=10, lend=2)
gets you bars on a log-y scale -- the look-and-feel is still a little different but can probably be tweaked.
Lastly, you can also do hist(log(x), ...) to get a histogram of the log of your data.
Another option would be to use the package ggplot2.
ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()
It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.
Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:
buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)
The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.
I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.
Run the hist() function without making a graph, log-transform the counts, and then draw the figure.
hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)
It should look just like the regular histogram, but the y-axis will be log2 Frequency.
I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.
The original problem would be solved with:
myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")
The function:
myhist <- function(x, ..., breaks="Sturges",
main = paste("Histogram of", xname),
xlab = xname,
ylab = "Frequency") {
xname = paste(deparse(substitute(x), 500), collapse="\n")
h = hist(x, breaks=breaks, plot=FALSE)
plot(h$breaks, c(NA,h$counts), type='S', main=main,
xlab=xlab, ylab=ylab, axes=FALSE, ...)
axis(1)
axis(2)
lines(h$breaks, c(h$counts,NA), type='s')
lines(h$breaks, c(NA,h$counts), type='h')
lines(h$breaks, c(h$counts,NA), type='h')
lines(h$breaks, rep(0,length(h$breaks)), type='S')
invisible(h)
}
Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.
Here's a pretty ggplot2 solution:
library(ggplot2)
library(scales) # makes pretty labels on the x-axis
breaks=c(0,1,2,3,4,5,25)
ggplot(mydata,aes(x = V3)) +
geom_histogram(breaks = log10(breaks)) +
scale_x_log10(
breaks = breaks,
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10

change look-and-feel of plot to resemble hist

I used the information from this post to create a histogram with logarithmic scale:
Histogram with Logarithmic Scale
However, the output from plot looks nothing like the output from hist. Does anyone know how to configure the output from plot to resemble the output from hist? Thanks for the help.
A simplified, reproducible version of the linked answer is
x <- rlnorm(1000)
hx <- hist(x, plot=FALSE)
plot(hx$counts, type="h", log="y", lwd=10, lend="square")
To get the axes looking more "hist-like", replace the last line with
plot(hx$counts, type="h", log="y", lwd=10, lend="square", axes = FALSE)
Axis(side=1)
Axis(side=2)
Getting the bars to join up is going to be a nightmare using this method. I suggest using trial and error with values of lwd (in this example, 34 is somewhere close to looking right), or learning to use lattice or ggplot.
EDIT:
You can't set a border colour, because the bars aren't really rectangles – they are just fat lines. We can fake the border effect by drawing slightly thinner lines over the top. The updated code is
par(lend="square")
bordercol <- "blue"
fillcol <- "pink"
linewidth <- 24
plot(hx$counts, type="h", log="y", lwd=linewidth, col=bordercol, axes = FALSE)
lines(hx$counts, type="h", lwd=linewidth-2, col=fillcol)
Axis(side=1)
Axis(side=2)
How about using ggplot2?
x <- rnorm(1000)
qplot(x) + scale_y_log10()
But I agree with Hadley's comment on the other post that having a histogram with a log scale seems weird to me =).

Resources