How to write labels in barplot on x-axis with duplicated names? - r

I am trying to make a simple barplot but i have a problem that I have duplicated names on x-axis. So when ever I am trying to write names on x-axis it does not show complete string. I have following data
x <- c(1.8405917,0.3265986,1.5723623,464.7370299,0.0000000,3.2235716,
3.1223534, 7.0999787, 1.7122258,3.2005524,3.7531266,469.4436828)
and I am using barplot
barplot(x,xlab=c("AA/AA","AA/CC","AA/AC","AA/NC","CC/AA","CC/CC","CC/AC",
"CC/NC","AC/AA","AC/CC","AC/AC","AC/NC"))
But it does not work. I also used
axis()
But it does not work as well.
Thanks in advance.

No, xlab is for providing a label for the entire x-axis of the plot, not for labelling the individual bars.
barplot() takes the labels for the bars from the names of the vector plotted (or something that can be derived into a set of names).
> names(x) <- c("AA/AA", "AA/CC", "AA/AC", "AA/NC", "CC/AA", "CC/CC", "CC/AC",
+ "CC/NC", "AC/AA", "AC/CC", "AC/AC", "AC/NC")
> barplot(x)
> ## or with labels rotated, see ?par
> barplot(x, las = 2)
Edit: As #Aaron mentions, barplot() also has a names.arg to supply the labels for the bars. This is what ?barplot has to say:
names.arg: a vector of names to be plotted below each bar or group of
bars. If this argument is omitted, then the names are taken
from the names attribute of height if this is a vector,
or the column names if it is a matrix.
Which explains the default behaviour if names.arg is not supplied - which is to take the names from the object plotted. Which usage is most useful for you will mainly be a matter of taste. Not having the row/column/names might speed code up slightly, but many of R's functions will take the names attribute (or similar, e.g. row names) directly from objects so you don't have to keep providing labels for plotting/labelling of results etc.

xlab should be names.arg. See ?barplot for details.

The way to use axis() is to capture the midpoints, which is what the barplot function returns. See ?barplot:
mids <- barplot(x, xlab="")
axis(1, at=mids, labels=c("AA/AA","AA/CC","AA/AC","AA/NC","CC/AA","CC/CC",
"CC/AC","CC/NC","AC/AA","AC/CC","AC/AC","AC/NC"),
las=3)

Try this:
barplot(x, cex.names=0.7,
names.arg=c("AA/AA","AA/CC","AA/AC","AA/NC","CC/AA","CC/CC","CC/AC",
"CC/NC","AC/AA","AC/CC","AC/AC","AC/NC"))

Related

R Lattice Plot Multiple Lines with Specific Color

I have two problems that I am having trouble to solve for. Firstly when I do a multiple column matrix plot using lattice xyplot, I find that all the points are connected. How can I get separate disconnected lines?
x<-cbind(rnorm(10),rnorm(10))
xyplot(x~1:nrow(x),type="l")
Secondly, I am having trouble figuring out how to make one line thicker than the other. For example, given that I want column 1, then column 1's line will be thicker than that of column 2.
The lattice plotting paradigm,like that of ggplot2 that followed it, expects data to be in long format in dataframes:
dfrm <- data.frame( y=c(rnorm(10),rnorm(10)),
x=1:10,
grp=rep(c("a","b"),each=10))
xyplot(y~x, group=grp, type="l", data=dfrm, col=c("red","blue"))
This might not be the most elegant solution but it gets the job done:
x<-cbind(rnorm(10),rnorm(10))
plot1<-xyplot(x[,1]~1:nrow(x),type="l",col="red",lwd=3)
plot2<-xyplot(x[,2]~1:nrow(x),type="l")
library(latticeExtra)
plot1+plot2
I assumed that you wanted V1 and V2 plotted against the number of observations.
Otherwise you indeed only have one line.
You can adjust the axis and labels according to taste.

R heat map: Ordering by value; label issues

I am looking to improve upon output I implemented in R based on Jeromy's answer here (thanks!). Mine is a 31x31 matrix with positive and negative values, and uses basically the same ggplot2 code:
library(ggplot2)
library(reshape)
z<-cor(insheet3,use="complete.obs",method="kendall")
zm<-melt(z)
ggplot(zm, aes(X1,X2, fill=value)) + geom_tile() +
scale_fill_gradient2(low = "blue", high = "dark violet")
I need to change three things:
Right now, the rows appear in reverse alphabetical order, which means no visible data trends. How can I influence the order of the rows and columns, such that either:
A. (Preferred:) The columns are ordered by correlation value (negative to positive or vice versa), as they are in the ellipse package output on that same page; or
B. The columns are manually ordered, so that I can group similar variables?
Along the bottom X-axis, my variable names are overlapping dramatically and are unreadable. They need to remain long (i.e., OrthoPhos, Ammonia, Residential...), so how can I rotate their labels 90 degrees?
Is there a way to remove the "X1" and "X2" labels along each axis?
Thank you!
Following what I'll call an extensive/religious R journey into correlation matrix possibilities, I wanted to share what I'm finally going to use. Also, thanks to the previous answerers; I've found that there are many "right" answers to this.
Since my reviewers insisted I include numbers and not just colors, and that I stay away from more "confusing" and "busy" output like correlogram, I finally found "image" and based my final output on this example. Thanks #Marcinthebox.
Also to appease StackOverflow, here is a link to the image, rather than the image itself.
Because some of these specifications took a while to figure out and were critical to the final output, here's my code, shortened as much as I could.
#Subsetting to only the vectors I want to see in the correlation, as ordered
insheet<-subset(insheet1,
select=c("Cond", "CL", "SO4", "TN", "TP", "OrthoPhos", "DO", ...., "Rural"))
#Defining "high" and "low" colors
library(colorspace)
mycolors<-diverge_hcl(8, h = c(8, 240), c = 80, l = c(50,100), power = 1)
#Correlating them into a matrix
sheet<-cor(insheet,use="complete.obs")
#Making it!
image(x=seq(dim(sheet)[2]), y=seq(dim(sheet)[2]), z=sheet, ann=FALSE,
col=mycolors, xlab="x column", ylab="y column", xaxt='n', yaxt='n')
text(expand.grid(x=seq(dim(sheet)[2]), y=seq(dim(sheet)[2])),
labels=round(c(sheet),2), cex=0.5)
axis(1, 1:dim(insheet2)[2], colnames(insheet2), las=2)
axis(2, 1:dim(insheet2)[2], colnames(insheet2), las=2)
par(mar=c(5.5, 5.5, 2, 1)) #Moves margins over to allow for axis labels
I was also able to for-loop this to output multiple .wmf files, once errors were suppressed. Too bad I couldn't visualize significant p-values as well... another time. Thanks!
I assume that you mean "clustering" for point 1.?
For such tasks I prefer the heatmap.2() function from the gplots package, which offers various clustering options.
For point 2 and 3: The heatmap.2() function will also take care of the 90º rotation and the labels since it is using a data matrix as input instead of a data table.

Histogram of two variables in R

I have two variables that I want to compare in a histogram like the one below. For each bin of the histogram the frequency of both variables is shown what makes it easy to compare them.
You can use the add parameter to hist (see ?hist, ?plot.histogram):
hist(rnorm(1000, mean=0.2, sd=0.1), col='blue', xlim=c(0, 1))
hist(rnorm(1000, mean=0.8, sd=0.1), col='red', add=T)
To find out about the add parameter I noticed that in ?hist the ... argument says that these are arguments passed to plot.histogram, and add is documented in ?plot.histogram. Alternatively, one of the examples at the bottom of ?hist uses the add parameter.
you can use prop.table and barplot like this
somkes <- sample(c('Y','N'),10,replace=T)
amount <- sample (c(1,2,3),10,replace=T)
barplot(prop.table(table(somkes,amount)),beside=T)

How to plot data grouped by a factor, but not as a boxplot

In R, given a vector
casp6 <- c(0.9478638, 0.7477657, 0.9742675, 0.9008372, 0.4873001, 0.5097587, 0.6476510, 0.4552577, 0.5578296, 0.5728478, 0.1927945, 0.2624068, 0.2732615)
and a factor:
trans.factor <- factor (rep (c("t0", "t12", "t24", "t72"), c(4,3,3,3)))
I want to create a plot where the data points are grouped as defined by the factor. So the categories should be on the x-axis, values in the same category should have the same x coordinate.
Simply doing plot(trans.factor, casp6) does almost what I want, it produces a boxplot, but I want to see the individual data points.
require(ggplot2)
qplot(trans.factor, casp6)
You can do it with ggplot2, using facets. When I read "I want to create a plot where the data points are grouped as defined by the factor", the first thing that came to my mind was facets.
But in this particular case, faster alternative should be:
plot(as.numeric(trans.factor), casp6)
And you can play with plot options afterwards (type, fg, bg...), but I recommend sticking with ggplot2, since it has much cleaner code, great functionality, you can avoid overplotting... etc. etc.
Learn how to deal with factors. You got barplot when evaluating plot(trans.factor, casp6) 'cause trans.factor was class of factor (ironically, you even named it in such manor)... and trans.factor, as such, was declared before a continuous (numeric) variable within plot() function... hence plot() "feels" the need to subset data and draw boxplot based on each part (if you declare continuous variable first, you'll get an ordinary graph, right?). ggplot2, on the other hand, interprets factor in a different way... as "an ordinary", numeric variable (this stands for syntax provided by Jonathan Chang, you must specify geom when doing something more complex in ggplot2).
But, let's presuppose that you have one continuous variable and a factor, and you want to apply histogram on each part of continuous variable, defined by factor levels. This is where the things become complicated with base graph capabilities.
# create dummy data
> set.seed(23)
> x <- rnorm(200, 23, 2.3)
> g <- factor(round(runif(200, 1, 4)))
By using base graphs (package:graphics):
par(mfrow = c(1, 4))
tapply(x, g, hist)
ggplot2 way:
qplot(x, facets = . ~ g)
Try to do this with graphics in one line of code (semicolons and custom functions are considered cheating!):
qplot(x, log(x), facets = . ~ g)
Let's hope that I haven't bored you to death, but helped you!
Kind regards,
aL3xa
I find the following solution:
stripchart(casp6~trans.factor,data.frame(casp6,trans.factor),pch=1,vertical=T)
simple and direct.
(Refer eg to http://www.mail-archive.com/r-help#r-project.org/msg34176.html)
You may be able to get close to what you want using lattice graphics by doing:
library(lattice)
xyplot(casp6 ~ trans.factor,
scales = list(x = list(at = 1:4, labels = levels(trans.factor))))
I think there's a better solution (I wrote it for a workshop a few days ago), but it slipped my mind. Here's an ugly substitute with base graphics. Feel free to annotate the x axis ad libitum. Personally, I like Greg's solution.
plot(0, 0, xlim = c(1, 4), ylim = range(casp6), type = "n")
points(casp6 ~ trans.factor)
No extra package needed
I'm a bit late to the party, but I found that you can get the desired result very easily with the standard plot function -- simply convert the factor to a numeric value:
plot(as.numeric(trans.factor), casp6)
10 year old question...but if you want a neat base R solution:
plot(trans.factor, casp6, border=NA, outline=FALSE)
points(trans.factor, casp6)
The first line sets up the plot but draws nothing. The second adds the points. This is slightly neater than the solutions that force x to be numeric.

How do I set what plot() labels the x-axis with?

I have a plot() that I'm trying to make, but I do not want the x-values to be used as the axis labels...I want a different character vector that I want to use as labels, in the standard way: Use as many as will fit, drop the others, etc. What should I pass to plot() to make this happen?
For example, consider
d <- data.frame(x=1:5,y=10:15,x.names=c('a','b','c','d','e'))
In barplot, I would pass barplot(height=d$y,names.arg=d$x.names), but in this case the actual x-values are important. So I would like an analog such as plot(x=d$x,y=d$y,type='l',names.arg=d$x.names), but that does not work.
I think you want to first suppress the labels on the x axis with the xaxt="n" option:
plot(flow~factor(month),xlab="Month",ylab="Total Flow per Month",ylim=c(0,55000), xaxt="n")
then use the axis command to add in your own labels. This example assumes the labels are in an object called month.name
axis(1, at=1:12, labels=month.name)
I had to look up how to do this and I stole the example from here.

Resources