Putting x-axis labels directly under tick marks in barplots in R - r

I have a table (below) showing the percentage of tree species (categorical variable) present in a group experiment. My objective is to plot the percentage of tree species on the y-axis and 'Species' on the x-axis within a barplot.
Issue
My problem is that I am experiencing problems with formatting the x-axis correctly. My objective is to ensure that the x-axis labels for**'Species'** are:-
Positioned directly underneath their bar at the tick mark
Do not overlap onto the plotting area
If anyone can help solve this issue, I would be incredibly grateful.
R code
df <- leaf.percent[order(leaf.percent$Leaf.Percentge, decreasing = TRUE),]
Tree.labels<-c("Quercus robar", "Quercus Patraea",
"Deciduous", "Oak",
"Plant", "Shrub")
par(mar=c(6, 6, 3, 3))
Tree<-barplot(df$Leaf.Percentge, names.arg = df$Species,
xaxt = "n",
ylab="Percentage %",
xlab="Tree Species",
col="lightblue",
ylim = c(0, 60))
axis(1, at=Tree, labels=FALSE)
text(seq(1, 6, by=1), par("usr")[3] - 0.2,
labels=unique(Tree.labels),
srt = 25, pos = 1,
xpd = TRUE, cex=0.7)
DATA
structure(list(Species = structure(1:6, .Label = c("Deciduous",
"Oak", "Plant", "Quercus_petraea", "Quercus_robur", "Shrub"), class = "factor"),
Frequency = c(48L, 29L, 6L, 70L, 206L, 4L), Leaf.Percentge = c(13.2231404958678,
7.98898071625344, 1.65289256198347, 19.2837465564738, 56.7493112947658,
1.10192837465565)), .Names = c("Species", "Frequency", "Leaf.Percentge"
), row.names = c(NA, -6L), class = "data.frame")

Related

point labels in R scatter plot

I have the following toy data
Xeafield (1999) (PER) ,1,0.5745375408
Lancelot et al. (1989),0.9394939494,0.4733405876
LemLM Xeafield (1997) (TER) ,0.6265126513,0.2959738847
Almore and Flemin (2001) (KER),0.4218921892,0.5745375408
Malek et al. (2006) (HER) ,0.4125412541,1
Charles and Osborne (2003),0.0308030803,0.1414581066
And trying a simple 2D plot in R with points labeled using the 1st column.
pdf('data.pdf', width = 7, height = 8)
d1 <- read.csv("data.csv", header=F, dec=".",sep = ",")
plot(as.matrix(d1[,2]), as.matrix(d1[,3]), col= "blue", pch = 19, cex = 1, lty = "solid", lwd = 2, ylim=c(0,1), xaxt = "n",yaxt = "n")
text(as.matrix(d1[,2]), as.matrix(d1[,3]), labels=as.matrix(d1[,1]), cex= 0.7, pos=3)
x_axis_range <- c(0,1)
x_axis_labels <- c("Small","Large")
axis(1,at = x_axis_range, labels = x_axis_labels)
y_axis_range <- c(0,1)
y_axis_labels <- c("Slow","Fast")
axis(2,at = y_axis_range, labels = y_axis_labels)
title(xlab="Memory", ylab="Speed",cex.lab=1)
dev.off()
But the plot doesn't come out right. A few issues I have: the axis label are messed up (it shows as.matrix ..., instead of the label I specified), and the margin of the plot is to small that node labels are cutoff. I am new to using R and plot, appreciate your help.
A simple solution for your problem is to define axis labels and axis ranges in the plot function.
d1 <- structure(list(V1 = structure(c(6L, 3L, 4L, 1L, 5L, 2L), .Label = c("Almore and Flemin (2001) (KER)",
"Charles and Osborne (2003)", "Lancelot et al. (1989)", "LemLM Xeafield (1997) (TER) ",
"Malek et al. (2006) (HER) ", "Xeafield (1999) (PER) "), class = "factor"),
V2 = c(1, 0.9394939494, 0.6265126513, 0.4218921892, 0.4125412541,
0.0308030803), V3 = c(0.5745375408, 0.4733405876, 0.2959738847,
0.5745375408, 1, 0.1414581066)), .Names = c("V1", "V2", "V3"
), class = "data.frame", row.names = c(NA, -6L))
# Use xlab and ylab for axis labels and
# and xlim and ylim for setting axis ranges
plot(as.matrix(d1[,2]), as.matrix(d1[,3]), col= "blue", pch = 19,
cex = 1, lty = "solid", lwd = 2, ylim=c(-0.1,1.1), xaxt = "n",yaxt = "n",
xlab="Memory", ylab="Speed",cex.lab=1, xlim=c(-0.1,1.1))
text(as.matrix(d1[,2]), as.matrix(d1[,3]),
labels=as.matrix(d1[,1]), cex= 0.7, pos=3)
x_axis_range <- c(0,1)
x_axis_labels <- c("Small","Large")
axis(1,at = x_axis_range, labels = x_axis_labels)
y_axis_range <- c(0,1)
y_axis_labels <- c("Slow","Fast")
axis(2,at = y_axis_range, labels = y_axis_labels)

How to add a second vertical line in R package forestplot

I'd like to distinguish between statistical significance (OR = 1.0) and clinical significance (OR = 1.5) in my forest plot. I created this plot using the forestplot package, sample code below. Is adding a vertical line possible (while maintaining the line of no difference)?
library(forestplot)
test_data <- structure(list(
mean = c(NA, NA, 1, 0.5, 2),
lower = c(NA, NA, .5, .25, 1.5),
upper = c(NA, NA, 1.5, .75, 2.5)),
.Names = c("mean", "lower", "upper"),
row.names = c(NA, -5L),
class = "data.frame")
tabletext <- cbind(
c("", "Outcome", "Outcome 1", "Outcome 2", "Outcome 3"),
c("", "OR", "1 (0.5 - 1.5)", "0.5 (0.25 - 0.75)", "2.0 (1.5 - 2.5)"))
forestplot(tabletext,
test_data,
new_page = TRUE,
xlog = TRUE,
boxsize = .25
)
Is this what you were looking for?
forestplot(tabletext,
test_data,
new_page = TRUE,
xlog = TRUE,
grid = structure(c(log(1.5)),
gp = gpar(lty = 2, col = "#CCCCFF")),
zero = 1,
boxsize = .25)
A suboptimal (and not very elegant) solution could be: 1- creating an empty plot with no axis or labels, 2- then plot a vertical line (abline(v=1.5)) and 3- call your forestplot with new_page = F.

Add median trend line and p-value for one-sided repeated measures test in 2-y axis scatter plot [R]

Load sample data frame
df <- structure(list(ID = c(1,1,1,2,2,2,3,3,3),
time = c(0L,1L,2L,0L,1L,2L,0L,1L,2L),
M1a = c(0, 0.2, 0.3, 0, 1.5, 2.9,0, 2.4, 3.9),
M2a = c(0, 0.4, 0.6,0,0.9, 0.9,0,0.5, 0.7),
M3a = c(0,0.3, 0.4, 0, 0.6, 0.9,0, 0.5, 0.8),
M4a = c(0,0.6, 0.6,0, 0.4, 0.6,0, 0.2, 0.9),
M1b = c(0L, 200L, 300L,0L, 300L, 900L,0L, 900L, 1000L),
M2b = c(0L, 400L, 600L,0L, 600L, 900L,0L, 600L, 1000L),
M3b = c(0L, 300L, 400L,0L, 200L, 800L,0L, 200L, 900L),
M4b = c(0L, 600L, 600L,0L, 800L, 1000L,0L, 400L, 1100L)),
.Names = c("ID", "time", "M1a", "M2a", "M3a", "M4a","M1b", "M2b","M3b", "M4b"), class = "data.frame", row.names = c(NA, -9L))
Now plot two y-axis scatter plot
par(mar=c(5,4,4,5)+.1)
plot(df$time,df$M1a,type="p",col="red", main="M1", cex=0.5, cex.main=2, cex.lab=1.0, cex.axis=0.7)
par(new = TRUE)
plot(df$time,df$M1b,type="p",col="blue",xaxt="n",yaxt="n",xlab="",ylab="")
mtext("Relative change (%)",side=4,line=3)
axis(4)
legend("topleft",col=c("red","blue"),lty=1,legend=c("Absolute Change","Relative Change"))
What I am stuck with?
1.Median trend line
I was able to add regression line, but I want to have a median trend line connecting M1a and M1b medians for three time points.
2.Adding p-values to the plot, repeated one-way anova test
fit1=aov(df$M1a~df$time + Error(ID/time),na.action=na.exclude,data=df);
sig1= summary(fit1)$"Error: Within"$"Pr(>F)"
if (sig<0.001) star='**' else if (sig>=0.001&sig<0.05) star='*' else star='';
if (sig1<0.001) star='**' else star='';
I was planning to add use above code for adding p-value in my 2-y axis plot. Here, I get sig1 as NULL, however, sig1 should print out 0.153.
The final results should include * mark on the main title of plot (M1), if results are significant.
Any tips? Thanks in advance!
To answer #2 first, one needs to look at the inner structures of a summary.aov object:
dput(summary(fit1))
structure(list(`Error: ID` = structure(list(structure(list(Df = 1,
`Sum Sq` = 5.60666666666667, `Mean Sq` = 5.60666666666667,
`F value` = NA_real_, `Pr(>F)` = NA_real_), .Names = c("Df",
"Sum Sq", "Mean Sq", "F value", "Pr(>F)"), class = c("anova",
"data.frame"), row.names = "Residuals")), class = c("summary.aov",
"listof")), `Error: ID:time` = structure(list(structure(list(
Df = 1, `Sum Sq` = 11.3157142857143, `Mean Sq` = 11.3157142857143), .Names = c("Df",
"Sum Sq", "Mean Sq"), class = c("anova", "data.frame"), row.names = "df$time")), class = c("summary.aov",
"listof")), `Error: Within` = structure(list(structure(list(Df = c(1,
5), `Sum Sq` = c(0.325952380952381, 0.573888888888889), `Mean Sq` = c(0.325952380952381,
0.114777777777778), `F value` = c(2.83985617480293, NA), `Pr(>F)` = c(0.152766396924706,
NA)), .Names = c("Df", "Sum Sq", "Mean Sq", "F value", "Pr(>F)"
), class = c("anova", "data.frame"), row.names = c("df$time ",
"Residuals"))), class = c("summary.aov", "listof"))), .Names = c("Error: ID",
"Error: ID:time", "Error: Within"), class = "summary.aovlist")
And note that the values within summary(fit1)$"Error: Within" are actually buried one level deeper (and don't have names so need numeric index. Do this:
summary(fit1)$"Error: Within"[[1]]$`Pr(>F)`
[1] 0.1527664 NA
Now to see if I can figure out the two-0rdinate plot issue. Pretty sure one would need to do any median plotting before the par(new=TRUE) operation because that changes the user coordinate system based on the new data.
Adding a title with extracted value to your plot augmented by the helpful comment by #VincentBonhomme:
plot(df$time,df$M1a,type="p",col="red", cex=0.5, cex.main=2, cex.lab=1.0, cex.axis=0.7)
lines(unique(df$time),
tapply(df$M1a, df$time, median))
par(new = TRUE)
plot( df$time, df$M1b,type="p", col="blue", xaxt="n", yaxt="n", xlab="",ylab="")
lines(unique(df$time),
tapply(df$M1b, df$time, median))
mtext("Relative change (%)",side=4,line=3)
axis(4)
legend("topleft",col=c("red","blue"), lty=1,legend=c("Absolute Change","Relative Change"))
title(main=bquote("P-value for M1 (absolute scale)"==
.(round(summary(fit1)$"Error: Within"[[1]]$`Pr(>F)`, 3) ) ) )

Move x-axis labels down from x-axis

x<- structure(list(count = c(4259120, 4317840, 4444000, 4254240,
4656800), the_date = structure(c(1389589200, 1389675600, 1389762000,
1389848400, 1389934800), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("count",
"the_date"), row.names = c(51L, 406L, 664L, 197L, 196L), class = "data.frame")
par(mar = c(8, 4, 4, 2) + 0.1)
plot(x$the_date, x$count, type="l", xaxt = "n", xlab = "")
axis(1, labels = FALSE)
labels<-x$the_date
labels<-format(labels, format="%b-%d-%Y")
text(x$the_date, par("usr")[3] - 0.75, srt = 55, adj = 1, labels = labels, xpd = TRUE)
I've tried adjusting the par("usr")[3] - 0.75 offset as specified here, but the labels aren't moving at all.
You can a trick like this using 2 calls to axis functions. I am using here axis.Date since you deal with dates(better for formatting). Then you can ply with line argument t play with labels positions.
axis(1,labels=FALSE)
axis.Date(1,at = x$the_date,las=2, format= "%m-%d",line=0.5,tick=FALSE)

Midcentered bar graphs

I have a data set with percentages in four categories: the top two categories are "positive" and the bottom two categories are "negative" so I want to align the boundary between 2 and 3 so it's the zero point on all the bars. (I'm plotting pairs of bars: on set of bars for ext.obs=0 and one for ext.obs=1.) Here's a portion of the data:
structure(list(ext.obs = c(0, 0, 0, 1, 1, 1), comp = c(1, 2,
3, 1, 2, 3), `1` = c(0.00617283950617284, 0.00609756097560976,
0.0111111111111111, 0, 0, 0), `2` = c(0.154320987654321, 0.195121951219512,
0.161111111111111, 0.211180124223602, 0.392638036809816, 0.23030303030303
), `3` = c(0.709876543209877, 0.676829268292683, 0.666666666666667,
0.745341614906832, 0.521472392638037, 0.721212121212121), `4` = c(0.12962962962963,
0.121951219512195, 0.161111111111111, 0.0434782608695652, 0.0858895705521472,
0.0484848484848485)), .Names = c("ext.obs", "comp", "1", "2",
"3", "4"), row.names = c(1L, 2L, 3L, 11L, 12L, 13L), class = "data.frame")
I would like to be able to put together a matrix with these data that I can just do barplot(datamatrix) and have it come out nice. But I can't figure out any way other than plotting the top two categories and then adding the bottom two categories using barplot(..., add=T).
Here's the code I wrote (I actually plot 10 pairs of bars with par(mfrow=c(1, 10)) looping though for(i in 1:10) ):
bar.loc <- barplot(t(as.matrix(tab3[c(i, i+10), c(5,6)])),
ylim=c(-0.5, 1.0),
col=my.pal[3:4],
xaxt="n",
yaxt="n",
ylab="",
xlab=components[i]
)
barplot(t(as.matrix(tab3[c(i, i+10), c(4, 3)]*(-1))),
add=T,
col=my.pal[2:1],
yaxt="n",
xaxt="n",
ylab="",
xlab="")
You can see part of the finished product here or the image is below:
Can anyone think of a more elegant way to do this?
Try this:
barplot( t(cbind(tab3[,5:6],-tab3[,6:5],-tab3[,4:3])),
col=c('lightblue','darkblue',NA,NA,'tan','brown') )

Resources