I'm trying to make a grouped barplot in r, but there are some things I cannot figure out. This is what I have so far:
I would like:
to create a matrix from the data.frame (.csv file, see below)
the ablines to appear, but not in front of the bars
labels for the grouped bars (November, December, January, ... ->see data below)
for the plot layout to be as shown below. (I basically want the plot border)
I used the following code:
x<-matrix(nrow=3,ncol=7, data=c(200,227,196,210,279,319,220,126,111,230,196,123,240,106,94,250,154,233,260,226,218))
tiff("p_month_all.tiff", width=600, height=300)
par(mar=c(5,4,0.5,0.5))
a=c("November","December","January","February","March","April","May")
barplot(x, beside=TRUE, ylim=c(0,350),xlab="Month", axes=TRUE,axis.lty=1, ylab="Monthly Precipitation [mm]", col=c("darkblue","dodgerblue3","deepskyblue1"),panel.first= abline(h = c(50,100,150,200,250,300), col = "grey", lty = 2), xaxt="n", yaxt="n")
par(ps=12, cex =1, cex.main=2)
axis(2, c(0,350, c(50, 100, 150, 200, 250, 300)), las=1)
dev.off()
The data set (.csv file) looks like this:
Month Hornberg Strick Huetten
November 120 278 234
December 279 156 145
January 328 300 299
February 267 259 234
March 190 201 187
April 150 199 177
May 147 156 160
I've rewritten your code for clarity so you can see more easily what the problem is.
You were suppressing the axes with xaxt = "n" and yaxt = "n". I removed those lines.
Adding a call to box draws the box around the plot.
Adding a call to grid draws gridlines in the plot.
I've added row and column names to your data matrix so the plot know what to use in the axes.
I've updated the plot margins.
I also tidied a few bits like replacing month names with month.name and using seq.int rather than a hard-coded sequence.
x <- matrix(
c(
200, 227, 196,
210, 279, 319,
220, 126, 111,
230, 196, 123,
240, 106, 94,
250, 154, 233,
260, 226, 218
),
nrow = 3,
ncol = 7
)
colnames(x) <- month.name[c(11:12, 1:5)]
rownames(x) <- c("Hornberg", "Strick", "Huetten")
par(mar = c(5, 4, 1.5, 0.5), ps = 12, cex = 1, cex.main = 2, las = 1)
barplot(
x,
beside = TRUE,
ylim = c(0,350),
xlab = "Month",
axes = TRUE,
axis.lty = 1,
ylab = "Monthly Precipitation [mm]",
col = c("darkblue", "dodgerblue3", "deepskyblue1"),
panel.first = abline(
h = seq.int(50, 300, 50),
col = "grey",
lty = 2
)
)
box()
grid()
So, first of all, look through ggplot2 documentation, it's pretty good http://docs.ggplot2.org/0.9.3.1/index.html
If you haven't found an answer for your question, never give up googling :)
Ok, about your question:
Create data
help(read.csv) -> import your data to data.frame named x
Prepare data for the plot:
Melt your data to use it for the plot
x<-melt(x)
Use Month variable as a factor and order by month:
x$Month=factor(x$Month,level=month.name)
x<-x[order(x$Month),]
Plot the graph using ggplot2 (as you tagged it here and it's straitforward in use)
ggplot(x,aes(x=Month,y=value,fill=variable))+geom_bar(stat="bin",position="dodge")+theme_bw()+ylab("Monthly Precipitation [mm]")+xlab("Month")
For the colours, can use scale_fill_brewer() (great tutorials here:http://www.cookbook-r.com/Graphs/Colors_%28ggplot2%29/)
ggplot(x,aes(x=Month,y=value,fill=variable))+geom_bar(stat="bin",position="dodge")+theme_bw()+ylab("Monthly Precipitation [mm]")+xlab("Month")+scale_fill_brewer(palette="Blues")
Related
I would like to change my Y axis of these histograms to start at 20 and end at 180 AND making it so there is always 20 between each number (20, 40, 80, ...). How should I do it?
I read about yaxis command, but I just dont know how to make it work as I am a total noob in coding (no education in that area).
This is the graph I am working on:
And this is the code I have:
orientation$head_linear <- ifelse(orientation$head > 180, 360 - orientation$head, orientation$head)
orientation$body_linear <- ifelse(orientation$body > 180, 360 - orientation$body, orientation$body)
par(mfrow = c(2,1))
hist(orientation$head_linear, main = NULL, ylim=c(20,180), ylab = NULL, xlab = NULL)
hist(orientation$body_linear, main = NULL, ylim=c(20, 180), ylab = NULL, xlab = " Odchylka od vletového otvoru ")
I have set the limit of Y axis with ylim code, but it doesnt seem to work (I have succesfully used it before in different work).
Maybe you mean the xaxt= argument whith which you can omit the y-axis. If you use it, you may create a custom axis afterwards. Use a seq(from=0, to=360, by=20) for it's at= argument.
par(mfrow=c(2, 1))
hist(orientation$head_linear, main=NULL, yaxt='n', ylab=NULL, xlab=NULL)
axis(side=2, at=seq(from=0, to=360, by=20))
hist(orientation$body_linear, main=NULL, yaxt='n', ylab=NULL, xlab=" Odchylka od vletového otvoru ")
axis(2, seq(0, 360, 20))
Data:
n <- 500
orientation <- data.frame(
head_linear=sample(360, n, replace=TRUE),
body_linear=sample(360, n, replace=TRUE)
)
What is the correct way to enter coordinates with geom_polygon?
In this plot I would like to draw 2 rectangles.
One going from .5 to 1.5 on the x axis and 148 to 161 on the y axis.
The other going from 1.5 to 2.5 on the x axis and 339 to 352 on the y axis.
The coordinates in the polygon() below work but I'd like to confirm how the coordinates must be entered. Below the coordinates are entered with the bottom line of each rectangle first 148 148 339 339 are entered and then the top line of each rectangle are entered: 161 161 352 352. Is that how the coordinates must be entered - bottom line first then top line?
plot(1, type="n", main="test",
xlim=c(0, 5), xlab="y",
ylim=c(0, max( 0,400 ) ), ylab="")
polygon(
x=c(0.5 ,1.5, 1.5, 2.5, 2.5, 1.5, 1.5, 0.5),
y= c(148, 148, 339, 339, 352, 352, 161, 161),
col = "blue", border = NA)
When I enter all 4 coordinates for each rectangle for the first rectangle first and then all 4 coordinates for the second rectangle the plot is wrong:
plot(1, type="n", main="test",
xlim=c(0, 5), xlab="y",
ylim=c(0, max( 0,400 ) ), ylab="")
polygon( x=c(.5,1.5,.5,1.5,1.5,2.5,1.5,2.5 ), y=c(148,148,161,161,339,339,352,352 ),
col = "red", border = NA)
Thank you.
This is a base plot question rather than ggplot2
polygon is trying to to draw a single polygon rather than the two you want. It is also assuming that the points are in order, and that the last point is connected to the first point
So your second example might work better if you separated the rectangles and reordered the points, perhaps trying
plot(1, type="n", main="test",
xlim=c(0, 5), xlab="y",
ylim=c(0, max(0, 400)), ylab="")
polygon(x=c(0.5, 1.5, 1.5, 0.5), y=c(148, 148, 161, 161),
col = "red", border = NA)
polygon(x=c(1.5, 2.5, 2.5, 1.5), y=c(339, 339, 352, 352),
col = "red", border = NA)
so rather than
you would get
which is what I assume you want
Sorry if image 1 is a little basic - layout sent by my project supervisor! I have created a scatterplot of total grey seal abundance (Total) over observation time (Obsv_time), and fitted a gam over the top, as seen in image 2:
plot(Total ~ Obsv_time,
data = R_Count,
ylab = "Total",
xlab = "Observation Time (Days)",
pch = 20, cex = 1, bty = "l",col="dark grey")
lines(R_Count$Obsv_time, fitted(gam.tot2))
I would like to somehow show on the graph the corresponding Season (Image 1) - from a categorical factor variable (4 levels: Pre-breeding,Breeding,Post-breeding,Moulting), which corresponds to Obsv_time.
I am unsure if I need to plot a secondary axis or just add labels to the graph...and how to do each! Thanks!
Wanted graph layout - indicate season from factor variable
Scatterplot with GAM curve
You can do this with base R graphics. Leave off the x-axis in the original plot, and add an axis with the season labels separately. You can get indicate the season by overlaying polygons.
## Some bogus data
x = sort(runif(50,0,250))
y = 800*(sin(x/40) + x/100 + rnorm(50,0, 0.2)) + 500
FittedY = 800*(sin(x/40) + x/100)+500
plot(x,y, pch= 20, col='lightgray', ylim=c(300,2700), xaxt='n',
xlab="", ylab='Total')
lines(x, FittedY)
axis(1, at=c(25,95,155,215), tick=FALSE,
labels=c('PreBreed', 'Repro', 'PostBreed', 'Moulting'))
rect(c(-10,65,125,185), 0, c(65,125,185,260), 3000,
col=rainbow(4, alpha=0.05), border=NA)
If you are able to use ggplot2, you could add (or compute from time) another factor variable to your data-frame which would be your season. Then it is just a matter of using color (or any other) aesthetic which would use this season variable.
require(ggplot2)
df <- data.frame(total = c(26, 41, 31, 75, 64, 32, 7, 89),
time = c(1, 2, 3, 4, 5, 6, 7, 8))
df$season <- cut(df$time, breaks=c(0, 2, 4, 6, 8),
labels=c("winter", "spring", "summer", "autumn"))
ggplot(df, aes(x=time, y=total)) +
geom_smooth(color="black") +
geom_point(aes(color=season))
The data for some of these types graphs that I'm graphing in R,
http://graphpad.com/faq/images/1352-1(1).gif
has outliers that are way out of range and I can't just exclude them. I attempted to use the axis.break() function from plotrix but the function doesn't rescale the y axis. It just places a break mark on the axis. The purpose of doing this is to be able to show the medians for both groups, as well as the data points, and the outliers all in one plot frame. Essentially, the data points that are far apart from the majority is taking up a chunk of space and the majority of points are being squished, not displaying much differences. Here is the code:
https://gist.github.com/9bfb05dcecac3ecb7491
Any suggestions would be helpful.
Thanks
Unfortunately the code you link to isn't self-contained, but possibly the code you have for gap.plot() there doesn't work as you expect because you are setting ylim to cover the full data range rather than the plotted sections only. Consider the following plot:
As you can see, the y axis has tickmarks for every 50 pg/ml, but there is a gap between 175 and 425. So the data range (to the nearest 50) is c(0, 500) but the range of the y axis is c(0, 250) - it's just that the tickmarks for 200 and 250 are being treated as those for 450 and 500.
This plot was produced using the following modified version of your code:
## made up data
GRO.Controls <- c(25, 40:50, 60, 150)
GRO.Breast <- c(70, 80:90, 110, 500)
##Scatter plot for both groups
library(plotrix)
gap.plot(jitter(rep(0,length(GRO.Controls)),amount = 0.2), GRO.Controls,
gap = c(175,425), xtics = -2, # no xtics visible
ytics = seq(0, 500, by = 50),
xlim = c(-0.5, 1.5), ylim = c(0, 250),
xlab = "", ylab = "Concentrations (pg/ml)", main = "GRO(P=0.0010)")
gap.plot(jitter(rep(1,length(GRO.Breast)),amount = 0.2), GRO.Breast,
gap = c(175, 425), col = "blue", add = TRUE)
##Adds x- variable (groups) labels
mtext("Controls", side = 1, at= 0.0)
mtext("Breast Cancer", side = 1, at= 1.0)
##Adds median lines for each group
segments(-0.25, median(GRO.Controls), 0.25, median(GRO.Controls), lwd = 2.0)
segments(0.75, median(GRO.Breast), 1.25, median(GRO.Breast), lwd = 2.0,
col = "blue")
You could be using gap.plot() which is easily found by following the link on the axis.break help page. There is a worked example there.
I want to change the values on the x axis in my histogram in R.
The computer currently has it set as
0, 20, 40, 60, 80, 100.
I want the x axis to go by 10 as in:
0,10,20,30,40,50,60,70,80,90,100.
I know to get rid of the current axis I have to do this
(hist(x), .... xaxt = 'n')
and then
axis(side = 1) .....
But how do I get it to show the numbers that I need it to show?
Thanks.
The answer is right there in ?axis...
dat <- sample(100, 1000, replace=TRUE)
hist(dat, xaxt='n')
axis(side=1, at=seq(0,100, 10), labels=seq(0,1000,100))