R histogram axes too small for the dataset [duplicate] - r

This question already has an answer here:
R - Customizing X Axis Values in Histogram
(1 answer)
Closed 7 years ago.
I'm learning R and I'm trying out the hist() histogram function. My code is here(and pasted below), and when I run it, the axes A) Do not connect at the origin, and B) they don't extend far enough for the dataset. I've looked and haven't found anything, the properties xlim, ylim, axes=FALSE, none of these solutions work.
bluegill = read.table(file="lab2.csv", header="true", sep=",")
attach(bluegill)
hist(Length, main="", xlab="Length (mm)", ylab="Number of individuals", col="gray")
and then this is the resulting chart, the max length is 220 in the data set, and the x axis only goes to 200.

A simple solution could be this:
bluegill = read.table(file="lab2.csv", header="true", sep=",")
attach(bluegill)
hist(Length, main="", xlab="Length (mm)", ylab="Number of individuals", col="gray", xaxt = "n") ##no x axis
Please notice the new option that has been added, xaxt = "n", which removes the x axis completely. You could then add the x axis late with another command. e.g.
axis(1, at = seq(0, 200, 20))
The first option is 1, which means x axis.
The second option stands for the points that will be shown in the plot (excuse my English).

Related

2 vectors in one plot [duplicate]

This question already has answers here:
Plotting multiple curves same graph and same scale
(5 answers)
Closed 4 years ago.
I spent a long time trying to figure something out which I thought would be very easy. I have three vectors (or a data frame if you want to make it into one)
date <- c("Q1","Q2","Q3","Q4")
group1 <- c(12,13,16,11)
group2 <- c(9,11,10,9)
Now I want to create one graph with the date along the x-axis, and two horizontal lines representing the 2 groups. For a bit of context, I did a difference-in-difference regression and want to show the average values for treatment and control group around the event. I'm using panel data and already calculated the mean for both groups at each point in time. Here is a sceenshot I took from my so you can see how I want it to look like.
# plot solid line, set plot size, but omit axes
plot(x=seq(date), y=group1, type="l", lty=1, ylim=c(5,20),
axes=F, bty="n", xaxs="i", yaxs="i", main="My Title",
xlab="", ylab="Total Risk-Based Capital Ratio")
# plot dashed line
lines(x=seq(date), y=group2, lty=2)
# add axes
axis(side=1, labels=date, at=seq(date))
axis(side=2, at=seq(5,20,3), las=1)
# add vertical red line
abline(v=2, col="red")
# add legend
par(xpd=TRUE)
legend(x=1.5, y=2, legend=c("solid", "dashed"), lty=1:2, box.lty=0, ncol=2)

How to reduce the size of the legend in R Plot, while still making it readable?

I am trying to plot some data over years with two y-axes in R. However, whenever I try to include a legend, the the legend dominates my plot. When I use solutions suggested elsewhere like keyword and/or using the cex argument, suggested in another post here, it either becomes unreadable or is still too big.
Here is my example with randomly generated data:
#Create years
year.df <- seq(1974, 2014, 1)
# Create y-axis data
set.seed(75)
mean1 <- rnorm(length(year.df), 52.49, 0.87)
mean2 <- rnorm(length(year.df), 52.47, 0.96)
#Create dataframe
df <- data.frame(cbind(year.df, mean1, mean2))
I want a second y-axis, the difference of the two means over the years
df$diff <- abs(df$mean1 - df$mean2)
When I plot using the code below to create two y-axes:
par(mfrow=c(1,1), mar=c(5.1,4.1,4.1,5.1))
with(df, plot(year.df, mean1, type = "l", lwd=4, xlab="Year", ylab="Mean", ylim=c(48,58)))
with(df, lines(year.df, mean2, type = "l", col="green", lwd=4))
par(new=TRUE)
with(df, plot(year.df, diff, type="l", axes=FALSE, xlab=NA, ylab=NA, col="red", lty=5, ylim=c(0,10)))
axis(side = 4)
mtext(side = 4, line = 3, "Annual Difference")
legend("topleft",
legend=c("Calculated", "MST", "Diff"),
lty=c(1,1,5), col=c("black", "green", "red"))
I get:
When I use the cex=0.5 argument in the legend(), it starts to become unreadable:
Is there a way to format my legend in a clear, readable manner? Better than what I have?
The white space in the legend tells me that you manually widened your plot window. Legends do not scale well when it comes to manual re-sizing.
The solution is opening a plot of the exact size you need before plotting. In Windows, this is done with windows(width=10, height=8). Units are in inches.
As you can see below, the legend sits tightly in the corner.
Apparently, I forgot to do the first step of troubleshooting: turn things off an turn it on. I woke up this morning and ran the script again. Even with cex = 0.5 and it turned out fine. I chose to use cex = 0.75. I would still appreciate any help in why that might be. Spent many hours yesterday trying to fix my legend and the same code works and receives this product (cex=0.75):

How to plot actual year in x axis of r plot instead of year 0,1,2......20

I am newbie in R. please show me the correct link if I asked my question in wrong forum. I am reading and extracting data from netcdf file. I want to plot actual year i.e. 1980,1981,......1999 in x-axis instead of 0,1,...20. I tried to change the range using xrange or xaxt and axis command in plot but unable to do so.Also, I want to plot the lines between 1980-1999 buthttp://i.stack.imgur.com/z8sEL.png the line continues after 1999 (see image) I tried since last 7 days without any succes and could not concentrate and move on. I have copied partial code and image. I will appreciate your help. Thank you.
for (j in 1:length(station_rchid)){
for (i in 1:length(rchid)){
if(identical(station_rchid[j],rchid[i])){
windows()
per<-'Average Annual '
an_time<-1:nyear
heading <- paste(per,vari,tper,station_name[j])
yrange<- max(varX_year[1,,j],varX_year[2,,j])
plot(an_time,varX_year[1,,j],main=heading,type="l",ylim=c(0,yrange),xlab="Year",ylab=unity,col="red",cex.lab=1.5,cex.axis=1.5)
lines(an_time,varX_year[2,,j],col="blue")
legend("topleft", c("Pred","Obs"),lty=c(1,1),lwd=c(2.5,2.5), col=c("red","blue"),inset = 1.4)
filename<-paste("NZ_Annual_swe_",station_name[i],".xls")
# write.xls(varX_year[ , ,j],file=filename,colNames=TRUE,rowNames=FALSE)
One way is to remove the axes in the plot and then re-add them later. The pretty can help you automate the label placements. Parse all axis-options to the axis lines.
plot(an_time, varX_year, main = heading, type="l", ylim = c(0,yrange)
,xlab = "Year", ylab = unity, col = "red", cex.lab = 1.5,
,axes = FALSE)
axis(1, at = pretty(an_time), labels = pretty(an_time) + 1980, cex.axis=1.5)
axis(2, cex.axis=1.5,)
box()
Be cautious with this approach, since you only change the labels.
Another workaround I suppose could be just to add 1980 to the year vector, i.e.
an_time <- an_time + 1980

How to remove e-notation on y axis when plotting graph with R? [duplicate]

I regularly do all kinds of scatter plots in R using the plot command.
Sometimes both, sometimes only one of the plot axes is labelled in scientific notation. I do not understand when R makes the decision to switch to scientific notation. Surprisingly, it often prints numbers which no sane human would write in scientific notation when labelling a plot, for example it labels 5 as 5e+00. Let's say you have a log-axis going up to 1000, scientific notation is unjustified with such "small" numbers.
I would like to suppress that behaviour, I always want R to display integer values. Is this possible?
I tried options(scipen=10) but then it starts writing 5.0 instead of 5, while on the other axis 5 is still 5 etc. How can I have pure integer values in my R plots?
I am using R 2.12.1 on Windows 7.
Use options(scipen=5) or some other high enough number. The scipen option determines how likely R is to switch to scientific notation, the higher the value the less likely it is to switch. Set the option before making your plot, if it still has scientific notation, set it to a higher number.
You can use format or formatC to, ahem, format your axis labels.
For whole numbers, try
x <- 10 ^ (1:10)
format(x, scientific = FALSE)
formatC(x, digits = 0, format = "f")
If the numbers are convertable to actual integers (i.e., not too big), you can also use
formatC(x, format = "d")
How you get the labels onto your axis depends upon the plotting system that you are using.
Try this. I purposely broke out various parts so you can move things around.
library(sfsmisc)
#Generate the data
x <- 1:100000
y <- 1:100000
#Setup the plot area
par(pty="m", plt=c(0.1, 1, 0.1, 1), omd=c(0.1,0.9,0.1,0.9))
#Plot a blank graph without completing the x or y axis
plot(x, y, type = "n", xaxt = "n", yaxt="n", xlab="", ylab="", log = "x", col="blue")
mtext(side=3, text="Test Plot", line=1.2, cex=1.5)
#Complete the x axis
eaxis(1, padj=-0.5, cex.axis=0.8)
mtext(side=1, text="x", line=2.5)
#Complete the y axis and add the grid
aty <- seq(par("yaxp")[1], par("yaxp")[2], (par("yaxp")[2] - par("yaxp")[1])/par("yaxp")[3])
axis(2, at=aty, labels=format(aty, scientific=FALSE), hadj=0.9, cex.axis=0.8, las=2)
mtext(side=2, text="y", line=4.5)
grid()
#Add the line last so it will be on top of the grid
lines(x, y, col="blue")
You can use the axis() command for that, eg :
x <- 1:100000
y <- 1:100000
marks <- c(0,20000,40000,60000,80000,100000)
plot(x,y,log="x",yaxt="n",type="l")
axis(2,at=marks,labels=marks)
gives :
EDIT : if you want to have all of them in the same format, you can use the solution of #Richie to get them :
x <- 1:100000
y <- 1:100000
format(y,scientific=FALSE)
plot(x,y,log="x",yaxt="n",type="l")
axis(2,at=marks,labels=format(marks,scientific=FALSE))
You could try lattice:
require(lattice)
x <- 1:100000
y <- 1:100000
xyplot(y~x, scales=list(x = list(log = 10)), type="l")
The R graphics package has the function axTicks that returns the tick locations of the ticks that the axis and plot functions would set automatically. The other answers given to this question define the tick locations manually which might not be convenient in some situations.
myTicks = axTicks(1)
axis(1, at = myTicks, labels = formatC(myTicks, format = 'd'))
A minimal example would be
plot(10^(0:10), 0:10, log = 'x', xaxt = 'n')
myTicks = axTicks(1)
axis(1, at = myTicks, labels = formatC(myTicks, format = 'd'))
There is also an log parameter in the axTicks function but in this situation it does not need to be set to get the proper logarithmic axis tick location.
Normally setting axis limit # max of your variable is enough
a <- c(0:1000000)
b <- c(0:1000000)
plot(a, b, ylim = c(0, max(b)))

Do not want scientific notation on plot axis

I regularly do all kinds of scatter plots in R using the plot command.
Sometimes both, sometimes only one of the plot axes is labelled in scientific notation. I do not understand when R makes the decision to switch to scientific notation. Surprisingly, it often prints numbers which no sane human would write in scientific notation when labelling a plot, for example it labels 5 as 5e+00. Let's say you have a log-axis going up to 1000, scientific notation is unjustified with such "small" numbers.
I would like to suppress that behaviour, I always want R to display integer values. Is this possible?
I tried options(scipen=10) but then it starts writing 5.0 instead of 5, while on the other axis 5 is still 5 etc. How can I have pure integer values in my R plots?
I am using R 2.12.1 on Windows 7.
Use options(scipen=5) or some other high enough number. The scipen option determines how likely R is to switch to scientific notation, the higher the value the less likely it is to switch. Set the option before making your plot, if it still has scientific notation, set it to a higher number.
You can use format or formatC to, ahem, format your axis labels.
For whole numbers, try
x <- 10 ^ (1:10)
format(x, scientific = FALSE)
formatC(x, digits = 0, format = "f")
If the numbers are convertable to actual integers (i.e., not too big), you can also use
formatC(x, format = "d")
How you get the labels onto your axis depends upon the plotting system that you are using.
Try this. I purposely broke out various parts so you can move things around.
library(sfsmisc)
#Generate the data
x <- 1:100000
y <- 1:100000
#Setup the plot area
par(pty="m", plt=c(0.1, 1, 0.1, 1), omd=c(0.1,0.9,0.1,0.9))
#Plot a blank graph without completing the x or y axis
plot(x, y, type = "n", xaxt = "n", yaxt="n", xlab="", ylab="", log = "x", col="blue")
mtext(side=3, text="Test Plot", line=1.2, cex=1.5)
#Complete the x axis
eaxis(1, padj=-0.5, cex.axis=0.8)
mtext(side=1, text="x", line=2.5)
#Complete the y axis and add the grid
aty <- seq(par("yaxp")[1], par("yaxp")[2], (par("yaxp")[2] - par("yaxp")[1])/par("yaxp")[3])
axis(2, at=aty, labels=format(aty, scientific=FALSE), hadj=0.9, cex.axis=0.8, las=2)
mtext(side=2, text="y", line=4.5)
grid()
#Add the line last so it will be on top of the grid
lines(x, y, col="blue")
You can use the axis() command for that, eg :
x <- 1:100000
y <- 1:100000
marks <- c(0,20000,40000,60000,80000,100000)
plot(x,y,log="x",yaxt="n",type="l")
axis(2,at=marks,labels=marks)
gives :
EDIT : if you want to have all of them in the same format, you can use the solution of #Richie to get them :
x <- 1:100000
y <- 1:100000
format(y,scientific=FALSE)
plot(x,y,log="x",yaxt="n",type="l")
axis(2,at=marks,labels=format(marks,scientific=FALSE))
You could try lattice:
require(lattice)
x <- 1:100000
y <- 1:100000
xyplot(y~x, scales=list(x = list(log = 10)), type="l")
The R graphics package has the function axTicks that returns the tick locations of the ticks that the axis and plot functions would set automatically. The other answers given to this question define the tick locations manually which might not be convenient in some situations.
myTicks = axTicks(1)
axis(1, at = myTicks, labels = formatC(myTicks, format = 'd'))
A minimal example would be
plot(10^(0:10), 0:10, log = 'x', xaxt = 'n')
myTicks = axTicks(1)
axis(1, at = myTicks, labels = formatC(myTicks, format = 'd'))
There is also an log parameter in the axTicks function but in this situation it does not need to be set to get the proper logarithmic axis tick location.
Normally setting axis limit # max of your variable is enough
a <- c(0:1000000)
b <- c(0:1000000)
plot(a, b, ylim = c(0, max(b)))

Resources