i have this data and plot
mydata <- data.frame(a=c(1:5),b=c(6:10),c=c(11:15),e=c(16:20))
plot <- stripchart(mydata, method="jitter", vertical=T,main='plot',pch=19)
I would like to subset the x axis into two labels named 'a+b' and 'c+d' labels
Thanks in advance
In your case you can simply use mtext on side 1:
mydata <- data.frame(a=c(1:5),b=c(6:10),c=c(11:15),d=c(16:20))
plot <- stripchart(mydata, method="jitter", vertical=T,main='plot',pch=19)
mtext(c('a+b','c+d'),side=1,line=3,at=c(1.5,3.5))
Argument line is to set up the vertical position and at the position on the x-axis.
Edit: To add a distance between the two groups, you can do like this (there may be a cleaner way to do that but that's the only one I can think of from the top of my head):
mydata <- data.frame(a=c(1:5),b=c(6:10),c=c(11:15),d=c(16:20))
plot <- stripchart(mydata, method="jitter", vertical=T, main='plot',pch=19,
at=c(1,2,4,5),xlim=c(0,6))
mtext(c('a+b','c+d'),1,line=3,at=c(1.5,4.5))
Argument at of stripchart is the one to fiddle with, but you then have to modify the plot limits (xlim) and the x-value at which you write the axis label (in mtext).
Related
I am working in R and I have to make many boxplots. This is a visualization of group differences. I want to relabel the x-axis to only have one title instead of five (one for each subplot). My biggest problem is that I also want the y-axis of all the subplots to have different labels.
This is what I tried so far:
par(mfrow=c(1,5))
lapply(NEW8[,c("gawayf", "humf", "sgamesf", "swtoyf", "kissf")],
function(x) boxplot(x ~ NEW8$PAPA_p4_adhd,col=rainbow(2),
names=c("CN","ADHD"),
ylab=c("gawayf", "humf", "sgamesf", "swtoyf", "kissf")))
All the y-labels are added to each subplots so each subplots has 5 lines of y-axis labels (gawayf, humf, sgamef, swtoyf, kissf), and each plot says what data was used to create the boxplot (PAPA_P4_ADHD).
I want each plots to only have the corresponding y-axis label and the x-axis to have 1 label for all five plots.
This is my current output:
Thank you very much
Instead of lapply try mapply - that will allow to pass different argument to each function call:
par(mfrow=c(1,5))
myBox <- function(x, y, ...) boxplot(x ~ y, col=rainbow(2), names=c("CN", "ADHA"), ...)
mapply(myBox,
x = NEW8[,c("gawayf", "humf", "sgamesf", "swtoyf", "kissf")],
y = list(NEW8$PAPA_p4_adhd), # we make this a list so it has length(1)
ylab = c("gawayf", "humf", "sgamesf", "swtoyf", "kissf"),
xlab = "" # empty x-lab
)
For x-lab you will have to do a trick - start a new empty plot that overlays all of the plots, and only add x-axis:
par(fig=c(0,1,0,1), oma=c(0,0,0,0), mar=par("mar"), new=TRUE)
plot.new()
title(xlab="my x-axis")
NOTE: I didn't try to run this code myself, if anything here doesn't work - please leave a comment and will try to address it.
I want to make a strip chart for each of the following two vectors with same x-axis using R.
Two vectors are:
x <- c(40,35,30,45,35,45,65,65,70,70)
y <- c(45,45,45,45,45,45,45,45,45,95)
When I made strip charts for the two, it came out like this:
How do I make it so that two x-axis will be the same?
Thank you
How do I make it so that two x-axis will be the same?
The stripchart() function takes an optional xlim parameter that lets you define the horizontal plot limits. For example:
x <- c(40,35,30,45,35,45,65,65,70,70)
y <- c(45,45,45,45,45,45,45,45,45,95)
par(bty="n", mfrow=c(2,1), mar=c(2,1,3,1)+0.1)
stripchart(x, xlim=c(20,100), method="stack", pch=16)
stripchart(y, xlim=c(20,100), method="stack", pch=21)
I have data that is mostly centered in a small range (1-10) but there is a significant number of points (say, 10%) which are in (10-1000). I would like to plot a histogram for this data that will focus on (1-10) but will also show the (10-1000) data. Something like a log-scale for th histogram.
Yes, i know this means not all bins are of equal size
A simple hist(x) gives
while hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000))) gives
none of which is what I want.
update
following the answers here I now produce something that is almost exactly what I want (I went with a continuous plot instead of bar-histogram):
breaks <- c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,4,8)
ggplot(t,aes(x)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10('true size/predicted size', breaks = breaks, labels = breaks)![alt text][3]
the only problem is that I'd like to match between the scale and the actual bars plotted. There two options for doing that : the one is simply use the actual margins of the plotted bars (how?) then get "ugly" x-axis labels like 1.1754,1.2985 etc. The other, which I prefer, is to control the actual bins margins used so they will match the breaks.
Log scale histograms are easier with ggplot than with base graphics. Try something like
library(ggplot2)
dfr <- data.frame(x = rlnorm(100, sdlog = 3))
ggplot(dfr, aes(x)) + geom_histogram() + scale_x_log10()
If you are desperate for base graphics, you need to plot a log-scale histogram without axes, then manually add the axes afterwards.
h <- hist(log10(dfr$x), axes = FALSE)
Axis(side = 2)
Axis(at = h$breaks, labels = 10^h$breaks, side = 1)
For completeness, the lattice solution would be
library(lattice)
histogram(~x, dfr, scales = list(x = list(log = TRUE)))
AN EXPLANATION OF WHY LOG VALUES ARE NEEDED IN THE BASE CASE:
If you plot the data with no log-transformation, then most of the data are clumped into bars at the left.
hist(dfr$x)
The hist function ignores the log argument (because it interferes with the calculation of breaks), so this doesn't work.
hist(dfr$x, log = "y")
Neither does this.
par(xlog = TRUE)
hist(dfr$x)
That means that we need to log transform the data before we draw the plot.
hist(log10(dfr$x))
Unfortunately, this messes up the axes, which brings us to workaround above.
Using ggplot2 seems like the most easy option. If you want more control over your axes and your breaks, you can do something like the following :
EDIT : new code provided
x <- c(rexp(1000,0.5)+0.5,rexp(100,0.5)*100)
breaks<- c(0,0.1,0.2,0.5,1,2,5,10,20,50,100,200,500,1000,10000)
major <- c(0.1,1,10,100,1000,10000)
H <- hist(log10(x),plot=F)
plot(H$mids,H$counts,type="n",
xaxt="n",
xlab="X",ylab="Counts",
main="Histogram of X",
bg="lightgrey"
)
abline(v=log10(breaks),col="lightgrey",lty=2)
abline(v=log10(major),col="lightgrey")
abline(h=pretty(H$counts),col="lightgrey")
plot(H,add=T,freq=T,col="blue")
#Position of ticks
at <- log10(breaks)
#Creation X axis
axis(1,at=at,labels=10^at)
This is as close as I can get to the ggplot2. Putting the background grey is not that straightforward, but doable if you define a rectangle with the size of your plot screen and put the background as grey.
Check all the functions I used, and also ?par. It will allow you to build your own graphs. Hope this helps.
A dynamic graph would also help in this plot. Use the manipulate package from Rstudio to do a dynamic ranged histogram:
library(manipulate)
data_dist <- table(data)
manipulate(barplot(data_dist[x:y]), x = slider(1,length(data_dist)), y = slider(10, length(data_dist)))
Then you will be able to use sliders to see the particular distribution in a dynamically selected range like this:
I had some problems while trying to plot a histogram to show the frequency of every value while plotting the value as well. For example, suppose I use the following code:
x <- sample(1:10,1000,replace=T)
hist(x,label=TRUE)
The result is a plot with labels over the bar, but merging the frequencies of 1 and 2 in a single bar.
Apart from separate this bar in two others for 1 and 2, I also need to put the values under each bar.
For example, with the code above I would have the number 10 under the tick at the right margin of its bar, and I needed to plot the values right under the bars.
Is there any way to do both in a single histogram with hist function?
Thanks in advance!
Calling hist silently returns information you can use to modify the plot. You can pull out the midpoints and the heights and use that information to put the labels where you want them. You can use the pos argument in text to specify where the label should be in relation to the point (thanks #rawr)
x <- sample(1:10,1000,replace=T)
## Histogram
info <- hist(x, breaks = 0:10)
with(info, text(mids, counts, labels=counts, pos=1))
I have data that is mostly centered in a small range (1-10) but there is a significant number of points (say, 10%) which are in (10-1000). I would like to plot a histogram for this data that will focus on (1-10) but will also show the (10-1000) data. Something like a log-scale for th histogram.
Yes, i know this means not all bins are of equal size
A simple hist(x) gives
while hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000))) gives
none of which is what I want.
update
following the answers here I now produce something that is almost exactly what I want (I went with a continuous plot instead of bar-histogram):
breaks <- c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,4,8)
ggplot(t,aes(x)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10('true size/predicted size', breaks = breaks, labels = breaks)![alt text][3]
the only problem is that I'd like to match between the scale and the actual bars plotted. There two options for doing that : the one is simply use the actual margins of the plotted bars (how?) then get "ugly" x-axis labels like 1.1754,1.2985 etc. The other, which I prefer, is to control the actual bins margins used so they will match the breaks.
Log scale histograms are easier with ggplot than with base graphics. Try something like
library(ggplot2)
dfr <- data.frame(x = rlnorm(100, sdlog = 3))
ggplot(dfr, aes(x)) + geom_histogram() + scale_x_log10()
If you are desperate for base graphics, you need to plot a log-scale histogram without axes, then manually add the axes afterwards.
h <- hist(log10(dfr$x), axes = FALSE)
Axis(side = 2)
Axis(at = h$breaks, labels = 10^h$breaks, side = 1)
For completeness, the lattice solution would be
library(lattice)
histogram(~x, dfr, scales = list(x = list(log = TRUE)))
AN EXPLANATION OF WHY LOG VALUES ARE NEEDED IN THE BASE CASE:
If you plot the data with no log-transformation, then most of the data are clumped into bars at the left.
hist(dfr$x)
The hist function ignores the log argument (because it interferes with the calculation of breaks), so this doesn't work.
hist(dfr$x, log = "y")
Neither does this.
par(xlog = TRUE)
hist(dfr$x)
That means that we need to log transform the data before we draw the plot.
hist(log10(dfr$x))
Unfortunately, this messes up the axes, which brings us to workaround above.
Using ggplot2 seems like the most easy option. If you want more control over your axes and your breaks, you can do something like the following :
EDIT : new code provided
x <- c(rexp(1000,0.5)+0.5,rexp(100,0.5)*100)
breaks<- c(0,0.1,0.2,0.5,1,2,5,10,20,50,100,200,500,1000,10000)
major <- c(0.1,1,10,100,1000,10000)
H <- hist(log10(x),plot=F)
plot(H$mids,H$counts,type="n",
xaxt="n",
xlab="X",ylab="Counts",
main="Histogram of X",
bg="lightgrey"
)
abline(v=log10(breaks),col="lightgrey",lty=2)
abline(v=log10(major),col="lightgrey")
abline(h=pretty(H$counts),col="lightgrey")
plot(H,add=T,freq=T,col="blue")
#Position of ticks
at <- log10(breaks)
#Creation X axis
axis(1,at=at,labels=10^at)
This is as close as I can get to the ggplot2. Putting the background grey is not that straightforward, but doable if you define a rectangle with the size of your plot screen and put the background as grey.
Check all the functions I used, and also ?par. It will allow you to build your own graphs. Hope this helps.
A dynamic graph would also help in this plot. Use the manipulate package from Rstudio to do a dynamic ranged histogram:
library(manipulate)
data_dist <- table(data)
manipulate(barplot(data_dist[x:y]), x = slider(1,length(data_dist)), y = slider(10, length(data_dist)))
Then you will be able to use sliders to see the particular distribution in a dynamically selected range like this: