Using R, can anyone show me how to draw a simple histogram with no gaps between the bins of the following data :-
Class Width Freq. Dist
0 <= x < 5 0.2
5 <= x < 15 0.1
15 <= x < 20 1.2
20 <= x < 30 0.4
30 <= x < 40 0.4
So I want the X axis to go from 0-5,5-15,15-20,20-30 and 30-40 with the bars drawn appropriately.
Thanks in advance !
How about this one?
breaks <- c(0,5,15,20,30,40)
counts <- c(0.2, 0.1, 1.2, 0.4, 0.4)
barplot(counts,
names=sprintf("[%g,%g)",
breaks[-length(breaks)], breaks[-1]
),
space=0
)
This will give you bars of equal widths. On the other hand, If you'd like to obtain bars of various widths, type:
barplot(counts, diff(breaks),
names=sprintf("[%g,%g)", breaks[-length(breaks)], breaks[-1]),
space=0
)
Moreover, this will give you an "ordinary" X axis:
barplot(counts, diff(breaks), space=0)
axis(1)
And if you'd like to get axis breaks exactly at points in breaks, type:
axis(1, at=breaks)
I would look into the "HistogramTools" package for R.
breaks <- c(0, 5, 15, 20, 30, 40)
counts <- c(0.2, 0.1, 1.2, 0.4, 0.4)
library(HistogramTools)
plot(PreBinnedHistogram(breaks, counts), main = "")
Related
I have plotted a bar graph with values smaller than 1 and combined it with two lines, one with positive and the other with negative larger values. When I plot it, the bar lines are much smaller and difficult to see. I would like to change the scale of the Y axis so that the bars that go from 0 to 0.5 are seen bigger. The objective would be to try to break the y axis in 2, from 0 to 0.5 and the rest. I thought about applying log = "y" to barplot, but the axis goes from negative to positive and cannot be logged. (Error in barplot.default(data$bv, data$year, ylim = c(-3, 3), log = "y") : log scale error: at least one 'height + offset' value <= 0 ). Any ideas about how to solve this?
data <- data.frame(c(2000,2001,2002,2003,2004,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018))
data$bv <- as.numeric(c(0.29,-0.15,0.1, 0.3, 0.2, -0.1, 0.25, -0.2, -0.3,0.08,-0.54, -0.24, -0.15, 0.26, 0.12, 0.23,-0.16,0.3))
data$pvp <- as.numeric(rep(c(2.8,2.9,3),times=6))
data$pvn <- as.numeric(rep(c(-2.8,-2.9,-3),times=6))
data$year <- as.numeric(c(2000,2001,2002,2003,2004,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018))
bar <- barplot(data$bv, data$year, ylim=c(-3,3))
par(new = T)
plot(data$pvp,ylim=c(-3,3),axes=F,xlab="",ylab="",type="b",lty=3,lwd=1.5,pch=15,cex=0.8)
points(data$pvn,type="b",lty=3,lwd=1.5,pch=17,cex=0.8)
You could take the log of the absolute value, then multiply that by the sign of the original value:
data$bv <- log(abs(data$bv)) * sign(data$bv)
Which gives you this:
I would like to make a plot like the this image what I want, however I don't know how. I wrote the code below but I don't find a way to obtain the plot. The point here is to add density lines to my original plot (Relation Masa-SFR) the density is supposed to be every 0.3 in x. I mean one line from 7 to 7.3, the next one from 7.3 to 7.6 and so on. With the code below (continue until x=12), I obtain the this [plot][2]
plot(SFsl$MEDMASS, SFR_SalpToMPA,xlim= range(7:12),
ylim= range(-3:2.5),ylab="log(SFR(M(sun)/yr)",
xlab="log(M(star)/(M(sun)")
title("Relacion Masa-SFR")
par(new=TRUE)
FCUTsfrsl1=(SFsl$MEDMASS >= 7 & SFsl$MEDMASS <=7.3 &
SFR_SalpToMPA < 2 & SFR_SalpToMPA > -3)
x <- SFR_SalpToMPA[FCUTsfrsl1]
y <- density(x)
plot(y$y, y$x, type='l',ylim=range(-3:2.5), col="red",
ylab="", xlab="", axes=FALSE)
I did what you said but I obtained this plot, I don't know if I did something wrong
Since I don't have your data, I had to make some up. If this does what you want, I think you can adapt it to your actual data.
set.seed(7)
x <- runif(1000, 7, 12)
y <- runif(1000, -3, 3)
DF <- data.frame(x = x, y = y)
plot(DF$x, DF$y)
# Cut the x axis into 0.3 unit segments, compute the density and plot
br <- seq(7, 12, 0.333)
intx <- cut(x, br) # intervals
intx2 <- as.factor(cut(x, br, labels = FALSE)) # intervals by code
intx3 <- split(x, intx) # x values
inty <- split(y, intx2) # corresponding y values for density calc
for (i in 1:length(intx3)) {
xx <- seq(min(intx3[[i]]), max(intx3[[i]]), length.out = 512)
lines(xx, density(inty[[i]])$y, col = "red")
}
This produce the following image. You need to look closely but there is a separate density plot for each 0.3 unit interval.
EDIT Change the dimension that is used to compute the density
set.seed(7)
x <- runif(1000, 7, 12)
y <- runif(1000, -3, 3)
DF <- data.frame(x = x, y = y)
plot(DF$x, DF$y, xlim = c(7, 15))
# Cut the x axis into 0.3 unit segments, compute the density and plot
br <- seq(7, 12, 0.333)
intx <- cut(x, br) # intervals
intx2 <- as.factor(cut(x, br, labels = FALSE)) # intervals by code
intx3 <- split(x, intx) # x values
inty <- split(y, intx2) # corresponding y values
# This gives the density values in the horizontal direction (desired)
# This is the change, the above is unchanged.
for (i in 1:length(intx3)) {
yy <- seq(min(inty[[i]]), max(inty[[i]]), length.out = 512)
offset <- min(intx3[[i]])
lines(density(intx3[[i]])$y + offset, yy, col = "red")
}
Which gives:
I know that I can add a horizontal line to a boxplot using a command like
abline(h=3)
When there are multiple boxplots in a single panel, can I add different horizontal lines for each single boxplot?
In the above plot, I would like to add lines 'y=1.2' for 1, 'y=1.5' for 2, and 'y=2.1' for 3.
I am not sure that I understand exactly, what you want, but it might be this: add a line for each boxplot that covers the same x-axis range as the boxplot.
The width of the boxes is controlled by pars$boxwex which is set to 0.8 by default. This can be seen from the argument list of boxplot.default:
formals(boxplot.default)$pars
## list(boxwex = 0.8, staplewex = 0.5, outwex = 0.5)
So, the following produces a line segment for each boxplot:
# create sample data and box plot
set.seed(123)
datatest <- data.frame(a = rnorm(100, mean = 10, sd = 4),
b = rnorm(100, mean = 15, sd = 6),
c = rnorm(100, mean = 8, sd = 5))
boxplot(datatest)
# create data for segments
n <- ncol(datatest)
# width of each boxplot is 0.8
x0s <- 1:n - 0.4
x1s <- 1:n + 0.4
# these are the y-coordinates for the horizontal lines
# that you need to set to the desired values.
y0s <- c(11.3, 16.5, 10.7)
# add segments
segments(x0 = x0s, x1 = x1s, y0 = y0s, col = "red")
This gives the following plot:
I have two vectors:
x = c(0, 20, 10000, 50, 30000)
y = c(0, 3, 800, 1000, 7000)
I would like to do a scatterplot of my data in R. This is not complicated with the plot function. It would look best on a log scale, but values equal to 0 are not shown on the graph. I know log(0) is nonexistent. But I was hoping there was a way to show them on the scatterplot? (for example a point on the y-axis or the x-axis). Does anyone know how to do that?
In order to plot the data points, add a very small increment to all values:
plot(x + 0.1, y + 0.1, log = 'xy')
Now this hides which values are 0. This can be visualised well by using another symbol for null values:
plot(x + 0.1, y + 0.1, log = 'xy', pch = ifelse(x == 0 | y == 0, 17, 16))
Alternatively, you could also choose a different colour.
In order to plot the actual log values, don’s use the log='xy' argument but rather apply the log to the numbers directly:
plot(log(x + 0.1), log(y + 0.1), pch = ifelse(x == 0 | y == 0, 17, 16))
For each X-value I calculated the average Y-value and the standard deviation (sd) of each Y-value
x = 1:5
y = c(1.1, 1.5, 2.9, 3.8, 5.2)
sd = c(0.1, 0.3, 0.2, 0.2, 0.4)
plot (x, y)
How can I use the standard deviation to add error bars to each datapoint of my plot?
A solution with ggplot2 :
qplot(x,y)+geom_errorbar(aes(x=x, ymin=y-sd, ymax=y+sd), width=0.25)
In addition to #csgillespie's answer, segments is also vectorised to help with this sort of thing:
plot (x, y, ylim=c(0,6))
segments(x,y-sd,x,y+sd)
epsilon <- 0.02
segments(x-epsilon,y-sd,x+epsilon,y-sd)
segments(x-epsilon,y+sd,x+epsilon,y+sd)
You can use arrows:
arrows(x,y-sd,x,y+sd, code=3, length=0.02, angle = 90)
You can use segments to add the bars in base graphics. Here epsilon controls the line across the top and bottom of the line.
plot (x, y, ylim=c(0, 6))
epsilon = 0.02
for(i in 1:5) {
up = y[i] + sd[i]
low = y[i] - sd[i]
segments(x[i],low , x[i], up)
segments(x[i]-epsilon, up , x[i]+epsilon, up)
segments(x[i]-epsilon, low , x[i]+epsilon, low)
}
As #thelatemail points out, I should really have used vectorised function calls:
segments(x, y-sd,x, y+sd)
epsilon = 0.02
segments(x-epsilon,y-sd,x+epsilon,y-sd)
segments(x-epsilon,y+sd,x+epsilon,y+sd)
A Problem with csgillespie solution appears, when You have an logarithmic X axis. The you will have a different length of the small bars on the right an the left side (the epsilon follows the x-values).
You should better use the errbar function from the Hmisc package:
d = data.frame(
x = c(1:5)
, y = c(1.1, 1.5, 2.9, 3.8, 5.2)
, sd = c(0.2, 0.3, 0.2, 0.0, 0.4)
)
##install.packages("Hmisc", dependencies=T)
library("Hmisc")
# add error bars (without adjusting yrange)
plot(d$x, d$y, type="n")
with (
data = d
, expr = errbar(x, y, y+sd, y-sd, add=T, pch=1, cap=.1)
)
# new plot (adjusts Yrange automatically)
with (
data = d
, expr = errbar(x, y, y+sd, y-sd, add=F, pch=1, cap=.015, log="x")
)