I know that I can add a horizontal line to a boxplot using a command like
abline(h=3)
When there are multiple boxplots in a single panel, can I add different horizontal lines for each single boxplot?
In the above plot, I would like to add lines 'y=1.2' for 1, 'y=1.5' for 2, and 'y=2.1' for 3.
I am not sure that I understand exactly, what you want, but it might be this: add a line for each boxplot that covers the same x-axis range as the boxplot.
The width of the boxes is controlled by pars$boxwex which is set to 0.8 by default. This can be seen from the argument list of boxplot.default:
formals(boxplot.default)$pars
## list(boxwex = 0.8, staplewex = 0.5, outwex = 0.5)
So, the following produces a line segment for each boxplot:
# create sample data and box plot
set.seed(123)
datatest <- data.frame(a = rnorm(100, mean = 10, sd = 4),
b = rnorm(100, mean = 15, sd = 6),
c = rnorm(100, mean = 8, sd = 5))
boxplot(datatest)
# create data for segments
n <- ncol(datatest)
# width of each boxplot is 0.8
x0s <- 1:n - 0.4
x1s <- 1:n + 0.4
# these are the y-coordinates for the horizontal lines
# that you need to set to the desired values.
y0s <- c(11.3, 16.5, 10.7)
# add segments
segments(x0 = x0s, x1 = x1s, y0 = y0s, col = "red")
This gives the following plot:
Related
While plotting a boxplot in R, I noticed not all values in the y-axis are presented. Possible values are -5 to 5, but actual values are -1.3 to 4.6, so the values presented on the y-axis are -2 to 5. I want it to be presented with all values: -5 to 5, even though there's no data for this entire range.
My code looks like this:
boxplot(depvar ~ indepvar, data = a, pars = list(outlwd = 2, outcex = 1.8), axes = FALSE)
axis(side = 2, at = seq(-5, 5, by = 1), las = 1, tck = 7)
What should be added/changed for the y-axis to be fully-presented?
Appears simliar to this question: How to set the y range in boxplot graph?
I think you are looking for ylim.
a <- c((randu$x*3)-2)
boxplot(x = a,
ylim = c(-5,5))
Load Packages
install.packages("dplyr")
library(dplyr)
creating random set with two columns:
set.seed(10)
df <- dplyr::data_frame(
x = 1:5,
y = 1:5)
Visualize in a boxplot with expanded axis:
boxplot(x~y,
df,
xlim =c(-5,5),
ylim =c(-5,5))
For the following boxplot:
How can I modify the y-axis and remove the range 0.1 to 0.8? The reason I want to do so is that I want to make each boxplot clear (that start from the range of 0.81 - 1).
For this boxplot, I wrote the following R script:
dataset <- read.csv("/boxplot.csv")
x <- boxplot(dataset)
This is generally discouraged and considered bad practice, as it leads to misleading visualizations. But you can do this using gap.boxplot() from the package plotrix.
Here is an arbitrary example:
library(plotrix)
test_data <- c(rnorm(100, 1000, 20), 10, 20)
par(mfrow = c(1, 2))
boxplot(test_data, main = "boxplot")
gap.boxplot(test_data, gap = list(bottom = c(50, 900), top = c(NA, NA)),
main = "gap.boxplot")
I want to plot two continuous variables using ggplot.
Assume I have a dataframe where one column is a ratio between 0 and 1, and the other one is an amount. I want to by able to have a break of the ratio in the x axis using something like
breaks=seq(0, 5, by = .1)
and in the y axis I want to have the sum of the amount for each break. It would be look like a histogram but the y axis should be the sum of all columns within the break ratio. If I was making a histogram, it would look like this:
ggplot(data=data, aes(ratio)) +
geom_histogram(breaks=seq(0, 1, by = .1), aes(fill=..count..))
Try this example script. x in the script represent the variable you want breaks in and then y represents the variable you would like to sum within those break. End product, variable name "SUM" should have your sums and variable named "facet" should have your breaks that you can plot
library(dplyr)
dataframe1<-data.frame(x=seq(0,1, length.out = 100), y = seq(0,1000, length.out = 100))
x<-mutate(dataframe1,facet = factor(rep(c("0-0.25", "0.25 - 0.50", "0.50 - 0.75", "0.75 - 1.0"), each = length(dataframe1$x)/4)))
x[,"SUM"]<-NA
x$SUM
list1<-as.list(matrix(unique(x$facet),nrow = 4, ncol = 1))
list1[[1]]
i<-1:4
facetfill<-function(i){
sum1<-sum(x$y[x$facet==list1[[i]]])
x$SUM[x$facet==list1[i]]<-sum1
x$SUM
}
for (j in 1:4) {
x$SUM<-facetfill(j)
x$SUM
}
x$SUM
x
If all my data is a long row of comma separated values (duration times), then R can give me a lot of info by
summary(dat)
Question
Can R also make a plot of the data with confidence intervals?
As the first google match stated, you can try doing as:
n<-50
x<-sample(40:70,n,rep=T)
y<-.7*x+rnorm(n,sd=5)
plot(x,y,xlim=c(20,90),ylim=c(0,80))
mylm<-lm(y~x)
abline(mylm,col="red")
newx<-seq(20,90)
prd<-predict(mylm,newdata=data.frame(x=newx),interval = c("confidence"),
level = 0.90,type="response")
lines(newx,prd[,2],col="red",lty=2)
lines(newx,prd[,3],col="red",lty=2)
Of course it is one of possibilities
Another example would be:
x <- rnorm(15)
y <- x + rnorm(15)
new <- data.frame(x = seq(-3, 3, 0.5))
pred.w.clim <- predict(lm(y ~ x), new, interval="confidence")
# Just create a blank plot region with axes first. We'll add to this
plot(range(new$x), range(pred.w.clim), type = "n", ann = FALSE)
# For convenience
CI.U <- pred.w.clim[, "upr"]
CI.L <- pred.w.clim[, "lwr"]
# Create a 'loop' around the x values. Add values to 'close' the loop
X.Vec <- c(new$x, tail(new$x, 1), rev(new$x), new$x[1])
# Same for y values
Y.Vec <- c(CI.L, tail(CI.U, 1), rev(CI.U), CI.L[1])
# Use polygon() to create the enclosed shading area
# We are 'tracing' around the perimeter as created above
polygon(X.Vec, Y.Vec, col = "grey", border = NA)
# Use matlines() to plot the fitted line and CI's
# Add after the polygon above so the lines are visible
matlines(new$x, pred.w.clim, lty = c(1, 2, 2), type = "l", col =
c("black", "red", "red"))
Example 1
Example 2
I'd like to do a vertical histogram. Ideally I should be able to put multiple on a single plot per day.
If this could be combined with quantmod experimental chart_Series or some other library capable of drawing bars for a time series that would be great. Please see the attached screenshot. Ideally I could plot something like this.
Is there anything built in or existing libraries that can help with this?
I wrote something a year or so ago to do vertical histograms in base graphics. Here it is, with a usage example.
VerticalHist <- function(x, xscale = NULL, xwidth, hist,
fillCol = "gray80", lineCol = "gray40") {
## x (required) is the x position to draw the histogram
## xscale (optional) is the "height" of the tallest bar (horizontally),
## it has sensible default behavior
## xwidth (required) is the horizontal spacing between histograms
## hist (required) is an object of type "histogram"
## (or a list / df with $breaks and $density)
## fillCol and lineCol... exactly what you think.
binWidth <- hist$breaks[2] - hist$breaks[1]
if (is.null(xscale)) xscale <- xwidth * 0.90 / max(hist$density)
n <- length(hist$density)
x.l <- rep(x, n)
x.r <- x.l + hist$density * xscale
y.b <- hist$breaks[1:n]
y.t <- hist$breaks[2:(n + 1)]
rect(xleft = x.l, ybottom = y.b, xright = x.r, ytop = y.t,
col = fillCol, border = lineCol)
}
## Usage example
require(plyr) ## Just needed for the round_any() in this example
n <- 1000
numberOfHists <- 4
data <- data.frame(ReleaseDOY = rnorm(n, 110, 20),
bin = as.factor(rep(c(1, 2, 3, 4), n / 4)))
binWidth <- 1
binStarts <- c(1, 2, 3, 4)
binMids <- binStarts + binWidth / 2
axisCol <- "gray80"
## Data handling
DOYrange <- range(data$ReleaseDOY)
DOYrange <- c(round_any(DOYrange[1], 15, floor),
round_any(DOYrange[2], 15, ceiling))
## Get the histogram obects
histList <- with(data, tapply(ReleaseDOY, bin, hist, plot = FALSE,
breaks = seq(DOYrange[1], DOYrange[2], by = 5)))
DOYmean <- with(data, tapply(ReleaseDOY, bin, mean))
## Plotting
par(mar = c(5, 5, 1, 1) + .1)
plot(c(0, 5), DOYrange, type = "n",
ann = FALSE, axes = FALSE, xaxs = "i", yaxs = "i")
axis(1, cex.axis = 1.2, col = axisCol)
mtext(side = 1, outer = F, line = 3, "Length at tagging (mm)",
cex = 1.2)
axis(2, cex.axis = 1.2, las = 1, line = -.7, col = "white",
at = c(75, 107, 138, 169),
labels = c("March", "April", "May", "June"), tck = 0)
mtext(side = 2, outer = F, line = 3.5, "Date tagged", cex = 1.2)
box(bty = "L", col = axisCol)
## Gridlines
abline(h = c(60, 92, 123, 154, 184), col = "gray80")
biggestDensity <- max(unlist(lapply(histList, function(h){max(h[[4]])})))
xscale <- binWidth * .9 / biggestDensity
## Plot the histograms
for (lengthBin in 1:numberOfHists) {
VerticalHist(binStarts[lengthBin], xscale = xscale,
xwidth = binWidth, histList[[lengthBin]])
}
Violin plots might be close enough to what you want. They are density plots that have been mirrored through one axis, like a hybrid of a boxplot and a density plot. (Much easier to understanding by example than description. :-) )
Here is a simple (somewhat ugly) example of the ggplot2 implementation of them:
library(ggplot2)
library(lubridate)
data(economics) #sample dataset
# calculate year to group by using lubridate's year function
economics$year<-year(economics$date)
# get a subset
subset<-economics[economics$year>2003&economics$year<2007,]
ggplot(subset,aes(x=date,y=unemploy))+
geom_line()+geom_violin(aes(group=year),alpha=0.5)
A prettier example would be:
ggplot(subset,aes(x=date,y=unemploy))+
geom_violin(aes(group=year,colour=year,fill=year),alpha=0.5,
kernel="rectangular")+ # passes to stat_density, makes violin rectangular
geom_line(size=1.5)+ # make the line (wider than normal)
xlab("Year")+ # label one axis
ylab("Unemployment")+ # label the other
theme_bw()+ # make white background on plot
theme(legend.position = "none") # suppress legend
To include ranges instead of or in addition to the line, you would use geom_linerange or geom_pointrange.
If you use grid graphics then you can create rotated viewports whereever you want them and plot to the rotated viewport. You just need a function that will plot using grid graphics into a specified viewport, I would suggest ggplot2 or possibly lattice for this.
In base graphics you could write your own function to plot the rotated histogram (modify the plot.histogram function or just write your own from scratch using rect or other tools). Then you can use the subplot function from the TeachingDemos package to place the plot wherever you want on a larger plot.