Deviation chart in base graphics - r

I need to do a deviance chart (lollipop chart with lines from the mean to values above / below the mean). From this question and answer Drawing line segments in R, it is clear that I need to plot segments and then add the points. However, my x axis is a factor and the solution fails.
This works:
df <- data.frame(ID = c(1, 2, 3),
score = c(30, 42, 48))
mid <- mean(df$score)
plot(range(df$ID), range(df$score),type="n")
segments(df$ID, df$score, df$ID, mid)
But changing my identifier variable into a factor breaks it.
df$ID2 <- factor(df$ID)
plot(range(df$ID2), range(df$score),type="n")
segments(df$ID2, df$score, df$ID2, mid)
How can I set up the plot area and x-axis values to deal with a factor?
Note that I need a base graphics solution to fit with the other charts in a dashboard style report.

You can convert the factor in a numeric variable, supress the x-axis and then add the correct labels to the plot:
df$ID2 <- factor(letters[df$ID]) # Use letters to show that this is working
plot(range(as.numeric(df$ID2)), range(df$score), type = "n", xaxt = "n")
segments(as.numeric(df$ID2), df$score, as.numeric(df$ID2), mid)
axis(1, at = seq_along(levels(df$ID2)), labels = levels(df$ID2))

Related

how to add labels above the bar of "barplot" graphics?

I asked a question before, but now I would like to know how do I put the labels above the bars.
post old: how to create a frequency histogram with predefined non-uniform intervals?
dataframe <- c (1,1.2,40,1000,36.66,400.55,100,99,2,1500,333.45,25,125.66,141,5,87,123.2,61,93,85,40,205,208.9)
Upatdate
Update
Following the guidance of the colleague I am updating the question.
I have a data base and I would like to calculate the frequency that a given value of that base appears within a pre-defined range, for example: 0-50, 50-150, 150-500, 500-2000.
in the post(how to create a frequency histogram with predefined non-uniform intervals?) I managed to do this, but I don't know how to add the labels above the bars. I Tried:
barplort (data, labels = labels), but it didn't work.
I used barplot because the post recommended me, but if it is possible to do it using ggplot, it would be good too.
Based on the answer to your first question, here is one way to add a text() element to your Base R plot, that serves as a label for each one of your bars (assuming you want to double-up the information that is already on the x axis).
data <- c(1,1.2,40,1000,36.66,400.55,100,99,2,1500,333.45,25,125.66,141,5,87,123.2,61,93,85,40,205,208.9)
# Cut your data into categories using your breaks
data <- cut(data,
breaks = c(0, 50, 150, 500, 2000),
labels = c('0-50', '50-150', '150-500', '500-2000'))
# Make a data table (i.e. a frequency count)
data <- table(data)
# Plot with `barplot`, making enough space for the labels
p <- barplot(data, ylim = c(0, max(data) + 1))
# Add the labels with some offset to be above the bar
text(x = p, y = data + 0.5, labels = names(data))
If it is the y values that you are after, you can change what you pass to the labels argument:
p <- barplot(data, ylim = c(0, max(data) + 1))
text(x = p, y = data + 0.5, labels = data)
Created on 2020-12-11 by the reprex package (v0.3.0)

R horizontal barplot with axis labels split between two axis

I have a horizontal barplot, with zero in the middle of the x-axis and would like the name for each bar to appear on the same side as the bar itself. The code I am using is:
abun<-data$av.slope
species<-data$Species
cols <- c("blue", "red")[(abun > 0)+1]
barplot(abun, main="Predicted change in abundance", horiz=TRUE,
xlim=c(-0.04,0.08), col=cols, names.arg=species, las=1, cex.names=0.6)
I have tried creating two separate axes and the names do appear on the desired side for each bar, but are not level with the appropriate bar. I will try and upload an image of the barplot, am still very new to R, apologies if I am missing something basic!
barplot1- names in correct position but all on one axis
barplot2- names on both sides of plot but not in line with appropriate bar
We can accomplish this using mtext:
generate data
Since you didn't include your data in the question I generated my own dummy data set. If you post a dput of your data, we could adapt this solution to your data.
set.seed(123)
df1 <- data.frame(x = rnorm(20),
y = LETTERS[1:20])
df1$colour <- ifelse(df1$x < 0, 'blue', 'red')
make plot
bp <- barplot(df1$x, col = df1$colour, horiz = T)
mtext(side = ifelse(df1$x < 0, 2, 4),
text = df1$y,
las = 1,
at = bp,
line = 1)

plot time series with discontinuous axis in R

I want to plot a time series, excluding in the plot a stretch of time in the middle. If I plot the series alone, with only an index on the x-axis, that is what I get. The interior set of excluded points do not appear.
x <- rnorm(50)
Dates <- seq(as.Date("2008-1-1"), by = "day", length.out = length(x))
dummy <- c(rep(1, 25), rep(0, 10), rep(1, length(x) - 35))
plot(x[dummy == 1])
Once the dates are on the x-axis, however, R dutifully presents an accurate true time scale, including the excluded dates. This produces a blank region on the plot.
plot(Dates[dummy == 1], x[dummy == 1])
How can I get dates on the x-axis, but not show the blank region of the excluded dates?
Three alternatives:
1. ggplot2 Apparently, ggplot2 would not allow for a discontinuous axis but you could use facet_wrap to get a similar effect.
# get the data
x = rnorm(50)
df <- data.frame( x = x,
Dates = seq(as.Date("2008-1-1"), by = "day", length.out = length(x)) ,
dummy = c(rep(1, 25), rep(0, 10), rep(1, length(x) - 35)))
df$f <- ifelse(df$Dates <= "2008-01-25", c("A"), c("B"))
# plot
ggplot( subset(df, dummy==1)) +
geom_point(aes(x= Dates, y=x)) +
facet_wrap(~f , scales = "free_x")
2. base R
plot(df$x ~ df$Dates, col= ifelse( df$f=="A", "blue", "red"), data=subset(df, dummy==1))
3. plotrix Another alternative would be to use gap.plot{plotrix}. The code would be something like this one below. However, I couldn't figure out how to make a break in an axis with date values. Perhaps this would and additional question.
library(plotrix)
gap.plot(Dates[dummy == 1], x[dummy == 1], gap=c(24,35), gap.axis="x")
I think I figured it out. Along the lines I proposed above, I had to fiddle with the axis command for a long time to get it to put the date labels in the right place. Here's what I used:
plot(x[dummy == 1], xaxt = "n", xlab = "") # plot with no x-axis title or tick labels
Dates1_index <- seq(1,length(Dates1), by = 5) # set the tick positions
axis(1, at = Dates1_index, labels = format(Dates1[Dates1_index], "%b %d"), las = 2)
Having succeeded, I now agree with #alistaire that it looks pretty misleading. Maybe if I put a vertical dashed line at the break...

Adding color to circular data based on group membership

I'm trying to add color to specific points in my circular data based on group membership (I have two groups: one with individuals with a certain medical condition and another group of just healthy controls). I've converted their data from degrees to radians and put it on the plot, but I haven't managed to be able to selectively change the color of the points based on the factor variable I have).
Know that I've loaded library (circular), which doesn't allow me to use ggplot. Here's the syntax I've been working with:
plot(bcirc, stack=FALSE, bins=60, shrink= 1, col=w$dx, axes=FALSE, xlab ="Basal sCORT", ylab = "Basal sAA")
If you've noticed, I specified the factor variable (which has two levels) in the color section, but it just keeps putting everything in one color. Any suggestions?
Seems plot.circular does not like to assign multiple colours. Here's one potential work-around:
library(circular)
## simulate circular data
bcirc1 <- rvonmises(100, circular(90), 10, control.circular=list(units="degrees"))
bcirc2 <- rvonmises(100, circular(0), 10, control.circular=list(units="degrees"))
bcirc <- c(bcirc1, bcirc2)
dx <- c(rep(1,100),rep(2,100))
## start with blank plot, then add group-specific points
plot(bcirc, stack=FALSE, bins=60, shrink= 1, col=NA,
axes=FALSE, xlab ="Basal sCORT", ylab = "Basal sAA")
points(bcirc[dx==1], col=rgb(1,0,0,0.1), cex=2) # note: a loop would be cleaner if dealing with >2 levels
points(bcirc[dx==2], col=rgb(0,0,1,0.1), cex=2)
Inspired by Paul Regular's example, here is a version using the same data where one condition is plotted stacking inwards and the other is plotted stacking outwards.
library(circular)
## simulate circular data
bcirc1 <- rvonmises(100, circular(90, units = 'degrees'), 10, control.circular=list(units="degrees"))
bcirc2 <- rvonmises(100, circular(0, units = 'degrees'), 10, control.circular=list(units="degrees"))
bcirc <- data.frame(condition = c(
rep(1,length(bcirc1)),
rep(2,length(bcirc2)) ),
angles = c(bcirc1,
bcirc2) )
## start with blank plot, then add group-specific points
dev.new(); par(mai = c(1, 1, 0.1,0.1))
plot(circular(subset(bcirc, condition == 1)$angles, units = 'degrees'), stack=T, bins=60, shrink= 1, col=1,sep = 0.005, tcl.text = -0.073,#text outside
axes=T, xlab ="Basal sCORT", ylab = "Basal sAA")
par(new = T)
plot(circular(subset(bcirc, condition == 2)$angles, units = 'degrees'), stack=T, bins=60, shrink= 1.05, col=2,
sep = -0.005, axes=F)#inner circle, no axes, stacks inwards

adding spread data to dotplots in R

I have a table with averages and interquartile ranges. I would like to create a dotplot, where the dot would show this average, and a bar would stretch through the dot, to show the interquartile range. In other words, the dot would be at the midpoint of a bar, the length of which would equal my interquartile range data. I am working in R.
For example,
labels<-c('a','b','c','d')
averages<-c(10,40,20,30)
ranges<-c(5,8,4,10)
dotchart(averages,labels=labels)
where the ranges would then be added to this plot as bars.
Any ideas?
Thanks!
Yet another method, using base.
labels <- c('a', 'b', 'c', 'd')
averages <- c(10, 40, 20, 30)
ranges <- c(5, 8, 4, 10)
dotchart(averages, labels=labels, xlab='average', pch=20,
xlim=c(min(averages-ranges), max(averages+ranges)))
segments(averages-ranges, 1:4, averages+ranges, 1:4)
For the record, here's a lattice solution, which uses a couple of functions from the Hmisc package:
library(lattice)
library(Hmisc)
labels<-c('a','b','c','d')
averages<-c(10,40,20,30)
ranges<-c(5,8,4,10)
low <- averages - ranges/2
high <- averages + ranges/2
d <- data.frame(labels, averages, low, high)
Dotplot(labels ~ Cbind(averages, low, high), data = d,
col = 1, # for black points
par.settings = list(plot.line = list(col = 1)), # for black bars
xlab = "Value")
ggplot2 has a good facility for doing this:
library(ggplot2)
labels<-c('a','b','c','d')
averages<-c(10,40,20,30)
ranges<-c(5,8,4,10)
x <- data.frame(labels,averages,ranges)
ggplot(x, aes(averages,labels)) +
geom_point() +
geom_errorbarh(aes(xmin=averages-ranges,xmax=averages+ranges))
Gives you a plot like:

Resources