Add pair lines in R - r

I have some data measured pair-wise (e.g. 1C, 1M, 2C and 2M), which I have plotted separately (as C and M). However, I would like to add a line between each pair (e.g. a line from point 1 in the C column to point 1 in the M 'column').
A small section of the entire dataset:
PairNumber Type M
1 M 0.117133
2 M 0.054298837
3 M 0.039734
4 M 0.069247069
5 M 0.043053957
1 C 0.051086898
2 C 0.075519
3 C 0.065834198
4 C 0.084632915
5 C 0.054254946
I have generated the below picture using the following tiny R snippet:
boxplot(test$M ~ test$Type)
stripchart(test$M ~ test$Type, vertical = TRUE, method="jitter", add = TRUE, col = 'blue')
Current plot:
I would like to know what command or what function I would need to achieve this (a rough sketch of the desired result, with only some of the lines, is presented below).
Desired plot:
Alternatively, doing this with ggplot is also fine by me, I have the following alternative ggplot code to produce a plot similar to the first one above:
ggplot(,aes(x=test$Type, y=test$M)) +
geom_boxplot(outlier.shape=NA) +
geom_jitter(position=position_jitter(width=.1, height=0))
I have been trying geom_path, but I have not found the correct syntax to achieve what I want.

I would probably recommend breaking this up into multiple visualizations -- with more data, I feel this type of plot would become difficult to interpret. In addition, I am not sure it's possible to draw the geom_lines and connect them with the additional call to geom_jitter. That being said, this gets you most of the way there:
ggplot(df, aes(x = Type, y = M)) +
geom_boxplot(outlier.shape = NA) +
geom_line(aes(group = PairNumber)) +
geom_point()
The trick is to specify your group aesthetic within geom_line() and not up top within ggplot().
Additional Note: No reason to fully qualify your aesthetic variables within ggplot() -- that is, no reason to do ggplot(data = test, aes(x = test$Type, y = test$M); rather, just use: ggplot(data = test, aes(x = Type, y = M)).
UPDATE
Leveraging cowplot to visualize this data in different plots could prove helpful:
library(cowplot)
p1 <- ggplot(df, aes(x = Type, y = M, color = Type)) +
geom_boxplot()
p2 <- ggplot(df, aes(x = Type, y = M, color = Type)) +
geom_jitter(position = position_jitter(width = 0.1, height = 0))
p3 <- ggplot(df, aes(x = M, color = Type, fill = Type)) +
geom_density(alpha = 0.5)
p4 <- ggplot(df, aes(x = Type, y = M)) +
geom_line(aes(group = PairNumber, color = factor(PairNumber)))
plot_grid(p1, p2, p3, p4, labels = c(LETTERS[1:4]), align = "v")

Related

How to stop ggplot line plot adding fill

I am producing a ggplot which looks at a curve in a dataset. When I build the plot, ggplot is automatically adding fill to data which is on the negative side of the x axis. Script and plot shown below.
ggplot(df, aes(x = Var1, y = Var2)) +
geom_line() +
geom_vline(xintercept = 0) +
geom_hline(yintercept = Var2[1])
Using base R, I am able to get the plot shown below which is how it should look.
plot(x = df$Var1, y = df$Var2, type = "l",
xlab = "Var1", ylab = "Var2")
abline(v = 0)
abline(h = df$Var2[1])
If anyone could help identify why I might be getting the automatic fill and how I could make it stop, I would be very appreciative. I would like to make this work in ggplot so I can later animate the line as it is a time series that can be used to compare between other datasets from the same source.
Can add data if necessary. Data set is 1561 obs long however. Thanks in advance.
I guess you should try
ggplot(df, aes(x = Var1, y = Var2)) +
geom_path() +
geom_vline(xintercept = 0) +
geom_hline(yintercept = Var2[1])
instead. The geom_line()-function connects the points in order of the variable on the x-axis.
Take a look at this example
dt <- data.frame(
x = c(seq(-pi/2,3*pi,0.001),seq(-pi/2,3*pi,0.001)),
y = c(sin(seq(-pi/2,3*pi,0.001)), cos(seq(-pi/2,3*pi,0.001)))
)
ggplot(dt, aes(x,y)) + geom_line()
The two points with x-coordinate -pi/2 will be connected first, creating a vertical black line. Next x = -pi/2 + 0.001 will be processed and so on. The x values will be processed in order.
Therefore you should use geom_path() to get the desired result
dt <- data.frame(
x = c(seq(-pi/2,3*pi,0.001),seq(-pi/2,3*pi,0.001)),
y = c(sin(seq(-pi/2,3*pi,0.001)), cos(seq(-pi/2,3*pi,0.001)))
)
ggplot(dt, aes(x,y)) + geom_path()

How to use sec_axis() for discrete data in ggplot2 R?

I have discreet data that looks like this:
height <- c(1,2,3,4,5,6,7,8)
weight <- c(100,200,300,400,500,600,700,800)
person <- c("Jack","Jim","Jill","Tess","Jack","Jim","Jill","Tess")
set <- c(1,1,1,1,2,2,2,2)
dat <- data.frame(set,person,height,weight)
I'm trying to plot a graph with same x-axis(person), and 2 different y-axis (weight and height). All the examples, I find is trying to plot the secondary axis (sec_axis), or discreet data using base plots.
Is there an easy way to use sec_axis for discreet data on ggplot2?
Edit: Someone in the comments suggested I try the suggested reply. However, I run into this error now
Here is my current code:
p1 <- ggplot(data = dat, aes(x = person, y = weight)) +
geom_point(color = "red") + facet_wrap(~set, scales="free")
p2 <- p1 + scale_y_continuous("height",sec_axis(~.*1.2, name="height"))
p2
I get the error: Error in x < range[1] :
comparison (3) is possible only for atomic and list types
Alternately, now I have modified the example to match this example posted.
p <- ggplot(dat, aes(x = person))
p <- p + geom_line(aes(y = height, colour = "Height"))
# adding the relative weight data, transformed to match roughly the range of the height
p <- p + geom_line(aes(y = weight/100, colour = "Weight"))
# now adding the secondary axis, following the example in the help file ?scale_y_continuous
# and, very important, reverting the above transformation
p <- p + scale_y_continuous(sec.axis = sec_axis(~.*100, name = "Relative weight [%]"))
# modifying colours and theme options
p <- p + scale_colour_manual(values = c("blue", "red"))
p <- p + labs(y = "Height [inches]",
x = "Person",
colour = "Parameter")
p <- p + theme(legend.position = c(0.8, 0.9))+ facet_wrap(~set, scales="free")
p
I get an error that says
"geom_path: Each group consists of only one observation. Do you need to
adjust the group aesthetic?"
I get the template, but no points get plotted
R function arguments are fed in by position if argument names are not specified explicitly. As mentioned by #Z.Lin in the comments, you need sec.axis= before your sec_axis function to indicate that you are feeding this function into the sec.axis argument of scale_y_continuous. If you don't do that, it will be fed into the second argument of scale_y_continuous, which by default, is breaks=. The error message is thus related to you not feeding in an acceptable data type for the breaks argument:
p1 <- ggplot(data = dat, aes(x = person, y = weight)) +
geom_point(color = "red") + facet_wrap(~set, scales="free")
p2 <- p1 + scale_y_continuous("weight", sec.axis = sec_axis(~.*1.2, name="height"))
p2
The first argument (name=) of scale_y_continuous is for the first y scale, where as the sec.axis= argument is for the second y scale. I changed your first y scale name to correct that.

Level-dependent axis vales using facet_wrap [duplicate]

I am trying to figure out a neat way to remove unused factors from a facet in ggplot2. Here is a minimal example
# DUMMY DATA
mydf = data.frame(
x = rpois(6, 25),
y = LETTERS[1:6],
cat = c(rep('AA', 3), rep('BB', 3)))
# PLOT IT!
p0 = ggplot(mydf, aes(x = x, y = y)) +
geom_point() +
facet_wrap(~ cat, ncol = 1)
From the plot below, you can see that factors D, E and F are plotted in facet AA despite the fact that there is no corresponding data. What I want is for a way to eliminate {D, E, F} from facet AA and similarly {A, B, C} from facet BB.
Is there a neat way to do this, or even a hack would be acceptable.
I think all you need is scales = "free_y":
p0 = ggplot(mydf, aes(x = x, y = y)) +
geom_point() +
facet_wrap(~ cat, ncol = 1,scales = "free_y")
p0

In ggplot2, geom_text() labels are misplaced below my data points (as pictured). How to overlay them onto points?

I'm using ggplot2 to create a simple dot plot of -1 to +1 correlation values using the following R code:
ggplot(dataframe, aes(x = exit)) +
geom_point(aes(y= row.names(dataframe))) +
geom_text(aes(y=exit, label=samplesize))
The y-axis has text labels, and I believe those text labels may be the reason that my geom_text() data point labels are squished down into the bottom of the plot as pictured here:
How can I change my plotting so that the data point labels appear on the dots themselves?
I understand that you would like to have the samplesize appear above each data point in the plot. Here is a sample plot with a sample data frame that does this:
EDIT: Per note by Gregor, changed the geom_text() call to utilize aes() when referencing the data. Thanks for the heads up!
top10_rank<-
String Number
4 h 0
1 a 1
11 w 1
3 z 3
7 z 3
2 b 4
8 q 5
6 k 6
9 r 9
5 x 10
10 l 11
x<-ggplot(data=top10_rank, aes(x = Number,
y = String)) + geom_point(size=3) + scale_y_discrete(limits=top10_rank$String)
x + geom_text(data=top10_rank, size=5, color = 'blue',
aes(x = Number,label = Number), hjust=0, vjust=0)
Not sure if this is what you wanted though.
Your problem is simply that you switched the y variables:
# your code
ggplot(dataframe, aes(x = exit)) +
geom_point(aes(y = row.names(dataframe))) + # here y is the row names
geom_text(aes(y =exit, label = samplesize)) # here y is the exit column
Since you want the same y-values for both you can define this in the initial ggplot() call and not worry about repeating it later
# working version
ggplot(dataframe, aes(x = exit, y = row.names(dataframe))) +
geom_point() +
geom_text(aes(label = samplesize))
Using row names is a little fragile, it's a little safer and more robust to actually create a data column with what you want for y values:
# nicer code
dataframe$y = row.names(dataframe)
ggplot(dataframe, aes(x = exit, y = y)) +
geom_point() +
geom_text(aes(label = samplesize))
Having done this, you probably don't want the labels right on top of the points, maybe a little offset would be better:
# best of all?
ggplot(dataframe, aes(x = exit, y = y)) +
geom_point() +
geom_text(aes(x = exit + .05, label = samplesize), vjust = 0)
In the last case, you'll have to play with the adjustment to the x aesthetic, what looks right will depend on the dimensions of your final plot

Removing Unused Factors from a Facet in ggplot2

I am trying to figure out a neat way to remove unused factors from a facet in ggplot2. Here is a minimal example
# DUMMY DATA
mydf = data.frame(
x = rpois(6, 25),
y = LETTERS[1:6],
cat = c(rep('AA', 3), rep('BB', 3)))
# PLOT IT!
p0 = ggplot(mydf, aes(x = x, y = y)) +
geom_point() +
facet_wrap(~ cat, ncol = 1)
From the plot below, you can see that factors D, E and F are plotted in facet AA despite the fact that there is no corresponding data. What I want is for a way to eliminate {D, E, F} from facet AA and similarly {A, B, C} from facet BB.
Is there a neat way to do this, or even a hack would be acceptable.
I think all you need is scales = "free_y":
p0 = ggplot(mydf, aes(x = x, y = y)) +
geom_point() +
facet_wrap(~ cat, ncol = 1,scales = "free_y")
p0

Resources