R - ggplot dodging geom_lines - r

This has been something I've been experimenting with to find a fix for a while, but basically I was wondering if there is a quick way to "dodge" lineplots for two different data sets in ggplot2.
My code is currently:
#Example data
id <- c("A","A")
var <- c(1,10)
id_num <- c(1,1)
df1 <- data.frame(id,var,id_num)
id <- c("A","A")
var <- c(1,15)
id_num <- c(0.9,0.9)
df2 <- data.frame(id,var,id_num)
#Attempted plot
dodge <- position_dodge(width=0.5)
p<- ggplot(data= df1, aes(x=var, y=id)) +
geom_line(aes(colour="Group 1"),position="dodge") +
geom_line(data= df2,aes(x=var, y=id,colour="Group 2"),position="dodge") +
scale_color_manual("",values=c("salmon","skyblue2"))
p
Which produces:
Here the "Group 2" line is hiding all of the "Group 1" line which is not what I want. Instead, I want the "Group 2" line to be below the "Group 1" line. I've looked around and found this previous post: ggplot2 offset scatterplot points but I can't seem to adapt the code to get two geom_lines to dodge each other when using separate data frames.
I've been converting my y-variables to numeric and slightly offsetting them to get the desired output, but I was wondering if there was a faster/easier way to get the same result using the dodge functionality of ggplot or something else.
My work around code is simply:
p<- ggplot(data= df1, aes(x=var, y=id_num)) +
geom_line(aes(colour="Group 1")) +
geom_line(data= df2,aes(x=var, y=id_num,colour="Group 2")) +
scale_color_manual("",values=c("salmon","skyblue2")) +
scale_y_continuous(lim=c(0,1))
p
Giving me my desired output of:
Desired output:
The numeric approach can be a little cumbersome when I try to expand it to fit my actual data. I have to convert my y-values to factors, change them to numeric and then merge the values onto the second data set, so a quicker way would be preferable. Thanks in advance for your help!

You have actually two issues here:
If the two lines are plotted using two layers of geom_line() (because you have two data frames), then each line "does not know" about the other. Therefore, they can not dodge each other.
position_dodge() is used to dodge in horizontal direction. The standard example is a bar plot, where you place various bars next to each other (instead of on top of each other). However, you want to dodge in vertical direction.
Issue 1 is solved by combining the data frames into one as follows:
library(dplyr)
df_all <- bind_rows(Group1 = df1, Group2 = df2, .id = "group")
df_all
## Source: local data frame [4 x 4]
##
## group id var id_num
## (chr) (fctr) (dbl) (dbl)
## 1 Group1 A 1 1.0
## 2 Group1 A 10 1.0
## 3 Group2 A 1 0.9
## 4 Group2 A 15 0.9
Note how setting .id = "Group" lets bind_rows() create a column group with the labels taken from the names that were used together with df1 and df2.
You can then plot both lines with a single geom_line():
library(ggplot2)
ggplot(data = df_all, aes(x=var, y=id, colour = group)) +
geom_line(position = position_dodge(width = 0.5)) +
scale_color_manual("",values=c("salmon","skyblue2"))
I also used position_dodge() to show you issue 2 explicitly. If you look closely, you can see the red line stick out a little on the left side. This is the consequence of the two lines dodging each other (not very successfully) in vertical direction.
You can solve issue 2 by exchanging x and y coordinates. In that situation, dodging horizontally is the right thing to do:
ggplot(data = df_all, aes(y=var, x=id, colour = group)) +
geom_line(position = position_dodge(width = 0.5)) +
scale_color_manual("",values=c("salmon","skyblue2"))
The last step is then to use coord_flip() to get the desired plot:
ggplot(data = df_all, aes(y=var, x=id, colour = group)) +
geom_line(position = position_dodge(width = 0.5)) +
scale_color_manual("",values=c("salmon","skyblue2")) +
coord_flip()

Related

create a scatterplot with x axis coming from one dataset and y axis coming from a second dataset and colour the points according to the dataframe

i have two dataframes comtaining results from epigenetic analysis.
the column from df1 which is important to the plot is labelled beta_ADHD
the column from df2 which is important to the plot is labelled beta_ADHD
I would like to make the the column from df 1 the x axis and the column from df 2 the y axis,
i would also like to label the points on the graph according to the data set they are from.
this is what ive tried so far but nothing has worked yet:
ggp <- ggplot(NULL, aes(Beta_ADHD, Beta_ADHD)) + # Draw ggplot2 plot based on two data frames
geom_point(data = df1, col = "red") +
geom_point(data = df2, col = "blue")
ggp # Draw plot
and i also tried this:
ggplot(data=data.frame(x=df1$Beta_ADHD, y=df2$Beta_ADHD), aes(x=x, y=y)) + geom_point()
I'm at a complete loss here and any help would be greatly appreciated.
I think you need to combine the inputs into a single data frame in order to use them as co-ordinates for a scatter plot. (Also, the 2 data sets must have the same number of values.)
I don't believe it makes sense to label or colour the points according to which data set they are from. As we are taking the x-coordinate from df1 and the y-coordinate from df2, that means that every point comes from both data sets. It is the labels on the x-axis beta_ADHD1 and y-axis beta_ADHD2 that show which data set the value came from. You can change the text and color of the axis titles using xlab(), ylab() and theme().
# create some sample data
df1 <- data.frame(beta_ADHD=runif(100,0,10))
df2 <- data.frame(beta_ADHD=rnorm(100,0,10))
# create a new data frame containing the required co-ordinates
# the values from df1 are named beta_ADHD1 and the values from df2 are named beta_ADHD2
new_df <- data.frame(beta_ADHD1 = df1$beta_ADHD, beta_ADHD2 = df2$beta_ADHD)
# plot this data using ggplot
ggplot(new_df, aes(x = beta_ADHD1, y = beta_ADHD2)) + geom_point() +
xlab('beta_ADHD from df1') + ylab('beta_ADHD from df2') +
theme(axis.title.x = element_text(color ='red'), axis.title.y = element_text(color = 'blue'))

R reverse the small lines in a ggplot with geom_rug

I have a dataframe with 2 columns: date and var1.
Now I want to plot these 2 variables in a ggplot and add small lines with geom_rug().
df<-tibble(date=lubridate::today() -0:14,
var1= c(1,2.5,NA,3,NA,6.5,1,NA,3,2,NA,7,3,NA,1))
df%>%ggplot(aes(x=date,y=var1))+
geom_point()+
geom_rug(sides = "tr",outside = T) +
# Need to turn clipping off if rug is outside plot area
coord_cartesian(clip = "off")
And here is my plot:
But my problem is that the small lines for var1 are on the left side. I want to have them on the top.
With the argument sides= you can change the disposition of the small lines, like here:
df%>%ggplot(aes(x=date,y=var1))+
geom_point()+
geom_rug(sides = "t",outside = T) +
# Need to turn clipping off if rug is outside plot area
coord_cartesian(clip = "off")
But in this example the small lines are representing the date and not the var1. (var1 has only 10 values, but there are 15 small lines)
Can someone help me, how can reverse the geom_rug-element and avoid this problem?
You do have 15 rows in df$var and in df$date. In the former, 5 are NA.
One way to approach this, is to define a limited data set with only relevant info for what is plotted (not the NAs). This should be given to geom_rug. With complete.cases we are able to omit the rows with NAs in our data set.
You could use the following code to achieve your wanted plot
library(ggplot2)
library(tibble)
library(dplyr)
df %>% ggplot(aes(x = date, y = var1)) +
geom_point() +
geom_rug(data = df[complete.cases(df), ] , ## selected date: not NA
sides = "lt",
outside = TRUE) +
coord_cartesian(clip = "off")
please let me know whether this is what you want.
Another option if you want to drop all cases with NA values while plotting is to use the ggplot2 remove_missing() function:
df %>% ggplot(data = remove_missing(.), mapping = aes(x=date,y=var1))+
geom_point()+
geom_rug(sides = "t",outside = T) +
coord_cartesian(clip = "off")

Keeping the width constant in ggplot in R

Have already tried this link but it fails to work, moreover this question was asked 5 years ago so I hope there is a better way around than this lengthy code.
How to make the width of bars and spaces between them fixed for several barplots using ggplot, having different number of bars on each plot?
#with 10 bins
data <- data.frame(x=1:10,y=runif(10))
library(ggplot2)
ggplot(data, aes(x,y)) + geom_bar(stat="identity")
#with 3 bins
ggplot(data[1:3,], aes(x,y)) + geom_bar(stat="identity")
Adding width=1 to geom_bar(...) doesn't help as well. I need the second plot automatically to have less width and the same bar width and spaces as the first one.
One solution would be to adjust the coordinates to match:
ggplot(data[1:3,], aes(x,y)) + geom_bar(stat="identity") +
scale_x_continuous(limits = c(0.5,10)) # This is approximate but pretty close.
#
# For an exact match, you can go into the ggplot_build object and
# extract [["layout"]][["panel_params"]][[1]][["x.range"]]
# You could then use the exact values:
# scale_x_continuous(limits = c(0.055,10.945), expand = c(0,0))
Another would be to combine the bars into one plot, and then use facets to show the data on the same scale.
library(dplyr)
data2 <-
data %>% mutate(plot = "A") %>%
bind_rows(
data[1:3,] %>% mutate(plot = "B")
)
(a <- ggplot(data2, aes(x,y)) + geom_bar(stat="identity") +
facet_grid(~plot)
)
If you want to use this with plotly, you could then use plotly::ggplotly(a).

Adding Custom Legend to 2 Data sets in ggplot2

I am trying to simply add a legend to my Nyquist plot where I am plotting 2 sets of data: 1 is an experimental set (~600 points), and 2 is a data frame calculated using a transfer function (~1000 points)
I need to plot both and label them. Currently I have them both plotted okay but when i try to add the label using scale_colour_manual no label appears. Also a way to move this label around would be appreciated!! Code Below.
pdf("nyq_2elc.pdf")
nq2 <- ggplot() + geom_point(data = treat, aes(treat$V1,treat$V2), color = "red") +
geom_point(data = circuit, aes(circuit$realTF,circuit$V2), color = "blue") +
xlab("Real Z") + ylab("-Imaginary Z") +
scale_colour_manual(name = 'hell0',
values =c('red'='red','blue'='blue'), labels = c('Treatment','EQ')) +
ggtitle("Nyquist Plot and Equivilent Circuit for 2 Electrode Treatment Setup at 0 Minutes") +
xlim(0,700) + ylim(0,700)
print(nq2)
dev.off()
Ggplot works best with long dataframes, so I would combine the datasets like this:
treat$Cat <- "treat"
circuit$Cat <- "circuit"
CombData <- data.frame(rbind(treat, circuit))
ggplot(CombData, aes(x=V1, y=V2, col=Cat))+geom_point()
This should give you the legend you want.
You probably have to change the names/order of the columns of dataframes treat and circuit so they can be combined, but it's hard to tell because you're not giving us a reproducible example.

ggplot 2 - Change legend categories with numeric values (no factors)

Suppose I work with the mtcars data set. I would like to set the size of the points according to the weight (wt). If I do that as shown below, R/ggplot2 will give me a legend with 4 categories (2,3,4,5).
library(ggplot2)
mtc <- mtcars
p1 <- ggplot(mtc, aes(x = hp, y = mpg))
p1 <- p1 + geom_point(aes(size = wt))
print(p1)
How can I change the scale/names/categories of the legend. I found information on how to do that if the "categories" would be factors, but I don't know how to do this with numeric values. I need to keep them numeric otherwise it doesn't work with the size of the dots anymore.
My real data set has about 100 values for wt (everything from 1-150) and I want to keep 5 values. (ggplot 2 gives me 2 -> 50 and 100)
1) How can I change the scale of that legend? In the mtc example for example I just want 2 points of size 2 and 5
2) I was thinking about making categories such as:
mtc$wtCat[which(mtc$wt<=2)]=1
mtc$wtCat[which(mtc$wt>2 & mtc$wt<=3)]=2
mtc$wtCat[which(mtc$wt>3)]=3
p1 <- ggplot(mtc, aes(x = hp, y = mpg))
p2 <- p1 + geom_point(aes(size = wtCat), stat="identity")
print(p2)
and then just rename 1,2,3 in the legend into <=2, 2-3 and >3 but I didn't figure out how to do that as well.
Thank you so much.
You can use scale_size_continuous() and with argument breaks= set levels you want to see in legend and with argument labels= change how legend entries are labelled.
ggplot(mtcars,aes(hp,mpg,size=wt))+geom_point()+
scale_size_continuous(breaks=c(2,5),labels=c("<=2",">2"))

Resources