Drawing a multiple line ggplot figure - r

I am working on a figure which should contain 3 different lines on the same graph. The data frame I am working on is the follow:
I would like to be able to use ind(my data point) on x axis and then draw 3 different lines using the data coming from the columns med, b and c.
I only managed to obtain draw one line.
Could you please help me? the code I am using now is
ggplot(data=f, aes(x=ind, y=med, group=1)) +
geom_line(aes())+ geom_line(colour = "darkGrey", size = 3) +
theme_bw() +
theme(plot.background = element_blank(),panel.grid.major = element_blank(),panel.grid.minor = element_blank())

The key is to spread columns in question into a new variable. This happens in the gather() step in the below code. The rest is pretty much boiler plate ggplot2.
library(ggplot2)
library(tidyr)
xy <- data.frame(a = rnorm(10), b = rnorm(10), c = rnorm(10),
ind = 1:10)
# we "spread" a and b into a a new variable
xy <- gather(xy, key = myvariable, value = myvalue, a, b)
ggplot(xy, aes(x = ind, y = myvalue, color = myvariable)) +
theme_bw() +
geom_line()

With melt and ggplot:
df$ind <- 1:nrow(df)
head(df)
a b med c ind
1 -87.21893 -84.72439 -75.78069 -70.87261 1
2 -107.29747 -70.38214 -84.96422 -73.87297 2
3 -106.13149 -105.12869 -75.09039 -62.61283 3
4 -93.66255 -97.55444 -85.01982 -56.49110 4
5 -88.73919 -95.80307 -77.11830 -47.72991 5
6 -86.27068 -83.24604 -86.86626 -91.32508 6
df <- melt(df, id='ind')
ggplot(df, aes(ind, value, group=variable, col=variable)) + geom_line(lwd=2)

Related

R ggplot boxplot: order filling variable

I am generating a number of boxplots, each for a different marker, filled by a categorical variable: 'CR' and 'No CR'.
I want the left box in the plot to be the 'No CR'-fill and the right one 'CR'. Like this one:
However, for some plots, I get it the other way around (left 'CR' and right 'No CR')
How can I control (order?) which filling category is left and which one is right?
Here is part of my code:
head(df)
# ID y CR
# 1 1 0 No CR
# 2 2 0 No CR
# 3 3 0 CR
# 4 4 4 No CR
ggplot(df, aes(x = CR, y = y)) +
geom_boxplot(aes(fill=CR)) +
labs(title="Highly differential peptides") +
scale_fill_manual(values=c("#35978f","#D6604D"))+
stat_compare_means( label.y = maxadn,size=5)
You can relevel your CR variable :
df$CR=factor(df$CR,levels=c("No CR","CR"))
and then try to replot
It's nice to include a minimal, reproducible example to make it easier to answer your question thoroughly. First I'll simulate some data:
library("ggplot2")
df <- data.frame(
CR = sample(c("CR", "No CR"), 20, replace=TRUE),
y = rpois(20, 2)
)
Then, as explained in this question, you can either set the limits directly:
ggplot(df, aes(x = CR, y = y)) +
geom_boxplot(aes(fill=CR)) +
scale_fill_manual(values=c("#35978f","#D6604D")) +
scale_x_discrete(limits=c("No CR", "CR"))
or use factor levels to control the order:
ggplot(df, aes(x = factor(CR, levels=c("No CR", "CR")), y = y)) +
geom_boxplot(aes(fill=CR)) +
scale_fill_manual(values=c("#35978f","#D6604D")) +
labs(x = "CR")
Without any reordering:
ggplot(df, aes(x = CR, y = y)) +
geom_boxplot(aes(fill=CR)) +
scale_fill_manual(values=c("#35978f","#D6604D"))

Trouble graphing two columns on one graph in R

I just started learning R. I melted my dataframe and used ggplot to get this graph. There's supposed to be two lines on the same graph, but the lines connecting seem random.
Correct points plotted, but wrong lines.
# Melted my data to create new dataframe
AvgSleep2_DF <- melt(AvgSleep_DF , id.vars = 'SleepDay_Date',
variable.name = 'series')
# Plotting
ggplot(AvgSleep2_DF, aes(SleepDay_Date, value, colour = series)) +
geom_point(aes(colour = series)) +
geom_line(aes(colour = series))
With or without the aes(colour = series) in the geom_line results in the same graph. What am I doing wrong here?
The following might explain what geom_line() does when you specify aesthetics in the ggplot() call.
I assign a deliberate colour column that differs from the series specification!
df <- data.frame(
x = c(1,2,3,4,5)
, y = c(2,2,3,4,2)
, colour = factor(c(rep(1,3), rep(2,2)))
, series = c(1,1,2,3,3)
)
df
x y colour series
1 1 2 1 1
2 2 2 1 1
3 3 3 1 2
4 4 4 2 3
5 5 2 2 3
Inheritance in ggplot will look for aesthetics defined in an upper layer.
ggplot(data = df, aes(x = x, y = y, colour = colour)) +
geom_point(size = 3) + # setting the size to stress point layer call
geom_line() # geom_line will "inherit" a "grouping" from the colour set above
This gives you
While we can control the "grouping" associated to each line(segment) as follows:
ggplot(data = df, aes(x = x, y = y, colour = colour)) +
geom_point(size = 3) +
geom_line(aes(group = series) # defining specific grouping
)
Note: As I defined a separate "group" in the series column for the 3rd point, it is depicted - in this case - as a single point "line".

How to create two barplots with different x and y axis in tha same plot in R?

I need plot two grouped barcodes with two dataframes that has distinct number of rows: 6, 5.
I tried many codes in R but I don't know how to fix it
Here are my data frames: The Freq colum must be in Y axis and the inter and intra columns must be the x axis.
> freqinter
inter Freq
1 0.293040975264367 17
2 0.296736775990729 2
3 0.297619926364764 4
4 0.587377012109561 1
5 0.595245125315916 4
6 0.597022018595893 2
> freqintra
intra Freq
1 0 3
2 0.293040975264367 15
3 0.597022018595893 4
4 0.598809552335782 2
5 0.898227748764939 6
I expect to plot the barplots in the same plot and could differ inter e intra values by colour
I want a picture like this one:
You probably want a histogram. Use the raw data if possible. For example:
library(tidyverse)
freqinter <- data.frame(x = c(
0.293040975264367,
0.296736775990729,
0.297619926364764,
0.587377012109561,
0.595245125315916,
0.597022018595893), Freq = c(17,2,4,1,4,2))
freqintra <- data.frame(x = c(
0 ,
0.293040975264367,
0.597022018595893,
0.598809552335782,
0.898227748764939), Freq = c(3,15,4,2,6))
df <- bind_rows(freqinter, freqintra, .id = "id") %>%
uncount(Freq)
ggplot(df, aes(x, fill = id)) +
geom_histogram(binwidth = 0.1, position = 'dodge', col = 1) +
scale_fill_grey() +
theme_minimal()
With the data you posted I don't think you can have this graph to look good. You can't have bars thin enough to differentiate 0.293 and 0.296 when your data ranges from 0 to 0.9.
Maybe you could try to treat it as a factor just to illustrate what you want to do:
freqinter <- data.frame(x = c(
0.293040975264367,
0.296736775990729,
0.297619926364764,
0.587377012109561,
0.595245125315916,
0.597022018595893), Freq = c(17,2,4,1,4,2))
freqintra <- data.frame(x = c(
0 ,
0.293040975264367,
0.597022018595893,
0.598809552335782,
0.898227748764939), Freq = c(3,15,4,2,6))
df <- bind_rows(freqinter, freqintra, .id = "id")
ggplot(df, aes(x = as.factor(x), y = Freq, fill = id)) +
geom_bar(stat = "identity", position = position_dodge2(preserve = "single")) +
theme(axis.text.x = element_text(angle = 90)) +
scale_fill_discrete(labels = c("inter", "intra"))
You can also check the problem by not treating your x variable as a factor:
ggplot(df, aes(x = x, y = Freq, fill = id)) +
geom_bar(stat = "identity", width = 0.05, position = "dodge") +
theme(axis.text.x = element_text(angle = 90)) +
scale_fill_discrete(labels = c("inter", "intra"))
Either the bars must be very thin (small width), or you'll get overlapping x intervals breaking the plot.

Remove Factors with no data in facet grouping variable

I have the following data :
data <- data.frame(x = letters[1:6],
group = rep(letters[1:2], each = 3),
y = 1:6)
x group y
1 a a 1
2 b a 2
3 c a 3
4 d b 4
5 e b 5
6 f b 6
And I would like to plot y ~ x and split into facets by groups with ggplot2.
ggplot(data, aes(x, y)) +
geom_bar(stat = "identity") +
facet_grid(group ~ .)
The problem is that some tuples (x; group) don't exist in my data(for example there is no data for x = a && group = b) , but they are kept in the x-axis of both facets so I would like to remove them and then remove white spaces in the facets when factors are missing in respective groups.
I thought scales = "free_x" or drop = TRUE could do the trick but I couldn't manage to do it.
Any help would be appreciated, Thanks !
Use facet_wrap instead
ggplot(data, aes(x, y)) +
geom_col() +
facet_wrap(~group, scales = 'free', nrow = 2, strip.position = 'right')
also note geom_col as an alternative to using identity

How to manage parameters with different length in R

I have to data sets (80211 and mine) as follows: each file has one column of data.
80211
1
2
3
4
5
mine
1
2
3
I need to read these two files and plot the cdf with ggplot2 but it says that the length of parameters are different.
The code is here.
library(ggplot2)
data1 <- read.csv('80211')
data2 <- read.csv('mine')
df <- data.frame(x = c(data1, data2), ggg=factor(rep(1:2, c(5,3))))
ggplot(df, aes(x, colour = ggg)) +
stat_ecdf()+
scale_colour_hue(name="my legend", labels=c('80211','mine'))
#Here this seems to work:
require(ggplot2)
data1 <- 1:5
data2 <- 1:3
df <- data.frame(x = c(data1, data2), ggg=factor(rep(1:2, c(5,3))))
ggplot(df, aes(x, colour = ggg)) +
stat_ecdf()+
scale_colour_hue(name="my legend", labels=c('80211','mine'))

Resources