ggplot2 not including each value on x-axis - r

I have the following data which I am trying to plot as a stacked area chart:
week Wildtype ARE
3 3770 3740
4 3910 3920
5 3660 3640
6 3750 3790
7 3940 3930
8 3940 3940
9 3830 3810
10 3710 3720
11 3730 3720
12 357 358
Using this code for a stacked area chart
library(reshape2)
library(ggplot2)
rm(list=ls())
df <- read.csv("Mo_data/mo_qpcr_data2.csv", comment.char = "#", sep=",")
df_melt <- melt(df, id=c("week"))
p1 <- ggplot() + geom_area(aes(y = value, x = week, fill = variable), data = df_melt)
p1
I get the plot that I want but it isn't quite right.
How do I change the plot so that the x-axis displays each week in the time series rather than just 5.0, 7.5 and 10.0?

I would add this to the code
+ scale_x_continuous(breaks= unique(df$week) )
library(reshape2)
library(ggplot2)
rm(list=ls())
df <- read.csv("Mo_data/mo_qpcr_data2.csv", comment.char = "#", sep=",")
df_melt <- melt(df, id=c("week"))
p1 <- ggplot() + geom_area(aes(y = value, x = week, fill = variable), data = df_melt) + scale_x_continuous(breaks= unique(df$week) )
p1

Ggplot is treating your Week column as a number (rightly so, because it is a number), so your x axis appears as continuous. If you want to treat weeks as discrete values you can:
1) Change the week column to characters or factors
df_melt$week <- as.factor(df_melt$week)
df_melt$week <- as.factor(df_melt$week)
2) Tell the x axis where you want the breaks with scale_axis_discrete more info here

Related

How can I line up graphs with a common x axis made using 2 different data frames when the x axis values are not identical?

I have 2 datasets, both with the column Depth. However df1 has ~400 rows and df2 7000 rows. The depth values, which I want to be my common x axis, for df1 go from 48-120 and df2 48-133. When I make my plots this difference in the range stops the plots from lining up.
df1 sample data
depth L F P Depo
67.48 1.003 1.063 1.066 Turb
67.63 1.004 1.020 1.024 Dri
67.73 1.011 1.017 1.028 Dri
67.83 1.006 1.007 1.014 Turb
67.92 1.003 1.029 1.032 Pro
68.06 1.004 1.007 1.011 Pro
df2 sample data
depth Ca Ti
67.41 378 241
67.91 422 253
67.94 402 262
67.95 412 264
67.98 377 266
68.01 386 263
68.02 326 266
68.08 338 219
I tried making individual plots and then using grid.draw but this doesn't work.
creating plots from DF1
Lin <- ggplot(DF1, aes(x=depth, y=L)) + geom_line() + geom_point(data = DF1, aes(x=depth, y=L, color = Depo))
Fab <- ggplot(DF1, aes(x=depth, y=P)) + geom_path() + geom_point(data = DF1, aes(x=depth, y=P, color = Depo))
Fol <- ggplot(DF1, aes(x=depth, y=F)) + geom_path() + geom_point(data = DF1, aes(x=depth, y=F, color = Depo))
Combining plots with grid.draw works for the df1 graphs
grid.draw(rbind(ggplotGrob(Fol), ggplotGrob(Lin), ggplotGrob(Fab), size = "last"))
Creating plot from DF2
Ca1 <- ggplot(DF2, aes(x=depth, y=Ca)) + geom_path()
When I try to combine the plots from the 2 dataframes it throws an error that x and y must have the same amount of columns.
grid.draw(rbind(ggplotGrob(Fol), ggplotGrob(Lin), ggplotGrob(Fab), ggplotGrob(Ca1), size = "last"))
Cowplot works but the depths don't line up for my df2 graph (Ca1)
plot_grid(Fol, Lin, Fab, Ca1, align="h", axis="b", nrow = 4, rel_widths = c(1,2))
I tried some other ways of lining the graphs up but it seems they all line the plots up, not the actual values of the x axis. I also tried to use facet wrap but couldn't work out how to combine the 2 dfs. In my searching to resolve this problem I keep seeing to combine the 2 dataframes but I can't see how this would work with my data?
Does anyone know how I can line these graphs up? I have so many variables I need to compare from both datasets.
To integrate the two data sets, which only have "depth" in common, you can gather the remaining numeric columns into "long" format, where we label the type of data in one column ('col' here) and the value in another ('val' here).
Once the data is combined, we can use facet_wrap(~col, scales = "free_y") to make facets for each variable, but with a common x axis.
library(tidyverse)
df_combo <-
bind_rows(
df1 %>% gather(col, val, L:P),
df2 %>% gather(col, val, Ca:Ti)
)
ggplot(df_combo, aes(depth, val, color = Depo)) +
geom_path() +
facet_wrap(~col, scales = "free_y", ncol = 1)

Show difference value over time in trend line

My question is similar to this one .
In the linked question ,the plot shows difference of values over time ,I want to show the line plot as well along with the difference of the values .
What I want to achieve,along with this , is a trend line across the year on the values as well . How can I do that .
data to replicate (similar to linked question )
library(ggplot2)
library(dplyr)
original.df <- read.table(text = "year Arabica Robusta
1990 100 200
1995 180 120
2000 200 190
2005 190 210
2012 230 120", header = TRUE)
df <- original.df %>%
mutate(direction = ifelse(Robusta - Arabica > 0, "Up", "Down"))%>%
melt(id = c("year", "direction"))
g1 <- ggplot(df, aes(x=year, y = value, color = variable, group = year )) +
geom_point(size=4) +
geom_path(aes(color = direction), arrow=arrow())
The plot (in the linked question) looks like .
If I add geom_smooth ,it does not show anything ,which makes sense to me as I understand geom_smooth does not know which points to refer ,whether its Arabica or the Robusta.
I tried few things and able to solve it apparently
ggplot(df, aes(x=year, y = value, color = variable))+geom_line()+geom_point(size=4)+geom_path(aes(color=direction,group=year),arrow = arrow())

Reordering columns by y-value in R?

I have a dataframe structured like this:
> head(df)
Zip Crimes Population CPC
1 78701 2103 6841 0.3074
2 78719 186 1764 0.1054
3 78702 1668 21334 0.0782
4 78723 2124 28330 0.0750
5 78753 3472 49301 0.0704
6 78741 2973 44935 0.0662
And I'm plotting it using this function:
p = ggplot(df, aes(x=Zip, y=CPC)) + geom_col() + theme(axis.text.x = element_text(angle = 90))
And this is the graph I get:
How can I order the plot by CPC, where the highest Zip codes are on the left?
Convert Zip to a factor ordered by negative CPC. E.g., try df$Zip <- reorder(df$Zip, -df$CPC) before plotting. Here's a small example:
d <- data.frame(
x = c('a', 'b', 'c'),
y = c(5, 15, 10)
)
library(ggplot2)
# Without reordering
ggplot(d, aes(x, y)) + geom_col()
# With reordering
d$x <- reorder(d$x, -d$y)
ggplot(d, aes(x, y)) + geom_col()
Sort your data frame in descending order and then plot it:
library(dplyr)
df <- arrange(df,desc(CPC))
ggplot...

2x1 faceting with ggplot2

I'm trying to make a simple facet with histograms in ggplot2
data <- read.csv("/hist_distances.csv", check.names = FALSE, sep = ",")
mdata <- melt(data)
m <- ggplot(data, aes(x=Distance))
m + geom_histogram()
head(data)
Gives:
Times Distance
1 3.093060 260.8840
2 2.557780 187.4960
3 0.263611 10.6584
4 2.880000 184.5970
5 5.035000 281.3490
6 6.952780 251.4730
head(mdata)
gives:
variable value
1 Times 3.093060
2 Times 2.557780
3 Times 0.263611
4 Times 2.880000
5 Times 5.035000
6 Times 6.952780
and
tail(mdata)
gives:
variable value
1739 Distance 1.103670
1740 Distance 1.695610
1741 Distance 3.795020
1742 Distance 6.651960
1743 Distance 0.719843
1744 Distance 6.504050
This produces this graphic:
I have tried:
m <- ggplot(mdata, aes(x=value)) +
geom_histogram() +
m + facet_wrap(~ variable)
With no success.
How can I produce a facetted graph instead, with a histogram of variable "times" at the top and a histogram of variable "distances" at the bottom?
Use facet_grid(variable ~ .), where facet_grid(row ~ column):
df <- data.frame(Time = rnorm(100),
Distance = rnorm(100)
)
dfm <- melt(df)
ggplot(dfm, aes(x=value)) + geom_histogram() + facet_grid(variable ~ .)
Edit for follow-up comment:
If your data are on different scales, use facet_grid(variable ~ ., scales = "free").
See help(facet_grid) for options.

Stacked bar chart

I would like to create a stacked chart using ggplot2 and geom_bar.
Here is my source data:
Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10
I want a stacked chart where x is the rank and y is the values in F1, F2, F3.
# Getting Source Data
sample.data <- read.csv('sample.data.csv')
# Plot Chart
c <- ggplot(sample.data, aes(x = sample.data$Rank, y = sample.data$F1))
c + geom_bar(stat = "identity")
This is as far as i can get. I'm not sure of how I can stack the rest of the field values.
Maybe my data.frame is not in a good format?
You said :
Maybe my data.frame is not in a good format?
Yes this is true. Your data is in the wide format You need to put it in the long format. Generally speaking, long format is better for variables comparison.
Using reshape2 for example , you do this using melt:
dat.m <- melt(dat,id.vars = "Rank") ## just melt(dat) should work
Then you get your barplot:
ggplot(dat.m, aes(x = Rank, y = value,fill=variable)) +
geom_bar(stat='identity')
But using lattice and barchart smart formula notation , you don't need to reshape your data , just do this:
barchart(F1+F2+F3~Rank,data=dat)
You need to transform your data to long format and shouldn't use $ inside aes:
DF <- read.table(text="Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10", header=TRUE)
library(reshape2)
DF1 <- melt(DF, id.var="Rank")
library(ggplot2)
ggplot(DF1, aes(x = Rank, y = value, fill = variable)) +
geom_bar(stat = "identity")
Building on Roland's answer, using tidyr to reshape the data from wide to long:
library(tidyr)
library(ggplot2)
df <- read.table(text="Rank F1 F2 F3
1 500 250 50
2 400 100 30
3 300 155 100
4 200 90 10", header=TRUE)
df %>%
gather(variable, value, F1:F3) %>%
ggplot(aes(x = Rank, y = value, fill = variable)) +
geom_bar(stat = "identity")
You will need to melt your dataframe to get it into the so-called long format:
require(reshape2)
sample.data.M <- melt(sample.data)
Now your field values are represented by their own rows and identified through the variable column. This can now be leveraged within the ggplot aesthetics:
require(ggplot2)
c <- ggplot(sample.data.M, aes(x = Rank, y = value, fill = variable))
c + geom_bar(stat = "identity")
Instead of stacking you may also be interested in showing multiple plots using facets:
c <- ggplot(sample.data.M, aes(x = Rank, y = value))
c + facet_wrap(~ variable) + geom_bar(stat = "identity")

Resources