I have been searching for missing values visualization in R and even though there are many nice options, I haven't found a code to get exactly what I need.
My data frame (df) is
data.frame(
stringsAsFactors = FALSE,
Date = c("01/01/2000","02/01/2000",
"03/01/2000","04/01/2000","05/01/2000","06/01/2000",
"07/01/2000","08/01/2000","09/01/2000"),
Site.1 = c(NA,0.952101337,0.066766616,
0.77279551,0.715427011,NA,NA,NA,0.925705179),
Site.2 = c(0.85847963,0.663818831,NA,NA,
0.568488712,0.002833073,0.349365844,0.652482654,
0.334879886),
Site.3 = c(0.139854891,0.057024999,
0.297705256,0.914754178,NA,0.14108163,0.282896932,
0.823245136,0.153609705),
Site.4 = c(0.758317946,0.284147119,
0.756356853,NA,NA,0.313465424,NA,0.013689324,0.654615632)
) -> df
And I would like to get a plot similar to the following:
Taking into account that my actual data consists of 51 Sites and around 9,000 dates
You can try with something like this:
library(tidyr)
library(dplyr)
library(ggplot2)
df %>%
# from wide to long
pivot_longer(!Date, names_to = "sites", values_to = "value") %>%
# add a column of one an NAs following your data
mutate(fake = ifelse(is.na(value),NA, 1),
sites = as.factor(sites)) %>%
# plot it
ggplot(aes(x = Date, y = reorder(sites,desc(sites)), color = fake, group = sites)) +
# line size
geom_line( size = 2) +
# some aesthetics
ylab('sites') +
scale_color_continuous(high="black",na.value="white") +
theme(legend.position = 'none',
panel.background = element_rect(fill ='white'),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
Despite, I prefere something simpler like this:
df %>%
pivot_longer(!Date, names_to = "sites", values_to = "value") %>%
ggplot(aes(x = Date, y = sites, fill =value)) +
geom_tile() +
theme_light() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
This answer is similar to the first one, but uses NA values to interrupt the lines.
library(ggplot2)
library(tidyr)
library(stringr)
df %>% pivot_longer(., cols = 2:5) %>%
mutate(present = !is.na(value)) %>%
mutate(height = as.numeric(str_remove(name, "Site.")) * present) %>% mutate(value2 = case_when(!is.na(value) ~ height)) %>%
ggplot(aes(Date, value2, group = name)) +
geom_line() +
theme(legend.position = "none") +
scale_x_discrete(guide = guide_axis(angle = 90))
Related
How can I make a graphic bar using barplot() or ggplopt() of an excel archive that has 83 columns?
I need to plot every column that has a >0 value on ich raw. (ich column represents a gene function and I need to know how many functions there is on ich cluster).
Iwas trying this,but it didn't work:
ggplot(x, aes(x=Cluster, y=value, fill=variable)) +
geom_bar(stat="bin", position="dodge") +
theme_bw() +
ylab("Funções no cluster") +
xlab("Cluster") +
scale_fill_brewer(palette="Blues")
Link to the excel:
https://github.com/annabmarques/GenesCorazon/blob/master/AllclusPathwayEDIT.xlsx
What about a heatmap? A rough example:
library(dplyr)
library(tidyr)
library(ggplot2)
library(openxlsx)
data <- read.xlsx("AllclusPathwayEDIT.xlsx")
data <- data %>%
mutate(cluster_nr = row_number()) %>%
pivot_longer(cols = -c(Cluster, cluster_nr),
names_to = "observations",
values_to = "value") %>%
mutate(value = as.factor(value))
ggplot(data, aes(x = cluster_nr, y = observations, fill = value)) +
geom_tile() +
scale_fill_brewer(palette = "Blues")
Given the large number of observations consider breaking this up into multiple charts.
It's difficult to understand exactly what you're trying to do. Is this what you're trying to achieve?
#install.packages("readxl")
library(tidyverse)
library(readxl)
read_excel("AllclusPathwayEDIT.xlsx") %>%
pivot_longer(!Cluster, names_to = "gene_counts", values_to = "count") %>%
mutate(Cluster = as.factor(Cluster)) %>%
ggplot(aes(x = Cluster, y = count, fill = gene_counts)) +
geom_bar(position="stack", stat = "identity") +
theme(legend.position = "right",
legend.key.size = unit(0.4,"line"),
legend.text = element_text(size = 7),
legend.title = element_blank()) +
guides(fill = guide_legend(ncol = 1))
ggsave(filename = "example.pdf", height = 20, width = 35, units = "cm")
my data looks like
Fem_Applied <- c(10,15,10)
Fem_Success <- c(3,5,2)
Mal_Applied <- c(20,15,20)
Mal_Success <- c(4,3,3)
Role <- c("A","B","C")
df <- data.frame(Role,Fem_Applied,Fem_Success,Mal_Applied,Mal_Success)
And while I can plot it ok useing melt(df) and role as ID variables by default, I end up with 4 columns. What I want is two columns, one red for women and one blue for men. And the applied stacked on top of the successfull with applied alpha being lower
It sounds like you're looking for something like this:
library(tidyr)
library(reshape2)
library(ggplot2)
library(dplyr)
separate(melt(df), "variable", into = c("Gender", "Result"), sep = "_") %>%
mutate(fillcat = paste(Gender, Result)) %>%
ggplot(aes(Gender, value, fill = fillcat)) +
geom_col(aes(group = Result)) +
scale_fill_manual(values = c("#FF3456", "#FF345680", "#3456FF", "#3456FF80")) +
facet_grid(~Role, switch = "x") +
labs(x = "Role", y = "Count") +
theme_classic() +
theme(panel.spacing = unit(0, "points"),
legend.position = "none",
strip.placement = "outside",
strip.background = element_blank())
You can use something like the following
library(tidyverse)
df %>%
pivot_longer(cols = -Role) %>%
separate(name, c("Gender", "b"), convert = T) %>%
ggplot(aes(x = Gender, y = value, fill = b)) +
geom_col()
To have role wise plot you can use
df %>%
pivot_longer(cols = -Role) %>%
separate(name, c("Gender", "b"), convert = T) %>%
ggplot(aes(x = Gender, y = value, fill = b)) +
geom_col() + facet_wrap(Role~.)
So I am currently plotting data from a excel sheet using R. The problem I am having is in regards to the legend. Here is the Picture: https://i.stack.imgur.com/Key98.jpg As you can see in the legend, the values go as follows: PP1, PP10, PP15, PP3, PP30, PP5. I have been trying to make it go in numerical order as PP1, PP3, PP5, PP10,PP15, PP30. I am not sure how to fix this problem as I am very new to R coding. Any help would be greatly appreciated!! This is how i have my Excel sheet formated: https://i.stack.imgur.com/OfNaY.jpg Here is my Code:
library("dplyr")
install.packages("ggplot2")
library("ggplot2")
install.packages("tidyverse")
library("tidyverse")
install.packages('reshape')
library('reshape')
# import data
NPPdata <- read.csv("C:\\Users\\rrami\\Desktop\\R-Data\\NPPdata.csv", header = TRUE)
ggplot(NPPdata , aes(x = N_Gradient, y=Values, colour = Group))+
geom_errorbar(aes(ymin=Values-Stdvalue, ymax=Values+Stdvalue), lwd =1.2)+
geom_line(lwd=1.5)+
ggtitle("Year 1 MONO Phrag [Branch Prob 0.1]")+
theme(plot.title = element_text(hjust =0.5)) +
labs(x = "N-Gradient", y ="INV%")+
theme(axis.text.x = element_text(size = 14), axis.title.x = element_text(size = 16),
axis.text.y = element_text(size = 14), axis.title.y = element_text(size = 16))
I've made an example with "iris". As you can see, on the second figure 'scale_fill_discrete' is used to change the order of the labels
library (tidyverse)
data(iris)
figure_1 <- iris %>%
gather(key = floral_components, value = values, -Species) %>%
ggplot(aes(x = floral_components, y = values, fill = Species)) +
geom_bar(stat='identity') +
labs(x = "Floral Components",
y = "Values",
fill = "Species")
figure_2 <- iris %>%
gather(key = floral_components, value = values, -Species) %>%
ggplot(aes(x = floral_components, y = values, fill = Species)) +
geom_bar(stat='identity') +
labs(x = "Floral Components",
y = "Values",
fill = "Species") +
scale_fill_discrete(labels = c("versicolor", "virginica", "setosa"))
I am learning r and I have problem with sorting the double bar in ascending or descending order and I want to set the legend just on the top of the plot with two color represent respectively with one row and two columns like for example:
The title Time
box color Breakfast box color Dinner
And the plot here
Here is my dataframe:
dat <- data.frame(
time = factor(c("Breakfast","Breakfast","Breakfast","Breakfast","Breakfast","Lunch","Lunch","Lunch","Lunch","Lunch","Lunch","Dinner","Dinner","Dinner","Dinner","Dinner","Dinner","Dinner"), levels=c("Breakfast","Lunch","Dinner")),
class = c("a","a","b","b","c","a","b","b","c","c","c","a","a","b","b","b","c","c"))
And here is my code to make change:
dat %>%
filter(time %in% c("Breakfast", "Dinner")) %>%
droplevels %>%
count(time, class) %>%
group_by(time) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = class, y = prop, fill = time, label = scales::percent(prop))) +
geom_col(position = 'dodge') +
geom_text(position = position_dodge(width = 0.9), vjust = 0.5, size = 3) +
scale_y_continuous(labels = scales::percent)+
coord_flip()
Any help would be appreciated.
Something like this should be close to what you are asking, feel free to ask more
Resources consulted during the answer: http://www.sthda.com/english/wiki/ggplot2-legend-easy-steps-to-change-the-position-and-the-appearance-of-a-graph-legend-in-r-software
Using part of the answer you can look further into https://ggplot2.tidyverse.org/reference/theme.html
library(tidyverse)
dat <- data.frame(
time = factor(c("Breakfast","Breakfast","Breakfast","Breakfast","Breakfast","Lunch","Lunch","Lunch","Lunch","Lunch","Lunch","Dinner","Dinner","Dinner","Dinner","Dinner","Dinner","Dinner"), levels=c("Breakfast","Lunch","Dinner")),
class = c("a","a","b","b","c","a","b","b","c","c","c","a","a","b","b","b","c","c"))
dat %>%
filter(time %in% c("Breakfast", "Dinner")) %>%
droplevels %>%
count(time, class) %>%
group_by(time) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = fct_reorder(class,prop), y = prop, fill = time, label = scales::percent(prop))) +
geom_col(position = 'dodge') +
geom_text(position = position_dodge(width = 0.9), vjust = 0.5, size = 3) +
scale_y_continuous(labels = scales::percent)+
coord_flip() +
labs(x = "class",fill = "Time") +
theme(legend.position = "top", legend.direction="vertical", legend.title=element_text(hjust = 0.5,face = "bold",size = 12))
Created on 2020-05-08 by the reprex package (v0.3.0)
To get the legend title above the legend key, requires a little additional adjustments to the theme and guides.
dat %>%
filter(time %in% c("Breakfast", "Dinner")) %>%
droplevels %>%
count(time, class) %>%
group_by(time) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = class, y = prop, fill = time, label = scales::percent(prop))) +
geom_col(position = 'dodge') +
geom_text(position = position_dodge(width = 0.9), vjust = 0.5, size = 3) +
scale_y_continuous(labels = scales::percent)+
coord_flip() +
theme(legend.position="top", legend.direction="vertical", legend.title=element_text(hjust = 0.5))+
guides(fill = guide_legend(title = "Time", nrow = 1))
I have a matrix with many zero elements. The column names are labeled on the horizontal axis. I'd like to show explictly the nonzero elements as the bias from the vertical line for each column.
So how should construct a figure such as the example using ggplot2?
An example data can be generated as follow:
set.seed(2018)
N <- 5
p <- 40
dat <- matrix(0.0, nrow=p, ncol=N)
dat[2:7, 1] <- 4*rnorm(6)
dat[4:12, 2] <- 2.6*rnorm(9)
dat[25:33, 3] <- 2.1*rnorm(9)
dat[19:26, 4] <- 3.3*rnorm(8)
dat[33:38, 5] <- 2.9*rnorm(6)
colnames(dat) <- letters[1:5]
print(dat)
Here is another option using facet_wrap and geom_col with theme_minimal.
library(tidyverse)
dat %>%
as.data.frame() %>%
rowid_to_column("row") %>%
gather(key, value, -row) %>%
ggplot(aes(x = row, y = value, fill = key)) +
geom_col() +
facet_wrap(~ key, ncol = ncol(dat)) +
coord_flip() +
theme_minimal()
To further increase the aesthetic similarity to the plot in your original post we can
move the facet strips to the bottom,
rotate strip labels,
add "zero lines" in matching colours,
remove the fill legend, and
get rid of the x & y axis ticks/labels/title.
library(tidyverse)
dat %>%
as.data.frame() %>%
rowid_to_column("row") %>%
gather(key, value, -row) %>%
ggplot(aes(x = row, y = value, fill = key)) +
geom_col() +
geom_hline(data = dat %>%
as.data.frame() %>%
gather(key, value) %>%
count(key) %>%
mutate(y = 0),
aes(yintercept = y, colour = key), show.legend = F) +
facet_wrap(~ key, ncol = ncol(dat), strip.position = "bottom") +
coord_flip() +
guides(fill = FALSE) +
theme_minimal() +
theme(
strip.text.x = element_text(angle = 45),
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank())
It would be much easier if you can provide some sample data. Thus I needed to create them and there is no guarantee that this will work for your purpose.
set.seed(123)
# creating some random sample data
df <- data.frame(id = rep(1:100, each = 3),
x = rnorm(300),
group = rep(letters[1:3], each = 100),
bias = sample(0:1, 300, replace = T, prob = c(0.7, 0.3)))
# introducing bias
df$bias <- df$bias*rnorm(nrow(df))
# calculate lower/upper bias for errorbar
df$biaslow <- apply(data.frame(df$bias), 1, function(x){min(0, x)})
df$biasupp <- apply(data.frame(df$bias), 1, function(x){max(0, x)})
Then I used kind of hack to be able to print groups in sufficient distance to make them not overlapped. Based on group I shifted bias variable and also lower and upper bias.
# I want to print groups in sufficient distance
df$bias <- as.numeric(df$group)*5 + df$bias
df$biaslow <- as.numeric(df$group)*5 + df$biaslow
df$biasupp <- as.numeric(df$group)*5 + df$biasupp
And now it is possible to plot it:
library(ggplot2)
ggplot(df, aes(x = x, col = group)) +
geom_errorbar(aes(ymin = biaslow, ymax = biasupp), width = 0) +
coord_flip() +
geom_hline(aes(yintercept = 5, col = "a")) +
geom_hline(aes(yintercept = 10, col = "b")) +
geom_hline(aes(yintercept = 15, col = "c")) +
theme(legend.position = "none") +
scale_y_continuous(breaks = c(5, 10, 15), labels = letters[1:3])
EDIT:
To incorporate special design you can add
theme_bw() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 1),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
to your plot.
EDIT2:
To incorporate several horizontal lines, you can create different dataset:
df2 <- data.frame(int = unique(as.numeric(df$group)*5),
gr = levels(df$group))
And use
geom_hline(data = df2, aes(yintercept = int, col = gr))
instead of copy/pasting geom_hline for each group level.