Not showing all x-axis labels in ggplot - r

I have a dataset having many columns. The last column (Labels) shows the cluster member for each user (row). How can I edit my code to show only a few labels of x-axis?, since right now the dates are overlapping and can not be read. I want to show the first, last and one out of every five dates. For example, showing the dates 1,5,10,15,....,133, which 1 and 133 are the first and the last dates.
BTW, I have used the scale_x_date() but I had no success.
Data Sample
mat <- structure(c(1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 2, 3),
.Dim = c(3L, 5L),
.Dimnames = list(c("A", "B", "C"),
c("2011-1-6", "2011-1-9", "2011-1-15", "2011-2-19", "Labels")))
Code
library(tidyverse)
mat %>%
as.data.frame() %>%
mutate(id=1:nrow(mat),
Labels = as.factor(Labels)) %>%
pivot_longer(cols=starts_with("2011")) %>%
filter(value==1) %>%
ggplot(aes(x=name, y=id, color=Labels)) +
geom_point() +
theme(axis.text.x = element_text(angle = 90))

You can use scale_x_date. Following #Rui Barradas' comment, you first have to set the class of the dates to "Date".
Then, with scale_x_date, you can control the breaks with date_breaks. You can also control the format with date_labels. See ?scale_x_date for more info. Here is how to have an axis label every 5 days:
mat %>%
as.data.frame() %>%
mutate(id=1:nrow(mat),
Labels = as.factor(Labels)) %>%
pivot_longer(cols=starts_with("2011")) %>%
mutate(name = as.Date(name)) %>%
filter(value==1) %>%
ggplot(aes(x=name, y=id, color=Labels)) +
geom_point() +
scale_x_date(date_labels = "%Y-%m-%d", date_breaks = "5 days") +
theme(axis.text.x = element_text(angle = 90))

Related

How to plot filtered data with loop in R and combine them with facet_grid?

I am new in R, so my question could seem very trivial for someone, but I need a solution. I have a data frame:
`structure(list(Time = c(0, 0, 0), Node = 1:3, Depth = c(0, -10,
-20), Head = c(-1000, -1000, -1000), Moisture = c(0.166, 0.166,
0.166), HeadF = c(-1000, -1000, -1000), MoistureF = c(0.004983,
0.004983, 0.004983), Flux = c(-0.00133, -0.00133, -0.00133),
FluxF = c(-0.00122, -0.00122, -0.00122), Sink = c(0, 0, 0
), Transf = c(0, 0, 0), TranS = c(0, 0, 0), Temp = c(20,
20, 20), ConcF = c(0, 0, 0), ConcM = c(0, 0, 0)), row.names = c(NA,
3L), class = "data.frame")`.
I am able to plot a single TranS vs Time Single plot, where color = Transf (using scale_color_viridis). I want to create plots with a filtered data for( depth = -20, depth = -40 , -60, -80 and -100) Note: that title also have to be changed according to a depth value. These plots then I want to put next to each other using facet_grid.
I have tried in a such way:
plot_d20 <-plot_node %>% filter(plot_node$Depth == -20)
plot_d40 <-plot_node %>% filter(plot_node$Depth == -40)
plot_d60 <-plot_node %>% filter(plot_node$Depth == -60)
plot_d80 <-plot_node %>% filter(plot_node$Depth == -80)
plot_d100 <-plot_node %>% filter(plot_node$Depth == -100)
depth_plot <- c(plot_d20,plot_d40,plot_d60,plot_d80,plot_d100)
for (p in depth_plot){
ggpS<-ggplot(p, aes(Time, TranS, color=Transf) ) +
geom_point(alpha = 1)+
scale_color_viridis(option = "D")+
scale_x_continuous(limits = c(0,1400), breaks = seq(0,1400,200))+
ggtitle('Solute Mass Transfer for depth = 20mm')
ggpS
}
But it doesn't work.
R says:
data must be a data frame, or another object coercible by fortify(), not a numeric vector. And I don't know how to make my title dynamic and combine it with facet_grid or on a single plot, but in this case, I will face difficulty to distinguish the lines and assigning the legend to the plot by color, because color already represents another variable. What is the possible way to accomplish that?
Edit: Understand the question differently.
facet_grid accepts a single data.frame, and uses one of that frames values to split a chart into multiple subplots. Your question describes combining multiples charts into a single chart, which is available as a function from the cowplot library. However, If you are interested in faceting the data, here is a way to filter and facet_wrap.
Example with Iris data:
library(tidyverse)
iris %>%
filter(Sepal.Length %in% c(6.4,5.7,6.7,5.1,6.3,5)) %>% ### Your values here
ggplot(aes(Petal.Length, Petal.Width, color=Species)) +
geom_point(alpha = 1) +
scale_color_viridis_d()+ #(option = "D") + ### New function name
#scale_x_continuous(limits = c(0,1400), breaks = seq(0,1400,200))+
facet_wrap("Sepal.Length") +
# facet_grid("Sepal.Length") + ### Alternative Layout
ggtitle('Sepal Length Range')
To create a "grid" of plots with only one faceting variable, you'll actually want to use facet_wrap(). You can create your facet titles before plotting, and change the formatting of strip.text within theme() to make them look more "title-like."
library(dplyr)
library(ggplot2)
plot_node %>%
mutate(
facet = paste0("Solute Mass Transfer for Depth = ", abs(Depth), "mm")
) %>%
ggplot(aes(Time, TranS, color=Transf)) +
geom_point(alpha = 1) +
scale_color_viridis_c(option = "D") +
scale_x_continuous(limits = c(0, 1400), breaks = seq(0, 1400, 200)) +
facet_wrap(vars(facet), ncol = 2, scales = "free") +
theme_minimal() +
theme(strip.text = element_text(size = 12, face = "bold"))

How to get r to not remove a row in ggplot - geom_line

I'm trying to produce a graph of growth rates over time based upon the following data which has blanks in two groups.
When I try to make a growth plot of this using geom_line to join points there is no line for group c.
I'm just wondering if there is anyway to fix this
One option would be to get rid of the missing values which prevent the points to be connected by the line:
Making use of the code from the answer I provided on your previous question but adding tidyr::drop_na:
Growthplot <- data.frame(
Site = letters[1:4],
July = 0,
August = c(1, -1, NA, 2),
September = c(3, 2, 3, NA)
)
library(ggplot2)
library(tidyr)
library(dplyr, warn=FALSE)
growth_df <- Growthplot %>%
pivot_longer(-Site, names_to = "Month", values_to = "Length") %>%
mutate(Month = factor(Month, levels = c("July", "August", "September"))) %>%
drop_na()
ggplot(growth_df, aes(x = Month, y = Length, colour = Site, group = Site)) +
geom_point() +
geom_line()+
labs(color = "Site", x = "Month", y = "Growth in cm") +
theme(axis.line = element_line(colour = "black", size = 0.24))

Ordering axes and making data more presentable

I am trying to order the time and date axes on my scatter plot into epochs/ time periods. For example, times between 12pm-:7:59pm and 9pm-11:59pm. I want to do something similar for the dates.
I am fairly new to R so I am just looking for suggestions/ to be told if this is even possible and maybe some alternatives:)
This is my code so far:
accident <- read.csv("accidents.csv",header = TRUE)
accident <- accident %>%
ggplot(data=accident)+
geom_point(mapping=aes(x=Time, y=Date, alpha=0.5))
Thank you!
Welcome to R! Here is one set of options.
library(tidyverse)
library(lubridate)
First, simulate dataset
accident <-
rnorm(n = 1000, mean = 1500000000, sd = 1000000) %>%
tibble(date_time = .) %>%
mutate(date_time = as.POSIXct(date_time, origin = "1970-01-01")) %>%
separate(date_time, into = c("date", "time"), sep = " ", remove = F)
Original plot:
accident %>%
ggplot()+
geom_point(aes(x=time, y=date), alpha=0.5)
Step 1: Collapse the x axis into smaller number of groups
accidents_per_trihour <-
accident %>%
mutate(hour = floor_date(date_time, unit = "hour"),
hour = as.numeric(str_sub(hour, 12,13)),
tri_hour = cut(hour, c(0, 3, 6, 9, 12, 15, 18, 21, 24), include.lowest = T)) %>%
group_by(date, tri_hour) %>%
count()
Then scale dot size by number of accidents
accidents_per_trihour %>%
ggplot()+
geom_point(aes(x=tri_hour, y=date, size = n), alpha=0.5) +
labs(x = "\nTime (in three-hour groups)", y = "Day\n", size = "Accidents count")
Still not great because the y axis is too expansive. So:
Step 2: Collapse the y axis into smaller number of groups
(For your data you may need to group into months for things to start to look reasonable)
accidents_per_trihour_per_week <-
accident %>%
mutate(hour = floor_date(date_time, unit = "hour"),
hour = as.numeric(str_sub(hour, 12,13)),
tri_hour = cut(hour, c(0, 3, 6, 9, 12, 15, 18, 21, 24), include.lowest = T)) %>%
mutate(week_start = floor_date(as.Date(date), unit = "weeks"),
week = format.Date(week_start, "%Y, week %W")) %>%
group_by(week, tri_hour) %>%
count()
Should be much more readable now
We’ll improve the theme as well, just because.
if (!require(ggthemr)) devtools::install_github('cttobin/ggthemr')
ggthemr::ggthemr("flat") ## helps with pretty theming
accidents_per_trihour_per_week %>%
ggplot()+
geom_point(aes(x=tri_hour, y=week, size = n), alpha = 0.9) +
labs(x = "\nTime (in three-hour groups)", y = "Week\n", size = "Accidents count")
Could also do a tile plot
accidents_per_trihour_per_week %>%
ggplot() +
geom_tile(aes(x = tri_hour, y = week, fill = n)) +
geom_label(aes(x = tri_hour, y = week, label = n), alpha = 0.4, size = 2.5, fontface = "bold") +
labs(x = "\nTime (in three-hour groups)", y = "Week\n", fill = "Accidents count")
Created on 2021-11-24 by the reprex package (v2.0.1)

Reorder ggplot barplot x-axis by facet_wrap

Let's say I have an example data frame:
frame <-
data.frame(group = c(rep(1, 3), rep(2, 3)),
idea = c(1, 2, 3, 1, 2, 4),
value = c(10000, 5000, 50, 5000, 7500, 100),
level = sample(c("rough", "detailed"), 6, TRUE))
I'd like a barplot of values with each idea within a group ordered by it's value. I can get close like this
library(dplyr)
library(ggplot2)
top_ideas <-
frame %>%
group_by(group) %>%
arrange(group, desc(value))
frame %>%
group_by(group) %>%
mutate(idea = idea %>% factor(levels = top_ideas$idea)) %>%
ggplot(aes_string(x = "idea", y = "value", fill = "level")) +
geom_bar(stat = "identity") +
theme(legend.position = "bottom",
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
facet_wrap(~group, scales = "free")
The mutate in the final dplyr line is setting the factor levels according to their ordering in the top_ideas dataframe above that. Unfortunately, because the idea nos 1 and 2 are shared by both groups 1 and 2, the ordering is set by the first group.
What I'd like to have is the ordering of ideas in both facets independent of each group. How can I do that in a dplyr string? Am I missing something simple?
I should note that this is example data. The actual data is much larger and encompasses more groups and more ideas that are shared.
Here is a workaround:
Data:
# setting seed to make solution reproducible
set.seed(123)
frame <-
data.frame(group = c(rep(1, 3), rep(2, 3)),
idea = c(1, 2, 3, 1, 2, 4),
value = c(10000, 5000, 50, 5000, 7500, 100),
level = sample(c("rough", "detailed"), 6, TRUE))
Code:
library(dplyr)
library(tidyr)
library(ggplot2)
top_ideas <-
frame %>%
group_by(group) %>%
arrange(group, desc(value)) %>%
unite("grp_idea", group, idea, sep = "_", remove = FALSE) %>%
data.frame() %>%
mutate(grp_idea = factor(grp_idea, levels = grp_idea))
top_ideas %>%
group_by(group) %>%
ggplot(aes(x = grp_idea, y = value, fill = level)) +
geom_bar(stat = "identity") +
theme(legend.position = "bottom",
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
facet_wrap(~group, scales = "free") +
xlab("idea") +
scale_x_discrete(breaks = top_ideas$grp_idea,
labels = top_ideas$idea)
Results:
> top_ideas
grp_idea group idea value level
1 1_1 1 1 10000 rough
2 1_2 1 2 5000 detailed
3 1_3 1 3 50 rough
4 2_2 2 2 7500 detailed
5 2_1 2 1 5000 detailed
6 2_4 2 4 100 rough
Note:
Basically what I'm doing is to paste together group and idea variables, convert the new variable grp_idea to a factor with the desired levels, and use that as the x-axis instead of the original idea column. This ensures that the ordering of levels in each facet will not be affected by other facets since they no longer share the same levels. It is then easy enough to relabel the x-axis title and tick labels with xlab and scale_x_discrete.

Project R - Barplot of occurrences of levels

I am struggling with some plots. I have a really big data.frame with some entries. To get an overview I will work with some test data.
Let's assume the following data:
Sender <- c("ARD", "ZDF", "ARD", "ARD", "ZDF", "ZDF", "ARD")
Akz <- as.factor(c(0, 1, 1, 0, 0, 1, 1))
NAkz <- as.factor(c(1, 1, 1, 0, 0, 0, 0))
data <- data.frame(Sender, Akz, NAkz)
I want to get a (stacked) barplot group by the column "Person". So for each person I want to illustrate the occurrences of the columns "A" and "NA". Means one bar represents the column "A" with 3 "0"s and 4 "1"s and next to this bar I want the column "NA" with 4 "0"s and 3 "1"s. Would be great if there is a possibility to have a legend and the total amount of each level.
Thanks and all the best
Peter
PS: Found a pictures which illustrates a cool barplot. But I am not able to create this since the work with integers and total amounts
Your data is a bit messed up, I trust this is what you wanted to post:
data:
Person <- c("ARD", "ZDF", "ARD", "ARD", "ZDF", "ZDF", "ARD")
Akzept <- as.factor(c(0, 1, 1, 0, 0, 1, 1))
NAkzept <- as.factor(c(1, 1, 1, 0, 0, 0, 0))
df <- data.frame(Person, Akzept, NAkzept)
The key to plotting in ggplot2 is to arrange the data in long format achieved by the function gather:
library(tidyverse)
df %>%
gather(var, val, Akzept:NAkzept) %>%
ggplot()+
geom_bar(aes(x = interaction(var, Person), fill = val))
or perhaps:
df %>%
gather(var, val, Akzept:NAkzept) %>%
ggplot()+
geom_bar(aes(x = Person, fill = val))+
facet_wrap(~var)
with text:
df %>%
gather(var, val, Akzept:NAkzept) %>%
ggplot()+
geom_bar(aes(x = Person, fill = val))+
geom_text(stat = "count", aes(label = ..count.. , x = Person, group = val), position = "stack", vjust = 2, hjust = 0.5)+
facet_wrap(~var)

Resources