I am trying to plot some line graphs from a dataset but with varied y-axis values that are set in the list var. For some reason, all the plots seem to display an empty plot but when I have the same code without the for loop it seems to work. So my question is can I possibly use a for loop to plot multiple plots at once instead of individually plotting them. Am I missing something while plotting this.
var = list("CAH", 'CTLT',"CI","DVA","HSIC","HOLX","HUM","IDXX","INCY","MRK","REGN","VTRS")
for(v in var){
p<-closing_price %>%
mutate(date = as.Date(Date,"%m/%d/%y")) %>%
ggplot(aes(x = date, y = v, group=1)) +
geom_line() +
labs(x = "Date", y = "Stock Price ($)", title = glue("{v} price over time")) +
scale_x_date(date_minor_breaks = "2 day")
print(p)
}
This is the outcome I am looking for:
closing_price %>%
mutate(date = as.Date(Date,"%m/%d/%y")) %>%
select(date,CAH) %>%
ggplot(aes(x = date, y = CAH, group=1)) +
geom_line() +
labs(x = "Date", y = "Stock Price ($)", title = "CAH price over time") +
scale_x_date(date_minor_breaks = "2 day")
As noted by stefan, ggplot thinks you're trying to plot the character which is why you're getting "CAH" on the y-axis. You can refer to the variable with .data[[v]].
var = list("CAH", 'CTLT',"CI","DVA","HSIC","HOLX","HUM","IDXX","INCY","MRK","REGN","VTRS")
for(v in var){
p<-closing_price %>%
mutate(date = as.Date(Date,"%m/%d/%y")) %>%
ggplot(aes(x = date, y = .data[[v]], group=1)) +
geom_line() +
labs(x = "Date", y = "Stock Price ($)", title = glue("{v} price over time")) +
scale_x_date(date_minor_breaks = "2 day")
print(p)
}
Since I don't have access to your data, here's a reproducible example:
library(tidyverse)
library(zoo)
closing_price <-
tibble(
dt = as.Date(seq(as.yearmon("2020-01-31"), as.yearmon("2020-12-31"), by = 1 / 12), frac = 1),
CAH = rnorm(12, mean = 10, sd = 2),
CTLT = rnorm(12, mean = 50, sd = 5),
CI = rnorm(12, mean = 25, sd = 2)
)
vars <- c("CAH", "CTLT","CI")
for(v in vars){
p <-
ggplot(data = closing_price, aes(x = dt, y = .data[[v]])) +
geom_line() +
labs(x = "Date", y = "Stock Price ($)", title = paste(v, "price over time")) +
scale_x_date(date_minor_breaks = "2 day")
print(p)
}
You could also use lapply for this by wrapping your plot in a function instead. The function may be useful depending on how often you make these plots or to make it easier to fix and share.
closing_price_plot <- function(data, var){
p <-
data %>%
ggplot(aes(x = dt, y = .data[[var]], group=1)) +
geom_line() +
labs(x = "Date", y = "Stock Price ($)", title = paste(var, "price over time")) +
scale_x_date(date_minor_breaks = "2 day")
return(p)
}
## prints out the plots as in the loop
lapply(vars, FUN = closing_price_plot, data = closing_price)
## save them to a list instead
p <- list()
p <- lapply(vars, FUN = closing_price_plot, data = closing_price)
## view the first plot
p[[1]]
Related
I have the following data:
library(ggplot2)
library(gganimate)
library(tidyverse)
createData<- function(vintage, id){#create data
# Generate a sequence of dates from 2010-01-01 to 2025-12-31 with a quarterly frequency
Dates <- seq(from = as.Date("2010-01-01"), to = as.Date("2025-12-31"), by = "quarter")
RLG<- cumsum(sample(c(-1, 1), 64, TRUE))
df<- data.frame( Dates,RLG, vintage,id)
return(df)
}
#createData
df<- createData("2018-01-01",1) %>%
rbind(createData("2019-01-01",2))%>%
rbind(createData("2020-01-01",3)) %>%
rbind(createData("2021-01-01",4))%>%
rbind(createData("2022-01-01",5))%>%
rbind(createData("2023-01-01",6))%>%
rbind(createData("2024-01-01",7))%>%
rbind(createData("2025-01-01",8))
Which I use to make the following chart:
options(gganimate.nframes = 8*length(unique(df$vintage)), gganimate.res = 30)
p<- ggplot(df) +
aes(x = Dates, y = RLG, group = as.Date(vintage), colour = "RLG") +
geom_line()+
scale_y_continuous(labels = \(x) paste0(x, "%"))+
theme(axis.title = element_blank(),legend.position="none")+
transition_time(id)+
exit_fade(alpha = 0.5)+
shadow_mark(alpha = 0.2)
animate(p, end_pause = 30)
I would like to add a geom_rect which goes from vintage to max(Dates). At each frame, vintage will increase, so the geom_rect will shrink slightly. How can I do this without interfering with the shadow_mark and exit_fades which I am applying to the lines?
If you mean something like a progress bar you could do it like so:
create an DF for the geom which is a subset of the original
df_geom <- df |>
mutate(vintage = as.Date(vintage)) |>
group_by(id) |>
slice(n())
Use geom_segment with the DF from above.
If you want to leave shadow_mark in you can do shadow_mark(exclude_layer = 2).
p <- ggplot(df) +
aes(x = Dates, y = RLG, group = as.Date(vintage), colour = RLG) +
geom_line()+
scale_y_continuous(labels = \(x) paste0(x, "%"))+
theme(axis.title = element_blank(),legend.position="none") +
geom_segment(
data = df_geom,
mapping = aes(x=vintage, xend=Dates,
y = 18, yend = 18),
size = 10, alpha =.4, color ='lightblue'
) +
transition_time(id)+
exit_fade(alpha = 0.5)
# shadow_mark(alpha = 0.2)
animate(p)
I have been trying to plot time series data in R. I consulted several different sources online and somehow I am still having problems creating this plots. I have simulated some data below that represents daily information a fictitious company received from 2014 to 2016:
#create data
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,100,10)
car_damages_in_dollars <- rnorm(731,105,8)
other_damages_in_dollars <- rnorm(731,104,9)
location <- c("canada","usa")
location <- sample(location, 731, replace=TRUE, prob=c(0.3, 0.7))
type_of_house <- c("single","townhome", "rental" )
type_of_house<- sample(type_of_house , 731, replace=TRUE, prob=c(0.5, 0.3, 0.2))
response_variable <- c("claim_approved","claim_rejected")
response_variable<- sample(response_variable, 731, replace=TRUE, prob=c(0.4, 0.6))
final_dataset <- cbind(date_decision_made, property_damages_in_dollars, car_damages_in_dollars, other_damages_in_dollars, location, type_of_house, response_variable)
final_dataset <- as.data.frame(final_dataset)
final_dataset$other_damages_in_dollars = as.numeric(final_dataset$other_damages_in_dollars)
final_dataset$property_damages_in_dollars = as.numeric(final_dataset$property_damages_in_dollars)
final_dataset$car_damages_in_dollars = as.numeric(final_dataset$car_damages_in_dollars)
prop_damage <-subset(final_dataset, select = c(date_decision_made, property_damages_in_dollars))
car_damage <-subset(final_dataset, select = c(date_decision_made, car_damages_in_dollars))
other_damage <-subset(final_dataset, select = c(date_decision_made, other_damages_in_dollars))
new <-subset(final_dataset, select = c(date_decision_made, property_damages_in_dollars, car_damages_in_dollars, other_damages_in_dollars))
Based on this data, I have tried to plot this data as a time series in R. I tried several methods and all of them are producing errors. I have tried to resolve these problems but I just can't seem to figure them out. Could someone please help me?
#first way (error)
library(ggplot2)
library(reshape2) library(dplyr)
ggplot() + geom_line(data = prop_damage, aes(x = date_decision_made, y = property_damages_in_dollars, group = 1), color = "red")
+ scale_x_date(date_breaks = "days" , date_labels = "%b %d %a")+
geom_line(data = car_damage, aes(x = date_decision_made, y = car_damages_in_dollars, group =1 ), color = "blue")
+ geom_line(data = other_damage, aes(x = date_decision_made, y = other_damages_in_dollars, group =1), color = "green")
+ xlab('data_date') + ylab('percent.change')
#second way (error)
ggplot(data = new, aes(x = date_decision_made)) +
geom_line(aes(y = property_damages_in_dollars, colour = "property_damages_in_dollars")) +
geom_line(aes(y = car_damages_in_dollars, colour = "car_damages_in_dollars")) +
geom_line(aes(y =other_damages_in_dollars, colour = "other_damages_in_dollars")) +
scale_colour_manual("",
breaks = c("property_damages_in_dollars", "car_damages_in_dollars", "other_damages_in_dollars"),
values = c("red", "green", "blue")) +
xlab(" ") +
scale_y_continuous("Dollars", limits = c(0,10000)) +
labs(title="demo graph")
#3rd way error
##Subset the necessary columns
dd_sub = new[,c(1,2,3,4)]
##Then rearrange your data frame
library(reshape2)
dd = melt(dd_sub, id=c("date_decision_made"))
ggplot(dd) + geom_line(aes(x=date_decision_made, y=value, colour=variable, group=1)) + scale_x_date(date_breaks = "days" , date_labels = "%b %d %a")+ scale_colour_manual(values=c("red","green","blue"))
#4th error
mymts = ts(new,
frequency = 1,
start = c(2014, 1))
autoplot(mymts) +
ggtitle("Time Series Plot") +
theme(plot.title = element_text(hjust = 0.5))
#5th Method error
x1 = ts(prop_damage, frequency = 1, start = c(2014,1))
x2 = ts(other_damage, frequency = 1, start = c(2014,1))
ts.plot(x, y, gpars = list(col = c("black", "red")))
ts.plot(date_decision_made,gpars= list(col=rainbow(10)))
#6th method error
##Subset the necessary columns
dd_sub = new[,c(1,2,3,4)]
##Then rearrange your data frame
library(reshape2)
dd = melt(dd_sub, id=c("date_decision_made"))
qplot(date_decision_made,value,data=dd,geom='line',color=variable)
#7th way error
x1 = ts(prop_damage, frequency = 1, start = c(2014,1))
x2 = ts(other_damage, frequency = 1, start = c(2014,1))
comb_ts <- cbind(x1, x2)
plot.ts(comb_ts, plot.type = "single")
Could someone please show me what I am doing wrong in these codes? Thanks
Try this approach and be careful with the dates. There are plenty of dates because of years:
#Data
##Then rearrange your data frame
library(reshape2)
dd = melt(dd_sub, id=c("date_decision_made"))
dd$date_decision_made <- as.Date(as.character(dd$date_decision_made),'%Y/%m/%d')
#Plot
ggplot(dd) +
geom_line(aes(x=date_decision_made, y=value, colour=variable, group=1))+
scale_x_date(date_breaks = "months" ,breaks = '12 months', date_labels = "%b %d %a")+
scale_colour_manual(values=c("red","green","blue"))+
theme(axis.text.x = element_text(angle=90))
Output:
Another option is to use pivot_longer from tidyr
library(dplyr)
library(tidyr)
library(ggplot2)
library(lubridate)
dd_sub %>%
pivot_longer(cols = -date_decision_made) %>%
mutate(date_decision_made = ymd(date_decision_made)) %>%
ggplot() +
geom_line(aes(x = date_decision_made, y=value, colour = name, group = 1)) +
scale_x_date(date_breaks = "months", breaks = '12 months',
date_labels = "%b %d %a")+
scale_colour_manual(values=c("red","green","blue"))+
theme(axis.text.x = element_text(angle = 90))
I am trying to create a plot to track results over days for multiple factors. Ideally I would like my xaxis to be Day, with the day number centered in the middle of the reps for that particular day, the y axis to be result, and the facet will be the Lot (1-4). I am having difficulty making the day centered on the bottom using repeatable text, as the number of reps may vary.
I was using ideas shown in this post: Multi-row x-axis labels in ggplot line chart but have been unable to make any progress.
Here is some code I have been using and the plot that I have so far. The x axis is far too busy and I am trying to consolidate it.
data <- data.frame(System = rep(c("A", "B"), each = 120), Lot = rep(1:4, each = 30),
Day = rep(1:5, each = 6), Rep = rep(1:6, 40), Result = rnorm(240))
library(ggplot2)
ggplot(data, aes(x = interaction(Day, Rep, lex.order = TRUE), y = Result, color = System, group = System)) +
geom_point() +
geom_line() +
theme(legend.position = "bottom") +
facet_wrap(~Lot, ncol = 1) +
geom_vline(xintercept = (which(data$Rep == 1 & data$Day != 1)), color = "gray60")
I'm not 100% sure if this is exactly what you are after but this will center the day on the x-axis.
library(dplyr)
library(tidyr)
library(ggplot2)
df <- data.frame(System = rep(c("A", "B"), each = 120), Lot = rep(1:4, each = 30),
Day = rep(1:5, each = 6), Rep = rep(1:6, 40), Result = rnorm(240))
df <- df %>%
unite(Day_Rep, Day, Rep, sep = ".", remove = F) %>%
mutate(Day_Rep = as.numeric(Day_Rep))
ggplot(df, aes(x = Day_Rep, y = Result, color = System, group = System)) +
geom_point() +
geom_line() +
theme(legend.position = "bottom") +
facet_wrap(~Lot, ncol = 1) +
scale_x_continuous(labels = df$Day, breaks = df$Day + 0.5)+
geom_vline(xintercept = setdiff(unique(df$Day), 1))
I search in R implementation (may be html widget on java script) a stacked bar chart in ribbon style, which allows you to see the rating change for each category in the dynamics.
It's look like ribbon chart in power bi desktop
Search rseek.org gave no results.
First off: Not a fan of that ribbon-styled stacked bar chart at all; while colourful and stylish, it's difficult to synthesise the relevant information. But that's just my opinion.
You could try building a similar plot in ggplot2 using geom_ribbon. See below for a minimal example:
# Sample data
set.seed(2017);
one <- sample(5:15, 10);
two <- rev(one);
df <- cbind.data.frame(
x = rep(1:10, 2),
y = c(one, two),
l = c(one - 1, two - 1),
h = c(one + 1, two + 1),
id = rep(c("one", "two"), each = 10));
require(ggplot2);
ggplot(df, aes(x = x, y = y)) +
geom_ribbon(aes(ymin = l, ymax = h, fill = id), alpha = 0.4) +
scale_fill_manual(values = c("#E69F00", "#56B4E9"));
If you need interactivity, you could wrap it inside plotly::ggplotly.
Using ggsankey package.
In the following you can make use of smooth argument geom_sankey_bump to control the look/feel of the chart as in ribbon chart of Power BI.
df <- data.frame (model = c("A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J","A","B","C","D","E","F","G","H","I","J"),
Year = c(2015,2015,2015,2015,2015,2015,2015,2015,2015,2015,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2018,2018,2018,2018,2018,2018,2018,2018,2018,2018),
sales = c(450,678,456,344,984,456,234,244,655,789,234,567,234,567,232,900,1005,1900,450,345,567,235,456,345,144,333,555,777,111,444,222,223,445,776,331,788,980,1003,456,434))
#install.packages("remotes")
#remotes::install_github("davidsjoberg/ggsankey")
library(ggsankey)
library(tidyverse)
ggplot(df, aes(x = Year,
node = model,
fill = model,
value = sales)) +
geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 15) +
scale_fill_viridis_d(option = "A", alpha = .8) +
theme_sankey_bump(base_size = 16) +
labs(x = NULL,
y = "Sales ($ ths)",
fill = "Model",
color = NULL) +
theme(legend.position = "bottom") +
labs(title = "Sales per model per year")
On suggestion in comments, I tried replicating some of the features of power BI chart.
# Prepare some data
set.seed(1)
df <- data.frame(
occupation = rep(c("Clerical", "Management", "Manual", "Professional", "Skilled"), 12),
Month = factor(rep(month.abb, 5), levels = month.abb, ordered = TRUE),
Sales = sample(200:1000, 60, replace = TRUE)
)
df %>%
group_by(Month) %>%
mutate(Max = sum(Sales)) %>%
ungroup() %>%
mutate(Max = max(Sales)) %>%
ggplot(aes(x = Month,
node = occupation,
fill = occupation,
value = Sales)) +
geom_col(aes(x = Month, y = Max/1.2),
alpha = 0.5,
fill = 'grey',
width = 0.4) +
geom_sankey_bump(space = 15,
type = "alluvial",
color = "transparent",
smooth = 8,
alpha = 0.8) +
scale_fill_brewer(palette = "Set3") +
theme_minimal() +
labs(x = NULL,
y = "Sales ($ ths)",
fill = "Occupation",
color = NULL) +
theme(legend.position = "top") +
labs(title = "Sales per occupation per month")
Created on 2022-07-07 by the reprex package (v2.0.1)
You may find your answers with ggalluvial package.
https://cran.r-project.org/web/packages/ggalluvial/vignettes/ggalluvial.html
I have plotted two facets one on top of the other with two different ys (a percentage and a cost) and the same x (Years). I took most of the ideas from this post and some variations of the same.
I'd like to show the labels of the y axis as percentages for the rate and as £ for the costs, but I have been unable to change each y label format independently.
Below a reproducible example using facet_grid (I managed to create a similar thing with facet_wrap but I get stuck with the same problem).
I considered using grid.arrange() from the gridExtra package, but it seemed that would bring other issues with the legend.
library(plyr)
library(tidyr)
library(dplyr)
library(ggplot2)
library(scales)
set.seed(12345)
my_labels <- function(variable, value){
names_li <- list("percentage", "cost in pounds")
return(names_li[value])
}
df <- data.frame(
rate = runif(10, 0, 1),
cost = rnorm(10, 100, 40),
years = seq(from = 2001, to = 2010)
)
df %>%
gather(type_of_var,
value,
rate:cost) ->
df2
df2 %>%
ggplot(aes(x = years,
y = value,
ymin = 0,
ymax = .1)) +
facet_grid(type_of_var ~ .,
scales = 'free_y',
labeller = my_labels) +
labs(x = "Year",
y = "") +
geom_point(subset = . (type_of_var == "rate")) +
geom_line(subset = . (type_of_var == "rate"),
colour = "grey") +
## the following two lines don't work
# scale_y_continuous(subset = . (type_of_var == "rate"),
# labels = percent) +
geom_bar(subset = . (type_of_var == "cost"),
stat = "identity") +
theme_bw() +
theme(strip.text.y = element_text(size = 15,
colour = "black"),
plot.title = element_text(lineheight = 0.8,
face = "bold")) +
scale_x_continuous(breaks = seq(2001, 2010, 1)) +
labs(title = "free_y y axis labels")
Thanks
as a fragile workaround, you could use
label_fun <- function (x) {
if(max(x, na.rm=TRUE) > 1) dollar(x) else percent(x)
}
(assuming you only deal with big money and small percentages)