Issues with multiple y axis ranges in plotly in R - r

I am attempting to plot multiple lines on the same graph in plotly. The problem is for every variable that is being plotted, plotly is creating new set of y axis values. Can this be solved. I want the same y axis for all the line plots that I create. Following is the code and the plot generated.
p1 <- plot_ly(data = st_data, x = ~Date) %>% add_lines(y = ~Close,name =
"Close") %>%
add_lines(y=~Bollinger,name="Bollinger")
In the graph the y axis has values ranging once from 61.85 to 65.90 and again from 62.15 to 65.49.
Ideally I am looking for the y axis values to be between 61.85 and 65.90 and the two lines plotted on the same axis.
Adding the input data:
Date Close Bollinger
1/30/2015 9:34 65.55 NA
1/30/2015 9:34 65.43 NA
1/30/2015 9:35 65.52 NA
1/30/2015 9:35 65.37 NA
1/30/2015 9:36 65.68 65.184
1/30/2015 9:36 65.4 65.303
1/30/2015 9:36 65.51 65.4155
1/30/2015 9:36 65.8 65.499
1/30/2015 9:36 65.6 65.548

Yes, your code should already work. I think sjakw is correct in that you have some other code that creates a problem. Try opening a new script, and paste the following code. You should get a plot with a single y axis.
library(data.table)
library(plotly)
st_data <- fread('Date , Close, Bollinger
1/30/2015 9:34, 65.55, NA
1/30/2015 9:34, 65.43, NA
1/30/2015 9:35, 65.52, NA
1/30/2015 9:35, 65.37, NA
1/30/2015 9:36, 65.68, 65.184
1/30/2015 9:36, 65.4 , 65.303
1/30/2015 9:36, 65.51, 65.4155
1/30/2015 9:36, 65.8 , 65.499
1/30/2015 9:36, 65.6 , 65.548 ')
p1 <- plot_ly(data = st_data, x = ~Date) %>% add_lines(y = ~Close,name = "Close") %>%
add_lines(y=~Bollinger,name="Bollinger")
p1
I like the following approach better.
p2 <- plot_ly()
p2 <- add_lines(p, data = st_data, x = ~Date, y = ~Close, name = "Close")
p2 <- add_lines(p, data = st_data, x = ~Date, y = ~Bollinger, name = "Bollinger")
p2
Your data is in "wide" format. You can use similar code to the R Plotly Book if you melt your data into "long" format:
st_data_long <- melt.data.table(st_data, id = "Date", measure.vars = c("Close", "Bollinger"),
value.factor = TRUE, variable.name = "PriceType", value.name = "Price")
p3 <- plot_ly(st_data_long, x = ~Date, y = ~Price) %>%
add_lines(color = ~PriceType)
p3
I also tried it with a sample dataset included in R:
# First make Date one column
airquality <- data.table(airquality)
airquality[, Date := do.call(paste, .SD), .SDcols = c("Month", "Day")]
p4 <- plot_ly()
p4 <- add_lines(p1, data = airquality, x = ~Date, y = ~Ozone, name = "Ozone")
p4 <- add_lines(p1, data = airquality, x = ~Date, y = ~Temp, name = "Temp")
p4

Related

Plotting/Mutating Data on R

I've trying to plot data that has been mutated into quarterly growth rates from nominal levels.
i.e the original dataset was
Date GDP Level
2010Q1 457
2010Q2 487
2010Q3 538
2010Q4 589
2011Q1 627
2011Q2 672.2
2011Q3 716.4
2011Q4 760.6
2012Q1 804.8
2012Q2 849
2012Q3 893.2
2012Q4 937.4
Which was in an excel file which I have imported using
dataset <- read_excel("xx")
Then, I have done the below in order to mutate it to quarter on quarter growth ("QoQ Growth):
dataset %>%
mutate(QoQ Growth= (GDP Level) / lag(GDP Level, n=1) - 1)
I would like to now plot this % growth across time, however I'm not too sure how what the geom_line code is for a mutated variable, any help would be really truly appreciated! I'm quite new to R and really trying to learn, thanks!
Something like this?
library(tidyverse)
df %>%
mutate(QoQGrowth = (GDPLevel) / lag(GDPLevel, n=1) - 1) %>%
ggplot(aes(factor(Date), QoQGrowth, group=1)) +
geom_line()
Output
Data
df <- structure(list(Date = c("2010Q1", "2010Q2", "2010Q3", "2010Q4",
"2011Q1", "2011Q2", "2011Q3", "2011Q4", "2012Q1", "2012Q2", "2012Q3",
"2012Q4"), GDPLevel = c(457, 487, 538, 589, 627, 672.2, 716.4,
760.6, 804.8, 849, 893.2, 937.4)), class = "data.frame", row.names = c(NA,
-12L))
Package zoo defines a S3 class "yearqtr" and has a function to handle quarterly dates, as.yearqtr. Combined with ggplot2's scale_x_date, the formating of quarterly axis labels becomes easier.
dataset <- read.table(text = "
Date 'GDP Level'
2010Q1 457
2010Q2 487
2010Q3 538
2010Q4 589
2011Q1 627
2011Q2 672.2
2011Q3 716.4
2011Q4 760.6
2012Q1 804.8
2012Q2 849
2012Q3 893.2
2012Q4 937.4
", header = TRUE, check.names = FALSE)
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(zoo))
library(ggplot2)
dataset %>%
mutate(Date = as.yearqtr(Date, format= "%Y Q%q"),
Date = as.Date(Date)) %>%
mutate(`QoQ Growth` = `GDP Level` / lag(`GDP Level`, n = 1) - 1) %>%
ggplot(aes(Date, `QoQ Growth`)) +
geom_line() +
scale_x_date(date_breaks = "3 months", labels = as.yearqtr) +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
#> Warning: Removed 1 row(s) containing missing values (geom_path).
Created on 2022-03-08 by the reprex package (v2.0.1)
Convert dataset to a zoo object z, use diff.zoo to get the growth, QoQ Growth, and then use autoplot.zoo with scale_x_yearqtr.
library(zoo)
library(ggplot2)
z <- read.zoo(dataset, FUN = as.yearqtr)
`QoQ Growth` <- diff(z, arith = FALSE) - 1
autoplot(`QoQ Growth`) +
scale_x_yearqtr(format = "%YQ%q", n = length(`QoQ Growth`)) +
xlab("")

ploting time series with day time in horizontal axis

Hi there I have some working code that selects data from a single station and plots it as a time series.in this data is a date time of the format:
28 11AC068 2018-08-30T02:15:00-06:00
29 11AC068 2018-08-30T02:20:00-06:00
file = "http://dd.weather.gc.ca/hydrometric/csv/SK/hourly/SK_hourly_hydrometric.csv"
skdat <- read.csv(file, head=T, sep=",", dec=".")
skdate <- skdat
colnames(skdat) <- c("ID", "Date", "Water.Level", "Grade.1", "Symbol.1",
"QA/QC-1", "Discharge/Debit", "Grade.2", "Symbol.2",
"QA/QC-2")
#There are 151 Factors of ID
str(skdat$ID)
skdat$Date <- as.Date(skdat$Date, "%h/%m")
#"05AH050","05EF001"#,..: 151 151 151 151 151 151 151 151 151 151 ...
plot.ts(subset(skdat, skdat$ID=='05EF001')$Water.Level, main="Plot TS of ID = 05EF001")
axis.Date(1, at=seq(min(skdat$Date), max(skdat$Date), by="hour"), format="%h-%m")
in the subset the date time is filtered out is there any way to keep that column in the data and use it to plot the horizontal axis just as hour min?
You could try something like this.
library(tidyverse)
file = "http://dd.weather.gc.ca/hydrometric/csv/SK/hourly/SK_hourly_hydrometric.csv"
skdat <- read.csv(file, head=T, sep=",", dec=".", stringsAsFactors = F)
colnames(skdat) <- c("ID", "Date", "Water.Level", "Grade.1", "Symbol.1",
"QA/QC-1", "Discharge/Debit", "Grade.2", "Symbol.2",
"QA/QC-2")
skdat %>% filter(ID=='05EF001') %>%
mutate(Date = gsub("-06:00$", "", Date) %>% lubridate::parse_date_time(., orders = "ymd HMS")) %>%
ggplot(aes(Date, Water.Level))+
geom_line()+
scale_x_datetime(breaks = "4 hours", date_labels = "%H:%M")
Created on 2018-09-01 by the reprex
package (v0.2.0).

Mult-Color Line Plot in R Plotly based on Y Value

I would like to create a Plotly graph in R that is colored green when it is positive and red when it is negative.
I attempted to do this using two separate traces producing the fist plot below which is discontinuous. I then attempted to create a colored trace using the color column which I created by the code below. These are the only implementations that I can think of using the current version of plotly.
> str(results)
'data.frame': 804 obs. of 7 variables:
$ date : Date, format: "2014-03-06" "2014-03-07" "2014-03-10" ...
$ 5yr : num 32.9 32.5 32.9 32.8 32.8 ...
$ 3y5 : num 32.4 32.1 32.5 32.4 32.4 ...
$ spread: num -0.488 -0.431 -0.438 -0.388 -0.452 ...
$ pos : num NA NA NA NA NA NA NA NA NA NA ...
$ neg : num -0.488 -0.431 -0.438 -0.388 -0.452 ...
$ color : chr "red" "red" "red" "red" ...
results$spread <- results[,3] - results[,2]
results$neg <- ifelse(results$spread < 0 , results$spread, NA)
results$pos <- ifelse(results$spread >= 0 , results$spread, NA)
plot_ly(results,
x = ~dates,
y = ~pos,
type = 'scatter',
mode = 'lines',
line = list(color = 'green')) %>%
add_trace(results,
x = ~dates,
y = ~neg,
type = 'scatter',
mode = 'lines',
line = list(color = 'red')) %>%
layout(xaxis = list(title = 'Date'),
yaxis = list(title = 'Price'))
plot_ly(results,
x = ~dates,
y = ~spread,
type = 'scatter',
mode = 'lines',
color = ~color) %>%
layout(xaxis = list(title = 'Date'),
yaxis = list(title = 'Price'))
This was an interesting one. But after a while I realized you can get what you want by inserting a zero value at every zero crossing of your plot:
I think the code is self-explanatory (with the comments)
Here is the code - (with some faked data):
library(plotly)
#fake up some data
set.seed(123)
n <- 100
sdate <- as.Date("2014-03-06")
dt <- seq.Date(sdate,by="days",length.out=n)
results <- data.frame(dates=dt,v1=rnorm(n,32.6,0.2),v2=rnorm(n,32.6,0.2))
results$spread <- results[,3] - results[,2]
# find all the zero crossings
spd <- results$spread
lagspd <- c(spd[1],spd[1:(length(spd)-1)])
crs <- sign(spd)!=sign(lagspd)
results$crs <- crs
# now insert a zero row where there is a crossing
insertZeroRow <- function(df,i){
n <- nrow(df)
ndf1 <- df[1:i,] # note these overlap by 1
ndf2 <- df[i:n,] # that is the row we insert
ndf1$spread[i] <- 0
ndf <- rbind(ndf1,ndf2)
}
i <- 1
while(i<nrow(results)){
if (results$crs[i]){
results <- insertZeroRow(results,i)
i <- i+1
}
i <- i+1
}
# plot it now
results$neg <- ifelse(results$spread <= 0 , results$spread, NA)
results$pos <- ifelse(results$spread >= 0 , results$spread, NA)
plot_ly(results,
x = ~dates,
y = ~pos,
type = 'scatter',
mode = 'lines',
line = list(color = 'green')) %>%
add_trace(results,
x = ~dates,
y = ~neg,
type = 'scatter',
mode = 'lines',
line = list(color = 'red')) %>%
layout(xaxis = list(title = 'Date'),
yaxis = list(title = 'Price'))
And here is the result:
Note you could make it better by interpolating the dates and spread value to get the correct x-axis crossing point, but I think it would not make a huge difference in most cases. If you did that you would need a date type that can represent hours of the day too (like as.POSIXct), in order to be able to specify the correct x-axis value.
Update:
Just to clear up any confusion, adding the zero rows are necessary. If you comment out the insertZeroRow call, you get this:
basically you can change your first implementation in this part of code:
results$spread <- results[,3] - results[,2]
results$neg <- ifelse(results$spread < 0 , results$spread, NA)
results$pos <- ifelse(results$spread >= 0 , results$spread, NA)
adding = in the second line of code:
results$spread <- results[,3] - results[,2]
results$neg <- ifelse(results$spread <= 0 , results$spread, NA)
results$pos <- ifelse(results$spread >= 0 , results$spread, NA)
try, it should work removing the discontinuities

Plotly - Plot 2 Y Axes With Time Series

I am trying to use r plotly to plot a chart that has following features:
Date objects as X-variable
2 line plots in one charts with 2 Y-axis: one on the left, the other on the right
Date Amount1 Amount2
2/1/2017 19251130 21698.94
2/2/2017 26429396 10687.37
2/5/2017 669252 0.00
2/6/2017 25944054 11885.10
2/7/2017 27895562 14570.39
2/8/2017 20842279 20080.56
2/9/2017 25485527 9570.51
2/10/2017 17008478 14847.49
2/11/2017 172562 0.00
2/12/2017 379397 900.00
2/13/2017 25362794 18390.80
2/14/2017 26740881 11490.94
2/15/2017 20539413 22358.26
2/16/2017 22589808 12450.45
2/17/2017 18290862 3023.45
2/19/2017 1047087 775.00
2/20/2017 4159070 4100.00
2/21/2017 28488401 22750.35
and the code I use is:
ay <- list(
#tickfont = list(color = "red"),
overlaying = "y",
side = "right"
)
p <- plot_ly() %>%
add_lines(x = df$Date, y = df$Amount1, name = "Amount1",type = "scatter", mode = "lines") %>%
add_lines(x = df$Date, y = df$Amount2, name = "Amount2", yaxis = "y2",type = "scatter", mode = "lines") %>%
layout(
title = "Chart Summary", yaxis2 = ay,
xaxis = list(title="Date")
)
The output chart looks fine but the date intervals on the X-axis is looking bad. I am wondering what is the solution to this, and if I want to have 2 histograms in one chart using the data above, what is the optimal way to do it?
Thank you for help!
Is your Date column a string or date?
If it is a string, convert it to date and let Plotly take care of it.
df$Date <- as.Date(df$Date , "%m/%d/%Y")
Full code
library('plotly')
txt <- "Date Amount1 Amount2
2/1/2017 19251130 21698.94
2/2/2017 26429396 10687.37
2/5/2017 669252 0
2/6/2017 25944054 11885.1
2/7/2017 27895562 14570.39
2/8/2017 20842279 20080.56
2/9/2017 25485527 9570.51
2/10/2017 17008478 14847.49
2/11/2017 172562 0
2/12/2017 379397 900
2/13/2017 25362794 18390.8
2/14/2017 26740881 11490.94
2/15/2017 20539413 22358.26
2/16/2017 22589808 12450.45
2/17/2017 18290862 3023.45
2/19/2017 1047087 775
2/20/2017 4159070 4100
2/21/2017 28488401 22750.35"
df$Date <- as.Date(df$Date , "%m/%d/%Y")
ay <- list(
#tickfont = list(color = "red"),
overlaying = "y",
side = "right"
)
p <- plot_ly() %>%
add_lines(x = df$Date, y = df$Amount1, name = "Amount1",type = "scatter", mode = "lines") %>%
add_lines(x = df$Date, y = df$Amount2, name = "Amount2", yaxis = "y2",type = "scatter", mode = "lines") %>%
layout(
title = "Chart Summary", yaxis2 = ay,
xaxis = list(title="Date", ticks=df$Date)
)
p

Faceting a Dataset

This is a beginner question. I have spent most of the day trying to work out how to facet my data, but all of the examples of faceting that I have come across seem unsuited to my dataset.
Here are the first five rows from my data:
Date Germany.Yield Italy.Yield Greece.Yield Italy_v_Germany.Spread Greece_v_Germany.Spread
2020-04-19 -0.472 1.820 2.287 2.292 2.759
2020-04-12 -0.472 1.790 2.112 2.262 2.584
2020-04-05 -0.345 1.599 1.829 1.944 2.174
2020-03-29 -0.441 1.542 1.972 1.983 2.413
2020-03-22 -0.475 1.334 1.585 1.809 2.060
I simply want to create two line charts. On both charts the x-axis will be the date. On the first chart, the y-axis should be Italy_v_Germany.Spread and on the second, the y-axis should be Greece_v_Germany.Spread.
The first chart looks like this:
So I want the two charts to appear alongside each other, like this:
The one on the left should be Italy_v_Germany.Spread, and the one on the right should be Greece_v_Germany.Spread.
I really have no idea where to start with this. Hoping that someone can point me in the right direction.
In the interest I making the example reproducible, I will share a link to the CSV files which I'm using: https://1drv.ms/u/s!AvGKDeEV3LOsmmlHkzO6YVQTRiOX?e=mukBVy. Unforunately these files convert into excel format when shared via this link, so you may have to export the files to CSVs so that the code works.
Here is the code that I have so far:
library(ggplot2)
library(scales)
library(extrafont)
library(dplyr)
library(tidyr)
work_dir <- "D:\\OneDrive\\Documents\\Economic Data\\Historical Yields\\Eurozone"
setwd(work_dir)
# Germany
#---------------------------------------
germany_yields <- read.csv(file = "Germany 10-Year Yield Weekly (2007-2020).csv", stringsAsFactors = F)
germany_yields <- germany_yields[, -(3:6)]
colnames(germany_yields)[1] <- "Date"
colnames(germany_yields)[2] <- "Germany.Yield"
#---------------------------------------
# Italy
#---------------------------------------
italy_yields <- read.csv(file = "Italy 10-Year Yield Weekly (2007-2020).csv", stringsAsFactors = F)
italy_yields <- italy_yields[, -(3:6)]
colnames(italy_yields)[1] <- "Date"
colnames(italy_yields)[2] <- "Italy.Yield"
#---------------------------------------
# Greece
#---------------------------------------
greece_yields <- read.csv(file = "Greece 10-Year Yield Weekly (2007-2020).csv", stringsAsFactors = F)
greece_yields <- greece_yields[, -(3:6)]
colnames(greece_yields)[1] <- "Date"
colnames(greece_yields)[2] <- "Greece.Yield"
#---------------------------------------
# Join data
#---------------------------------------
combined <- merge(merge(germany_yields, italy_yields, by = "Date", sort = F),
greece_yields, by = "Date", sort = F)
combined <- na.omit(combined)
combined$Date <- as.Date(combined$Date,format = "%B %d, %Y")
combined["Italy_v_Germany.Spread"] <- combined$Italy.Yield - combined$Germany.Yield
combined["Greece_v_Germany.Spread"] <- combined$Greece.Yield - combined$Germany.Yield
#--------------------------------------------------------------------
fl_dates <- c(tail(combined$Date, n=1), head(combined$Date, n=1))
ggplot(data=combined, aes(x = Date, y = Italy_v_Germany.Spread)) + geom_line() +
scale_x_date(limits = fl_dates,
breaks = seq(as.Date("2008-01-01"), as.Date("2020-01-01"), by="2 years"),
expand = c(0, 0),
date_labels = "%Y")
You need to get your data into a long format, for example, by using pivot_wider. Then it should work.
library(dplyr)
library(tidyr)
library(ggplot2)
data <- tribble(~Date, ~Germany.Yield, ~Italy.Yield, ~Greece.Yield, ~Italy_v_Germany.Spread, ~Greece_v_Germany.Spread,
"2020-04-19", -0.472, 1.820, 2.287, 2.292, 2.759,
"2020-04-19", -0.472, 1.820, 2.287, 2.292, 2.759,
"2020-04-12", -0.472, 1.790, 2.112, 2.262, 2.584,
"2020-04-05", -0.345, 1.599, 1.829, 1.944, 2.174,
"2020-03-29", -0.441, 1.542, 1.972, 1.983, 2.413,
"2020-03-22", -0.475, 1.334, 1.585, 1.809, 2.060
)
data %>%
mutate(Date = as.Date(Date)) %>%
pivot_longer(
cols = ends_with("Spread"),
names_to = "country",
values_to = "Spread_v_Germany",
values_drop_na = TRUE
) %>%
ggplot(., aes(x = Date, y = Spread_v_Germany, group = 1)) +
geom_line() +
facet_wrap(. ~ country)

Resources