I have code that breaks down hours with corresponding values into quarters of an hour.
Unfortunately, when broken down into quarters of an hour, the values are identical for the entire hour.
After adding quarters of an hour, I would also like to add values between the original hours so that the graph is smooth and not sharp. How to do it, average it, interpolate it?
df <- data.frame(
h = 0:23,
x = c(22, 11, 5, 8 , 22, 88, 77, 7, 11, 5, 8 , 22, 88, 77, 11, 5, 8 , 22, 88, 77, 11, 5, 8 , 22))
library(dplyr)
library(stringr)
df %>%
data.frame(h = rep(df$h, each = 4), # quadruplicate rows
x = rep(df$x, each = 4)) %>% # quadruplicate rows
mutate(h.1 = str_pad(h.1, width = 2, side = "left", pad = "0"), # add leading '0'
qu = paste0(h.1, c(":00", ":15", ":30", ":45"))) %>% # create quarters
select( - c(h,x)) %>% # deselect obsolete cols
rename(c("h" = "h.1", "x" = "x.1"))
df %>%
ggplot() +
geom_point(aes(qu, x), color = "red", size = 2) +
labs(x= "", y = "",
title = "Example")
Here I make a "decimal hour" variable to simplify the calculations. We can also use hms::hms() to define a timestamp that ggplot2 can understand. I use base:approx here to interpolate between hourly points.
df2 <- df %>%
tidyr::uncount(4) %>% # make 4 copies of each row
mutate(h_dec = h + (0:3)/4,
h_time = hms::hms(hours = h_dec),
x = x * c(1, NA, NA, NA), # this is to make non-hourly into NA,
# so that approx only uses hourly
x_interp = approx(x = h, y = x, xout=h_dec)$y)
df2 %>%
ggplot() +
geom_point(aes(h_time, x_interp), color = "red", size = 2) +
labs(x= "", y = "",
title = "Example")
Related
I have a data structure that I got as a result of the problem stated here.
Code:
df <- tibble::tribble(~person, ~age, ~height,
"John", 1, 20,
"Mike", 3, 50,
"Maria", 3, 52,
"Elena", 6, 90,
"Biden", 9, 120)
df %>%
mutate(
age_c = cut(
age,
breaks = c(-Inf, 5, 10),
labels = c("0-5", "5-10"),
right = TRUE
),
height_c = cut(
height,
breaks = c(-Inf, 50, 100, 200),
labels = c("0-50", "50-100", "100-200"),
right = TRUE
)
) %>%
count(age_c, height_c, .drop = FALSE)
# A tibble: 6 x 3
age_c height_c n
<fct> <fct> <int>
1 0-5 0-50 2
2 0-5 50-100 1
3 0-5 100-200 0
4 5-10 0-50 0
5 5-10 50-100 1
6 5-10 100-200 1
Now I am trying to create a scatter plot but I have a problem that it seems like the code is not noticing that the values on the X and Y axis are repeating. Instead, it is repeating them. So, I would expect my x-axis to have two values 0-5 and 5-10 (what I get is 0-5,0-5,0-5,5-10,5-10,5-10), and the y-axis three values 0-50, 50-100 and 100-200 (instead I have two series of them).
The code I use to plot:
ggplot(df, aes(x=age_c, y=height_c))
Expected plot (where the size of circles would be based on the value of N):
If you plot the count data.frame it should work:
countdf = df %>%
mutate(
age_c = cut(
age,
breaks = c(-Inf, 5, 10),
labels = c("0-5", "5-10"),
right = TRUE
),
height_c = cut(
height,
breaks = c(-Inf, 50, 100, 200),
labels = c("0-50", "50-100", "100-200"),
right = TRUE
)
) %>%
count(age_c, height_c, .drop = FALSE)
countdf %>%
filter(n>0) %>%
ggplot(aes(x=age_c,y=height_c,size=n)) +
geom_point() +
scale_size_continuous(range=c(5,10),breaks=c(1,2))
I'm trying to do a semi circle donut with highcharter library but I only know how to do a pie chart. I know that with JS you can do it by adding "startAngle" and "endAngle" but I want to know how to do it with R:
A <- c("a", "b", "c", "d")
B <- c(4, 6, 9, 2)
C <- c(23, 26, 13, 15)
df <- data.frame(A, B, C)
highchart() %>%
hc_chart(type = "pie") %>%
hc_add_series_labels_values(labels = df$A, values = df$B)%>%
hc_tooltip(crosshairs = TRUE, borderWidth = 5, sort = TRUE, shared = TRUE, table = TRUE,
pointFormat = paste('<b>{point.percentage:.1f}%</b>')
) %>%
hc_title(text = "ABC",
margin = 20,
style = list(color = "#144746", useHTML = TRUE))
Thank you!
You can do something like this though not using Highcharts library.
library(tidyverse)
library(ggforce)
library(scales)
library(ggplot2)
# -------------------------------------------------------------------------
A <- c("a", "b", "c", "d")
B <- c(4, 6, 9, 2)
C <- c(23, 26, 13, 15)
df <- data.frame(A, B, C)
# Ensure A is a factor (we'll be using it to fill the pie)
df$A <- factor(df$A)
# compute the individual proportion in this case using var C
df$prop <- df$C/sum(df$C)
# compute the cumulative proportion and use that to plot ymax
df$p_end <- cumsum(df$prop)
# generate a y-min between 0 and 1 less value than p_end (using p_end)
df$p_start <- c(0, head(df$p_end ,-1))
# -------------------------------------------------------------------------
# plot
df %>%
mutate_at(c("p_start", "p_end"), rescale, to=pi*c(-.5,.5), from=0:1) %>%
ggplot +
geom_arc_bar(aes(x0 = 0, y0 = 0, r0 = .5, r = 1, start = p_start, end = p_end, fill=A)) +
coord_fixed() +xlab("X_label") + ylab("Y_lablel") + guides(fill=guide_legend(title="Legend Title"))
Output
Hope that helps.
Try adding startAngle = -90, endAngle = 90 inside hc_add_series_labels_values.
Note as per the warning hc_add_series_labels_values is deprecated so suggest using hc_add_series.
highchart() %>%
hc_add_series(type = "pie", data = df, hcaes(x = A, y = B), startAngle = -90, endAngle = 90) %>%
hc_tooltip(pointFormat = '<b>{point.percentage:.1f}%</b>') %>%
hc_title(text = "ABC",
margin = 20,
style = list(color = "#144746", useHTML = TRUE))
I'm trying to load data from Quandl with collapse = "monthly".
Some of the values are only available in a yearly or halve-yearly fashion.
Some other values are only available within certain periods of time.
This leaves me with a lot of inhomogeneous data.
How can I fill the yearly and halve-yearly data in a "Last observation carried forward" fashion and the remaining NAs with 0?
Here is my idea of the data I got and the one I want to have at the end:
library(tibble)
set.seed(4711)
# How do I get from:
#
df.start <- data_frame(
Date = seq.Date(as.Date("1990-01-01"), as.Date("1999-12-01"), "1 month"),
B = rep(NA, 120),
C = c(rep(NA, 50), rnorm(120 - 50)),
D = rep(c(rnorm(1), rep(NA, 11)), 10),
E = c(rep(NA, 24), rep(c(rnorm(1), rep(NA, 11)), 8)),
F = c(rep(NA, 45), rnorm(50), rep(NA, 25)),
G = c(rep(NA, 24), rep(c(rnorm(1), rep(NA, 11)), 6), rep(NA, 24)),
H = c(rep(NA, 10), rnorm(20), rep(NA, 16), rnorm(37), rep(NA, 37)),
I = rep(c(rnorm(1), rep(NA, 5)), 20)
)
#
# To:
#
df.end <- data_frame(
Date = seq.Date(as.Date("1990-01-01"), as.Date("1999-12-01"), "1 month"),
B = rep(0, 120),
C = c(rep(0, 50), rnorm(120 - 50)),
D = rep(rnorm(10), each = 12),
E = c(rep(0, 24), rep(rnorm(8), each = 12)),
F = c(rep(0, 45), rnorm(50), rep(0, 25)),
G = c(rep(0, 24), rep(rnorm(6), each = 12), rep(0, 24)),
H = c(rep(0, 10), rnorm(20), rep(0, 16), rnorm(37), rep(0, 37)),
I = rep(rnorm(20), each = 6)
)
#
# Automatically?
#
You can use fill to fill the NAs with the last non-empty value (except for the Date column), and then replace the remaining NAs by 0. We do these operations grouped by year.
library(tidyverse)
library(lubridate)
df.end <- df.start %>%
mutate(year = year(Date)) %>%
group_by(year) %>%
fill(., colnames(df.start[-1])) %>%
replace(., is.na(.), 0) %>%
ungroup() %>%
select(-year)
My data look like this:
set.seed(123)
library(tidyverse)
library(reshape2)
Year <- c(2017, 2017, 2017, 2018, 2018, 2018)
Month <- c(10, 11, 12, 1, 2, 3)
alpha_test <- runif(n = 6, min = 0.2, max = 0.25)
alpha_control <- runif(n = 6, min = 0.17, max = 0.22)
beta_test <- runif(n = 6, min = 0.01, max = 0.1)
beta_control <- runif(n = 6, min = 0.03, max = 0.05)
df <- tibble(Year, Month, alpha_test, alpha_control, beta_test, beta_control)
df
What I want is, two geom_path charts (one chart for alpha, one for beta) which compare the test and the control. Here's an example from Excel for a similar test:
I assume I will need to melt the data in some way to get what I want. But, the command
rawMelt <- melt(df, id.vars = c(Year, Month))
gives the error Error: id variables not found in data: 2017, 2018, October, November, December, January, February, March. How would you melt these data so that I can make the graph I want?
This is what I eventually went with, should anyone else have this problem:
rawMelt <- melt(df, id.vars = c("Year", "Month")) %>%
mutate(
theSource = ifelse(grepl("test", variable), "test", "control"),
metric = ifelse(grepl("alpha", variable), "alpha", "beta"),
monthText = paste0(Year, "_", ifelse(Month < 10, "0", ""), Month)
) %>%
select(-variable)
g_maker <- function(theMetric) {
theChart <- rawMelt %>%
filter(metric == theMetric)
g <- ggplot(theChart, aes(x = as.factor(monthText), y = value, group = theSource)) +
geom_path(aes(color = theSource)) +
scale_color_manual(values = c("red", "black")) +
theme_minimal() +
xlab(NULL) +
theme(axis.text.x = element_text(angle = 75, hjust = 1))
return(g)
}
alpha_graph <- g_maker("alpha")
beta_graph <- g_maker("beta")
alpha_graph
beta_graph
Is there a way to move the plot down so that there is some space between the legend and the plot area? Ideally have the chart area automatically spaced below the legend.
df <- data.frame(
x = seq(50),
y = rnorm(50, 10, 3),
z = rnorm(50, 11, 2),
w = rnorm(50, 9, 2)
)
df %>%
e_charts(x) %>%
e_line(w) %>%
e_line(y) %>%
e_line(z) %>%
e_legend(orient = 'vertical', left = 0, top = 0)
Use the e_grid function to adjust the "grid" on which the graph is plotted.
library(echarts4r)
df <- data.frame(
x = seq(50),
y = rnorm(50, 10, 3),
z = rnorm(50, 11, 2),
w = rnorm(50, 9, 2)
)
df %>%
e_charts(x) %>%
e_line(w) %>%
e_line(y) %>%
e_line(z) %>%
e_legend(
orient = 'vertical',
left = 0,
top = 0,
selectedMode = "single" # might be of use
) %>%
e_grid(left = 100, top = 5)
Plenty more options in the grid can be found here