How to add values ​between hours - r

I have code that breaks down hours with corresponding values ​​into quarters of an hour.
Unfortunately, when broken down into quarters of an hour, the values ​​are identical for the entire hour.
After adding quarters of an hour, I would also like to add values ​​between the original hours so that the graph is smooth and not sharp. How to do it, average it, interpolate it?
df <- data.frame(
h = 0:23,
x = c(22, 11, 5, 8 , 22, 88, 77, 7, 11, 5, 8 , 22, 88, 77, 11, 5, 8 , 22, 88, 77, 11, 5, 8 , 22))
library(dplyr)
library(stringr)
df %>%
data.frame(h = rep(df$h, each = 4), # quadruplicate rows
x = rep(df$x, each = 4)) %>% # quadruplicate rows
mutate(h.1 = str_pad(h.1, width = 2, side = "left", pad = "0"), # add leading '0'
qu = paste0(h.1, c(":00", ":15", ":30", ":45"))) %>% # create quarters
select( - c(h,x)) %>% # deselect obsolete cols
rename(c("h" = "h.1", "x" = "x.1"))
df %>%
ggplot() +
geom_point(aes(qu, x), color = "red", size = 2) +
labs(x= "", y = "",
title = "Example")

Here I make a "decimal hour" variable to simplify the calculations. We can also use hms::hms() to define a timestamp that ggplot2 can understand. I use base:approx here to interpolate between hourly points.
df2 <- df %>%
tidyr::uncount(4) %>% # make 4 copies of each row
mutate(h_dec = h + (0:3)/4,
h_time = hms::hms(hours = h_dec),
x = x * c(1, NA, NA, NA), # this is to make non-hourly into NA,
# so that approx only uses hourly
x_interp = approx(x = h, y = x, xout=h_dec)$y)
df2 %>%
ggplot() +
geom_point(aes(h_time, x_interp), color = "red", size = 2) +
labs(x= "", y = "",
title = "Example")

Related

Plotting a scatter plot in R

I have a data structure that I got as a result of the problem stated here.
Code:
df <- tibble::tribble(~person, ~age, ~height,
"John", 1, 20,
"Mike", 3, 50,
"Maria", 3, 52,
"Elena", 6, 90,
"Biden", 9, 120)
df %>%
mutate(
age_c = cut(
age,
breaks = c(-Inf, 5, 10),
labels = c("0-5", "5-10"),
right = TRUE
),
height_c = cut(
height,
breaks = c(-Inf, 50, 100, 200),
labels = c("0-50", "50-100", "100-200"),
right = TRUE
)
) %>%
count(age_c, height_c, .drop = FALSE)
# A tibble: 6 x 3
age_c height_c n
<fct> <fct> <int>
1 0-5 0-50 2
2 0-5 50-100 1
3 0-5 100-200 0
4 5-10 0-50 0
5 5-10 50-100 1
6 5-10 100-200 1
Now I am trying to create a scatter plot but I have a problem that it seems like the code is not noticing that the values on the X and Y axis are repeating. Instead, it is repeating them. So, I would expect my x-axis to have two values 0-5 and 5-10 (what I get is 0-5,0-5,0-5,5-10,5-10,5-10), and the y-axis three values 0-50, 50-100 and 100-200 (instead I have two series of them).
The code I use to plot:
ggplot(df, aes(x=age_c, y=height_c))
Expected plot (where the size of circles would be based on the value of N):
If you plot the count data.frame it should work:
countdf = df %>%
mutate(
age_c = cut(
age,
breaks = c(-Inf, 5, 10),
labels = c("0-5", "5-10"),
right = TRUE
),
height_c = cut(
height,
breaks = c(-Inf, 50, 100, 200),
labels = c("0-50", "50-100", "100-200"),
right = TRUE
)
) %>%
count(age_c, height_c, .drop = FALSE)
countdf %>%
filter(n>0) %>%
ggplot(aes(x=age_c,y=height_c,size=n)) +
geom_point() +
scale_size_continuous(range=c(5,10),breaks=c(1,2))

How to do a semi circle donut with highcharter library?

I'm trying to do a semi circle donut with highcharter library but I only know how to do a pie chart. I know that with JS you can do it by adding "startAngle" and "endAngle" but I want to know how to do it with R:
A <- c("a", "b", "c", "d")
B <- c(4, 6, 9, 2)
C <- c(23, 26, 13, 15)
df <- data.frame(A, B, C)
highchart() %>%
hc_chart(type = "pie") %>%
hc_add_series_labels_values(labels = df$A, values = df$B)%>%
hc_tooltip(crosshairs = TRUE, borderWidth = 5, sort = TRUE, shared = TRUE, table = TRUE,
pointFormat = paste('<b>{point.percentage:.1f}%</b>')
) %>%
hc_title(text = "ABC",
margin = 20,
style = list(color = "#144746", useHTML = TRUE))
Thank you!
You can do something like this though not using Highcharts library.
library(tidyverse)
library(ggforce)
library(scales)
library(ggplot2)
# -------------------------------------------------------------------------
A <- c("a", "b", "c", "d")
B <- c(4, 6, 9, 2)
C <- c(23, 26, 13, 15)
df <- data.frame(A, B, C)
# Ensure A is a factor (we'll be using it to fill the pie)
df$A <- factor(df$A)
# compute the individual proportion in this case using var C
df$prop <- df$C/sum(df$C)
# compute the cumulative proportion and use that to plot ymax
df$p_end <- cumsum(df$prop)
# generate a y-min between 0 and 1 less value than p_end (using p_end)
df$p_start <- c(0, head(df$p_end ,-1))
# -------------------------------------------------------------------------
# plot
df %>%
mutate_at(c("p_start", "p_end"), rescale, to=pi*c(-.5,.5), from=0:1) %>%
ggplot +
geom_arc_bar(aes(x0 = 0, y0 = 0, r0 = .5, r = 1, start = p_start, end = p_end, fill=A)) +
coord_fixed() +xlab("X_label") + ylab("Y_lablel") + guides(fill=guide_legend(title="Legend Title"))
Output
Hope that helps.
Try adding startAngle = -90, endAngle = 90 inside hc_add_series_labels_values.
Note as per the warning hc_add_series_labels_values is deprecated so suggest using hc_add_series.
highchart() %>%
hc_add_series(type = "pie", data = df, hcaes(x = A, y = B), startAngle = -90, endAngle = 90) %>%
hc_tooltip(pointFormat = '<b>{point.percentage:.1f}%</b>') %>%
hc_title(text = "ABC",
margin = 20,
style = list(color = "#144746", useHTML = TRUE))

R: How to fill yearly data within monthly data?

I'm trying to load data from Quandl with collapse = "monthly".
Some of the values are only available in a yearly or halve-yearly fashion.
Some other values are only available within certain periods of time.
This leaves me with a lot of inhomogeneous data.
How can I fill the yearly and halve-yearly data in a "Last observation carried forward" fashion and the remaining NAs with 0?
Here is my idea of the data I got and the one I want to have at the end:
library(tibble)
set.seed(4711)
# How do I get from:
#
df.start <- data_frame(
Date = seq.Date(as.Date("1990-01-01"), as.Date("1999-12-01"), "1 month"),
B = rep(NA, 120),
C = c(rep(NA, 50), rnorm(120 - 50)),
D = rep(c(rnorm(1), rep(NA, 11)), 10),
E = c(rep(NA, 24), rep(c(rnorm(1), rep(NA, 11)), 8)),
F = c(rep(NA, 45), rnorm(50), rep(NA, 25)),
G = c(rep(NA, 24), rep(c(rnorm(1), rep(NA, 11)), 6), rep(NA, 24)),
H = c(rep(NA, 10), rnorm(20), rep(NA, 16), rnorm(37), rep(NA, 37)),
I = rep(c(rnorm(1), rep(NA, 5)), 20)
)
#
# To:
#
df.end <- data_frame(
Date = seq.Date(as.Date("1990-01-01"), as.Date("1999-12-01"), "1 month"),
B = rep(0, 120),
C = c(rep(0, 50), rnorm(120 - 50)),
D = rep(rnorm(10), each = 12),
E = c(rep(0, 24), rep(rnorm(8), each = 12)),
F = c(rep(0, 45), rnorm(50), rep(0, 25)),
G = c(rep(0, 24), rep(rnorm(6), each = 12), rep(0, 24)),
H = c(rep(0, 10), rnorm(20), rep(0, 16), rnorm(37), rep(0, 37)),
I = rep(rnorm(20), each = 6)
)
#
# Automatically?
#
You can use fill to fill the NAs with the last non-empty value (except for the Date column), and then replace the remaining NAs by 0. We do these operations grouped by year.
library(tidyverse)
library(lubridate)
df.end <- df.start %>%
mutate(year = year(Date)) %>%
group_by(year) %>%
fill(., colnames(df.start[-1])) %>%
replace(., is.na(.), 0) %>%
ungroup() %>%
select(-year)

Melt function (R/reshape) delivering an error

My data look like this:
set.seed(123)
library(tidyverse)
library(reshape2)
Year <- c(2017, 2017, 2017, 2018, 2018, 2018)
Month <- c(10, 11, 12, 1, 2, 3)
alpha_test <- runif(n = 6, min = 0.2, max = 0.25)
alpha_control <- runif(n = 6, min = 0.17, max = 0.22)
beta_test <- runif(n = 6, min = 0.01, max = 0.1)
beta_control <- runif(n = 6, min = 0.03, max = 0.05)
df <- tibble(Year, Month, alpha_test, alpha_control, beta_test, beta_control)
df
What I want is, two geom_path charts (one chart for alpha, one for beta) which compare the test and the control. Here's an example from Excel for a similar test:
I assume I will need to melt the data in some way to get what I want. But, the command
rawMelt <- melt(df, id.vars = c(Year, Month))
gives the error Error: id variables not found in data: 2017, 2018, October, November, December, January, February, March. How would you melt these data so that I can make the graph I want?
This is what I eventually went with, should anyone else have this problem:
rawMelt <- melt(df, id.vars = c("Year", "Month")) %>%
mutate(
theSource = ifelse(grepl("test", variable), "test", "control"),
metric = ifelse(grepl("alpha", variable), "alpha", "beta"),
monthText = paste0(Year, "_", ifelse(Month < 10, "0", ""), Month)
) %>%
select(-variable)
g_maker <- function(theMetric) {
theChart <- rawMelt %>%
filter(metric == theMetric)
g <- ggplot(theChart, aes(x = as.factor(monthText), y = value, group = theSource)) +
geom_path(aes(color = theSource)) +
scale_color_manual(values = c("red", "black")) +
theme_minimal() +
xlab(NULL) +
theme(axis.text.x = element_text(angle = 75, hjust = 1))
return(g)
}
alpha_graph <- g_maker("alpha")
beta_graph <- g_maker("beta")
alpha_graph
beta_graph

echarts4r moving the plot down to make room for the legend

Is there a way to move the plot down so that there is some space between the legend and the plot area? Ideally have the chart area automatically spaced below the legend.
df <- data.frame(
x = seq(50),
y = rnorm(50, 10, 3),
z = rnorm(50, 11, 2),
w = rnorm(50, 9, 2)
)
df %>%
e_charts(x) %>%
e_line(w) %>%
e_line(y) %>%
e_line(z) %>%
e_legend(orient = 'vertical', left = 0, top = 0)
Use the e_grid function to adjust the "grid" on which the graph is plotted.
library(echarts4r)
df <- data.frame(
x = seq(50),
y = rnorm(50, 10, 3),
z = rnorm(50, 11, 2),
w = rnorm(50, 9, 2)
)
df %>%
e_charts(x) %>%
e_line(w) %>%
e_line(y) %>%
e_line(z) %>%
e_legend(
orient = 'vertical',
left = 0,
top = 0,
selectedMode = "single" # might be of use
) %>%
e_grid(left = 100, top = 5)
Plenty more options in the grid can be found here

Resources