so I'm trying to Plot chart. I filtered the original dataset Datengf to get the median income of each year (MULTYEAR) and the variable Schulbildung. No chart looks like this: chart. Now I want to plot chart by using ggplot and geom_line. On the x-axis MULTYEAR and on the y-axis the medianincome. But I want it to be a different line and color for each value of Schulbildung.
Chart code:
chart <- Datengf %>%
filter(SEX == 1)%>%
group_by(MULTYEAR,Schulbildung) %>%
summarise(medianincome = median(INCWAGE))%>%
ungroup()%>%
mutate(Schulbildung = ifelse(Schulbildung < 12, "others", Schulbildung)) %>%
group_by(Schulbildung,MULTYEAR)%>%
summarise(medianincome = sum(medianincome))
I tried using
chartplot <- chart %>%
ggplot(aes(x = MULTYEAR, y = medianincome))+
geom_line()
but the chart is an complete mess.
Specify color in the aes function:
chartplot <- chart %>%
ggplot(aes(x = MULTYEAR, y = medianincome, color = Schulbildung))+
geom_line()
Related
I have the following bar plot created using R ggplot. How do I dynamically update the distances between the bars on the plot using the 'distance' column of the same data frame.
library(tidyverse)
data.frame(name = c("A","B","C","D","E"),
value = c(34,45,23,45,75),
distance = c(3,4,1,2,5)) %>%
ggplot(aes(x = name, y = value)) +
geom_col()
I am trying to make a scatterplot with plotly for R where the dots are connected in order using geom_path.
Now I would like to add a rangeslider where the user can select a date range.
Here is something similar using Python: Youtube Code
...but I don't need any recalculation of means or something like that, I just want to filter based on column i.
Unfortunately, I am having trouble doing that in R plotly.
This is my attempt, but I don't know how to tell plotly to subset the data using the i column:
library(tidyverse)
library(plotly)
p <- mtcars %>%
mutate(i = 1:nrow(mtcars)) %>%
ggplot(aes(x = mpg, y = wt))+
geom_path(size = 0.5) +
geom_point(aes(color = i), size = 3)
ggplotly(p) %>%
layout(
xaxis = list(rangeslider = list())
)
My data consists of a date variable and four numeric variables, of the 4 numeric variables I wish to plot two of these as a stacked bar chart and the remaining variables as line charts.
Is it possible to create two line charts and a stacked bar chart in a single plot using ggplot?
My data looks as follows:
data <- tibble(Month = 1:12,Brands = c(1,1,1,1,1,1,2,2,2,2,2,2),Generics = Brands + 1,Metric1 = c(5,5,5,5,5,5,6,6,7,8,9,10),Metric2 = c(10,10,11,11,12,13,14,15,16,17,18,19))
I wish to plot months on the x axis, Brands1 & Brands2 as stacked bar charts and Metric1 & Metric2 as line charts all on the same chart if possible.
Something like this?
library(tidyverse)
data <- tibble(Month = 1:12,Brands = c(1,1,1,1,1,1,2,2,2,2,2,2),Generics = Brands + 1,Metric1 = c(5,5,5,5,5,5,6,6,7,8,9,10),Metric2 = c(10,10,11,11,12,13,14,15,16,17,18,19))
data %>%
pivot_longer(cols = c(Brands,Generics)) %>%
pivot_longer(cols = c(Metric1,Metric2),
names_to = "metric_name",values_to = "metric_value") %>%
ggplot(aes(Month))+
geom_col(aes(y = value, fill = name))+
geom_line(aes(y = metric_value, col = metric_name),size = 1.25)+
scale_x_continuous(breaks = 1:12)+
scale_color_manual(values = c("black","purple"))
I need to convert into widget a simple ggplot, a stacked bar with uncertainty.
The data:
world.tot <- data.frame('country'='world', 'GHG'=c('CH4', 'CO2','N2O'),
'emi'=c(6e+6, 3e+6, 1+6),
'unc.min'=8561406, 'unc.max'=14027350)
and the ggplot:
p2 <- ggplot(world.tot) +
geom_bar(aes(x=country,y=emi,fill=GHG), stat='identity', position='stack' ) +
geom_errorbar(aes(x=country, ymin=unc.min, ymax=unc.max), width=0.2) +
theme(axis.title. x=element_blank(), axis.title. y=element_blank()) +
theme(legend.position='none')
When I try: ggplotly(p2) only the stacked bars are converted, not the error bar. Any advice?
Alternatively, I could use plot_ly to create the plot, but cannot manage to add the error bar:
plot_ly(world.tot, x=~country. y=~emi, color=~GHG,type=bar,
error_y=~list(array(c(unc.min, unc.max))) %>%
layout(barmode='stack')
This produces error bars to all shares of the stacked histogram, while I need only one error at the top of the stacked histogram.
Any help is appreciated
You can prepare a data.frame that has only one error size per group
library(dplyr)
world.err <- world.tot %>%
group_by(country) %>%
summarise(emi = sum(emi), unc.min = 8561406, unc.max = 14027350)
And plot the errors as a separate trace
plot_ly(world.tot) %>%
add_bars(x = ~country, y = ~emi, color = ~GHG, type='bar') %>%
add_trace(x = ~country, y = ~emi, data = world.err,
showlegend = F, mode='none', type='scatter',
error_y = ~list(array = c(unc.min, unc.max), color = '#000000')) %>%
layout(barmode='stack')
I want to plot boxplot using only summary statistics (INPUT is summary statistics for each ID).
plot1 is what I want, but when I convert it to a plotly object something goes wrong (ie., plotly flips boxplot).
However, if I plot boxplot the usual way (not using stat = "identity") everything works fine.
Question: Why plotly "flips" summarised ggplot2 boxplot and how to avoid this?
library(broom)
library(plotly)
library(tidyverse)
# Generate random data
# Calculate statistics
INPUT <- rnorm(100) %>%
matrix(10) %>%
apply(2, function(x) tidy(summary(x))) %>%
bind_rows() %>%
mutate(ID = letters[1:10])
# Plot boxplot using statistics
plot1 <- ggplot(INPUT, aes(ID)) +
geom_boxplot(stat = "identity", aes(
lower = q1,
upper = q3,
middle = median,
ymin = minimum,
ymax = maximum))
# Only ggplot2 produces right result
plot1; ggplotly(plot1)
# Plot boxplot usual way
plot2 <- INPUT %>%
gather(variable, value, -ID) %>%
ggplot(aes(ID, value)) +
geom_boxplot()
# ggplot2 and plotly produces right result
plot2; ggplotly(plot2)