How to improve this graph with multiple lines in ggplo2? - r

Example dataframe: datafame.RData
I would like to create a chart below with these automated interactions. Ex. Changing the variable the average calculations are automatically remade and changed in the graph. For example Pais 'n' presents NA.
Here is an example of the expected chat in the output of ggplo2.
What I managed to do in R was this:
mydata %>%
dplyr::filter(Region %in% 'World median') %>%
dplyr::select(year,value) %>%
ggplot() +
aes(year,value, group=1,color="World median")+
geom_line()+
geom_line(data=mydata %>%
dplyr::filter(Country %in% 'Canada') %>%
dplyr::select(year,value),
aes(year, value, group=1, color="Canada"))+
geom_line(data=mydata %>%
dplyr::filter(Country %in% 'Brazil') %>%
dplyr::select(year,value),
aes(year, value, group=1, color="Brazil"))
The result was the one below. But if you have any suggestions on how to do better using ggplot I appreciate it.

Related

How to make animation of items for a specific subject's responses over time?

Data
Here is the simulated data for my question. It consists of subjects, items (stimuli), and a T/F response to each item:
#### Load Tidyverse ####
library(tidyverse)
library(gganimate)
#### Create Tibble ####
set.seed(123)
subject <- factor(rep(1:5,100))
score <- rbinom(n=500,
size=1,
prob=.5)
tib <- tibble(subject,
score) %>%
group_by(subject) %>%
mutate(item = row_number())
tib
Problem
I'm trying to figure out how to animate either a single subject or many subject responses over time. If I plot the change over time in this way:
#### Plot Change Over Items ####
tib %>%
ggplot(aes(x=item,
y=score,
color=subject))+
geom_point()+
geom_smooth(se=F)
I can at least see generally speaking where the trends lie. However, I would like to have something animated which shows the progression of responses as they happen. I tried using gganimate, but it wouldn't use geom_smooth and the points alone are lacking a lot of useful information:
#### Plot Change Over Items ####
tib %>%
ggplot(aes(x=item,
y=score,
color=subject))+
geom_point()+
transition_manual(item)
I tried a cumulative sum plot as well:
#### Plot Cumulative Sum ####
tib %>%
mutate(cum_score = cumsum(score)) %>%
ggplot(aes(x=item,
y=cum_score,
color=subject))+
geom_line()
But animating it still comes out poor:
#### Plot Cumulative Sum ####
tib %>%
mutate(cum_score = cumsum(score)) %>%
ggplot(aes(x=item,
y=cum_score,
color=subject))+
geom_line()+
transition_manual(cum_score)
Am I messing up the arguments here? Is there a better alternative?
I figured it out. I was trying to figure out how to use the cumulative argument and I realized it was a logical argument:
#### Plot Cumulative Sum ####
tib %>%
mutate(cum_score = cumsum(score)) %>%
ggplot(aes(x=item,
y=cum_score,
color=subject))+
geom_line()+
geom_point()+
transition_manual(cumulative = T,
frames = cum_score)
Which gives me a nice gif:

Making ggplot geom_boxplot

Boxplot in ggplot
df %>%
mutate(Bezettingsgraad = Bezetting_gem / Capaciteit *100 ) %>%
group_by(Stadion)
Code for the boxplot
df %>%
mutate(Bezettingsgraad = Bezetting_gem / Capaciteit *100 ) %>%
group_by(Provincie) %>%
ggplot(Provincie, aes(x=Provincie, y=Bezetting_gem, color=dose)) +
geom_boxplot()
In the image you see in yellow the rows that are being used
Error
Before the mapping aesthetics you have included the variable Provincie in the place where your data should be . Besides you are already piping your data into your ggplot call via the %>% operator.
Try deleting Provincie

Boxplots of four variables in the same plot

I would like to make four boxplots side-by-side using ggplot2, but I am struggling to find an explanation that suits my purposes.
I am using the well-known Iris dataset, and I simply want to make a chart that has boxplots of the values for sepal.length, sepal.width, petal.length, and petal.width all next to one another. These are all numerical values.
I feel like this should be really straightforward but I am struggling to figure this one out.
Any help would be appreciated.
Try this. The approach would be to selecting the numeric variables and with tidyverse functions reshape to long in order to sketch the desired plot. You can use facet_wrap() in order to create a matrix style plot or avoid it to have only one plot. Here the code (Two options):
library(tidyverse)
#Data
data("iris")
#Code
iris %>% select(-Species) %>%
pivot_longer(everything()) %>%
ggplot(aes(x=name,y=value,fill=name))+
geom_boxplot()+
facet_wrap(.~name,scale='free')
Output:
Or if you want all the data in one plot, you can avoid the facet_wrap() and use this:
#Code 2
iris %>% select(-Species) %>%
pivot_longer(everything()) %>%
ggplot(aes(x=name,y=value,fill=name))+
geom_boxplot()
Output:
This is a one-liner using reshape2::melt
ggplot(reshape2::melt(iris), aes(variable, value, fill = variable)) + geom_boxplot()
In base R, it can be done more easily in a one-liner
boxplot(iris[-5])
Or using ggboxplot from ggpubr
library(ggpubr)
library(dplyr)
library(tidyr)
iris %>%
select(-Species) %>%
pivot_longer(everything()) %>%
ggboxplot(x = 'name', fill = "name", y = 'value',
palette = c("#00AFBB", "#E7B800", "#FC4E07", "#00FABA"))

How to create ggplot with facet_grid() for each column of a matrix

I want to create a plot in R with ggplot() to visualise the data included in variable matrix that looks like this:
matrix <- matrix(c(time =c(1,2,3,4,5),v1=rnorm(5),v2=c(NA,1,0.5,0,0.1)),nrow=5)
colnames(matrix) <- c("time","v1","v2")
df <-data.frame(
time=rep(matrix[,1],2),
values=c(matrix[,2],matrix[,3]),
names=rep(c("v1","v2"), each=length(matrix[,1]))
)
ggplot(df, aes(x=time,y=values,color=names)) +
geom_point()+
facet_grid(names~.)
Is there a faster way than transforming the data in a data.frame like I do? This way seems to be very laborious..
I would appreciate every help!! Thanks in advance.
A tidyverse approach:
This will produce the data structure you need to use in ggplot
library(tidyverse)
matrix %>%
as_data_frame() %>%
gather(., names, value, -time)
This will generate data structure and plot all at once
matrix %>%
as_data_frame() %>%
gather(., names, value, -time) %>%
ggplot(., aes(x=time,y=value,color=names)) +
geom_point()+
facet_grid(names~.)

Plotting Average/Median of each column in data frame grouped by factors

I am trying to make a grouped barplot and I am running into trouble. For example, if I was using the mtcars dataset and I wanted to group everything by the 'vs' column (col #8), find the average of all remaining columns, and then plot them by group.
Below is a very poor example of what I am trying to do and I know it is incorrect.
Ideally, mpg for vs=1 & vs=0 would be side by side, followed by cyl's means side by side, etc. I don't care if aggregate is skipped for dyplr or if ggplot is used or even if the aggregate step is not needed...just looking for a way to do this since it is driving me crazy.
df = mtcars
agg = aggregate(df[,-8], by=list(df$vs), FUN=mean)
agg
barplot(t(agg), beside=TRUE, col=df$vs))
Try
library(ggplot2)
library(dplyr)
library(tidyr)
df %>%
group_by(vs=factor(vs)) %>%
summarise_each(funs(mean)) %>%
gather(Var, Val, -vs) %>%
ggplot(., aes(x=Var, y=Val, fill=vs))+
geom_bar(stat='identity', position='dodge')
Or using base R
m1 <- as.matrix(agg[-1])
row.names(m1) <- agg[,1]
barplot(m1, beside=TRUE, col=c('red', 'blue'), legend=row.names(m1))

Resources