How to apply a code for multiple columns? [duplicate] - r

This question already has answers here:
Plotting two variables as lines using ggplot2 on the same graph
(5 answers)
Closed 8 months ago.
I am new to R and have the following example code that I wish to apply for every column in my data.
data(economics, package="ggplot2")
economics$index <- 1:nrow(economics)
loessMod10 <- loess(uempmed ~ index, data=economics, span=0.10)
smoothed10 <- predict(loessMod10)
plot(economics$uempmed, x=economics$date, type="l", main="Loess Smoothing and Prediction", xlab="Date", ylab="Unemployment (Median)")
lines(smoothed10, x=economics$date, col="red")
Could someone please suggest how this would be possible?

It's possible to perform loess smoothing within ggplot.
library(data.table)
library(ggplot2)
df <- economics
##
#
gg.melt <- setDT(df) |> melt(id='date', variable.name = 'KPI')
ggplot(gg.melt, aes(x=date, y=value))+
geom_line()+
stat_smooth(method=loess, color='red', size=0.5, se=FALSE, method.args = list(span=0.1))+
facet_wrap(~KPI, scales = 'free_y')
Regarding combining everything on one plot I'm not seeing how you would do that as the y-scales are so different. If the point is to see how the peaks line up, etc. you could do this:
ggplot(gg.melt, aes(x=date, y=value))+
geom_line()+
stat_smooth(method=loess, color='red', size=0.5, se=FALSE, method.args = list(span=0.1))+
facet_grid(KPI~., scales = 'free_y')
There is also the dygraphs package which allows creation of dynamic graphics that can be saved to html:
gg.melt[, scaled:=scale(value, center = FALSE, scale=diff(range(value))), by=.(KPI)]
gg.melt[, pred:=predict(loess(scaled~as.integer(date), .SD, span=0.1)), by=.(KPI)]
gg.dt <- dcast(gg.melt, date~KPI, value.var = list('scaled', 'pred'))
library(dygraphs)
dygraph(gg.dt) |>
dyCrosshair(direction = 'vertical') |>
dyRangeSelector()
It's possible to create a dygraph(...) version of the second plot, where the different KPI are in different facets, but you have to use RMarkdown for that.

You can make your data from wide to long by the date and use facet_wrap. Maybe you want something like this:
library(ggplot2)
library(reshape2)
library(dplyr)
economics %>%
melt(., "date") %>%
ggplot(., aes(date, value)) +
geom_line() +
facet_wrap(~variable, scales = "free")
Output:
Comment: All plots in one graph
If you mean all plots in one graph, you can give the variables a color like this:
economics %>%
melt(., "date") %>%
ggplot(., aes(date, value, color = variable)) +
geom_line() +
scale_y_log10()
Output:

Related

Combine scale_x_upset with scale_y_break

I made an upset plot using the ggupset package and added a break to the y axis with scale_y_break from the ggbreakpackage.
However, when I add scale_y_break, the combination matrix under the bar plot disappears.
Is there a way to combine the combination matrix of the plot made without scale_y_break with the bar plot portion of a plot made with scale_y_break? I can't seem to be able to access the grobs of these plots or use any other workaround. If anyone could help, I would greatly appreciate it!
Example with scale_x_upset and scale_y_break:
df = tidy_movies %>% distinct(title, year, length, .keep_all=TRUE)
ggplot(df, aes(x=Genres)) + geom_bar() + scale_x_upset(n_intersections = 20)+ scale_y_break(breaks = c(750,1000))
I would like to combine the barplot portion of the plot created with:
df = tidy_movies %>% distinct(title, year, length, .keep_all=TRUE)
ggplot(df, aes(x=Genres)) + geom_bar() + scale_x_upset(n_intersections = 20)+ scale_y_break(breaks = c(750,1000))
with the combination matrix portion of the plot made with:
df = tidy_movies %>% distinct(title, year, length, .keep_all=TRUE)
ggplot(df, aes(x=Genres)) + geom_bar() + scale_x_upset(n_intersections = 20)
Thanks!

Is there a way to get multi-paneled plots in ggplot with different color axes for each plot? [duplicate]

Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?
I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)
Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.
If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")

ggplot for each column in a data

I am missing some basics in R.
How do I make a plot for each column in a data frame?
I have tried making plots for each column separately. I was wondering if there was a easier way?
library(dplyr)
library(ggplot2)
data(economics)
#scatter plots
ggplot(economics,aes(x=pop,y=pce))+
geom_point()
ggplot(economics,aes(x=pop,y=psavert))+
geom_point()
ggplot(economics,aes(x=pop,y=uempmed))+
geom_point()
ggplot(economics,aes(x=pop,y=unemploy))+
geom_point()
#boxplots
ggplot(economics,aes(y=pce))+
geom_boxplot()
ggplot(economics,aes(y=pop))+
geom_boxplot()
ggplot(economics,aes(y=psavert))+
geom_boxplot()
ggplot(economics,aes(y=uempmed))+
geom_boxplot()
ggplot(economics,aes(y=unemploy))+
geom_boxplot()
All I'm looking for is having 1 box plot 2*2 and 1 2*2 scatter plot with ggplot2. I understand there is facet grid which I have failed to understand how to implement.(I believe this can be achieved easily with par(mfrow()) and base R plots. I saw somewhere else using using widening the data? which i didn't understand.
In cases like this the solution is almost always to reshape the data from wide to long format.
economics %>%
select(-date) %>%
tidyr::gather(variable, value, -pop) %>%
ggplot(aes(x = pop, y = value)) +
geom_point(size = 0.5) +
facet_wrap(~ variable, scales = "free_y")
economics %>%
tidyr::gather(variable, value, -date) %>%
ggplot(aes(y = value)) +
geom_boxplot() +
facet_wrap(~ variable, scales = "free_y")

ggplot boxplot: position_dodge does not work

I have made a relatively simple boxplot with ggplot
ggplot(l8tc.df_17_18,aes(x=landcover,y= tcw_17, group=landcover))+
geom_boxplot()+
geom_boxplot(aes(y= tcw_18),position_dodge(1))
A screenshot to get an idea of the data used:
This is the output:
I want the different boxplots to be next to each other and not in one vertical line. I have looked through all related questions and tried out a couple of options, however I could not find a solution so far.
I am still a ggplot beginner though.
Any ideas?
You should use in this case different data format and melt it.
require(reshape2)
require(tidyverse)
# format data
melted_data <- l8tc.df_17_18 %>%
select(landcover, tcw_17, tcw_18) %>%
melt('landcover', variable.name = 'tcw')
# plot
ggplot(melted_data, aes(x = as.factor(landcover), y = value)) + geom_boxplot(aes(fill = tcw))
a dodge should be automatic but if you want ot experiment use geom_boxplot(aes(fill = tcw), position = position_dodge())
https://ggplot2.tidyverse.org/reference/position_dodge.html
you can write it in one line without creating temp file
l8tc.df_17_18 %>%
select(landcover, tcw_17, tcw_18) %>%
melt('landcover', variable.name = 'tcw') %>%
ggplot(aes(x = as.factor(landcover), y = value)) + geom_boxplot(aes(fill = tcw))

Different colourbar for each facet in ggplot figure

Say, I make a gpplot2 plot like the following with several facets:
ggplot(iris) +
geom_tile(aes(x = Petal.Width, fill = Sepal.Width, y = Petal.Length)) +
facet_wrap(~Species)
Note that there is one colourbar for all three plots, but each facet could potentially have a very different values. Is it possible to have a separate colourbar for each facet?
I agree with Alex's answer, but against my better scientific and design judgment, I took a stab at it.
require(gridExtra)
require(dplyr)
iris %>% group_by(Species) %>%
do(gg = {ggplot(., aes(Petal.Width, Petal.Length, fill = Sepal.Width)) +
geom_tile() + facet_grid(~Species) +
guides(fill = guide_colourbar(title.position = "top")) +
theme(legend.position = "top")}) %>%
.$gg %>% arrangeGrob(grobs = ., nrow = 1) %>% grid.arrange()
Of course, then you're duplicating lots of labels, which is annoying. Additionally, you lose the x and y scale information by plotting each species as a separate plot, instead of facets of a single plot. You could fix the axes by adding ... + coord_cartesian(xlim = range(iris$Petal.Width), ylim = range(iris$Petal.Length)) + ... within that ggplot call.
To be honest, the only way this makes sense at all is if it's comparing two different variables for the fill, which is why you don't care about comparing their true value between plots. A good alternative would be rescaling them to percentiles within a facet using dplyr::group_by() and dplyr::percent_rank.
Edited to update:
In the two-different-variables case, you have to first "melt" the data, which I assume you've already done. Here I'm repeating it with the iris data. Then you can look at the relative values by examining the percentiles, rather than the absolute values of the two variables.
iris %>%
tidyr::gather(key = Sepal.measurement,
value = value,
Sepal.Length, Sepal.Width) %>%
group_by(Sepal.measurement) %>%
mutate(percentilevalue = percent_rank(value)) %>%
ggplot(aes(Petal.Length, Petal.Width)) +
geom_tile(aes(fill = percentilevalue)) +
facet_grid(Sepal.measurement ~ Species) +
scale_fill_continuous(limits = c(0,1), labels = scales::percent)
Separate palettes for facets in ggplot facet_grid
It has been asked before. This is the best solution I have seen so far, however I think having a common palette is more ideal from a visualization standpoint.
If this is what you want then there is a simple hack to it.
tf1 <- iris
tf1$COL <- rep(1:50, each=3)
ggplot(tf1) +
geom_tile(aes(x = Petal.Width, fill = interaction(Petal.Length,COL), y = Petal.Length)) +
facet_wrap(~Species, scales = "free") + theme(legend.position="none")

Resources