Related
I am new in R. I encounter some troubles.
The dataset is like this:stats.stock
I want to produce many graphs for every variable by (x = date, y = variable).
This is my code:
stat.variables <- names(stats.stock)
stat.variables <- stat.variables[-1]
for (i in stat.variables) {
png(filename =paste0("C:\\Users\\", i, "stock.jpg"), width=2400, height=1800, res=300)
print(ggplot(stats.stock,
mapping = aes_string(
x = "Date",
y = i)) + geom_point())
dev.off()
}
However, results turn out like this:
results
I am pretty sure there is nothing wrong with my data. If I run only one variable, the result is good.
one
How could I deal with this trouble? Thanks in advance!
Consider using aes instead of aes_string as it is deprecated and make use of .data to subset the column
for (i in stat.variables) {
png(filename =paste0("C:\\Users\\", i, "stock.png"),
width=2400, height=1800, res=300)
print(ggplot(stats.stock,
mapping = aes(
x = Date,
y = .data[[i]])) +
geom_point())
dev.off()
}
Using a small reproducible example
wd <- getwd()
data(iris)
iris$Date <- seq(Sys.Date(), length.out = nrow(iris), by = '1 day')
stat.variables <- names(iris)[1:4]
for (i in stat.variables) {
png(filename = file.path(wd, paste0(i, "_stock.png")),
width=2400, height=1800, res=300)
print(ggplot(iris,
mapping = aes(
x = Date,
y = .data[[i]])) +
geom_point())
dev.off()
}
-output
I am making a shiny application where the user specifies the independent variables and as a result shiny displays a time series plot with plotly, where on-however each point shows the selected parameters.
If I know the exact number of variables that the user selects, I am able to construct the time series plot without a problem. Let's say there are 3 parameters chosen:
ggp <- ggplot(data = data.depend(), aes(x = Datum, y = y, tmp1 = .data[[input$Coockpit.Dependencies.Undependables[1]]], tmp2 = .data[[input$Coockpit.Dependencies.Undependables[2]]], tmp3 = .data[[input$Coockpit.Dependencies.Undependables[3]]])) +
geom_point()
ggplotly(ggp)
where data.depend() looks like
and the selected parameters are stored in a character vector
So the problem is that for each parameter I want to include in the tooltip, I have to hard code it in the aes function as tmpi = .data[[input$Coockpit.Dependencies.Undependables[i]]]. I would however like to write generic function that handles any amount of selected parameters. Any comment suggestions are welcome.
EDIT:
Below a minimal working example:
data.dummy <- data.frame(Charge = c(1,2,3,4,5), Datum = c(as.Date("2020-01-01"),as.Date("2020-01-02"),as.Date("2020-01-03"),as.Date("2020-01-04"),as.Date("2020-01-05")), y = c(4,5,6,4,5), ZuluftTemperatur = c(52,51,54,58,49), Durchflussgeschwindigkeit = c(690, 716,722,710,801), ZuluftFeuchtigkeit= c(3.9,4.1,3.8,3.0,4.9))
ChosenParams <- c("ZuluftTemperatur", "ZuluftFeuchtigkeit", "Durchflussgeschwindigkeit")
ggp <- ggplot(data = data.dummy, aes(x = Datum, y = y, tmp1 = .data[[ChosenParams[1]]], tmp2 = .data[[ChosenParams[2]]], tmp3 = .data[[ChosenParams[3]]])) + geom_point()
ggplotly(ggp)
Result:
So this works at the "cost" of me knowing the user is choosing three parameters and therefore I write in aes tmpi = .data[[ChosenParams[i]]]; i=1:3. I am interested in a solution with the same result but where I don't have to write tmpi = .data[[ChosenParams[i]]] i-number of times
Thank you!
One solution is to use eval(parse(...)) to create the code for you:
library(ggplot2)
library(plotly)
data.dummy <- data.frame(Charge = c(1,2,3,4,5), Datum = c(as.Date("2020-01-01"),as.Date("2020-01-02"),as.Date("2020-01-03"),as.Date("2020-01-04"),as.Date("2020-01-05")), y = c(4,5,6,4,5), ZuluftTemperatur = c(52,51,54,58,49), Durchflussgeschwindigkeit = c(690, 716,722,710,801), ZuluftFeuchtigkeit= c(3.9,4.1,3.8,3.0,4.9))
ChosenParams <- c("ZuluftTemperatur", "ZuluftFeuchtigkeit", "Durchflussgeschwindigkeit")
ggp <- eval(parse(text = paste0("ggplot(data = data.dummy, aes(x = Datum, y = y, ",
paste0("tmp", seq_along(ChosenParams), " = .data[[ChosenParams[", seq_along(ChosenParams), "]]]", collapse = ", "),
")) + geom_point()"
)
))
ggplotly(ggp)
Just note that this is not very efficient and in some cases it is not advised to use it (see What specifically are the dangers of eval(parse(...))?). There might also be a way to use quasiquotation in aes(), but I am not really familiar with it.
EDIT: Added a way to do it with quasiquotation.
I had a look a closer look at quasiquotations in aes() and found a nicer way to do it using syms() and !!!:
data.dummy <- data.frame(Charge = c(1,2,3,4,5), Datum = c(as.Date("2020-01-01"),as.Date("2020-01-02"),as.Date("2020-01-03"),as.Date("2020-01-04"),as.Date("2020-01-05")), y = c(4,5,6,4,5), ZuluftTemperatur = c(52,51,54,58,49), Durchflussgeschwindigkeit = c(690, 716,722,710,801), ZuluftFeuchtigkeit= c(3.9,4.1,3.8,3.0,4.9))
ChosenParams <- c("ZuluftTemperatur", "ZuluftFeuchtigkeit", "Durchflussgeschwindigkeit")
names(ChosenParams) <- paste0("tmp", seq_along(ChosenParams))
ChosenParams <- syms(ChosenParams)
ggp <- ggplot(data = data.dummy, aes(x = Datum, y = y, !!!ChosenParams)) + geom_point()
ggplotly(ggp)
I am trying to use lapply to loop through a list of vectors containing plot labels. The list contains vectors. The first element of the vector is the plot title, then the x axis label, and then the y axis label. I want to avoid writing any for loops. I can't figure out how to reference the. The code below is the closest I could come up with, but it just outputs NULL 3 times.
library(tidyverse)
plot.labels <- list(c("Title1","Xlab1","ylab1"),c("Title2","Xlab2","ylab1"),c("Title3","Xlab3","ylab1"))
plotter <- function(plotdata=d, xvar=cond,lab = NULL){
ggplot(data = plotdata, aes( x = xvar , y = scale(vowmeanf0) ))+
geom_point()+
labs(title = lab[1],
x = lab[2],
y = lab[3])
}
lapply(plot.labels, function(x){
for(df in 1:length(plot.labels)){
plotter(x[df])
}
} )
lapply should look more like this. You should never loop within lapply because lapply is essentially a wrapper for a for loop to begin with. lapply(plot.labels, function(x) plotter(plotdata = d, xvar = cond, lab = x)) – Mako212
That worked
library(tidyverse)
plot.labels <- list(c("Title1","Xlab1","Ylab1"),c("Title2","Xlab2","Ylab2"),c("Title3","Xlab3","Ylab3"))
plotter <- function(plotdata = NULL, xvar = NULL ,lab = NULL){
ggplot(data = plotdata, aes( x = xvar , y = scale(vowmeanf0) ))+
geom_point()+
labs(title = lab[1],
x = lab[2],
y = lab[3])
}
lapply(plot.labels, function(x) plotter(plotdata = d, xvar = "cond", lab = x))
I am writing a function to plot heat map for users. In the following example, it plots the change of grade over time for different gender.
However, this is a special case. "Gender" may have other name like "Class".
I will let user input their specific name and then make ggplot have the right label for each axis.
How do I modify my function "heatmap()" based on what I need?
sampledata <- matrix(c(1:60,1:60,rep(0:1,each=60),sample(1:3,120,replace = T)),ncol=3)
colnames(sampledata) <- c("Time","Gender","Grade")
sampledata <- data.frame(sampledata)
heatmap=function(sampledata,Gender)
{
sampledata$Time <- factor(sampledata$Time)
sampledata$Grade <- factor(sampledata$Grade)
sampledata$Gender <- factor(sampledata$Gender)
color_palette <- colorRampPalette(c("#31a354","#2c7fb8", "#fcbfb8","#f03b20"))(length((levels(factor(sampledata$Grade)))))
ggplot(data = sampledata) + geom_tile( aes(x = Time, y = Gender, fill = Grade))+scale_x_discrete(breaks = c("10","20","30","40","50"))+scale_fill_manual(values =color_palette,labels=c("0-1","1-2","2-3","3-4","4-5","5-6",">6"))+ theme_bw()+scale_y_discrete(labels=c("Female","Male"))
}
The easiest solution is redefining the function using aes_string.
When the function is called, you need to pass it the name of the column
you want to use as a string.
heatmap=function(sampledata,y)
{
sampledata$Time <- factor(sampledata$Time)
sampledata$Grade <- factor(sampledata$Grade)
sampledata$new_var <- factor(sampledata[,y])
color_palette <- colorRampPalette(c("#31a354","#2c7fb8", "#fcbfb8","#f03b20"))(length((levels(factor(sampledata$Grade)))))
ggplot(data = sampledata) + geom_tile( aes_string(x = "Time", y = "new_var", fill = "Grade"))+scale_x_discrete(breaks = c("10","20","30","40","50"))+scale_fill_manual(values =color_palette,labels=c("0-1","1-2","2-3","3-4","4-5","5-6",">6"))+ theme_bw()+scale_y_discrete(labels=c("Female","Male")) + ylab(y)
}
# Below an example of how you call the newly defined function
heatmap(sampledata, "Gender")
Alternatively if you want to retain the quote free syntax, there is a slightly more complex solution:
heatmap=function(sampledata,y)
{
arguments <- as.list(match.call())
axis_label <- deparse(substitute(y))
y = eval(arguments$y, sampledata)
sampledata$Time <- factor(sampledata$Time)
sampledata$Grade <- factor(sampledata$Grade)
sampledata$y <- factor(y)
color_palette <- colorRampPalette(c("#31a354","#2c7fb8", "#fcbfb8","#f03b20"))(length((levels(factor(sampledata$Grade)))))
ggplot(data = sampledata) + geom_tile( aes(x = Time, y = y, fill = Grade))+scale_x_discrete(breaks = c("10","20","30","40","50"))+scale_fill_manual(values =color_palette,labels=c("0-1","1-2","2-3","3-4","4-5","5-6",">6"))+ theme_bw()+scale_y_discrete(labels=c("Female","Male")) + ylab(axis_label)
}
# Below an example of how you call the newly defined function
heatmap(sampledata, Gender)
I have an issue trying to create a function to creat a plot using ggplot. Here is some code:
y1<- sample(1:30,45,replace = T)
x1 <- rep(rep(c("a1","a2","a3","a4","a5"),3),each=3)
x2 <- rep(rep(c("b1","b2","b3","b4","b5"),3),each=3)
df <- data.frame(y1,x1,x2)
library(Rmisc)
dfsum <- summarySE(data=df, measurevar="y1",groupvars=c("x1","x2"))
myplot <- function(d,v, w,g) {
pd <- position_dodge(.1)
localenv <- environment()
ggplot(data=d, aes(x=v,y=w,group=g),environment = localenv) +
geom_errorbar(data=d,aes(ymin=d$w-d$se, ymax=d$w+d$se,col=d$g), width=.4, position=pd,environment = localenv) +
geom_line(position=pd,linetype="dotted") +
geom_point(data=d,position=pd,aes(col=g))
}
myplot(dfsum,x1,y1,x2)
As I was looking for similar questions, I found that specifying the local environment should solve the issue. However it did not help in my case.
Thank you
Preliminary Note
When looking at your data.frame, the group variable does not make any sense, as it is perfectly confounded with the x variable. Hence I adapted your data a bit, to show a full example:
Data
library(Rmisc)
library(ggplot2)
d <- expand.grid(x1 = paste0("a", 1:5),
x2 = paste0("b", 1:5))
d <- d[rep(1:NROW(d), each = 3), ]
d$y1 <- rnorm(NROW(d))
dfsum <- summarySE(d, measurevar = "y1", groupvars = paste0("x", 1:2))
Plot Function
myplot <- function(mydat, xvar, yvar, grpvar) {
mydat$ymin <- mydat[[yvar]] - mydat$se
mydat$ymax <- mydat[[yvar]] + mydat$se
pd <- position_dodge(width = .5)
ggplot(mydat, aes_string(x = xvar, y = yvar, group = grpvar,
ymin = "ymin", ymax = "ymax", color = grpvar)) +
geom_errorbar(width = .4, position = pd) +
geom_point(position = pd) +
geom_line(position = pd, linetype = "dashed")
}
myplot(dfsum, "x1", "y1", "x2")
Explanation
Your problem occurs because the scope of x1 x2 and y1 was ambiguous. As you defined these variables also at the top environmnet, R did not complain in the first place. If you had added a rm(x1, x2, y1)in your original code right after you created your data.frame you would have seen the problem already eralier.
ggplot looks in the data.frame you provide for all the variables you want to map to certain aesthetics. If you want to create a function, where you specify the name of the aesthatics as arguments, you should use aes_string instead of aes, as the former expects a string giving the name of the variable rather than the variable itself.
With this approach however, you cannot do calculations on the spot, so you need to create the variables yminand ymaxbeforehand in your data.frame. Furthermore, you do not need to provide the data argument for each geom if it is the same as provided to ggplot.
I've got it plotting something, let me know if this isn't the expected output.
The changes I've made to the code to get it working are:
Load the ggplot2 library
Remove the d$ from the geom_errorbar call to w and g, as these are function arguments rather than columns in d.
I've also removed the data=d calls from all layers except the main ggplot one as these aren't necessary.
library(ggplot2)
myplot <- function(d,v, w,g) {
pd <- position_dodge(.1)
localenv <- environment()
ggplot(data=d, aes(x=v,y=w,group=g),environment = localenv) +
geom_errorbar(aes(ymin=w-se, ymax=w+se,col=g), width=.4,
position=pd,environment = localenv) +
geom_line(position=pd,linetype="dotted") +
geom_point(position=pd,aes(col=g))
}
myplot(dfsum,x1,y1,x2)