How to `dput` a `ggplot` object? - r

I am looking for a way to save some ggplot objects for later use. The dput function creates a string that when passed to dget() would return the errors of unexpected <:
The first one is here: .internal.selfref = <. This can be easily solved by setting .internal.selfref to NULL.
The remaining seven are distributed across different attributes, with the arguments being <environment>. I tried to change the <environment>'s to something like NULL or environment(), but none of them works - the environment is not set right and the object not found error is returned.
Some searches led me to the function ggedit::dput.ggedit. But it gives me the error:
# Error in sprintf("%s = %s", item, y) :
# invalid type of argument[2]: 'symbol'
I am thinking, either I set the environments right in using the dput function, or I figure out why ggedit::dput.ggedit does not work...
Any idea?

Not using dput(), but to save your ggplot objects for later use, you could save them as .rds files (just like any R objects).
Example:
my_plot <- ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
saveRDS(my_plot, "my_plot.rds")
And to restore your object in another session, another script, etc.
my_plot <- readRDS("my_plot.rds")

You can try a tidyverse
Save the plot beside the data in a tibble using nest and map.
library(tidyverse)
res <- mtcars %>%
as.tibble() %>%
nest() %>%
mutate(res=map(data, ~ggplot(.,aes(mpg, disp)) + geom_point()))
Then save the data.frame using save or saveRDS.
Finally, call the plot:
res$res
The size is 4kb for tibble(mtcars) vs. 21kb with plot.

Related

error argument "df1" is missing, with no default

a friend of mine is working with the r language and asked me what she did wrong, i can't seem to find the problem. does someone know what it is?
the code she send me:
# 10*. Pipe that to a ggplot command and create a histogram with 4 bins.
# Hint: you will NOT write ggplot(df, aes(...)) because the df is already piped in.
# Instead, just write: ggplot(aes(...)) etc.
# Title the histogram, "Distribution of Sunday tips for bills over $20"
# Feel free to style the plot (not required; this would be a typical exploratory
# analysis where only you will see it, so it doesn't have to be perfect).
df %>%
filter(total_bill > 20 & day == "Sun") %>%
ggplot(aes(x=total_bill, fill=size)) +
geom_histogram(bins=4) +
ggtitle("Distribution of Sunday tips for bills over $20")
the error:
Error in df(.) : argument "df1" is missing, with no default
Type ?df in your console, and you will see that df is a function with the following argument.
df(x, df1, df2, ncp, log = FALSE)
where df1 is an argument. So the error message is saying that R cannot find the first argument for the df function.
It seems like in this code example, your friend is trying to put a data frame called df into the filter function from the dplyr package and the ggplot function from the ggplot2 package to create a plot.
So my guess is your friend needs to define df as a data frame. Otherwise, R will think df is a function and keep throwing error.
By the way, since df is a defined function in R, it is not a good name for a data frame. However, people use df as a name for a data frame all the time. Try a different name, such as dat, for the name of a data frame next time.

How can I save a list of scatterplots

I wanna save a list of scatterplots with their filenames.
Therefor I give each plot a name, which worked perfectly:
names(scatterplot_standardcurve) <-
sub("\\.xlsx$",
".png",
names(standardcurve_concentration))
> print(scatterplot_standardcurve)
$K_20210722
$A_20210722
$c_20210722
$d_20210722
$t_20210722
$v_20210722
And then I want to save them in a specific folder but I always get an error
lapply(names(scatterplot_standardcurve),
function(nm) print(scatterplot_standardcurve[[nm]]) +
ggsave(filename = file.path("Z:/output/scatterplot_standardcurve/",
nm )))
Error: Unknown graphics device ''
Using imap you'll be able to iterate over the ggplot object as well as it's name. Try -
library(purrr)
library(ggplot2)
imap(scatterplot_standardcurve,
~ggsave(sprintf("Z:/output/scatterplot_standardcurve/%s.png", .y), .x))

R for loop overwriting variable data

I am trying to use a for loop to create a ggplot for each column in a dataframe. I am pretty new to this so my approach may be very wrong here.
I have written a function to create the ggplot:
create_scatter <- function(df, x, y) {
ggplot(df, aes(x, y)) +
geom_point() +
xlab(name) +
ylab("quality")
}
And a for loop to iterate through the Dataframe columns by name (to get the name of the column for use later) then get the contents of the column for the plotting function.
for (name in names(whiteWines)) {
for (column in whiteWines[name]) {
assign(paste0(name, "_scatter"),
create_scatter(whiteWines, column, whiteWines$quality))
}
}
Using assign() I am able to create a variable name from the column name on the fly and assign the results of ggplot to it.
I am then using grid.arrange to arrange the resulting plots in a 3 x 4 grid.
grid.arrange(fixed.acidity_scatter,
volatile.acidity_scatter,
citric.acid_scatter,
residual.sugar_scatter,
chlorides_scatter,
free.sulfur.dioxide_scatter,
total.sulfur.dioxide_scatter,
density_scatter,
pH_scatter,
sulphates_scatter,
alcohol_scatter,
layout_matrix = rbind(c(1,2,3), c(4,5,6), c(7,8,9), c(10,11,12)))
When executed all scatter plots are created, however they all contain the data from the last scatter plot in the loop.
Undesired Results
If I wrap the assign statement in a print() statement then I do get the desired outcome in the grid, but each individual plot gets printed as well.
Desired Results
Dataset
You're probably looking for something more like this:
library(readr)
library(tidyr)
library(dplyr)
library(ggplot2)
ww <- read_delim(file = "~/Downloads/winequality-white.csv",delim = ";")
ww_long <- ww %>%
gather(key = measure,value = value,`fixed acidity`:`alcohol`)
ggplot(data = ww_long,aes(x = quality,y = value)) +
facet_wrap(~measure,scales = "free_y") +
geom_point()
R has some tools that can be very tempting for beginners as they think through solving a problem. Among them are assign(), get() and eval(parse(text = )). It is usually the case that a solution using those will cause more problems than they solve; there's typically a better way, but will require digging a little deeper into the "normal" way of doing things in R.
The followings are the variables of the data
"fixed acidity";"volatile acidity";"citric acid";"residual sugar";"chlorides";"free sulfur dioxide";"total sulfur dioxide";"density";"pH";"sulphates";"alcohol";"quality"
the followings are sample rows
7;0.27;0.36;20.7;0.045;45;170;1.001;3;0.45;8.8;6
6.3;0.3;0.34;1.6;0.049;14;132;0.994;3.3;0.49;9.5;6
8.1;0.28;0.4;6.9;0.05;30;97;0.9951;3.26;0.44;10.1;6
7.2;0.23;0.32;8.5;0.058;47;186;0.9956;3.19;0.4;9.9;6
7.2;0.23;0.32;8.5;0.058;47;186;0.9956;3.19;0.4;9.9;6
8.1;0.28;0.4;6.9;0.05;30;97;0.9951;3.26;0.44;10.1;6
6.2;0.32;0.16;7;0.045;30;136;0.9949;3.18;0.47;9.6;6
7;0.27;0.36;20.7;0.045;45;170;1.001;3;0.45;8.8;6
6.3;0.3;0.34;1.6;0.049;14;132;0.994;3.3;0.49;9.5;6
8.1;0.22;0.43;1.5;0.044;28;129;0.9938;3.22;0.45;11;6
All form the excel sheet.

Rshiny output function breaks when using $ operator and filter

I'm trying to build a shiny app very closely based on a previous one I have (which works), albeit with a different (though similarly structured) data object underneath. After changing all the variable and object names, I got the following error:
Warning: Error in filter_impl: Evaluation error: $ operator is invalid
for atomic vectors.
I found a couple of seemingly relevant pages in my searches online:
$ operator is invalid for atomic vectors :: R shiny
https://github.com/rstudio/shiny/issues/1823
Unfortunately the $ operator error seems to be quite a general error message, and neither of these seemed to specifically address my problem. After some tinkering paring back various elements, I found I could render the plots and tables in the app, provided I didn't use attempt to filter on any of the input fields.
For instance, the following ouptut worked, fine, including the switch that allowed me to turn the whole plot on and off, and alter the heading of the graph with a text input called filter1.
output$emoPlot <- renderPlotly({
if(input$prefilterswitch == "OFF"){
df <- dtm_EMO %>%
clusterer(input$clusters)
plot <- df %>%
left_join(EMO_ALL, by = "Work_Order") %>%
ggplot(aes(x = date, y = as.factor(cluster), col = MILL, shape = as.factor(PART), text = Equipment_Description_Line_1, text2 = Work_Order))+
geom_point()+
guides(col = "none")+
ggtitle(label = input$filter1)
ggplotly(plot, tooltip = c("x", "y", "col", "text", "text2", "size"))
}
})
However, if I add line that seeks to filter on one of those inputs, like this:
output$emoPlot <- renderPlotly({
if(input$prefilterswitch == "OFF"){
df <- dtm_EMO %>%
filter(str_detect(combined, input$prefilterswitch %>% tolower()) ==T) %>%
clusterer(input$clusters)
plot <- df %>%
left_join(EMO_ALL, by = "Work_Order") %>%
ggplot(aes(x = date, y = as.factor(cluster), col = MILL, shape = as.factor(PART), text = Equipment_Description_Line_1, text2 = Work_Order))+
geom_point()+
guides(col = "none")+
ggtitle(label = input$filter1)
ggplotly(plot, tooltip = c("x", "y", "col", "text", "text2", "size"))
}
})
Then I get my error, and no plot:
Warning: Error in filter_impl: Evaluation error: $ operator is invalid
for atomic vectors.
I've had a play with rending a table output, using renderTable aswell, and get the same problem. I can use inputs, but not with functions like filter(), and mutate() also has the same problem.
I suspected that perhaps the issue was to do with the inputs, but all of the switches and text fields render find in the app, and they seem to work, just not with those functions.
That's about as much as I've been able to narrow it down. It's a little frustrating since the ability to apply multiple filters is quite important to the purpose of the app. Any help would be appreciated!
I am pretty sure that the problem is the use of non standard evaluation in filter. In
filter(str_detect(combined, input$prefilterswitch %>% tolower()))
you have input$prefilterswitch ... which normally in filter would be invalid because it is not just giving a column name from combined (even though we actually know that it is. I think there is probably a check in filter that automatically throws an error if there is a $. My usual solution to this is to create the object prior to starting your piping, so something like
prefilterswitch <- input$prefilterswitch
And then reference that in the filter statement. Also with a Boolean you don't need ==T.
Alternatively you can go into rlang.

ggplot2 - importing and plotting multiple .csv files

I have tried batch importing, but I think ggplot2 requires data frames and I have only been able to make a list of elements. I have set up a simple code in ggplot2 that imports data from multiple csv files and overlays their trendlines. All of the .csv files are in the same folder and have the same format. Is there a way to import all of the .csv files from the folder and plot all of them in ggplot without copying this code hundreds of times?
Thank you for your help!
library(ggplot2)
points1 <- read.csv("http://drive.google.com")[1:10,1:2]
points2 <- read.csv("http://drive.google.com")[1:10,1:2]
g <- (ggplot(points1, aes(x=ALPHA, y=BETA))
+labs(title="Model Run", subtitle="run4", y="LabelY", x="LabelX", caption="run4")
+ coord_cartesian(xlim=c(0,10), ylim=c(0,11))
#+ geom_point(data = points1)#
+geom_smooth(method="loess", span=.8, data = points1, se=FALSE)
#+ geom_point(data = points2)#
+geom_smooth(method="loess", span=.8, data = points2, se=FALSE))
plot(g)
This is a fun one. I am using some packages from the tidyverse (ggplot, purrr, readr) to make it more consistent.
Since you want to plot all the data in one plot, it makes sense to put all of it into one dataframe. The function purrr::map_df is perfect for this.
library(tidyverse)
files <- list.files("data/", "*.csv", full.names = T)
names(files) <- list.files("data/", "*.csv")
df <- map_df(files, ~read_csv(.), .id = "origin")
df %>% ggplot()+
aes(x,y, color = origin)+
geom_point()
A few explainations
The first two lines create a named vector with its elements being the full paths to the csv-files and the names of this vector being the filenames. This makes is easier to use the .id argument of map_df, which creates an additional column namend "origin" from the filenames. The notation inside map might seem a little weird at first, you could also supply a function written elesewhere to apply to each element but the ~ symbol is pretty handy: it creates a function right there and this function always takes the argument . as the current element of the vector or list you are iterating over.

Resources