Efficient way to review formulas that generate named objects in R - r

If I have a named object (in my case a named plot) in R, is there an efficient way to double check the formula that generated it? As of now I am scrolling back through the console, but I'm hoping that there is a more efficient way.
For example, at the start of my project I input
Boxplot <- ggplot(plotting input) + geom_boxplot(plotting input)
Now I can call Boxplot by name to plot it, but I want to be able to efficiently review my ggplot input. Is there a tool to do this?

For your example, you can see the elements of Boxplot using:
names(Boxplot)
So you can see, for example, the input data using:
Boxplot$data
Or the parameters and type of the plot using:
Boxplot$layers

Related

Any way to access the plot object generated by DescTools::Desc()?

I am using Desc() from DescTools to describe some variables in a rmarkdown PDF document. The problem is that it generates 3 plots that are kept in line when I knit the document, thus clipping the images.
Example:
dates <- sample(seq(as.Date('1999/01/01'), as.Date('2021/01/01'), by="day"), 1000)
results <- DescTools::Desc(dates)
results
The output contains 3 plots. I can find the individual responses using the list in results[[1]]], but I can't find the plot objects, which I think could be a way to put then one below the other.
Any thoughts?
There are no plot objects in results.
Instead, when you type results in your console, it invokes the S3 generic print, which in turn dispatches the print.Desc method. By default, print.Desc will call a plotting function based on the "class" member of results, which in your example is "Date". If you type DescTools:::plot.Desc.Date in your console, you will see the function that actually generates the plot every time you print results.
So there are no plot objects. There is data to create a plot, and whenever you print results to the console, the plots are created by a call to a plotting function.
The Desc plotting functions seem to have very few options available to allow modifications, so the best option would probably be to use the data inside results to create your own plots. If you wish to see the contents of results without the plots, simply type:
print(results, plotit = FALSE)
And if you want the three plots one at a time, you can do:
DescTools:::plot.Desc.Date(results[[1]], type = 1)
DescTools:::plot.Desc.Date(results[[1]], type = 2)
DescTools:::plot.Desc.Date(results[[1]], type = 3)

How to save the plot output and object output of a function call in R

I am making a function call in R. The function returns a list and in it has a call to plot(). From this one call, I need to record the plot as an object and store the list as a separate object. I need to store the plot because I later give it to ggarrange() along with other plots. I must store both outputs in one call of the function because the function runs permutations. As a result, it will produce a slightly different output each time. Therefore, in order for the data in the list to match the plot, the call can only be made once.
The line of code below is what I am currently using that successfully stores the plot as ggplot object. It does not store the list.
my_plot <- as.ggplot(~(my_function(input1,input2, permutations=1000)))
The code below will return the list, but not save the plot.
my_list <- my_function(input1,input2, permutations=1000)
Does anyone know a way of accomplishing what I am trying to do?

Plot data from SparkR DataFrame

I have an avro file which I am reading as follows:
avroFile <-read.df(sqlContext, "avro", "com.databricks.spark.avro")
This file as lat/lon columns but I am not able to plot them like a regular dataframe.
Neither am I able to access the column using the '$' operator.
ex.
avroFile$latitude
Any help regarding avro files and operation on them using R are appreciated.
If you want to use ggplot2 for plotting, try ggplot2.SparkR. This package allows you to take SparkR DataFrame directly as input for ggplot() function call.
https://github.com/SKKU-SKT/ggplot2.SparkR
And you won't be able to plot it directly. SparkR DataFrame is not compatible with functions which expect data.frame as an input. This is not even a data structure in a strict sense but simply a recipe how to process input data. It is materialized only when you execute an action.
If you want to plot it you'll have collect it first.. Beware that it fetches all the data the local machine so typically it is something you want to avoid on full data set.
As zero323 mentioned, you cannot currently run R visualizations on distributed SparkR DataFrames. You can run them on local data.frames. Here is one way you could make a new dataframe with just the columns you want to plot, and then collect a random sample of them to a local data.frame which you can plot from
latlong <- (avroFile, avroFile$latitude, avrofile$longitude)
latlongsample <- collect(sample(latlong, FALSE, .1))
plot(latlongsample)
the signature for sample method is:
sample(x, withReplacement, fraction, seed)

Incorporating user input in factor function in R

I am trying to figure out how to take a column name supplied by a user in response to a prompt in R and use that in the function factor. The idea is to create a script using ggplot2 that will allow users to easily select which variable from a table they would like coded by color and which by shape.
The line of code requesting user input would be:
> Color_Factor<-readline("What is the Column Heading of the Variable you would like separated by Color? ")
What is the Column Heading of the Variable you would like separated by Color? Reach
My problem is that I can't figure out how to use this input to call a particular column for graphing purposes. The code below creates a graph with one color and the single variable "Reach".
> qplot(d13C, d15N, data=InputFile, **col=factor(Color_Factor)**, shape=factor(Functional_Group))
All my attempts at calling a function as the initial argument in factor() have met with complete failure. I'm specifically interested in this for graphing purposes but am also wondering if there is a way to use the value of a variable rather than the variable name to specify a column in this type of function in general. I'm totally new to R so maybe there's an obvious solution but I haven't been able to find an answer online so far.
Thanks
This isn't very elegant but the non-standard evaluation of arguments to ggplot2 has always confused the heck out of me:
> Color_Factor<-readline("What is the Column Heading of the Variable you would like separated by Color? ")
What is the Column Heading of the Variable you would like separated by Color? mpg
qplot( x=mtcars[, Color_Factor], wt, data=mtcars)
I've tried (and failed with) a variety of language level incantations using as.name, substitute, and eval to supply the x-argument with a language element that satisfied it. The strategy above uses the capacity of [.data.frame to evaluate the Color_Factor and match it to a column name in mtcars. This alternative also succeeds (since it is pretty much duplicating what the first one is doing:
qplot( x=eval(as.name(Color_Factor), mtcars), wt, data=mtcars)

Loop in R to create and save series of ggplot2 plots with specified names

I have a data frame in R with POSIXct variable sessionstarttime. Each row is identified by integer ID variable of a specified location . Number of rows is different for each location. I plot overall graph simply by:
myplot <- ggplot(bigMAC, aes(x = sessionstarttime)) + geom_freqpoly()
Is it possible to create a loop that will create and save such plot for each location separately?
Preferably with a file name the same as value of ID variable?
And preferably with the same time scale for each plot?
Not entirely sure what you're asking but you can do one of two things.
a) You can save each individual plot in a loop with a unique name based on ID like so:
ggsave(myplot,filename=paste("myplot",ID,".png",sep="")) # ID will be the unique identifier. and change the extension from .png to whatever you like (eps, pdf etc).
b) Just assign each plot to an element of a list. Then write that list to disk using save
That would make it very easy to load and access any individual plot at a later time.
I am not sure if I get what you want to do. From what I guess, i suggest to write a simple function that saves the plot. and then use lapply(yourdata,yourfunction,...) . Since lapply can be used for lists, it´s not necessary that the number of rows is equal.
HTH
use something like this in your function:
ggsave(filename,scale=1.5)

Resources