R function list with undetermined length as an argument

R function list with undetermined length as an argument - r

I am using the multiplot function of this website
This function takes as argument an indefinite number of plots and plot them together. With the code below I can plot for example 4 plots together in 2 columns.
library(ggplot2)
n = 4
p = lapply(1:n, function(i) {
ggplot(data.frame(x = 1:10,y = rep(i,10)), aes(x = x, y = y)) +
geom_point()
})
multiplot(p[[1]],p[[2]],p[[3]],p[[4]],cols = 2)
How can I do if I have an undetermined number of plots n ?
I already tried things like
multiplot(p,cols = 2)
do.call(multiplot, list(p,cols = 2))
but it doesn't give the result that I want

With the second option you were also close:
do.call(multiplot, c(p, cols = 2))
does the job since
length(list(p, cols = 2))
# [1] 2
length(c(p, cols = 2))
# [1] 5
That is, list(p, cols = 2) makes a list of two components: a list p and an integer 2, while what you want is to extend p by adding cols = 2, and we need for that. (Somewhat not immediately obvious perhaps, but the title of ?c does indeed say Combine Values into a Vector or List.)

maybe multiplot(plotlist = p) !
cheers

Related

Missing ticks in custom axis transformation

When plotting the ratio between two variables, their relative order is often of no concern, yet depending on which variable is in the numerator, its relative size is constrained either to (0,1) or (1, Inf), which is somewhat unintuitive and breaks symmetry. I want to plot ratios "symmetrically", without resorting to symmetric log-scale, by having a y-axis that goes like 1/4, 1/3, 1/2, 1, 2, 3, 4 or, equivalently, 4^-1, 3^-1, 2^-1, 1, 2, 3, 4 in regular intervals. I've come up with the following:
symmult <- function(x){
isf <- is.finite(x) & (x>0)
xf <- x[isf]
xf <- ifelse(xf>=1,
xf-1,
1-(1/xf))
x[isf] <- xf
x[!isf] <- NA
x[!is.finite(x)] <- NA
return(x)
}
symmultinv <- function(x){
isf <- is.finite(x)
xf <- x[isf]
xf <- ifelse(x[isf]>=0,
x[isf]+1,
-1/(x[isf]-1))
x[isf] <- xf
x[!isf] <- NA
x[!is.finite(x)] <- NA
return(x)
}
sym_mult_trans = function(){trans_new("sym_mult", symmult, symmultinv )}
x <- c(-4:-2, 1:4)
x[x<1] <- 1/abs(x[x<1])
ggplot() +
geom_point(aes(x=x, y=x)) +
scale_y_continuous(trans="sym_mult")
The transformation works, but I cannot get the axis labels etc. to work for any 0<x<1, without setting them manually. Any help would be greatly appreciated.

You can create bespoke 'breaks' and 'format' functions that you can use inside trans_new (or pass to scale_y_continuous directly via its breaks and labels parameters).
For the breaks function, remember it will take as input a length-two numeric vector representing the range of the y axis. You must then convert this to a number of appropriate breaks. Here, if the minimum of the range is less than one, we take its reciprocal, find the pretty breaks between one and that number, then take the reciprocal of the output. We concatenate that onto pretty breaks between 1 and our range maximum:
# Define breaks function
symmult_breaks <- function(x) {
c(1 / extended_breaks(5)(c(1/x[x < 1], 1)),
extended_breaks(5)(c(1, x[x >= 1])))
}
For the labelling function, remember, it needs to take as input the vector of numbers produced by our breaks function. We can paste a 1/ in front of the reciprocal of numbers less than one, but leave numbers of 1 or more unaltered:
# Define labelling function
symmult_labs <- function(x) {
labs <- character(length(x))
labs[x >= 1] <- as.character(x[x >= 1])
labs[x < 1] <- paste("1", as.character(1/x[x < 1]), sep = "/")
labs
}
So your full new transformation becomes:
# Use our four functions to define the whole transformation:
sym_mult_trans <- function() {
trans_new(name = "sym_mult",
transform = symmult,
inverse = symmultinv,
breaks = symmult_breaks,
format = symmult_labs)
}
And your plot becomes:
ggplot() +
geom_point(aes(x = x, y = x)) +
scale_y_continuous(trans = "sym_mult")

Plot: Looping over nested data

I'm currently having an issue where I'm trying to nest simulated data for an efficient frontier inside a tibble containing all 250 simulations. The tibble will have 1 column named "sim" which indicates the number of the simulation, i.e. the rows in this column runs from 1:250. The other column should contain the nested simulation data which is a 3x123 tibble for each simulation. I've successfully, with help from a nice soul here, managed to create this tibble containing the efficient frontiers. Now I need to make a loop running through this tibble and plotting all of the 250 efficient frontiers in one plot.
I've tried to replicate the problem such that you don't need all of the previous code and data to see the issue. In this simple and reproducible example I have a table which is a 5x2 Tibble where the column 'sim' lists simulations (1:5) and 'obs' holds an individual 5x3 tibble with some coordinates:
library(tidyverse)
library(ggplot2)
counter = 0
table <- tibble(sim = 1:5, obs = NA)
for(i in (1:5)){
counter = counter + 1
tibble <- tibble(a = NA, b = 1:5, x = c(counter + 1), y = c(counter*2-1))
tibble$a <- counter
nested_tibble <- tibble %>% nest(data = -a) %>% select(-a)
table$obs[i] <- nested_tibble[[1]]
}
for (i in (1:5)){
print(ggplot()+
geom_point( data = (table %>% filter(sim == i) %>% .$obs)[[1]],
aes(x = x, y = y),
color = "red",
size = 4))
}
As mentioned I wish for it to plot all of the 5 coordinates in one plot such that I can replicate this to plot 250 efficient frontiers. However, when I run the code it only returns the last coordinate.
I hope my formulation makes sense. If you need any additional documentation please let me know.

I am not sure, but this should do the job. I think using lists is way better to store nested structure. So, the code below returns a list called table_out.
Please, have a look if this is what you want.
library(tibble)
library(data.table)
library(ggplot2)
N_sim <- 5
table_out <- vector("list", 5)
for ( i in seq_len(N_sim) ) {
current_table <- tibble(a = i, b = 1L:N_sim, x = i + 1, y = i*2 - 1)
table_out[[ i ]] <- current_table
}
# this creates a data.table (like a data.frame) from a list
final <- rbindlist( table_out )
ggplot(final, aes(x, y)) +
geom_point(color = "red", size = 4)
Created on 2021-03-03 by the reprex package (v1.0.0)

using mapply with ggplot

Continuing on my quest to work with functions and ggplot:
I sorted out basic ways on how to use lapply and ggplot to cycle through a list of y_columns to make some individual plots:
require(ggplot2)
# using lapply with ggplot
df <- data.frame(x=c("a", "b", "c"), col1=c(1, 2, 3), col2=c(3, 2, 1), col3=c(4, 2, 3))
cols <- colnames(df[2:4])
myplots <- vector('list', 3)
plot_function <- function(y_column, data) {
ggplot(data, aes_string(x="x", y=y_column, fill = "x")) +
geom_col() +
labs(title=paste("lapply:", y_column))
}
myplots <- lapply(cols, plot_function, df)
myplots[[3]])
I know what to bring in a second variable that I will use to select rows. In my minimal example I am skipping the selection and just reusing the same plots and dfs as before, I simply add 3 iterations. So I would like to generate the same three plots as above, but now labelled as iteration A, B, and C.
I took me a while to sort out the syntax, but I now get that mapply needs to vectors of identical length that get passed on to the function as matched pairs. So I am using expand.grid to generate all pairs of variable 1 and variable 2 to create a dataframe and then pass the first and second column on via mapply. The next problem to sort out was that I need to pass on the dataframe as list MoreArgs =. So it seems like everything should be good to go. I am using the same syntax for aes_string() as above in my lapply example.
However, for some reason now it is not evaluating the y_column properly, but simply taking it as a value to plot, not as an indicator to plate the values contained in df$col1.
HELP!
require(ggplot2)
# using mapply with ggplot
df <- data.frame(x=c("a", "b", "c"), col1=c(1, 2, 3), col2=c(3, 2, 1), col3=c(4, 2, 3))
cols <- colnames(df[2:4])
iteration <- c("Iteration A", "Iteration B", "Iteration C")
multi_plot_function <- function(y_column, iteration, data) {
plot <- ggplot(data, aes_string(x="x", y=y_column, fill = "x")) +
geom_col() +
labs(title=paste("mapply:", y_column, "___", iteration))
}
# mapply call
combo <- expand.grid(cols=cols, iteration=iteration)
myplots <- mapply(multi_plot_function, combo[[1]], combo[[2]], MoreArgs = list(df), SIMPLIFY = F)
myplots[[3]]

We may need to use rowwise here
out <- lapply(asplit(combo, 1), function(x)
multi_plot_function(x[1], x[2], df))
In the OP's code, the only issue is that the columns are factor for 'combo', so it is not parsed correctly. If we change it to character, it works
out2 <- mapply(multi_plot_function, as.character(combo[[1]]),
as.character(combo[[2]]), MoreArgs = list(df), SIMPLIFY = FALSE)
-testing
out2[[1]]

How to avoid gaps due to missing values in matplot in R?

I have a function that uses matplot to plot some data. Data structure is like this:
test = data.frame(x = 1:10, a = 1:10, b = 11:20)
matplot(test[,-1])
matlines(test[,1], test[,-1])
So far so good. However, if there are missing values in the data set, then there are gaps in the resulting plot, and I would like to avoid those by connecting the edges of the gaps.
test$a[3:4] = NA
test$b[7] = NA
matplot(test[,-1])
matlines(test[,1], test[,-1])
In the real situation this is inside a function, the dimension of the matrix is bigger and the number of rows, columns and the position of the non-overlapping missing values may change between different calls, so I'd like to find a solution that could handle this in a flexible way. I also need to use matlines
I was thinking maybe filling in the gaps with intrapolated data, but maybe there is a better solution.

I came across this exact situation today, but I didn't want to interpolate values - I just wanted the lines to "span the gaps", so to speak. I came up with a solution that, in my opinion, is more elegant than interpolating, so I thought I'd post it even though the question is rather old.
The problem causing the gaps is that there are NAs between consecutive values. So my solution is to 'shift' the column values so that there are no NA gaps. For example, a column consisting of c(1,2,NA,NA,5) would become c(1,2,5,NA,NA). I do this with a function called shift_vec_na() in an apply() loop. The x values also need to be adjusted, so we can make the x values into a matrix using the same principle, but using the columns of the y matrix to determine which values to shift.
Here's the code for the functions:
# x -> vector
# bool -> boolean vector; must be same length as x. The values of x where bool
# is TRUE will be 'shifted' to the front of the vector, and the back of the
# vector will be all NA (i.e. the number of NAs in the resulting vector is
# sum(!bool))
# returns the 'shifted' vector (will be the same length as x)
shift_vec_na <- function(x, bool){
n <- sum(bool)
if(n < length(x)){
x[1:n] <- x[bool]
x[(n + 1):length(x)] <- NA
}
return(x)
}
# x -> vector
# y -> matrix, where nrow(y) == length(x)
# returns a list of two elements ('x' and 'y') that contain the 'adjusted'
# values that can be used with 'matplot()'
adj_data_matplot <- function(x, y){
y2 <- apply(y, 2, function(col_i){
return(shift_vec_na(col_i, !is.na(col_i)))
})
x2 <- apply(y, 2, function(col_i){
return(shift_vec_na(x, !is.na(col_i)))
})
return(list(x = x2, y = y2))
}
Then, using the sample data:
test <- data.frame(x = 1:10, a = 1:10, b = 11:20)
test$a[3:4] <- NA
test$b[7] <- NA
lst <- adj_data_matplot(test[,1], test[,-1])
matplot(lst$x, lst$y, type = "b")

You could use the na.interpolation function from the imputeTS package:
test = data.frame(x = 1:10, a = 1:10, b = 11:20)
test$a[3:4] = NA
test$b[7] = NA
matplot(test[,-1])
matlines(test[,1], test[,-1])
library('imputeTS')
test <- na.interpolation(test, option = "linear")
matplot(test[,-1])
matlines(test[,1], test[,-1])

Had also the same issue today. In my context I was not permitted to interpolate. I am providing here a minimal, but sufficiently general working example of what I did. I hope it helps someone:
mymatplot <- function(data, main=NULL, xlab=NULL, ylab=NULL,...){
#graphical set up of the window
plot.new()
plot.window(xlim=c(1,ncol(data)), ylim=range(data, na.rm=TRUE))
mtext(text = xlab,side = 1, line = 3)
mtext(text = ylab,side = 2, line = 3)
mtext(text = main,side = 3, line = 0)
axis(1L)
axis(2L)
#plot the data
for(i in 1:nrow(data)){
nin.na <- !is.na(data[i,])
lines(x=which(nin.na), y=data[i,nin.na], col = i,...)
}
}
The core 'trick' is in x=which(nin.na). It aligns the data points of the line consistently with the indices of the x axis.
The lines
plot.new()
plot.window(xlim=c(1,ncol(data)), ylim=range(data, na.rm=TRUE))
mtext(text = xlab,side = 1, line = 3)
mtext(text = ylab,side = 2, line = 3)
mtext(text = main,side = 3, line = 0)
axis(1L)
axis(2L)`
draw the graphical part of the window.
range(data, na.rm=TRUE) adapts the plot to a proper size being able to include all data points.
mtext(...) is used to label the axes and provides the main title. The axes themselves are drawn by the axis(...) command.
The following for-loop plots the data.
The function head of mymatplot provides the ... argument for an optional passage of typical plot parameters as lty, lwt, cex etc. via . Those will be passed on to the lines.
At last word on the choice of colors - they are up to your flavor.

passing arguments to function with a view to plotting with ggplot stat_function

I have a function and a list of arguments.
F <- function(a,b,...) {a^b+b/a}
L <- list("a" = 5, "b" = 2, "c" = 0)
I want to replace one of the arguments ("a", "b" or "c") with an unknown x (or "x") and plot with ggplot's stat_function.
These computations are part of a shiny app, where the user will 1) select a parameter from a drop-down list, say "a", to be the unknown, and 2) use sliders to select values of the other parameters. The numbers 5, 2, 0 in L are the default parameter values, to be used before user interaction. There are several such functions. Here the list of parameters L has an element not used in F.
I've been stuck on this for so long that I can't think straight anymore. Of the many things I've tried, here's one:
# select a parameter to vary:
Y <- "a"
f1 <- function(f = F, l = L, y = Y, x, ...){
l[y] <- x # replace "a" with x
do.call(f, l, ...)
}
# make a stat_function ggplot:
library("ggplot2")
df <- data.frame(x = c(0,10))
p <- ggplot(df, aes(x))
p <- p + stat_function(fun = f1)
print(p)
This returns the following error:
Error in (function (f = F, l = L, y = Y, x, ...) :
argument "x" is missing, with no default
Error in as.environment(where) : 'where' is missing
I have tried several variants, including: setting l[y] <- "x" and using aes_string instead of aes. I have also tried backquotes around x. I have read through the documentation about environments, so I've tried defining an environment, wrapping x and just about everything around eval or quote. I've even tried voodoo. I've lost count of how many hours I've spent on this. A suggestion to read the manual or a hint without an explanation will kill me. 8-) If my question is unclear, please let me know and I will clarify. Thanks!

If I understand, Having a multi parameters functions , you want to induce a partial function where you vary one parameter and fix others. Try this for example:
F <- function(a,b,...) {a^b+b/a}
L <- list("a" = 5, "b" = 2, "c" = 0)
f.partial <- function( var = "a",params=L){
params[[var]]=as.name("x")
function(x)do.call(F,params)
}
We can test this for example:
## vary a
f.partial("a")(1)
[1] 3
> F(1,b=L$b)
[1] 3
## vary b
> f.partial("b")(1)
[1] 5.2
> F(1,a=L$a)
[1] 5.2
Testing with ggplot2:
library("ggplot2")
df <- data.frame(x = seq(0,1,0.1))
ggplot(df, aes(x)) +
stat_function(fun = f.partial("a"),col='blue') +
stat_function(fun = f.partial("b"),col='red')