How could I output several plot and values in R function? - r

There are several questions:
1 In an R function, "return" only can output one plot or value. But now, I want the function output every plot or vector I require, how could I achieve that. Which code should I use.
2 I have a series of variables : Game1~Game10 and I built up a do loop to analysis each of them where I represented their name as
"paste("Game",i, sep="")",
But it is characters, and I cannot do it like a variable like
"sort(eval(paste("Game",i, sep="")))"
is fail. How could I make R recognize the characters series as a variable name.

to return more than one value from a function, use a data structure, that can store more values and return it, e.g. a vector, list or a dataframe
...
vector_1 <- 1:10
vector_2 <- 11:20
return( list(vec_1=vector_1, vec_2=vector_2) )
to output more than one plot, simply use a loop within the function e.g.
for(i in 5:10) plot(1:i)
Your second question is not clear to me. What are you trying to do?

Related

Creating a vector of colors inside a loop in R

I want to create a script to automatically plot very Hclust plots. The problem I am facing is that, I able to create a vector and assign values in-side a loop but not able to setnames() to the vec.
For e.g.
col_metadata <- paste0("col_lis",1:ncol(metadata))
#> "col_lis1" "col_lis2" "col_lis3"
for (i in seq_len(ncol(metadata))){
m <- length(levels(metadata[,i]))
assign(paste0("col_lis",i),brewer.pal(n=m,sets[i]))
names()
#paste0("col_lis",i) <- setNames(names(eval(parse(text=paste0("col_lis",i)))),nm = c("sdf","dsfd","dfsf"))
}
I am not able to set names inside the loop but I am able to assign values to the vector col_lis1.
Is there any other way to do this.

Output of a function in R change with the number of inputs

I am trying to run a function to download data from the USGS website using dataRetrieval package of R and a function I have created called getstreamflow. The code is the following:
siteNumber <- c("094985005","09498501","09489500","09489499","09498502")
Streamflow <- sapply(siteNumber, function(siteNumber) tryCatch(getstreamflow(siteNumber), error = function(e) message(paste("Error in station ", siteNumber))))
Streamflow <- Filter(NROW,Streamflow) #to delete empty data frames
I got the output I want that it is the one shown in the image below:
However, when I ran the same code but increase the number of stations in the input siteNumber
The output change and instead to produce several dataframes inside of a list. It generates a list for each data frame.
Does someone know why this happens? It is the same function only changes the number of stations in the siteNumber
Based on the image showed in the new data, each element in the list is nested as a list. We can extract the list element (of length 1) with [[1]] and then apply the Filter
out <- Filter(NROW, lapply(Streamflow, function(x) x[[1]]))
As we used NROW, it passed the test for list as well where it returns 1 for length attribute of list and thus all the elements meet the condition TRUE. Also, in the previous step, OP uses sapply and sapply is one function which can sometimes simplify the output. Instead of sapply use lapply (or specify simplify = FALSE)

get() not working for column in a data frame in a list in R (phew)

I have a list of data frames. I want to use lapply on a specific column for each of those data frames, but I keep throwing errors when I tried methods from similar answers:
The setup is something like this:
a <- list(*a series of data frames that each have a column named DIM*)
dim_loc <- lapply(1:length(a), function(x){paste0("a[[", x, "]]$DIM")}
Eventually, I'll want to write something like results <- lapply(dim_loc, *some function on the DIMs*)
However, when I try get(dim_loc[[1]]), say, I get an error: Error in get(dim_loc[[1]]) : object 'a[[1]]$DIM' not found
But I can return values from function(a[[1]]$DIM) all day long. It's there.
I've tried working around this by using as.name() in the dim_loc assignment, but that doesn't seem to do the trick either.
I'm curious 1. what's up with get(), and 2. if there's a better solution. I'm constraining myself to the apply family of functions because I want to try to get out of the for-loop habit, and this name-as-list method seems to be preferred based on something like R- how to dynamically name data frames?, but I'd be interested in other, more elegant solutions, too.
I'd say that if you want to modify an object in place you are better off using a for loop since lapply would require the <<- assignment symbol (<- doesn't work on lapply`). Like so:
set.seed(1)
aList <- list(cars = mtcars, iris = iris)
for(i in seq_along(aList)){
aList[[i]][["newcol"]] <- runif(nrow(aList[[i]]))
}
As opposed to...
invisible(
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <<- runif(nrow(aList[[x]]))
})
)
You have to use invisible() otherwise lapply would print the output on the console. The <<- assigns the vector runif(...) to the new created column.
If you want to produce another set of data.frames using lapply then you do:
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <- runif(nrow(aList[[x]]))
return(aList[[x]])
})
Also, may I suggest the use of seq_along(list) in lapply and for loops as opposed to 1:length(list) since it avoids unexpected behavior such as:
# no length list
seq_along(list()) # prints integer(0)
1:length(list()) # prints 1 0.

How to check if a column has numeric or categorical levels in R?

I am trying to plot 9 barplots in a 3X3 matrix in R using base-R wrapped inside a for loop. (I am working on a workhorse solution for visualizing every column before I begin working on manipulating data) Below is the code:
library(ISLR);
library(ggplot2);
# load wage data
data(Wage)
par(mfrow=c(3,3))
for(i in 1:(dim(Wage)[2]-2)){
plot(Wage[,i],main = paste0(names(Wage)[i]),las = 2)
}
But unfortunately can't do properly for first 2 columns because they are numeric and actually needs a histogram. I get it that I need to fit if-else condition somewhere inside for() statement but that is giving me errors. below is the output where first 2 columns are plotted wrong. (Age and year are actually numeric and I may need to use them in X-axis instead of defaulting them to y).
Kindly requesting to suggest an edit/hack? I also learnt that I cant' use par() when I am wrapping ggplot inside for so I had to use base-R otherwise ggplot would have been great aesthetically.

R: Call a pasted variable name and use it as position argument

I am trying to replace all values of r for which r<=10 with the value of the 1st observation in x (which is 1). This is just a very simplified example of what I am trying to do, so please do not question why I'm trying to do this in a complicated way because the full code is more complicated. The only thing I need help with is figuring out how to use the vector I created (p1) to replace r[p1] or equivalently r[c(1,2,3,4)] with x[ 1 ] (which is equal to 1). I can not write p1 explicitly because it will be generated in a loop (not shown in code).
x=c(1,2,3)
r=c(1,3,7,10,15)
assign(paste0("p", x[1]), which(r<=10))
p1
r[paste0("p", x[1])]=x[1]
In the code above, I tried using r[paste0("p", x[1])]=x[1] but this is the output I end up with
When instead I would like to see this output
Basically, I need to figure out a way to call p1 in this code r[??]=x[1] without explicitly typing p1.
I have included the full code I am attempting below in case context is needed.
##Creates a function to generate discrete random values from a specified pmf
##n is the number of random values you wish to generate
##x is a vector of discrete values (e.g. c(1,2,3))
##pmf is the associated pmf for the discrete values (e.g. c(.3,.2,.5))
r.dscrt.pmf=function(n,x,pmf){
set.seed(1)
##Generate n uniform random values from 0 to 1
r=runif(n)
high=0
low=0
for (i in 1:length(x)){
##High will establish the appropriate upper bound to consider
high=high+pmf[i]
if (i==1){
##Creates the variable p1 which contains the positions of all
##r less than or equal to the first value of pmf
assign(paste0("p", x[i]), which(r<=pmf[i]))
} else {
##Creates the variable p2,p3,p4,etc. which contains the positions of all
##r between the appropriate interval of high and low
assign(paste0("p", x[i]), which(r>low & r<=high))
}
##Low will establish the appropriate lower bound to consider
low=low+pmf[i]
}
for (i in 1:length(x)){
##Will loops to replace the values of r at the positions specified at
##p1,p2,p3,etc. with x[1],x[2],x[3],etc. respectively.
r[paste0("p", x[i])]=x[i]
}
##Returns the new r
r
}
##Example call of the function
r.dscrt.pmf(10,c(0,1,3),c(.3,.2,.5))
get is like assign, in that it lets you refer to variables by string instead of name.
r[get(paste0("p", x[1]))]=x[1]
But get is one of those "flags" of something that could be written in a much clearer and safer way.
Would this suit your needs?
ifelse(r<11, x[1], r)
[1] 1 1 1 1 15

Resources