Avoid getting the output of sapply() in R Markdown - r

I made a function systemFail(x), where x is one of the columns in my data frame (df$location).
Now I want to create a new column in my data frame (df$outcome) with the result of this function (Pass/Fail result). I used the line of code below which made the extra column as I wanted. However, annoyingly the result (a long column of Passes and Fails) also shows up in my R Markdown document.
How can I get the extra column in my data frame without getting the result also in my R Markdown document?
df$outcome <- sapply(df$location, systemFail)

It is difficult to answer without knowing details of systemFail function. However, from your previous question it seems you are printing the values in the function. Instead of printing use return to return the result.
Look at this simple example -
systemFail <- function(x) {
print(x)
}
res <- systemFail('out')
#[1] "out"
and when we use return nothing is printed and the result is available in res.
systemFail <- function(x) {
return(x)
}
res <- systemFail('out')

Related

Convert R list to Pythonic list and output as a txt file

I'm trying to convert these lists like Python's list. I've used these codes
library(GenomicRanges)
library(data.table)
library(Repitools)
pcs_by_tile<-lapply(as.list(1:length(tiled_chr)) , function(x){
obj<-tileSplit[[as.character(x)]]
if(is.null(obj)){
return(0)
} else {
runs<-filtered_identical_seqs.gr[obj]
df <- annoGR2DF(runs)
score = split(df[,c("start","end")], 1:nrow(df[,c("start","end")]))
#print(score)
return(score)
}
})
dt_text <- unlist(lapply(tiled_chr$score, paste, collapse=","))
writeLines(tiled_chr, paste0("x.txt"))
The following line of code iterates through each row of the DataFrame (only 2 columns) and splits them into the list. However, its output is different from what I desired.
score = split(df[,c("start","end")], 1:nrow(df[,c("start","end")]))
But I wanted the following kinda output:
[20350, 20355], [20357, 20359], [20361, 20362], ........
If I understand your question correctly, using as.tuple from the package 'sets' might help. Here's what the code might look like
library(sets)
score = split(df[,c("start","end")], 1:nrow(df[,c("start","end")]))
....
df_text = unlist(lapply(score, as.tuple),recursive = F)
This will return a list of tuples (and zeroes) that look more like what you are looking for. You can filter out the zeroes by checking the type of each element in the resulting list and removing the ones that match the type. For example, you could do something like this
df_text_trimmed <- df_text[!lapply(df_text, is.double)]
to get rid of all your zeroes
Edit: Now that I think about it, you probably don't even need to convert your dataframes to tuples if you don't want to. You just need to make sure to include the 'recursive = F' option when you unlist things to get a list of 0s and dataframes containing the numbers you want.

get() not working for column in a data frame in a list in R (phew)

I have a list of data frames. I want to use lapply on a specific column for each of those data frames, but I keep throwing errors when I tried methods from similar answers:
The setup is something like this:
a <- list(*a series of data frames that each have a column named DIM*)
dim_loc <- lapply(1:length(a), function(x){paste0("a[[", x, "]]$DIM")}
Eventually, I'll want to write something like results <- lapply(dim_loc, *some function on the DIMs*)
However, when I try get(dim_loc[[1]]), say, I get an error: Error in get(dim_loc[[1]]) : object 'a[[1]]$DIM' not found
But I can return values from function(a[[1]]$DIM) all day long. It's there.
I've tried working around this by using as.name() in the dim_loc assignment, but that doesn't seem to do the trick either.
I'm curious 1. what's up with get(), and 2. if there's a better solution. I'm constraining myself to the apply family of functions because I want to try to get out of the for-loop habit, and this name-as-list method seems to be preferred based on something like R- how to dynamically name data frames?, but I'd be interested in other, more elegant solutions, too.
I'd say that if you want to modify an object in place you are better off using a for loop since lapply would require the <<- assignment symbol (<- doesn't work on lapply`). Like so:
set.seed(1)
aList <- list(cars = mtcars, iris = iris)
for(i in seq_along(aList)){
aList[[i]][["newcol"]] <- runif(nrow(aList[[i]]))
}
As opposed to...
invisible(
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <<- runif(nrow(aList[[x]]))
})
)
You have to use invisible() otherwise lapply would print the output on the console. The <<- assigns the vector runif(...) to the new created column.
If you want to produce another set of data.frames using lapply then you do:
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <- runif(nrow(aList[[x]]))
return(aList[[x]])
})
Also, may I suggest the use of seq_along(list) in lapply and for loops as opposed to 1:length(list) since it avoids unexpected behavior such as:
# no length list
seq_along(list()) # prints integer(0)
1:length(list()) # prints 1 0.

Print command used to create results in loop

Let's say I have the following loop:
test <- mtcars
target_var <- c("mpg", "wt")
group_var <- c("gear", "carb")
library(tidyverse)
for (i in target_var) {
for (j in group_var) {
print(prop.table(table(test[, i], test[, j]), 2))
}
}
When I run it, I see four prop.tables reported, each with a heading of [1] through [[4]]. I can visually match up which prop.table goes with which variables by looking at the code and seeing the order of variables assigned to target_var and group_var.
But let's say I'm looping through a dozen variables or more. Obviously, it's a problem to match up one particular prop.table (listed as, say, "[[14]]") with the actual variables in that table.
Is there a way to print the command R used to generate each particular table?
I am not looking for a progress bar but, instead, something that would list the code above each table printed in the console. For example, the following code would be printed right above the appropriate prop.table:
prop.table(table(test$mpg, test$gear), 2)
# actual prop.table result from the first time through the loop here
prop.table(table(test$mpg, test$carb), 2)
# actual prop.table result from the second time through the loop here
This would help me when doing exploratory data analysis.
If I understand well you want the results of your function to be printed "as you go" (i.e. at each iteration of the loop, not at the end)?
You can use cat instead of print (be sure to print a new line \n after you call cat to avoid a messy console):
for(i in 1:3) {
results <- mean(rnorm(10))
cat(results)
cat("\n")
}

Writing a loop in R

I have written a loop in R. The code is expected to go through a list of variables defined in a list and then for each of the variables perform a function.
Problem 1 - I cannot loop through the list of variables
Problem 2 - I need to insert each output from the values into Mongo DB
Here is an example of the list:
121715771201463_626656620831011
121715771201463_1149346125105084
Based on this value - I am running a code and i want this output to be inserted into MongoDB. Right now only the first value and its corresponding output is inserted
test_list <-
C("121715771201463_626656620831011","121715771201463_1149346125105084","121715771201463_1149346125105999")
for (i in test_list)
{ //myfunction//
mongo.insert(mongo, DBNS, i)
}
I am able to only pick the values for the first value and not all from the list
Any help is appreciated.
Try this example, which prints the final characters
myfunction <- function(x){ print( substr(x, 27, nchar(x)) ) }
test_list <- c("121715771201463_626656620831011",
"121715771201463_1149346125105084",
"121715771201463_1149346125105999")
for (i in test_list){ myfunction(i) }
for (j in 1:length(test_list)){ myfunction(test_list[j]) }
The final two lines should each produce
[1] "31011"
[1] "105084"
[1] "105999"
It is not clear whether "variable" is the same as "value" here.
If what you mean by variable is actually an element in the list you construct, then I think Ilyas comment above may solve the issue.
If "variable" is instead an object in the workspace, and elements in the list are the names of the objects you want to process, then you need to make sure that you use get. Like this:
for(i in ls()){
cat(paste(mode(get(i)),"\n") )
}
ls() returns a list of names of objects. The loop above goes through them all, uses get on them to get the proper object. From there, you can do the processing you want to do (in the example above, I just printed the mode of the object).
Hope this helps somehow.

R store output from lapply with multiple functions

I'm using lapply to loop through a list of dataframes and apply the same set of functions. This works fine when lapply has just one function, but I'm struggling to see how I store/print the output from multiple functions - in that case, I seem to only get output from one 'loop'.
So this:
output <- lapply(dflis,function(lismember) vss(ISEQData,n=9,rotate="oblimin",diagonal=F,fm="ml"))
works, while the following doesn't:
output <- lapply(dflis,function(lismember){
outputvss <- vss(lismember,n=9,rotate="oblimin",diagonal=F,fm="ml")
nefa <- (EFA.Comp.Data(Data=lismember, F.Max=9, Graph=T))
})
I think this dummy example is an analogue, so in other words:
nbs <- list(1==1,2==2,3==3,4==4)
nbsout <- lapply(nbs,function(x) length(x))
Gives me something I can access, while I can't see how to store output using the below (e.g. the attempt to use nbsout[[x]][2]):
nbs <- list(1==1,2==2,3==3,4==4)
nbsout <- lapply(nbs,function(x){
nbsout[[x]][1]<-typeof(x)
nbsout[[x]][2]<-length(x)
}
)
I'm using RStudio and will then be printing outputs/knitting html (where it makes sense to display the results from each dataset together, rather than each function-output for each dataset sequentially).
You should return a structure that include all your outputs. Better to return a named list. You can also return a data.frame if your outputs have all the same dimensions.
otutput <- lapply(dflis,function(lismember){
outputvss <- vss(lismember,n=9,rotate="oblimin",diagonal=F,fm="ml")
nefa <- (EFA.Comp.Data(Data=lismember, F.Max=9, Graph=T))
list(outputvss=outputvss,nefa=nefa)
## or data.frame(outputvss=outputvss,nefa=nefa)
})
When you return a data.frame you can use sapply that simply outputs the final result to a big data.frame. Or you can use the classical:
do.call(rbind,output)
to aggregate the result.
A function should always have an explicit return value, e.g.
output <- lapply(dflis,function(lismember){
outputvss <- vss(lismember,n=9,rotate="oblimin",diagonal=F,fm="ml")
nefa <- (EFA.Comp.Data(Data=lismember, F.Max=9, Graph=T))
#return value:
list(outputvss, nefa)
})
output is then a list of lists.

Resources