‘l’ is no list of MALDIquant::MassPeaks objects - r

I'm following this tutorial about MALDIquant package but i'm getting an error, before executing this line
peaks <- binPeaks(peaks, tolerance=0.002)
The Error is :
Error: binPeaks(peaks, tolerance = 0.002) : ‘l’ is no list of MALDIquant::MassPeaks objects!
When i do class(peaks) :
> class(peaks)
[1] "MassPeaks"
attr(,"package")
[1] "MALDIquant"

binPeaks just works on list of MassPeaks objects. Your class output shows that you have just a single MassPeaks object.
## load package
library("MALDIquant")
## create two MassPeaks objects
p <- list(createMassPeaks(mass=seq(100, 500, 100), intensity=1:5),
createMassPeaks(mass=c(seq(100.2, 300.2, 100), 395), intensity=1:4))
binnedPeaks <- binPeaks(p, tolerance=0.002)
## but not:
binPeaks(p[[1]])
# Error: binPeaks(peaks, tolerance = 0.002) : ‘l’ is no list of MALDIquant::MassPeaks objects!
Please look at ?binPeaks for details or write me an email (I am the maintainer, you will find my mail address at CRAN.

Related

Extraction of specific result in R outputs

I want to extract the values of "b1p" and "b2p" from the mardia's command and want to save it in bskew.
For this i have used the "psych" package R version is 4.0.3. I have tried several commands for extraction but failed.
bskew <- mardia$b1p
bskew <- mardia[b1p
bskew <- mardia[[b1p
for this i got the error "object of type 'closure' is not subsettable"
By using names() i got only names and by using class() i got "psych", "mardia".
By using summary() i got the message "Warning message:
In summary.psych(mardia(x)) :
I am sorry, I do not have a summary function for this object" and then i used mna$coefficients[[]] command
and i got the message "NULL".
I saved my mardia command in mna.
Minimum Working Example is:
n0 <- 5
p0 <- 2
m0 <- matrix(rep(0,p0),ncol=p0)
s0 <- diag(1,p0)
x <- rmvnorm(5,mean=m0, sigma=s0)
mardia$"b1p"
bskew <- mardia["b1p"]
bskew <- mardia[["b1p"]]
bkurt <- mardia[["b2p"]]
bskew <- mardia$b1p$
mna<-mardia(x)
class(mna)
names(mna)
summary(mardia(x))
summary(mna)
sk1 <- mna$coefficients[[3]]
mna$coefficients
the error is because you're trying to subset a function mardia which always throws the error you have, also you should subset the mna object instead of subsetting the actual function.
> mna$b1p
[1] 1.95888
> mna["b1p"]
$b1p
[1] 1.95888
> mna[["b1p"]]
[1] 1.95888
> mardia(x)$b1p
[1] 1.95888
> mardia$b1p
Error in mardia$b1p : object of type 'closure' is not subsettable
> mardia<-mardia(x)
> mardia$b1p
[1] 1.95888

How to reuse sparklyr context with mclapply?

I have a R code that does some distributed data preprocessing in sparklyr, and then collects the data to R local dataframe to finally save the result in the CSV. Everything works as expected and now I plan to re-use the spark context across multiple input files processing.
My code looks similar to this reproducible example:
library(dplyr)
library(sparklyr)
sc <- spark_connect(master = "local")
# Generate random input
matrix(rbinom(1000, 1, .5), ncol=1) %>% write.csv('/tmp/input/df0.csv')
matrix(rbinom(1000, 1, .5), ncol=1) %>% write.csv('/tmp/input/df1.csv')
# Multi-job input
input = list(
list(name="df0", path="/tmp/input/df0.csv"),
list(name="df1", path="/tmp/input/df1.csv")
)
global_parallelism = 2
results_dir = "/tmp/results2"
# Function executed on each file
f <- function (job) {
spark_df <- spark_read_csv(sc, "df_tbl", job$path)
local_df <- spark_df %>%
group_by(V1) %>%
summarise(n=n()) %>%
sdf_collect
output_path <- paste(results_dir, "/", job$name, ".csv", sep="")
local_df %>% write.csv(output_path)
return (output_path)
}
If I execute the function of a job inputs in sequential way with lapply everything works as expected:
> lapply(input, f)
[[1]]
[1] "/tmp/results2/df0.csv"
[[2]]
[1] "/tmp/results2/df1.csv"
However, if I plan to run it in parallel to maximize usage of spark context (if df0 spark processing is done and the local R is working on it, df1 can be already processed by spark):
> library(parallel)
> library(MASS)
> mclapply(input, f, mc.cores = global_parallelism)
*** caught segfault ***
address 0x560b2c134003, cause 'memory not mapped'
[[1]]
[1] "Error in as.vector(x, \"list\") : \n cannot coerce type 'environment' to vector of type 'list'\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in as.vector(x, "list"): cannot coerce type 'environment' to vector of type 'list'>
[[2]]
NULL
Warning messages:
1: In mclapply(input, f, mc.cores = global_parallelism) :
scheduled core 2 did not deliver a result, all values of the job will be affected
2: In mclapply(input, f, mc.cores = global_parallelism) :
scheduled core 1 encountered error in user code, all values of the job will be affected
When I'm doing similar with Python and ThreadPoolExcutor, the spark context is shared across threads, same for Scala and Java.
Is this possible to reuse sparklyr context in parallel execution in R?
Yeah, unfortunately, the sc object, which is of class spark_connection, cannot be exported to another R process (even if forked processing is used). If you use the future.apply package, part of the future ecosystem, you can see this if you use:
library(future.apply)
plan(multicore)
## Look for non-exportable objects and given an error if found
options(future.globals.onReference = "error")
y <- future_lapply(input, f)
That will throw:
Error: Detected a non-exportable reference (‘externalptr’) in one of the
globals (‘sc’ of class ‘spark_connection’) used in the future expression

Change default argument(s) of S3 Methods in R

Is it possible to change default argument(s) of S3 Methods in R?
It's easy enough to change arguments using formals ...
# return default arguments of table
> args(table)
function (..., exclude = if (useNA == "no") c(NA, NaN), useNA = c("no",
"ifany", "always"), dnn = list.names(...), deparse.level = 1)
# Update an argument
> formals(table)$useNA <- "always"
# Check change
> args(table)
function (..., exclude = if (useNA == "no") c(NA, NaN), useNA = "always",
dnn = list.names(...), deparse.level = 1)
But not S3 methods ...
# View default argument of S3 method
> formals(utils:::str.default)$list.len
[1] 99
# Attempt to change
> formals(utils:::str.default)$list.len <- 99
Error in formals(utils:::str.default)$list.len <- 99 :
object 'utils' not found
At #nicola's generous prompting here is an answer-version of the comments:
You can edit S3 methods and other non-exported functions using assignInNamespace(). This lets you replace a function in a given namespace with a new user-defined function (fixInNamespace() will open the target function in an editor to let you make a change).
# Take a look at what we are going to change
formals(utils:::str.default)$list.len
#> [1] 99
# extract the whole function from utils namespace
f_to_edit <- utils:::str.default
# make the necessary alterations
formals(f_to_edit)$list.len<-900
# Now we substitute our new improved version of str.default inside
# the utils namespace
assignInNamespace("str.default", f_to_edit, ns = "utils")
# and check the result
formals(utils:::str.default)$list.len
#> [1] 900
If you restart your R session you'll recover the defaults (or you can put them back manually in the current session).

R stm package error: "vectorized sources must have a positive length entry"

I think I'm making a pretty simple mistake but I'm a rookie at R and am having a hard time figuring it out. I'm trying to use the 'stm' package in R do some topic-modeling on a dataset of tweets I scraped.
The dataset is formatted in two columns, one with the name of the tweet-sender, with column header "meta" and the other with the vocab of the tweet, column header, "vocab". After running the script below, I get the following errors:
Error: is.Source(s) is not TRUE
In addition: Warning message:
In is.Source(s) : vectorized sources must have a positive length entry
library(stm)
library(igraph)
setwd("c:/Users/Adam/Desktop/RTwitter")
data <-read.csv("TweetDataSTM.csv")
processed <- textProcessor(data$documents, metadata = data)
out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
docs <- out$documents
vocab <- out$vocab
meta <-out$meta
> library(stm)
> library(igraph)
> setwd("c:/Users/Adam/Desktop/RTwitter")
>
> rm(list=ls())
>
> data <-read.csv("TweetDataSTM.csv")
> processed <- textProcessor(data$documents, metadata = data)
Building corpus...
Error: is.Source(s) is not TRUE
In addition: Warning message:
In is.Source(s) : vectorized sources must have a positive length entry
> out <- prepDocuments(processed$documents, processed$vocab, processed$meta)
Error in prepDocuments(processed$documents, processed$vocab, processed$meta) :
object 'processed' not found
> docs <- out$documents
Error: object 'out' not found
> vocab <- out$vocab
Error: object 'out' not found
> meta <-out$meta
Error: object 'out' not found
(any advice would be greatly appreciated!)
-Adam
I think your mistake occurs because your columns are named vocab and meta. But here
processed <- textProcessor(data$documents, metadata = data)
you are trying to call a column documents that - as far as I see - does not exist in your data.frame. Try changing the code to:
processed <- textProcessor(data$vocab, metadata = data)

Using geterrmessage() in a loop - R

My objective here is to capture the error that R throws and store it in an object.
Here are some dummy codes:
for(i in 1:length(a)){try(
if (i==4)(print(a[i]/"b"))else(print(a[i]/b[i]))
)}
[1] -0.125
[1] -0.2857143
[1] -0.5
Error in a[i]/"b" : non-numeric argument to binary operator
[1] -1.25
[1] -2
[1] -3.5
[1] -8
[1] Inf
[1] 10
So I want to capture that on the 4th iteration the error was: Error in a[i]/"b" : non-numeric argument to binary operator into an object say:
error<-()
iferror(error[i]<-geterrmessage())
I am aware that iferror as a function is not available in R, but I am trying to give the idea, because geterrmessage captures only the last error it sees
So for the example i want say for error[1:3]<-'NA'and error[5:10]<-'NA' because no error but
error[4]<-"Error in a[i]/"b" : non-numeric argument to binary operator"
So that later I can check error object and understand where and what error happened
If you can help me write a code that would be excellent and highly appreciated
I hope the following function helps:
a <- c(0:6)
b <- c(-3:3)
create_log <- function(logfile_name, save_path) {
warning("Error messages not visible. Use closeAllConnections() in the end of the script")
if (file.exists(paste0(save_path, logfile_name))) {
file.remove(paste0(save_path, logfile_name))
}
fid <- file(paste0(save_path, logfile_name), open = "wt")
sink(fid, type = "message", split = F) # warnings are NOT displayed. split=T not possible.
sink(fid, append = T, type = "output", split = T) # print, cat
return(NULL)
}
create_log("test.csv", "C:/Test/")
for(i in 1:length(a)){try(
if (i==4)(print(a[i]/"b"))else(print(a[i]/b[i]))
)}
closeAllConnections()

Resources