How to print the error of a system call in R - r

I'm attempting to use the system function in R to run a program, which I expect to yield an error message in some cases. For this I want to write a tryCatch function.
system(command, intern = TRUE) only returns the actual values which were echo'd by the program I'm running, it does not return my error.
In R, how can I get the error message which was yielded by my system?
My code:
test <- tryCatch({
cmd <- paste0("../Scripts/Plink2/plink --file ../InputData/",prefix," --bmerge ",
"../InputData/fs --missing --out ../InputData/",prefix)
print(cmd)
system(cmd)
} , error = function(e) {
# error handler picks up where error was generated
print("EZEL")
print(paste("MY_ERROR: ",e))
}, finally = {
print("something")
})
[1] "../Scripts/Plink2/plink --file ../InputData/GS80Kdata --bmerge ../InputData/fs --missing --out ../InputData/GS80Kdata"
PLINK v1.90b3.37 64-bit (16 May 2016) https://www.cog-genomics.org/plink2
#....
#skipping some lines here to reduce size
#....
Of these, 1414410 are new, while 2462 are present in the base dataset.
Error: 1 variant with 3+ alleles present.
* If you believe this is due to strand inconsistency, try --flip with
# Skipping some more lines here
[1] "something"
However when using intern=TRUE and assigning the system function to a variable won't catch the error in the vector and still prints it in the R console.
Edit: Here the output of the vector (using gsub to reduce the ridiculous size)
> gsub(pattern="\b\\d.*", replacement = "", x = tst)
[1] "PLINK v1.90b3.37 64-bit (16 May 2016) https://www.cog-genomics.org/plink2"
[2] "(C) 2005-2016 Shaun Purcell, Christopher Chang GNU General Public License v3"
[3] "Logging to ../InputData/GS80Kdata.log."
[4] "Options in effect:"
[5] " --bmerge ../InputData/fs"
[6] " --file ../InputData/GS80Kdata"
[7] " --missing"
[8] " --out ../InputData/GS80Kdata"
[9] ""
[10] "64381 MB RAM detected; reserving 32190 MB for main workspace."
[11] "Scanning .ped file... 0%\b"
[12] "2%\b\b"
[13] "%\b\b"
[14] "\b\b"
[15] "\b"
[16] ""
[17] "58%\b\b"
[18] "7%\b\b"
[19] "%\b\b"
[20] "\b\b"
[21] "\b"
[22] "Performing single-pass .bed write (42884 variants, 14978 people)."
[23] "0%\b"
[24] "../InputData/GS80Kdata-temporary.bim + ../InputData/GS80Kdata-temporary.fam"
[25] "written."
[26] "14978 people loaded from ../InputData/GS80Kdata-temporary.fam."
[27] "144 people to be merged from ../InputData/fs.fam."
[28] "Of these, 140 are new, while 4 are present in the base dataset."
[29] "42884 markers loaded from ../InputData/GS80Kdata-temporary.bim."
[30] "1416872 markers to be merged from ../InputData/fs.bim."
[31] "Of these, 1414410 are new, while 2462 are present in the base dataset."
attr(,"status")
[1] 3
>

Related

List.files based on numbers

I am trying to create a list of files on which I want to run a function. I created a pattern which matches 35 files which I want to use.
mypattern <- paste0("NBS_NLoans_since2009_", seq(1, 35),".xls")
[1] "NBS_NLoans_since2009_1.xls" "NBS_NLoans_since2009_2.xls" "NBS_NLoans_since2009_3.xls" "NBS_NLoans_since2009_4.xls"
[5] "NBS_NLoans_since2009_5.xls" "NBS_NLoans_since2009_6.xls" "NBS_NLoans_since2009_7.xls" "NBS_NLoans_since2009_8.xls"
[9] "NBS_NLoans_since2009_9.xls" "NBS_NLoans_since2009_10.xls" "NBS_NLoans_since2009_11.xls" "NBS_NLoans_since2009_12.xls"
[13] "NBS_NLoans_since2009_13.xls" "NBS_NLoans_since2009_14.xls" "NBS_NLoans_since2009_15.xls" "NBS_NLoans_since2009_16.xls"
[17] "NBS_NLoans_since2009_17.xls" "NBS_NLoans_since2009_18.xls" "NBS_NLoans_since2009_19.xls" "NBS_NLoans_since2009_20.xls"
[21] "NBS_NLoans_since2009_21.xls" "NBS_NLoans_since2009_22.xls" "NBS_NLoans_since2009_23.xls" "NBS_NLoans_since2009_24.xls"
[25] "NBS_NLoans_since2009_25.xls" "NBS_NLoans_since2009_26.xls" "NBS_NLoans_since2009_27.xls" "NBS_NLoans_since2009_28.xls"
[29] "NBS_NLoans_since2009_29.xls" "NBS_NLoans_since2009_30.xls" "NBS_NLoans_since2009_31.xls" "NBS_NLoans_since2009_32.xls"
[33] "NBS_NLoans_since2009_33.xls" "NBS_NLoans_since2009_34.xls" "NBS_NLoans_since2009_35.xls"
Then I used the pattern to get those files from my directory. I got only one file. I have tried different patterns but either I got one file or more than 35 files. Thanks for any suggestion.
list.files(pattern = mypattern)
[1] "NBS_NLoans_since2009_1.xls"

list files pattern select date

Hello I have a set of daily meteo data, using the expression :
f <- list.files(getwd(), include.dirs=TRUE, recursive=TRUE, pattern= "PREC")
I select only the files of Precipitation
I wonder how to select only files for example of January, the one for example named 20170103 (yyyymmdd) , so the one named yyyy01dd....
the files are named in this way: "PREC_20010120.grd".
Try pattern='PREC_\\d{4}01\\d[2].*'.
PREC_ literally
\\d{4} four digits
01 '"01" literally
\\d{2} two digits
.* any character repeatedly
Thank you , but I retrieved only 35 items instead of 31 days * 10 years what's wrong ?
[1] "20100102/PREC_20100102.tif" "20100112/PREC_20100112.tif"
[3] "20100122/PREC_20100122.tif" "20110102/PREC_20110102.tif"
[5] "20110112/PREC_20110112.tif" "20110122/PREC_20110122.tif"
[7] "20120102/PREC_20120102.tif" "20120112/PREC_20120112.tif"
[9] "20120122/PREC_20120122.tif" "20130102/PREC_20130102.tif"
[11] "20130112/PREC_20130112.tif" "20130122/PREC_20130122.tif"
[13] "20140102/PREC_20140102.tif" "20140112/PREC_20140112.tif"
[15] "20140122/PREC_20140122.tif" "20150102/PREC_20150102.tif"
[17] "20150112/PREC_20150112.tif" "20150122/PREC_20150122.tif"
[19] "20160102/PREC_20160102.tif" "20160112/PREC_20160112.tif"
[21] "20160122/PREC_20160122.tif" "20170102/PREC_20170102.tif"
[23] "20170112/PREC_20170112.tif" "20170122/PREC_20170122.tif"
[25] "20180102/PREC_20180102.tif" "20180112/PREC_20180112.tif"
[27] "20180122/PREC_20180122.tif" "20190102/PREC_20190102.tif"
[29] "20190112/PREC_20190112.tif" "20190122/PREC_20190122.tif"
[31] "20200102/PREC_20200102.tif" "20200112/PREC_20200112.tif"
[33] "20200122/PREC_20200122.tif" "20210102/PREC_20210102.tif"
[35] "20210112/PREC_20210112.tif" "20210122/PREC_20210122.tif"
Resolved with:
f <- list.files(getwd(), include.dirs=TRUE, recursive=TRUE, pattern='PREC_\\d{4}01.*')

When using sapply,I get Error in str2lang(x) : <text>:1:31: unexpected symbol 1 ^

When run this code,I will get an error:
genes<-colnames(survdata)[-c(1:3)]
univ_formulas<-sapply(genes,function(x)as.formula(paste('Surv(OS,status)~',x)))
Error in str2lang(x) : <text>:1:31: unexpected symbol
1: Surv(OS,status)~ ABC7-42389800N19.1
^
If I remove the element and run the code again, a similar error appears again:
univ_formulas<-sapply(genes,function(x)as.formula(paste('Surv(OS,status)~',x)))
Error in str2lang(x) : <text>:1:26: unexpected symbol
1: Surv(OS,status)~ CITF22-1A6.3
^
I don't know where the wrong is.
example of the data:
head(genes,n = 50)
[1] "A1BG" "A1BG-AS1" "A2M"
[4] "A2M-AS1" "A2ML1" "A2MP1"
[7] "A3GALT2" "A4GALT" "AAAS"
[10] "AACS" "AACSP1" "AADAT"
[13] "AAED1" "AAGAB" "AAK1"
[16] "AAMDC" "AAMP" "AANAT"
[19] "AAR2" "AARD" "AARS"
[22] "AARS2" "AARSD1" "AASDH"
[25] "AASDHPPT" "AASS" "AATF"
[28] "AATK" "AATK-AS1" "ABAT"
[31] "ABC7-42389800N19.1" "ABCA1" "ABCA10"
[34] "ABCA11P" "ABCA12" "ABCA13"
[37] "ABCA17P" "ABCA2" "ABCA3"
[40] "ABCA4" "ABCA5" "ABCA6"
[43] "ABCA7" "ABCA8" "ABCA9"
[46] "ABCB1" "ABCB10" "ABCB4"
[49] "ABCB6" "ABCB7"
This is because the names of the genes contain - which base::str2lang regards as a mathematical expression. We can fix this as follows:
"Clean" gene names to convert - to _ and document this somewhere.
We then have:
genes <- c("ABC7-42389800N19.1", "AATK-AS1")
sapply(genes,function(x)as.formula(paste('Surv(OS,status)~',
+ sub("-", "_",x))))
$`ABC7-42389800N19.1`
Surv(OS, status) ~ ABC7_42389800N19.1
<environment: 0x000002ad508b58e8>
$`AATK-AS1`
Surv(OS, status) ~ AATK_AS1
<environment: 0x000002ad508b3c30>
This is an illustration of why that is the case:
A <- 4; B<- 20
str2lang("A-B")
A - B
eval(str2lang("A-B"))
[1] -16
str2lang is essentially similar to the dreaded eval-parse framework. From the docs, this is what it does:
str2expression(s) and str2lang(s) return special versions of parse(text=s, keep.source=FALSE) and can therefore be regarded as transforming character strings s to expressions, calls, etc.
NOTE
Since this is to be used in modeling, it is probably better to perform the sub at the colnames stage such that the input data to the model has the names we expect:
# not tested but you get the idea
colnames(survdata)[-c(1:3)]<-sub("-", "_",colnames(survdata)[-c(1:3)])
It is important, for biological/research purposes, to document why gene names where cleaned as suggested in this answer.

R H2o object not found H2OKeyNotFoundArgumentException

R Version: R version 3.5.1 (2018-07-02)
H2O cluster version: 3.20.0.2
The dataset used here is available on Kaggle (Home credit risk). Prior to using h2o automl, the necessary treatment of missing values and selection of relevant categorical variables has already been carried out. Can you assist me in figuring out what is the underlying cause for this error?
Thanks
Code:
h2o.init()
h2o.no_progress()
# y_train_processed_tbl is the target variable
# x_train_processed_tbl is the remaining data post dealing with Missing
# values
data_h2o <- as.h2o(bind_cols(y_train_processed_tbl, x_train_processed_tbl))
splits_h2o <- h2o.splitFrame(data_h2o, ratios = c(0.7, 0.15), seed = 1234)
train_h2o <- splits_h2o[[1]]
valid_h2o <- splits_h2o[[2]]
test_h2o <- splits_h2o[[3]]
y <- "TARGET"
x <- setdiff(names(train_h2o), y)
automl_models_h2o <- h2o.automl(x = x,y = y,
training_frame = train_h2o, validation_frame = valid_h2o,
leaderboard_frame = test_h2o,
max_runtime_secs = 90
)
automl_leader <- automl_models_h2o#leader
# Error in performance_h2o
performance_h2o <- h2o.performance(automl_leader, newdata = test_h2o)
ERROR: Unexpected HTTP Status code: 404 Not Found
water.exceptions.H2OKeyNotFoundArgumentException
[1] "water.exceptions.H2OKeyNotFoundArgumentException: Object 'dummy' not
found in function: predict for argument: model"
[2] " water.api.ModelMetricsHandler.score(ModelMetricsHandler.java:235)"
[3] " sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)"
[4] " sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)"
[5] " sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)"
[6] " java.lang.reflect.Method.invoke(Unknown Source)"
[7] " water.api.Handler.handle(Handler.java:63)"
[8] " water.api.RequestServer.serve(RequestServer.java:451)"
[9] " water.api.RequestServer.doGeneric(RequestServer.java:296)"
[10] " water.api.RequestServer.doPost(RequestServer.java:222)"
[11] " javax.servlet.http.HttpServlet.service(HttpServlet.java:755)"
[12] " javax.servlet.http.HttpServlet.service(HttpServlet.java:848)"
[13] " org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)"
[14] " org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503)"
[15] " org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)"
[16] " org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)"
[17] " org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)"
[18] " org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)"
[19] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[20] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[21] " water.JettyHTTPD$LoginHandler.handle(JettyHTTPD.java:197)"
[22] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[23] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[24] " org.eclipse.jetty.server.Server.handle(Server.java:370)"
[25] " org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)"
[26] " org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)"
[27] " org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982)"
[28] " org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043)"
[29] " org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)"
[30] " org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)"
[31] " org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)"
[32] " org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)"
[33] " org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)"
[34] " org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)"
[35] " java.lang.Thread.run(Unknown Source)"
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix =
page, :
ERROR MESSAGE:
Object 'dummy' not found in function: predict for argument: model
The issue here is that you only gave AutoML 90 seconds to run, so it did not have time to train even one model. In the next stable release of H2O, the error message will be gone and instead you will simply get a Leaderboard with no rows (we are fixing this so that it's handled more gracefully).
Rather than using max_runtime_secs = 90, you could increase that to something much larger (the default is 3600 secs, or 1 hour). Alternatively you can specify the number of models you want instead by setting max_models = 20, for example.
If you do use max_models, I'd recommend setting max_runtime_secs to something large (e.g. 999999999) so that you don't run out of time. The AutoML process will stop when it reaches the first of max_models or max_runtime_secs.
I posted a similar answer here.
My code was working fine, then I tweaked it and got the same error.
To fix it, instead of using automl_models_h2o#leader to save the leader for predictions/performance, save the leader using h2o.getModel().
Change your automl_leader initialization:
...
# get model name from list
automl_models_h2o#leaderboard
# change MODEL_NAME_HERE to a model name from your leaderboard list.
automl_leader <- h2o.getModel("MODEL_NAME_HERE")
performance_h2o <- h2o.performance(automl_leader, newdata = test_h2o)
...

Read file into R keeping end of lines

Probably a simple question and I have looked at the many options in scan but havent got what I want yet.
A simple example would be
require(httr)
example <- content(GET("http://www.r-project.org"), as = 'text')
write(example, 'text.txt')
input <- readLines('text.txt')
> example
[1] "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">\n<html>\n<head>\n<title>The R Project for Statistical Computing</title>\n<link rel=\"icon\" href=\"favicon.ico\" type=\"image/x-icon\">\n<link rel=\"shortcut icon\" href=\"favicon.ico\" type=\"image/x-icon\">\n<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">\n</head>\n\n<FRAMESET cols=\"1*, 4*\" border=0>\n<FRAMESET rows=\"120, 1*\">\n<FRAME src=\"logo.html\" name=\"logo\" frameborder=0>\n<FRAME src=\"navbar.html\" name=\"contents\" frameborder=0>\n</FRAMESET>\n<FRAME src=\"main.shtml\" name=\"banner\" frameborder=0>\n<noframes>\n<h1>The R Project for Statistical Computing</h1>\n\nYour browser seems not to support frames,\nhere is the contents page of the R Project's\nwebsite.\n</noframes>\n</FRAMESET>\n\n\n\n"
input
[1] "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"
[2] "<html>"
[3] "<head>"
[4] "<title>The R Project for Statistical Computing</title>"
[5] "<link rel=\"icon\" href=\"favicon.ico\" type=\"image/x-icon\">"
[6] "<link rel=\"shortcut icon\" href=\"favicon.ico\" type=\"image/x-icon\">"
[7] "<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">"
[8] "</head>"
[9] ""
[10] "<FRAMESET cols=\"1*, 4*\" border=0>"
[11] "<FRAMESET rows=\"120, 1*\">"
[12] "<FRAME src=\"logo.html\" name=\"logo\" frameborder=0>"
[13] "<FRAME src=\"navbar.html\" name=\"contents\" frameborder=0>"
[14] "</FRAMESET>"
[15] "<FRAME src=\"main.shtml\" name=\"banner\" frameborder=0>"
[16] "<noframes>"
[17] "<h1>The R Project for Statistical Computing</h1>"
[18] ""
[19] "Your browser seems not to support frames,"
[20] "here is the contents page of the R Project's"
[21] "website."
[22] "</noframes>"
[23] "</FRAMESET>"
[24] ""
[25] ""
[26] ""
[27] ""
the motivation for this is that I want to store various files in Postgresql and I am passing them in in the format given by example as opposed to input. Apologies if I havent explained very well.
#Hong Ooi gave a nice answer using readChar. I have encoding issues so have had to wrap
iconv(readChar(file, nchars=file.info(file)["size"], TRUE), from = "latin1", to = "UTF-8")
to stop the database complaining.
If you want all those strings concatenated into a single string:
paste(input, collapse="\n")
Alternatively, if you're reading from a file and want to avoid splitting the input into bits and putting them back together:
f <- readChar(file, nchars=file.info(file)["size"], TRUE)

Resources