I'm using R on the university HPC cluster and now they have upgraded to R/3.3.0 I'm looking to install my code and supporting packages to match my local setup. However I'm having problems with the commands install_github and install_bitbucket to install my code (and another package I use).
These are the commands and errors from the R session:
> library(devtools)
> install_github("config-i1/smooth")
Downloading GitHub repo config-i1/smooth#master
from URL https://api.github.com/repos/config-i1/smooth/zipball/master
Installing smooth
*** caught segfault ***
address 0x4000000079, cause 'memory not mapped'
Traceback:
1: .Call(digest_impl, object, as.integer(algoint), as.integer(length), as.integer(skip), as.integer(raw), as.integer(seed))
2: `_digest`(c(list(repos, type), lapply(`_additional`, function(x) eval(x[[2L]], environment(x)))), algo = "sha512")
3: available_packages(repos, type)
4: package_deps(deps, repos = repos, type = type)
5: dev_package_deps(pkg, repos = repos, dependencies = dependencies, type = type, force_deps = force_deps, quiet = quiet)
6: install_deps(pkg, dependencies = dependencies, upgrade = upgrade_dependencies, threads = threads, force_deps = force_deps, quiet = quiet, ...)
7: install(source, ..., quiet = quiet, metadata = metadata)
8: FUN(X[[i]], ...)
9: vapply(remotes, install_remote, ..., FUN.VALUE = logical(1))
10: install_remotes(remotes, quiet = quiet, ...)
11: install_github("config-i1/smooth")
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
I receive a similar error with: install_bitbucket("wellermatt/forecastR"
> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux release 6.7 (Carbon)
locale:
[1] LC_CTYPE=C LC_NUMERIC=C
[3] LC_TIME=en_US.iso88591 LC_COLLATE=C
[5] LC_MONETARY=en_US.iso88591 LC_MESSAGES=en_US.iso88591
[7] LC_PAPER=en_US.iso88591 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.iso88591 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] devtools_1.11.1
loaded via a namespace (and not attached):
[1] withr_1.0.1 memoise_1.0.0 digest_0.6.9
I have tried to upgrade RCurl and devtools, but neither has helped. WOuld anybody be able to suggest a potential solution - I'm not able to find one on the web at the moment. I guess I can try to download the whole repository from github and then install from source that way but this doesn't help my workflow which currently pulls the latest version automatically.
---- EDIT -----
The cluster manager at Lancaster has tried the install also and has looked into the traceback. He reports as follows:
Let me know how that goes. I can replicate the crash myself (which may in itself be odd, as I expected your codebase to require authentication), but I’m none the wiser. Here’s the last few lines of strace (which logs systems calls only) leading up to the segfault:
read(3, "Package: forecastR\nType: Package"..., 16384) = 492
read(3, "", 12288) = 0
lseek(3, 0, SEEK_CUR) = 492
read(3, "", 4096) = 0
close(3) = 0
munmap(0x7f276cdfe000, 4096) = 0
stat("/tmp/RtmpjNlvQb/devtools2d0a1128ed80/wellermatt-forecastr-d5b58631b3a1/src", 0x7fff1a81bdb0) = -1 ENOENT (No such file or directory)
write(2, "Installing forecastR\n", 21) = 21
--- SIGSEGV (Segmentation fault) # 0 (0) ---
The failed stat() call may or may not be an issue – were you expecting your package to have src dir? If not, this may simply be an expected test.
Related
I am trying to run a LDA topic analysis on Rstudio 3.3.0. I am at the following step but keep getting the error:
Error in gzfile(file, "wb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "wb") :
cannot open compressed file 'results/Gibbs_5_1.rda', probable reason 'No such file or directory'
There is a problem while saving.
D <- nrow(data)
folding <- sample(rep(seq_len(10), ceiling (D))[seq_len(D)])
for (k in topics)
{
for (chain in seq_len(10))
{
FILE <- paste("Gibbs_", k, "_", chain, ".rda", sep = "")
training <- LDA(data[folding != chain,], k = k,
control = list(seed = SEED,
burnin = BURNIN, thin = THIN, iter = ITER, best= BEST),
method = "Gibbs")
best_training <- training#fitted[[which.max(logLik(training))]]
testing <- LDA(data[folding == chain,], model = best_training,
control = list(estimate.beta = FALSE, seed = SEED,
burnin = BURNIN,
thin = THIN, iter = ITER, best = BEST))
save(training, testing, file = file.path("results", FILE))
}
}
There is enough workspace on my computer, and I tried to restart r several times and yes I looked at the other questions but none of the solutions seem to work.
> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] topicmodels_0.2-4 wordcloud_2.5 RColorBrewer_1.1-2 slam_0.1-35 SnowballC_0.5.1
[6] tm_0.6-2 NLP_0.1-9
loaded via a namespace (and not attached):
[1] modeltools_0.2-21 parallel_3.3.0 tools_3.3.0 Rcpp_0.12.5 stats4_3.3.0
I am a beginner in R and I follow a book to conduct the analysis for my master thesis.
Thanks!
The error message says it can't save the file. What is it trying to save? Looking at the code it looks like its trying to save in a folder called "results". Does this folder exist? Because if it doesn't, I get that error when I try and save something to a non-existent folder:
> save(iris, file=file.path("results","foo.rda"))
Error in gzfile(file, "wb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "wb") :
cannot open compressed file 'results/foo.rda', probable reason 'No such file or directory'
If I create the folder then it works:
> dir.create("results")
> save(iris, file=file.path("results","foo.rda"))
I've been getting WFA to run on the full set of intraday GBPUSD 30min data, and have come across a couple of things that need addressing. The first is I believe the save function needs changing to remove the time from the string (as shown here as a pull request on the R-Finance/quantstrat repo on github). The walk.forward function throws this error:
Error in gzfile(file, "wb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "wb") :
cannot open compressed file 'wfa.GBPUSD.2002-10-21 00:30:00.2002-10-23 23:30:00.RData', probable reason 'Invalid argument'
The second is a rare case scenario where its ends up calling runSum on a data set with less rows than the period you are testing (n). This is the traceback():
8: stop("Invalid 'n'")
7: runSum(x, n)
6: runMean(x, n)
5: (function (x, n = 10, ...)
{
ma <- runMean(x, n)
if (!is.null(dim(ma))) {
colnames(ma) <- "SMA"
}
return(ma)
})(x = Cl(mktdata)[, 1], n = 25)
4: do.call(indFun, .formals)
3: applyIndicators(strategy = strategy, mktdata = mktdata, parameters = parameters,
...)
2: applyStrategy(strategy, portfolios = portfolio.st, mktdata = symbol[testing.timespan]) at custom.walk.forward.R#122
1: walk.forward(strategy.st, paramset.label = "WFA", portfolio.st = portfolio.st,
account.st = account.st, period = "days", k.training = 3,
k.testing = 1, obj.func = my.obj.func, obj.args = list(x = quote(result$apply.paramset)),
audit.prefix = "wfa", anchored = FALSE, verbose = TRUE)
The extended GBPUSD data used in the creation of the Luxor Demo includes an erroneous date (2002/10/27) with only 1 observation which causes this problem. I can also foresee this being an issue when testing longer signal periods on instruments like Crude where they have only a few trading hours on Sunday evenings (UTC).
Given that I have purely been following the Luxor demo with the same (extended) intra-day data set, are these genuine issues or have they been caused by package updates etc?
What is the preferred way for these things to be reported to the authors of QS, and find out if/when fixes are likely to be made?
SessionInfo():
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 LC_NUMERIC=C LC_TIME=English_Australia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] quantstrat_0.9.1739 foreach_1.4.3 blotter_0.9.1741 PerformanceAnalytics_1.4.4000 FinancialInstrument_1.2.0 quantmod_0.4-5 TTR_0.23-1
[8] xts_0.9.874 zoo_1.7-13
loaded via a namespace (and not attached):
[1] compiler_3.3.0 tools_3.3.0 codetools_0.2-14 grid_3.3.0 iterators_1.0.8 lattice_0.20-33
quantstrat is on github here:
https://github.com/braverock/quantstrat
Issues and patches should be reported via github issues.
I am trying to make a file connection within a cluster (using parallel).
While it works correctly in the global environment, it gives me an error message when used within the members of the cluster (See the script below).
Do I missed something?
Any suggestion?
Thanks,
# This part works
#----------------
cat("This is a test file" , file={f <- tempfile()})
con <- file(f, "rt")
# Doing what I think is the same thing gives an error message when executed in parallel
#--------------------------------------------------------------------------------------
library(parallel)
cl <- makeCluster(2)
## Exporting the object f into the cluster
clusterExport(cl, "f")
clusterEvalQ(cl[1], con <- file(f[[1]], "rt"))
#Error in checkForRemoteErrors(lapply(cl, recvResult)) :
# one node produced an error: cannot open the connection
## Creating the object f into the cluster
clusterEvalQ(cl[1],cat("This is a test file" , file={f <- tempfile()}))
clusterEvalQ(cl[1],con <- file(f, "rt"))
#Error in checkForRemoteErrors(lapply(cl, recvResult)) :
# one node produced an error: cannot open the connection
############ Here is my sessionInfo() ###################
# R version 3.3.0 (2016-05-03)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 7 x64 (build 7601) Service Pack 1
#
# locale:
# [1] LC_COLLATE=French_Canada.1252 LC_CTYPE=French_Canada.1252
# [3] LC_MONETARY=French_Canada.1252 LC_NUMERIC=C
# [5] LC_TIME=French_Canada.1252
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
Try changing the code to return a NULL rather than the created connection object:
clusterEvalQ(cl[1], {con <- file(f[[1]], "rt"); NULL})
Connection objects can't be safely sent between the master and workers, but this method avoids that.
Consider the following usage:
tryCatch(log("a"), error = function(e) NULL)
#NULL
Now I'm trying to do essentially the same, but in a more complicated fashion. I have two network repositories, and I'd like to install packages from the second if the first is not available for some reason. Here's how I do it:
pkg_location <- c("file://main_repo", "file://extra_repo")
lapply(pkg_location, function(repo)
{
tryCatch(install.packages("my-cool-package",
contriburl = repo, dependencies = TRUE),
error = function(e) NULL)
})
And I'm expecting a list of NULLs. However, the error is not suppressed:
Installing package into ‘...’
(as ‘lib’ is unspecified)
Warning in install.packages :
cannot open compressed file '//extra_repo/PACKAGES',
probable reason 'No such file or directory'
Error in install.packages : cannot open the connection
[[1]]
NULL
[[2]]
NULL
It seems like install.packages somehow ignores the mechanism. How is that possible, why is that happening and how can I approach the problem?
Here's sessionInfo, probably worth noting I'm running RStudio 0.98.977.
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.1.2
RStudio does not exectute the normal install.packages but instead does its own thing:
look at the code in RStudio:
> install.packages
function (...)
.rs.callAs(name, hook, original, ...)
<environment: 0x3e4b478>
> .rs.callAs
function (name, f, ...)
{
withCallingHandlers(tryCatch(f(...), error = function(e) {
cat("Error in ", name, " : ", e$message, "\n", sep = "")
}), warning = function(w) {
cat("Warning in ", name, " :\n ", w$message, "\n", sep = "")
invokeRestart("muffleWarning")
})
}
<environment: 0x3bafa38>
weird code, it recalls itself ...
i was expecting a .Primitive() somewhere
> sum
function (..., na.rm = FALSE) .Primitive("sum")
but it is an ugly RStudio hack. if you look at install.packages in normal R you get:
head(install.packages) # it is really long :P
1 function (pkgs, lib, repos = getOption("repos"), contriburl = contrib.url(repos,
2 type), method, available = NULL, destdir = NULL, dependencies = NA,
3 type = getOption("pkgType"), configure.args = getOption("configure.args"),
4 configure.vars = getOption("configure.vars"), clean = FALSE,
5 Ncpus = getOption("Ncpus", 1L), verbose = getOption("verbose"),
6 libs_only = FALSE, INSTALL_opts, quiet = FALSE, keep_outputs = FALSE,
....
I'm going to suggest closing as off-topic because this is an RStudio problem. Basically, tryCatch is catching the error, but RStudio's error handler prints the error anyway. Thus the reason you're getting a return value:
[[1]]
NULL
[[2]]
NULL
This means tryCatch works. RStudio just prints caught errors weirdly.
Use the namespaced invocation:
utils::install.packages()
I am getting a caught segfault error every time I try to run any plotting functions from the ggplot2 package (1.0.0). I have tried this with qplot, geom_dotplot, geom_histogram, etc. Data from the package (e.g. diamonds or economics) work just fine.
I am operating on Mac OS 10.9.4 (the latest version) and on R 3.1.1 (also the latest version). I get the same error with the standard R GUI, RStudio, and when using R from the command line. The command brings up the default graphic device (Quartz for R GUI and command line), but also the terminal error.
library(ggplot2)
qplot(1:10)
gives me the error:
*** caught segfault ***
address 0x18, cause 'memory not mapped'
Traceback:
1: .Call("plyr_split_indices", PACKAGE = "plyr", group, n)
2: split_indices(scale_id, n)
3: scale_apply(layer_data, x_vars, scale_train, SCALE_X, panel$x_scales)
4: train_position(panel, data, scale_x(), scale_y())
5: ggplot_build(x)
6: print.ggplot(list(data = list(), layers = list(<environment>), scales = <S4 object of class "Scales">, mapping = list(x = 1:3), theme = list(), coordinates = list(limits = list(x = NULL, y = NULL)), facet = list(shrink = TRUE), plot_env = <environment>, labels = list(x = "1:3", y = "count")))
7: print(list(data = list(), layers = list(<environment>), scales = <S4 object of class "Scales">, mapping = list(x = 1:3), theme = list(), coordinates = list( limits = list(x = NULL, y = NULL)), facet = list(shrink = TRUE), plot_env = <environment>, labels = list(x = "1:3", y = "count")))
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Here is my session info:
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] graphics grDevices utils datasets stats methods base
other attached packages:
[1] ggplot2_1.0.0 marelac_2.1.3 seacarb_3.0 shape_1.4.1 beepr_1.1 birk_1.1
loaded via a namespace (and not attached):
[1] audio_0.1-5 colorspace_1.2-4 digest_0.6.4 grid_3.1.1 gtable_0.1.2
[6] MASS_7.3-34 munsell_0.4.2 plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2
[11] reshape2_1.4 scales_0.2.4 stringr_0.6.2 tools_3.1.1
I've gathered from others that this is a memory issue of some sort, but this error occurs even when I have over 2 GB of free RAM. I know this is a widely used package, so of course this doesn't happen for everyone, but why is it happening for me? Does anyone know what I can do to fix this problem?
In case anyone else has this problem or similar in the future, I sent a bug report to the package maintainer and he recommended uninstalling all installed packages and starting over. I took his advice and it worked!
I followed advice from this posting: http://r.789695.n4.nabble.com/Reset-R-s-library-to-base-packages-only-remove-all-installed-contributed-packages-td3596151.html
ip <- installed.packages()
pkgs.to.remove <- ip[!(ip[,"Priority"] %in% c("base", "recommended")), 1]
sapply(pkgs.to.remove, remove.packages)
This is not an answer to this question but it might be helpful for someone. (Inspired by user1310503. Thanks!)
I am working on a data.frame df with three cols: col1, col2, col3.
Initially,
df =data.frame(col1=character(),col2=numeric(),col3=numeric(),stringsAsFactors = F)
In the process, rbind is used for many times, like:
aList<-list(col1="aaa", col2 = "123", col3 = "234")
dfNew <- as.data.frame(aList)
df <- rbind(df, dfNew)
At last, df is written to file via data.table::fwrite
data.table::fwrite(x = df, file = fileDF, append = FALSE, row.names = F, quote = F, showProgress = T)
df has 5973 rows and 3 cols. The "caught segfault" always occurs:
address 0x1, cause 'memory not mapped'.
The solution to this problem is:
aList<-list(col1=as.character("aaa"), col2 = as.numeric("123"), col3 = as.numeric("234"))
dfNew <- as.data.frame(aList)
dfNew$col1 <- as.characer(dfNew$col1)
dfNew$col2 <- as.numeric(dfNew$col2)
dfNew$col3 <- as.numeric(dfNew$col3)
df <- rbind(df, dfNew)
Then this problem is solved. Possible reason is that the classes of cols are different.
This is not an answer to this question but it might be useful for someone. I had segfaults when I did pdf to create a PDF graphics device and then used plot. This happened with R 2.15.3, 3.2.4, and one or two other versions, running on Scientific Linux release 6.7. I tried many different things, but the only ways I could get it to work were (a) using png or tiff instead of pdf, or (b) saving large .RData files and then using a completely separate R program to create the graphics.