Why does running the tutorial example fail with Rstan? - r

I installed the RStan successfully, the library loads. I try to run a tutorial example (https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started#example-1-eight-schools). After applying the stan function
fit1 <- stan(
file = "schools.stan", # Stan program
data = schools_data, # named list of data
chains = 4, # number of Markov chains
warmup = 1000, # number of warmup iterations per chain
iter = 2000, # total number of iterations per chain
cores = 1, # number of cores (could use one per chain)
refresh = 0 # no progress shown
)
I get the following error:
*Error in compileCode(f, code, language = language, verbose = verbose) :
C:\rtools42\x86_64-w64-mingw32.static.posix\bin/ld.exe: file1c34165764a4.o:file1c34165764a4.cpp:(.text$_ZN3tbb8internal26task_scheduler_observer_v3D0Ev[_ZN3tbb8internal26task_scheduler_observer_v3D0Ev]+0x1d): undefined reference to `tbb::internal::task_scheduler_observer_v3::observe(bool)'C:\rtools42\x86_64-w64-mingw32.static.posix\bin/ld.exe: file1c34165764a4.o:file1c34165764a4.cpp:(.text$_ZN3tbb10interface623task_scheduler_observerD1Ev[_ZN3tbb10interface623task_scheduler_observerD1Ev]+0x1d): undefined reference to `tbb::internal::task_scheduler_observer_v3::observe(bool)'C:\rtools42\x86_64-w64-mingw32.static.posix\bin/ld.exe: file1c34165764a4.o:file1c34165764a4.cpp:(.text$_ZN3tbb10interface623task_scheduler_observerD1Ev[_ZN3tbb10interface623task_scheduler_observerD1Ev]+0x3a): undefined reference to `tbb::internal::task_scheduler_observer_v3::observe(bool)'C:\rtools42\x86_64-w64-mingw32.static.posix\bin/ld.exe: file1c34165764a4.o:file1c34165764a4.cpp:(.text$_ZN3tbb10interface
Error in sink(type = "output") : invalid connection*
Simply running example(stan_model, run.dontrun=T) gives the same error.
What does this error mean?
Is rtools wrongly installed? My PATH seems to contain the correct folder C:\\rtools42\\x86_64-w64-mingw32.static.posix\\bin;. Is something wrong with the version of my Rstan package? I am struggling to interpret this error?
What to try to solve it?

Apparently Rtools42 in incompatible with the current version Rstan on CRAN. The solution that worked for me is found here: https://github.com/stan-dev/rstan/wiki/Configuring-C---Toolchain-for-Windows#r-42

Related

NLP textEmbed function

I am trying to run the textEmbed function in R.
Set up needed:
require(quanteda)
require(quanteda.textstats)
require(udpipe)
require(reticulate)
#udpipe_download_model(language = "english")
ud_eng <- udpipe_load_model(here::here('english-ewt-ud-2.5-191206.udpipe'))
virtualenv_list()
reticulate::import('torch')
reticulate::import('numpy')
reticulate::import('transformers')
reticulate::import('nltk')
reticulate::import('tokenizers')
require(text)
It runs the following code
tmp1 <- textEmbed(x = 'sofa help',
model = 'roberta-base',
layers = 11)
tmp1$x
However, it does not run the following code
tmp1 <- textEmbed(x = 'sofa help',
model = 'roberta-base',
layers = 11)
tmp1$x
It gives me the following error
Error in x[[1]] : subscript out of bounds
In addition: Warning message:
Unknown or uninitialised column: `words`.
Any suggestions would be highly appreciated
I believe that this error has been fixed with a newer version of the text-package (version .9.50 and above).
(I cannot see any difference in the two code parts – but I think that this error is related to only submitting one token/word to textEmbed, which now works).
Also, see updated instructions for how to install the text-package http://r-text.org/articles/Extended_Installation_Guide.html
library(text)
library(reticulate)
# Install text required python packages in a conda environment (with defaults).
text::textrpp_install()
# Show available conda environments.
reticulate::conda_list()
# Initialize the installed conda environment.
# save_profile = TRUE saves the settings so that you don't have to run textrpp_initialize() after restarting R.
text::textrpp_initialize(save_profile = TRUE)
# Test so that the text package work.
textEmbed("hello")

Goodness of fit test 'ncores' argument produces error if included or if left to default

I (a novice) am trying to run a McKenzie-Bailey goodness of fit test on occupancy models using package AICcmodavg. No matter what I do I get an error regarding argument ncores which specifies how many of my computational cores to use in running the bootstraps. The package info says that if left blank it will default to 1 less than available cores. My computer has 4 cores. If I specify any number of cores (I've tried 0-4) I get an error that the argument is unused. If I do not specify ncores I get an error that it is missing. Code following, any suggestions appreciated :)
mb.gof.test(CamMod1, nsim = 1000, ncores = 3)
Error in statistic(object, ...) : unused argument (ncores = ncores)
mb.gof.test(CamMod1, nsim = 1000)
Error in parboot(mod, statistic = function(i) mb.chisq(i)$chi.square, : argument "ncores" is missing, with no default

JAGS not recognizing values stored in R's global environment

I'm running JAGS with R v3.6.1 and Rstudio v1.2 in Windows 10, using package R2jags. JAGS does not appear to find the stored values that I've created in R for MCMC settings such as n.iter, n.burn-in, etc. because my code:
out2 <- jags.parallel (data = data, inits = inits, parameters.to.save = parameters,
model.file = "C:\\Users\\jj1995\\Desktop\\VIP\\Thanakorn\\DLM\\district model.txt", n.chains = n.chain,
n.thin = n.thin, n.iter = n.iter, n.burnin = n.burn)
Produces the error
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
3 nodes produced errors; first error: object 'n.burn' not found
If I replace a stored value's name with a number (n.burnin = 10000), it will return the same error, but for a different MCMC setting. I assume JAGS will also be unable to find the lists and dataframes I've stored in the global environment also. I thought my system's firewall was preventing JAGS from accessing the global environment, so I disabled my antivirus measures and I ran both R and Rstudio as an administrator. This did not solve the problem.

BoundsError in Julia MXNet when using small batch size

I'm trying to reproduce some Python MXNet code in Julia 0.6.0, and I'm getting a BoundsError if I try to use a batch size that is smaller than the dimension of the output. If I use a larger batch size in a toy example, things work properly and the network converges to the correct solution, but in my application the output dimension is large so this isn't practical.
Here's a linear regression example that gives this error:
using MXNet
net = mx.Variable(:data)
net = mx.FullyConnected(net, name=:fc0, num_hidden=5)
net = mx.LinearRegressionOutput(net, name=:output)
mod = mx.FeedForward(net, context=mx.cpu(0))
batch_size = 4 # works for batch_size > 4
A = randn(5,100)
train_in = randn(100,1000)
train_out = A*train_in + .1*randn(5,1000)
train_provider = mx.ArrayDataProvider(:data=>train_in,
:output_label=>train_out,
shuffle=true,
batch_size=batch_size)
optimizer = mx.SGD(lr=0.001, momentum=0.9, weight_decay=0.00001)
mx.fit(mod, optimizer, train_provider)
This produces
INFO: Start training on MXNet.mx.Context[CPU0]
INFO: Initializing parameters...
INFO: Creating KVStore...
INFO: TempSpace: Total 0 MB allocated on CPU0
INFO: Start training...
ERROR: LoadError: BoundsError: attempt to access 5×4 Array{Float32,2} at index [Base.Slice(Base.OneTo(5)), 5]
If I increase the batch size to 5 or greater, it works as expected. What am I missing?
You can track the resolution of this bug here:
https://github.com/dmlc/MXNet.jl/issues/264
I have tested it two weeks ago and unfortunately it is still happening.

Error occurring in caret when running on a cluster

I am running the train function in caret on a cluster via doRedis. For the most part, it works, but every so often I get errors at the very end of this nature:
error calling combine function:
<simpleError: obj$state$numResults <= obj$state$numValues is not TRUE>
and
Error in names(resamples) <- gsub("^\\.", "", names(resamples)) :
attempt to set an attribute on NULL
when I run traceback() I get:
5: nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method,
ppOpts = preProcess, ctrl = trControl, lev = classLevels,
...)
4: train.default(x, y, weights = w, ...)
3: train(x, y, weights = w, ...)
2: train.formula(couple ~ ., training.balanced, method = "nnet",
preProcess = "range", tuneGrid = nnetGrid, MaxNWts = 2200)
1: caret::train(couple ~ ., training.balanced, method = "nnet",
preProcess = "range", tuneGrid = nnetGrid, MaxNWts = 2200)
These errors are not easily reproducible (i.e. they happen sometimes, but not consistently) and only occur at the end of the run. The stdout on the cluster shows all tasks running and completed, so I am a bit flummoxed.
Has anyone encountered these errors? and if so understand the cause and even better a fix?
I imagine you've already solved this problem, but I ran into the same issue on my cluster consisting of linux and windows systems. I was running the server on ubuntu 14.04 and had noticed the warnings when starting the server service about having 'transparent huge pages' enabled in the linux kernel. I ignored that message and began running training exercises where most of the machines were maxed out with workers. I received the same error at the end of the run:
error calling combine function:
<simpleError: obj$state$numResults <= obj$state$numValues is not TRUE>
After a lot of head scratching and useless tinkering, I decided to address that warning by following the instructions here: http://ubuntuforums.org/showthread.php?t=2255151
Essentially, I installed hugeadm using:
sudo apt-get install hugeadm
Then disabled the transparent pages using:
hugeadm --thp-never
Note that this change will be undone on restart of the computer.
When I re-ran my training process it ran without any errors.
Hope that helps.
Cheers,
Eric

Resources