Julia DifferentialEquations package SLOW and HEAVY? - julia

I am very new to Julia with python background and I am just testing DifferentialEquations package.
I run a simple jl script from command line and the problem is that it takes about a minute to run a simple code that Benchmark shows it need a few milliseconds to execute and also it take about 1GB of RAM. Am I doing something wrong or it is quite normal in Julia?
This is the simple script I got from the Tutorial:
import DifferentialEquations
import Plots
pl = Plots
df = DifferentialEquations
f(u,p,t) = 0.98u
u0 = 1.0
tspan = (0.0, 1.0)
prob = df.ODEProblem(f, u0, tspan)
sol = df.solve(prob)
I am using Ubuntu 18.04 and Julia 1.4.

It sounds like what you're seeing is mainly compilation time = Julia compiles native code for any method upon its first invocation, so yes it's normal to see longer runtimes and higher memory usage on the first run. The times reported in benchmarks are usually obtained using the BenchmarkTools package which will run a function multiple times to give a more accurate picture of its actual runtime, discarding the compilation time (similar to Python's %timeit functionality).

Related

Histogram in Julia: Distributed.ProcessExitedException

I'm running Julia 1.7.2 in a Pluto notebook on an M1 chip.
I have a vector of integers between 1 and 9, and would like to make a histogram showing which integers are most frequent.
I thought I had done this successfully by using Plots and then calling histogram on the appropriate vector. At one point I had even generated a plot that I liked doing this.
I then tried to replicate this procedure (i.e. using Plots, and then call histogram) in a different Pluto notebook I had written.
I don't know if it was conflicting with other packages, but I began to get Distributed.ProcessExitedException errors in this other notebook when I ran histogram.
In another fail mode, I created a cell that contained only the following code
begin
using Plots
histogram(vector)
end
When I run the notebook with this code, all the other cells evaluate, but then this last cell lags forever and does not evaluate.
Frustrated, I went back to the first notebook where I had gotten the plotting to work, but now I get Distributed.ProcessExitedException error there too!
I am just posting to see if anyone has any ideas as to what might be going on.
In particular,
is there a link between Plots and Distributed?
is there anything that would cause an error in one notebook to cause a different notebook that had previously been working to fail?
The packages I was using in the first notebook that worked, and then didn't work) were
begin
using DataFrames
using BSON
using Revise
using FileIO
using CSV, HDF5, NRRD
# Statistics
using StatsBase: mean, nquantile, percentile
using LinearAlgebra: norm, mul!
using HypothesisTests
using Plots
using Random
# Tensors
using Tullio: #tullio
# Algorithms
import Flux
import Flux: update!
using Flux: RMSProp
end
The packages I was using in the second notebook that never worked were
begin
using DataFrames
using FileIO
using CSV, HDF5, NRRD
# Statistics
using StatsBase: mean, nquantile, percentile
using LinearAlgebra: norm, mul!
using HypothesisTests
# Tensors
using Tullio: #tullio
# Algorithms
import Flux
import Flux: update!
using Flux: RMSProp
end

I want to print the current simulation number while running my code in r

I am using mclapply(1:nsim, f, mc.cores=4) in rstudio for parallel computing in linux. the function f is a predefined function by me. The number nsim = 500 is the number of simulations, so it is taking a lot of time to run this line since my function involves modeling 100000 data points. Is there any way to see the current number of simulation my screen\console (which is showing when i run it in windows with mc.cores = 1 ).
If there is any solution please let me know, this will help me check whether the program is running or the system is crashed.
Thanks in advance

Restricting loess' multicore usage in R

I'm trying to fit +- 70.000 values as a function of two variables using the loess() function several times. I want to use this fit to de-trend the data. My problem is that once I start the loess function, the R session takes up all available cores on the system, and that would be inconsiderate towards other users on the same computing cluster.
The relevant code would be analogous to the following:
# Approximation of the data
df <- data.frame(y = rpois(70000, rnorm(70000, 10, 2)), # y is count data
x = 50000 - rpois(70000, 100),
z = runif(70000))
# The problematic operation
fit <- loess(y ~ x + z, data = df)
When I run this example on my local machine, it only takes up 1 core, but on the cluster it takes as many cores as it could get (up to 48). Ideally, I would loess() to run on only 1 core.
I've tried to trace any multicore parameters in the code of loess, which I couldn't find. I know that loess calls stats:::simpleLoess, which in turn calls C code, which in turn calls Fortran code. I have no experience in C or Fortran and I haven't been able to figure out how I can restrict the CPU usage for this function.
Does anyone has any suggestion on how I can limit the CPU usage of the loess function?
I am not knowledgeable enough to comment on specifics about how all of this works, but I know that C++ and FORTRAN for R are usually built using the OpenMP framework for multi-thread programming. Empirically, I do know that your issue can be resolved if you set the OMP_NUM_THREADS argument before you launch R or if you set it within an R session.
Let's say you wanted to use 2 threads for the loess function. Before you launch R, you would do this ($ to signify typing this in a shell session):
$ OMP_NUM_THREADS=2 R [whatever other options you use to launch R]
Here's how to do it from within R (> to indicate an interactive R session):
> Sys.setenv("OMP_NUM_THREADS" = 2)
If you ever need to check the variable from within R, you can do the following (this will return a character vector with the number):
> Sys.getenv("OMP_NUM_THREADS")
# The result in our example will be "2"
For completeness, be sure to use ?Sys.setenv or ?Sys.getenv if you wish to get more information about those functions, and check out this site for details about OMP_NUM_THREADS.
Hope that helps!
So McG led me down a path that eventually gave me the ability to control the number of cores, which I'll post as another answer.
There were a few details I foolishly neglected to mention, namely that I was working on an RStudio server. For all other purposes, I indeed think that McG's answer would be excellent.
That answer helped me get the correct terms to google, and strolling around the search results I stumbled upon this thread that suggested that the RhpcBLASctl package has a function to set the number of cores as follows:
blas_set_num_threads(2)
Setting this in an RMarkdown document before running loess kept my CPU usage at 200% while running the loess function afterwards that was problematic before.

Could someone explain what "compiling" in R is, and why it would speed up this code?

I am working on some R code written by a previous student. The code is extremely computationally intensive, and for that reason he appears to have gone to great lengths to minimise the time it took in anyway possible.
One example is the following section:
# Now lets compile these functions, for a modest speed boost.
Sa <- cmpfun(Sa)
Sm <- cmpfun(Sm)
sa <- cmpfun(sa)
sm <- cmpfun(sm)
h <- cmpfun(h)
li <- cmpfun(lli)
ll <- cmpfun(ll)
He appears to have used the compiler package to do this.
I have never heard of compiling in R, and I am interested in what it does and why it would help speed up the code. I am having trouble finding material that would explain it for a novice like me.
The compiler package has been part of R since version 2.130. Compiling R functions, results in a byte code version that may run faster. There are a number ways of compiling. All base R functions are compiled by default.
Compiling individual functions via cmpfun. Alternatively, you can call enableJIT(3) once and the R code is automatically compiled.
I've found compiling R code gives a modest, cost free speed boost - see Efficient R programming for a timed example.
It appears that byte compiling will be turned on by default in R 3.4.X

How can I get R to use more CPU usage?

I noticed that R doesn't use all of my CPU, and I want to increase that tremendously (upwards to 100%). I don't want it to just parallelize a few functions; I want R to use more of my CPU resources. I am trying to run a pure IP set packing program using the lp() function. Currently, I run windows and I have 4 cores on my computer.
I have tried to experiment with snow, doParallel, and foreach (though I do not know what I am doing with them really).
In my code I have this...
library(foreach)
library(doParallel)
library(snowfall)
cl <- makeCluster(4)
registerDoParallel(cl)
sfInit(parallel = TRUE, cpus = 4)
#code that is taking a while to run but does not involve simulations/iterations
lp (......, all.int = TRUE)
sfStop()
R gets stuck and runs lp() for a very long time. My CPU is around 25%, but how can I increase that?
If you are trying to run 4 different LPs in parallel, here's how to do it in snowfall.
sfInit(parallel=TRUE, cpus=4)
sfSource(code.R) #if you have your function in a separate file
sfExport(list=c("variable1","variable2",
"functionname1")) #export your variables and function to cluster
results<-sfClusterApplyLB(parameters, functionname) #this starts the function on the clusters
E.g. The function in the sfClusterApply could contain your LP.
Otherwise see comments in regard to your question
Posting this as an answer because there's not enough space in a comment.
This is not an answer directly towards your question but more to the performance.
R uses slow statistical libraries by default which also can only use single core by default. Improved libraries are OPENBLAS/ATLAS. These however, can be a pain to install.
Personally I eventually got it working using this guide.
I ended up using Revolution R open(RRO) + MKL which has both improved BLAS libraries and multi-cpu support. It is an alternative R distribution which is supposed to have up to 20x the speed of regular R (I cannot confirm this, but it is alot faster).
Furthermore, you could check the CRAN HPC packages to see if there is any improved packages which support the lp function.
There is also packages to explore multi cpu usage.
This answer by Gavin, as well as #user3293236's answer above show several possibilities for packages allowing multi CPU usage.

Resources