Is R a compiled language? - r

I can't find it anywhere on the web (and I don't want to install it). Is the R language a compiled language? How fast does it run a pre-written script? Does it do any kind of compilation, or just execute instructions line by line?

In most cases R is an interpreted language that runs in a read-evaluate-print loop. There are numerous extensions to R that are written in other languages like C and Fortran where speed or interfacing with native libraries is helpful.

I've often rewritten R code in C++ and made it run 100x faster. Looping is especially inefficient in R.

R is generally an interpreted language. However, package compiler offers bytecode compilation that can improve performance. You can also call compiled code from R.
In terms of how fast, it depends on what you are trying to do and how you are trying to do it. Some looping operations can be very slow. However, in many cases, with well written code, the performance of R scripting is determined by the speed of the underlying internal C-based libraries and system memory read-write speeds, and so R is about as fast as anything else.

Related

Using Openblas with R in Reproducible R container

I am using R for a reproducible scientific machine learing & hyperparameter optimizations. I stumble upon the fact that other implementations of blas such openblas/atlas/klm can speedup this costly optimization. But results are slightly different using each blas even if optimization is forced on single thread results deviate from default R.
So I want to try using Docker to contain the experiment. I have multiple questions.
is it good to compile from source instead of binaries ?
if I compile from source, will it lead to same configuration as debian binaries ?
since results are different for each blas, there is a tool called ReproBLAS from Berkeley, is it good idea to use it with R ?
when you compile R using "--with-blas=-lopenblas" in this case openblas is single threaded or multithreaded ?

R: write a high-quality GUI for an R-project

I have written a piece of R-code that performs a numerical computation. Now, I want to implement it into a nice GUI. I know that there are some R-packages, that allow to create GUIs from within R (e.g. gWidgets, RGtk2, ...). However, they seem to be rather limited in the capabilities and complicate to build. So I thought about going the other way round and writing a windowed-program that incorporates my R-code.
Is it possible to write a nice GUI (for example in Visual Basic.NET or Java) that allows to gather some user inputs, call the R-computations and display the results?
I ask for Visual Basic because there is this new R-Open that comes along with Visual Studio which makes me think the two must offer natural ways of collaboration with each other. I also hope that I would be able to compile an exe with it in the end.
Thank you very much for you Help!
Bernd
You can embed R in C++ code. There are examples in the R source code and documentation.
Very briefly, you'll need to build a shared DLL version of R (i.e with the --enable-R-shlib option) from the source code, using the Windows Tools. This is how GUIs like RStudio function.
The R Admin manuals have detailed instructions. The RInside package might make this a bit easier.
With the shared DLL you could probably embed R in other languages (it works for R in Python).

Multi-threading in NLOPTR (The R package interface to NLOPT)

I'm trying to run a set of fairly large non-linear optimizations through an R interface. (Yes, it must be R -- other languages don't work) I've been groveling around in the code, and I confess that I can't figure out whether NLOPT supports multithreaded operation, and, if so, how it gets kicked in.
Is it?

Rust on grid computing

I'm looking to create Rust implementations of some small bioinformatics programs for my research. One of my main considerations is performance, and while I know that I could schedule the Rust program to run on a grid with qsub - the cluster I have access to uses Oracle's GridEngine - I'm worried that the fact that I'm not calling MPI directly will cause performance issues with the Rust program.
Will scheduling the program without using an MPI library hinder performance greatly? Should I use an MPI library in Rust, and if so, are there any known MPI libraries for Rust? I've looked for one but I haven't found anything.
I have used several supercomputing facilities (I'm an astrophysicist) and have often faced the same problem: I know C/C++ very well but prefer to work with other languages.
In general, any approach other than MPI will do, but consider that often such supercomputers have heavily optimised MPI libraries, often tailored for the specific hardware integrated in the cluster. It is difficult to tell how much the performance of your Rust programs will be affected if you do not use MPI, but the safest bet is to stay with the MPI implementation provided on the cluster.
There is no performance penalty in using a Rust wrapper around a C library like a MPI library, as the bottleneck is the time needed to transfer data (e.g. via a MPI_Send) between nodes, not the negligible cost of an additional function call. (Moreover, this is not the case for Rust: there is no additional function call, as already stated above.)
However, despite the very good FFI provided by Rust, it is not going to be easy to create MPI bindings. The problem lies in the fact that MPI is not a library, but a specification. Popular MPI libraries are OpenMPI (http://www.open-mpi.org) and MPICH (http://www.mpich.org). Each of them differs slightly in the way they implement the standard, and they usually cover such differences using C preprocessor macros. Very few FFIs are able to deal with complex macros; I don't know how Rust scores here.
As an instance, I am implementing an MPI Program in Free Pascal but I am not able to use the existing MPICH bindings (http://wiki.lazarus.freepascal.org/MPICH), as the cluster I am using provides its own MPI library and I prefer to use this one for the reason stated above. I was unable to reuse MPICH bindings, as they assumed that constants like MPI_BYTE were hardcoded integer constants. But in my case they are pointers to opaque structures that seem to be created when MPI_Init is called.
Julia bindings to MPI (https://github.com/lcw/MPI.jl) solve this problem by running C and Fortran programs during the installation that generate Julia code with the correct values for such constants. See e.g. https://github.com/lcw/MPI.jl/blob/master/deps/make_f_const.f
In my case I preferred to implement a middleware, I.e., a small C library which wraps MPI calls with a more "predictable" interface. (This is more or less what the Python and Ocaml bindings do too, see https://forge.ocamlcore.org/projects/ocamlmpi/ and http://mpi4py.scipy.org.) Things are running smoothly, so far I haven't got any problem.
Will scheduling the program without using an MPI library hinder performance greatly?
There are lots of ways to carry out parallel computing. MPI is one, and as comments to your question indicate you can call MPI from Rust with a bit of gymnastics.
But there are other approaches, like the PGAS family (Chapel, OpenSHMEM, Co-array Fortran), or alternative messaging like what Charm++ uses.
MPI is "simply" providing a (very useful, highly portable, aggressively optimized) messaging abstraction, but as long as you have some way to manage the parallelism, you can run anything on a cluster.

Calling one R Installation from Another

I seem to be unable to compile RPostgreSQL for Windows x64, and after extensive searching, I've not been able to find a precompiled binary. To get on with my work, I've installed a 32 bit version of Postgre and have been using 32 bit R for all database ops.
I need to do much of my work in 64 bit R, so switching back and forth has become a bit painful, especially since this requires a save() and load() operation each time I need to run a query.
I'm wondering whether it is possible to call one R installation directly from another? For example, could I simply pass queries to my 32 bit R installation and retrieve the result? I think there are other times when the ability to call another R installation would be useful as well.
All I've come up with is using a system() call, either directly to pgsql or to 32-bit R, but this doesn't allow for very efficient transfer of data.
I'd very sincerely appreciate any advice or assistance!
P.S. I'd rather ask how to compile RPostgreSQL for x64, but as I understand the rules here, such a question would be inappropriate since it's not a general question (e.g. I'd need step-by-step instructions since I don't have the requisite skills).
http://wiki.postgresql.org/wiki/64bit_Windows_port

Resources