I am running R Studio 64bit on a Windows10 laptop with a Nvidia GPU in it, however, when I am running code, specifically Rshiny apps, they take a long time. This laptop has a GPU but my task manager shows that the GPU is not being utilized. Would the GPU make my program run faster? I do not know much about hardware so forgive my ignorance regarding this.
In answer to your question getting a new GPU would have no impact on the speed of your code.
By default most R code is single threaded meaning that it will only use 1 CPU core. There are various ways to do parallel processing (using more than 1 core) in R. And there are also packages that can make use of GPUs. However it sounds like you are not using either of these.
There are various different ways that you could code your application that would make it more efficient. However how you would go about this would be specific to your code. I would suggest you ask a different question regarding.
Also Hadley's excellent book, Advance R, has techniques for profiling and benchmarking your code to increase performance: http://adv-r.had.co.nz/Profiling.html
Related
On the right hand side is the result from EC2 instance with 36 cores and 64GB RAM while on the left is my laptop with 8 cores and 8GB RAM.
I'm new to running R on AWS EC2 instance, so probably I need to configure my R in order to make use of the EC2 instance raw compute power.
Could someone please advise on how or is there anything that I miss here?
Thanks!
I found the better answer here.
Quite disappointing with the unnecessary downvotes.
https://github.com/csantill/RPerformanceWBLAS/blob/master/RPerformanceBLAS.md
We frequently hear that R is slow, if your code makes heavy usage of
vector/matrix operations you will see significant performance
improvements.
The precompiled R distribution that is downloaded from CRAN makes use
of the reference BLAS/LAPACK implementation for linear algebra
operations, These implementations are built to be stable and cross
platform compatible but are not optimized for performance. Most R
programmers use these default libraries and are unaware that highly
optimized libraries are available and switching to these can have a
significant perfomance improvement.
I've been using R 3.1.2 on an early-2014 13" MacBook Air with 8GB and 1.7GHz Intel Core I7, running Mavericks OSX.
Recently, I've started to work with substantially larger data frames (2+ million rows and 500+ columns) and I am running into performance issues. In Activity Monitor, I'm seeing virtual memory sizes of 64GB, 32GB paging files, etc. and the "memory pressure" indicator is red.
Can I use the "throw more hardware" at this problem? Since the MacBook Air tops out at 8GB physical memory, I was thinking about buying a Mac Pro with 64GB memory. Before I spend the $5K+, I wanted to ask if there are any inherent limitations in R other than the ones that I've read about here: R Memory Limits or if anyone who has a Mac Pro has experienced any issues running R/RStudio on it. I've searched using Google and haven't come up with anything specific about running R on a Mac Pro.
Note that I realize I'll still be using 1 CPU core unless I rewrite my code. I'm just trying to solve the memory problem first.
Several thoughts:
1) Its a lot more cost effective to use a cloud service like https://www.dominodatalab.com (not affiliated). Amazon AWS would also work, the benefit of domino is that it takes the work out of managing the environment so you can focus on the data science.
2) You might want to redesign your processing pipeline so that not all your data needs to be loaded in memory at the same time (soon you will find you need 128 GB, then what). Read up on memory mapping, using databases, separating your pipeline into several steps that can be executed independent of each other, etc (googling brought up http://user2007.org/program/presentations/adler.pdf). Running out of memory is a common problem when working with real life datasets, throwing more hardware at the problem is not always your best option (though sometimes it really can't be avoided).
Have anyone tried to compile glibc with -march=corei7 to see if there's any performance improvement over the version that comes by default with any Linux x68_64 distribution? GCC is compiled with -march=i686. I think (not sure) that the mathematical library is also compiled the same way. Can anybody confirm this?
Most Linux distributions for x86 compile using only i686 instructions, but asking for scheduling them for later processors. I haven't really followed later developments.
A long while back different versions of system libraries according to processor lines were common, but the performance differences were soon deemed too small for the cost. And machines got more uniform in performance meanwhile.
One thing that has to be remembered always is that today's machines are memory bound. I.e., today a memory access takes a few hundred times longer than an instruction, and the gap is growing. Not to mention that this machine (an oldish laptop, was top-of-the-line some 2 years back) has 4 cores (8 threads), all battling to get data/instructions from memory. Making the code run a tiny bit faster, so the CPU can wait longer for RAM, isn't very productive.
Does anyone have a suggestion if it would be possible to run R scripts on a Synology DS214 NAS? If yes, further information or links would be much appreciated.
The underlying OS on the Diskstation is Linux-based, so in principle it may be possible, but I highly doubt it will be in practice. First of all, you'd need to compile R for the Marvell Armada XP CPU, which is part of the ARM family of CPUs. To compile R on this chip you'll need all the related software described in the R Administration & Installation manual upon which R depends.
Finally, a DiskStation is not at all designed to crunch numbers. Even if you do manage to overcome the potentially unsurmountable problems of being able to compile the software you need on the DS214, the execution of code isn't going to be, well, snappy. Also, the 512MB on-board RAM will make all but small data analysis jobs impossible or impossibly slow.
I have a script in R that uses loess to find a curve of best fit for some data, integrates that curve and then does some simple arithmitic from there to come up with a number. It is for a testing unit which is handheld, and therefore I need to run it on a microcomputer. There are ones out there with Arm processors which will run linux (520MHz Low-power ARM processor, 128 MB of RAM).
My question is will this have enough power to do the calculations, and will R even run on a system like this?
Any suggestions or insight would be greatly appreciated!
Thanks.
You need to have compatible C and FORTRAN compilers as well as the GNU OS and Linux kernel to build R from sources. The lack of compatible compilers is likely to doom this strategy. I suspect you will have better success using for code from loess and simpleLoess as a template for a compiled version in C.
As to your second question, R can be made to run on ARM processors. See, for example, http://maemo.org/packages/view/r-base-core/ which is a port to Maemo which is an operating system on ARM processors for handheld devices. It also can be compiled for Android: http://rwiki.sciviews.org/doku.php?id=getting-started:installation:android.
As #Carl said, the first question can't be answered without knowing more details.
I stumbled on this post depicting an Open Source HW/SW R-Based Graphing Calculator based on this motherboard!
HIH!