Can I use R without R studio? - r

We are confused on the difference between R and R studio. We do the majority of our work on R studio but we were required to download R as well. Is regular R necessary for R studio to work?

Indeed, R is the real technology you are using. RStudio is an IDE which makes it easier and nicer. Still, it's just working on top of R.
You should be comparing RStudio to a regular text editor. You can use R without RStudio, you might for instance use a text editor plus a terminal window.

As mentioned by #NewUser Rstudio is simply an IDE, and other alternatives do exist. Check the answer to another question here for a long list of alternatives.
Rstudio is however the most popular IDE, and it comes with quite a few benefits. Auto completion of code, the interactive window for html applications, interactive graphics window, easy connection to various databases with automatic connection string complection etc. However some prefer to use alternative IDE's and you could even set up notebook++ to write and execute your code through the terminal.
The most obvious alternative is likely the R IDE. The minimalistic IDE that comes with the installation has some benefits as well, while being restrictive in others. The most obvious benefit is the far lower usage of memory for each window. My thesis supervisor is a hard-lover of the standard IDE, while a friend of mine simply uses it in cases where he needs to View(...) very large data, and for some reasons can't live with a summary output.
That said this question basically has nothing to do with programming, and is technically considered "off-topic" on stackoverflow as it is asking for recommendations. Other websites in the SO family are simply better for these type of questions.

R studio can be considered as a "skin" over base R, which makes it more user friendly. However, base R can certainly be used without R studio.
The main difference you will experience at a beginner level is that you will need to use functions such as View(), rather than cntrl clicking a dataframe etc.

Related

Line by line analysis and plotting on multiple monitors during presentation

I am preparing a presentation on data analysis and I am provided with a 2-3 monitor and projector head-up. I would like to use one monitor(+projector) for code, one monitor(+projector) for console display and one monitor(+projector) for plots. Monitors are for me, projectors for the audience.
I would also like to run the code line-by-line (similar to the Ctr-Enter feature of RStudio); copy pasting code won't work. I want to use interactive graphics, analysis and plotting on-the-fly so any pre-done analysis won't work.
Is there any way to achieve this? Although Rstudio is a fantastic tool, a rather basic (and one might say easy) feature like panel detachment is not being developed although frequently requested. This would be probably the best solution to what I want.
UPDATE: Any OS (Win, Mac, Linux) will do.
You should be able to use the vanilla R GUI. Within that you have separate panels/windows for code, console, and plots (with as many plot windows as you want by calling a new device like quartz()). You can evaluate a line of code from the script using Cmd-Enter(mac) and Cntr-Enter (pc) plus the default settings highlight the line of interest. You could also use emacs in the same way, which I find much more powerful and fun.

Does R have an add-on to colorize scripts?

I am new to the R language, but not to the programming world. I have been using excellent code editors such as Notepad++ and Eclipse and, therefore, am used to colored codes.
Is there anything that can be done to colorize the scripts inside R?
I know I can use Notepad++; however, this will require going back and forth between the two software, which is not convenient.
Check out RStudio.

R and SPSS difference

I will be analysing vast amount of network traffic related data shortly, and will pre-process the data in order to analyse it. I have found that R and SPSS are among the most popular tools for statistical analysis. I will also be generating quite a lot of graphs and charts. Therefore, I was wondering what is the basic difference between these two softwares.
I am not asking which one is better, but just wanted to know what are the difference in terms of workflow between the two (besides the fact that SPSS has a GUI). I will be mostly working with scripts in either case anyway so I wanted to know about the other differences.
Here is something that I posted to the R-help mailing list a while back, but I think that it gives a good high level overview of the general difference in R and SPSS:
When talking about user friendlyness
of computer software I like the
analogy of cars vs. busses:
Busses are very easy to use, you just
need to know which bus to get on,
where to get on, and where to get off
(and you need to pay your fare). Cars
on the other hand require much more
work, you need to have some type of
map or directions (even if the map is
in your head), you need to put gas in
every now and then, you need to know
the rules of the road (have some type
of drivers licence). The big advantage
of the car is that it can take you a
bunch of places that the bus does not
go and it is quicker for some trips
that would require transfering between
busses.
Using this analogy programs like SPSS
are busses, easy to use for the
standard things, but very frustrating
if you want to do something that is
not already preprogrammed.
R is a 4-wheel drive SUV (though
environmentally friendly) with a bike
on the back, a kayak on top, good
walking and running shoes in the
pasenger seat, and mountain climbing
and spelunking gear in the back.
R can take you anywhere you want to go
if you take time to leard how to use
the equipment, but that is going to
take longer than learning where the
bus stops are in SPSS.
There are GUIs for R that make it a bit easier to use, but also limit the functionality that can be used that easily. SPSS does have scripting which takes it beyond being a mere bus, but the general phylosophy of SPSS steers people towards the GUI rather than the scripts.
I work at a company that uses SPSS for the majority of our data analysis, and for a variety of reasons - I have started trying to use R for more and more of my own analysis. Some of the biggest differences I have run into include:
Output of tables - SPSS has basic tables, general tables, custom tables, etc that are all output to that nifty data viewer or whatever they call it. These can relatively easily be transported to Word Documents or Excel sheets for further analysis / presentation. The equivalent function in R involves learning LaTex or using a odfWeave or Lyx or something of that nature.
Labeling of data --> SPSS does a pretty good job with the variable labels and value labels. I haven't found a robust solution for R to accomplish this same task.
You mention that you are going to be scripting most of your work, and personally I find SPSS's scripting syntax absolutely horrendous, to the point that I've stopped working with SPSS whenever possible. R syntax seems much more logical and follows programming standards more closely AND there is a very active community to rely on should you run into trouble (SO for instance). I haven't found a good SPSS community to ask questions of when I run into problems.
Others have pointed out some of the big differences in terms of cost and functionality of the programs. If you have to collaborate with others, their comfort level with SPSS or R should play a factor as you don't want to be the only one in your group that can work on or edit a script that you wrote in the future.
If you are going to be learning R, this post on the stats exchange website has a bunch of great resources for learning R: https://stats.stackexchange.com/questions/138/resources-for-learning-r
The initial workflow for SPSS involves justifying writing a big fat cheque. R is freely available.
R has a single language for 'scripting', but don't think of it like that, R is really a programming language with great data manipulation, statistics, and graphics functionality built in. SPSS has 'Syntax', 'Scripts' and is also scriptable in Python.
Another biggie is that SPSS squeezes its data into a spreadsheety table structure. Dealing with other data structures is probably very hard, but comes naturally to R. I wouldn't know where to start handling network graph type data in SPSS, but there's a package to do it for R.
Also with R you can integrate your workflow with your reporting by using Sweave - you write a document with embedded bits of R code that generate plots or tables, run the file through the system and out comes the report as a PDF. Great for when you want to do a weekly report, or you do a body of work and then the boss gives you an updated data set. Re-run, read it over, its done.
But you know, your call...
Well, are you a decent programmer? If you are, then it's worthwhile to learn R. You can do more with your data, both in terms of manipulation and statistical modeling, than you can with SPSS, and your graphs will likely be better too. On the other hand, if you've never really programmed before, or find the idea of spending several months becoming a programmer intimidating, you'll probably get more value out of SPSS. The level of stuff that you can do with R without diving into its power as a full-fledged programming language probably doesn't justify the effort.
There's another option -- collaborate. Do you know someone you can work with on your project (you don't say whether it's academic or industry, but either way...), who knows R well?
There's an interesting (and reasonably fair) comparison between a number of stats tools here
http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/
I work with both in a company and can say the following:
If you have a large team of different people (not all data scientists), SPSS is useful because it is plain (relatively) to understand. For example, if users are going to run a model to get an output (sales estimates, etc), SPSS is clear and easy to use.
That said, I find R better in almost every other sense:
R is faster (although, sometimes debatable)
As stated previously, the syntax in SPSS is aweful (I can't stress this enough). On the other hand, R can be painful to learn, but there are tons of resources online and in the end it pays much more because of the different things you can do.
Again, like everyone else says, the sky is the limit with R. Tons of packages, resources and more importantly: indepedence to do as you please. In my organization we have some very high level functions that get a lot done. The hard part is creating them once, but then they perform complicated tasks that SPSS would tangle in a never ending web of canvas. This is specially true for things like loops.
It is often overlooked, but R also has plenty of features to cooperate between teams (github integration with RStudio, and easy package building with devtools).
Actually, if everyone in your organization knows R, all you need is to maintain a basic package on github to share everything. This of course is not the norm, which is why I think SPSS, although a worst product, still has a market.
I have not data for it, but from my experience I can tell you one thing:
SPSS is a lot slower than R. (And with a lot, I really mean a lot)
The magnitude of the difference is probably as big as the one between C++ and R.
For example, I never have to wait longer than a couple of seconds in R. Using SPSS and similar data, I had calculations that took longer than 10 minutes.
As an unrelated side note: In my eyes, in the recent discussion on the speed of R, this point was somehow overlooked (i.e., the comparison with SPSS). Furthermore, I am astonished how this discussion popped up for a while and silently disappeared again.
There are some great responses above, but I will try to provide my 2 cents. My department completely relies on SPSS for our work, but in recent months, I have been making a conscious effort to learn R; in part, for some of the reasons itemized above (speed, vast data structures, available packages, etc.)
That said, here are a few things I have picked up along the way:
Unless you have some experience programming, I think creating summary tables in CTABLES destroys any available option in R. To date, I am unaware package that can replicate what can be created using Custom Tables.
SPSS does appear to be slower when scripting, and yes, SPSS syntax is terrible. That said, I have found that scipts in SPSS can always be improved but using the EXECUTE command sparingly.
SPSS and R can interface with each other, although it appears that it's one way (only when using R inside of SPSS, not the other way around). That said, I have found this to be of little use other than if I want to use ggplot2 or for some other advanced data management techniques. (I despise SPSS macros).
I have long felt that "reporting" work created in SPSS is far inferior to other solutions. As mentioned above, if you can leverage LaTex and Sweave, you will be very happy with your efficient workflows.
I have been able to do some advanced analysis by leveraging OMS in SPSS. Almost everything can be routed to a new dataset, but I have found that most SPSS users don't use this functionality. Also, when looking at examples in R, it just feels "easier" than using OMS.
In short, I find myself using SPSS when I can't figure it out quickly in R, but I sincerely have every intention of getting away from SPSS and using R entirely at some point in the near future.
SPSS provides a GUI to easily integrate existing R programs or develop new ones. For more info, see the SPSS Community on IBM Developer Works.
#Henrik, I did the same task you have mentioned (C++ and R) on SPSS. And it turned out that SPSS is faster compared to R on this one. In my case SPSS is aprox. 7 times faster. I am surprised about it.
Here is a code I used in SPSS.
data list free
/x (f8.3).
begin data
1
end data.
comp n = 1e6.
comp t1 = $time.
loop #rep = 1 to 10.
comp x = 1.
loop #i=1 to n.
comp x = 1/(1+x).
end loop.
end loop.
comp t2 = $time.
comp elipsed = t2 - t1.
form elipsed (f8.2).
exe.
Check out this video why is good to combine SPSS and R...
Link
http://bluemixanalytics.wordpress.com/2014/08/29/7-good-reasons-to-combine-ibm-spss-analytics-and-r/
If you have a compatible copy of R installed, you can connect to it from IBM SPSS Modeler and carry out model building and model scoring using custom R algorithms that can be deployed in IBM SPSS Modeler. You must also have a copy of IBM SPSS Modeler - Essentials for R installed. IBM SPSS Modeler - Essentials for R provides you with tools you need to start developing custom R applications for use with IBM SPSS Modeler.
The truth is: both packages are useful if you do data analysis professionally. Sure, R / RStudio has more statistical methods implemented than SPSS. But SPSS is much easier to use and gives more information per each button click. And, therefore, it is faster to exploit whenever a particular analysis is implemented in both R and SPSS.
In the modern age, neither CPU nor memory is the most valuable resource. Researcher's time is the most valuable resource. Also, tables in SPSS are more visually pleasing, in my opinion.
In summary, R and SPSS complement each other well.

R text editors for introductory statistics courses [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Best IDE / TextEditor for R
Recommendations for Windows text editor for R
Dear All,
I teach a large introductory R course (about 100 students), and would like to recommend suitable text editors for R. The students who attend this course are first year mathematics undergraduates doing their very first course in R. They have never programmed in any language before.
For the vast majority of them, it wouldn't beneficial for them to learn to use a 'complex editor' - by this I mean emacs and vi.
What I would like to do is recommend simple text editors that are
free
can be easily installed on their laptops by users with little computer knowledge
have R syntax highlighting.
available for Windows or Mac.
For windows I've found:
TINN-R
Notepad ++ with the R plugin
Are there any others that I've missed for Windows?
There are a few threads that deal with R text editors:
Best IDE / TextEditor for R
Recommendations for Windows text editor for R
Which IDE for R in Linux?
but these are a bit too complicated for my purpose.
Edits
Following comments from Shane and others I've reworded the question.
Given that you don't have an major specific requirements (like an object browser), it's probably best to use what you're already using as much as possible. Something like Textpad is very simple and can do syntax highlighting.
Here are a few more pointers:
First of all, the R console that ships with Windows has it's own script editor. Just go File > New Script. It's very easy to use and you can execute code by highlighting it. If you just want something simple, I would stick with that.
I use Eclipse (with StatET) on Windows, and I have used it on a Mac too. It's great if you want an extensive IDE (syntax highlighting, integrated console, SVN, etc.) with a small learning curve.
JGR is also very good and platform independent.
Sciviews (which has Tinn-R) has several other options, including SciViews-K which is an R extension for Komodo.
Two others worth mentioning are Rattle and Rkward.
Emacs and VIM have a bigger learning curve, but they're also very powerful, especially if you're already using them for something else.
I see, this question is distinguished from prior ones by asking for a recommendation specific to "Intro to R" students. For the Mac portion of your question, i would suggest TextMate, for two reasons. First, the default answer "just use the Aqua R.app Gui" that R ships with, has minimal syntax highlighting and doesn't allow you to save and insert R commands (not that i'm aware of at least). Both of those things make learning a new language less painful and more efficient. But that might not justify the overhead of learning an editor while learning a new language at the same time.
No doubt others here will recommend TM, but they might not mention TextMate's tiered learning curve, i.e., someone who has never seen TM before can, after a 45-min tutorial, launch an interactive R session from it and use it to save/retrieve R commands "snippets". TM is not free, but it's around $50 with academic discount i believe. I would recommend three bundles for R use in TM, (i) R.app; (ii) R.daemon; and (iii) R, all of which are in the TM svn repository.
As always, emacs is an option: R in Emacs
This may not be the best option because of the learning curve with emacs though.
I haven't used it for R but TextMate on the Mac is awesome and they have an R bundle.
I haven't used it myself, but there is an Eclipse plug-in for R (which should work on Windows and Mac).
Because someone already mentioned Emacs, of course there's VIM with R plugin, don't know how many of those there are, but I found at least one with a quick google. VIM might have an even steeper learning curve than Emacs though.
That said. I think Emacs and VIM will both handle pretty much any language out there, so let the flame war begin!
I use Vim myself but I'm quite certain that both Vim and Emacs would be a bad choice for a student course.

What useful R package doesn't currently exist? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I have been working on a few R packages for some general tools that aren't currently available in R: blogging, report delivery, logging, and scheduling. This led me to wonder: what are the most important things that people wish existed in R that currently aren't available?
My hope is that we can use this to pinpoint some gaps, and possibly work on them collaboratively.
I'm a former Mathematica junkie, and one thing that I really miss is the notebook style interface. When I did my research with notebooks, papers would almost write themselves as I did my analysis. But now that I'm using R, I find that documenting my work to be quite tedious.
For people that are not so familiar with Mathematica, you have documents called "notebooks" that can contain code, text, equations, and the results from executed code (which can be equations, text, graphics, or interactive tools). Everything can be neatly organized into styled subsections or sections that are collapsable. You can have multiple open documents that integrate with a single shared kernel.
While I don't think a full-blown Mathematica style interface is entirely necessary, some interactive document system that would support text (for description), code, code output, and embedded image output would be a real boon to researchers.
A Real-Time R package would be my choice, using C Streaming perhaps.
Also I'd like a more robust web development package. Nothing as extensive as Ruby on Rails but something a bit better than Sweave combined with R2HTML, that can run on RApache. I think this needs to be a huge area of emphasis for R in general.
I realize LaTeX is better markup for certain academia but in general I think HTML should be the markup language of choice. More needs to be done in terms of R Web Apps, so applications can be hosted on huge RAM remotely and R can start being used for SaaS data applications and other graphics choices.
Interfaces to any of the new-fangled 'Web 2.0' databases that use key-value pairs rather than the standard RDMS. A non-exhaustive list (in alphabetical order) would be
Cassandra Project
CouchDB
MongoDB
Project Voldemort
Redis
Tokyo Cabinet
and it would of course be nice if we had a DBI-alike abstraction on top of this. Jeff has started with RBerkeley but that use the older-school Oracle BerkeleyDB backend rather than one of those new things.
An output device which produces Javascript code, perhaps using the protovis library.
as a programmer and writer of libraries for colleagues, I was definitely missing a logging package, I googled and asked around, here too, then wrote one myself. it is on r-forge, here, and it s called "logging" :)
I use it and I'm obviously still developing it.
There are few libraries to interface with database in general, and there is not ORM library.
RMySQL is useful, but you have to write the SQL queries manually and there is not a way to generate them as in a ORM. Morevoer, it is only specific to MySQL.
Another library set that R still doesn't have, for me, it is a good system for reading command line arguments: there is R getopt but it is nothing like, for example, argparse in python.
A natural interface to the .NET framework would be awesome, though I suspect that that might be a lot of work.
EDIT:
Syntax highlighting from within RGui would also be wonderful.
ANOTHER EDIT:
R.NET now exists to integrate R with .NET.
A FRAQ package for FRequently Asked Questions, a la fortune(). R-help would be so much fun: "Try this, library(FRAQ); faq("lattice won't print"), etc.
See also.
A wiki package that adds wiki-like documentation to R packages. You'd have a inst/wiki subdirectory with plain text files in markdown, asciidoc, textile, with embedded R code. With the right incantation, these files would be executed (think brew and/or asciidoc packages), and the relevant output uploaded to a given repository online (github, googlecode, etc.). Another function could take care of synchronizing the changes made online, typically via svn or git.
Suddenly you have a wiki documentation for your package with reproducible examples (could even be hooked to R CMD check).
EDIT 2012:
... and now the knitr package would make this process even easier and neater
I would like to see a possibility to embed another programming language within R in a more straightforward way by the users. I give this as an example in some common-lisp implementations one could write a function with embedded C code like this:
(defun sample (x)
(ffi:c-inline (n1 n2) (:int :int) (values :int :int) "{
int n1 = #0, n2 = #1, out1 = 0, out2 = 1;
while (n1 <= n2) {
out1 += n1;
out2 *= n1;
n1++;
}
#(return 0)= out1;
#(return 1)= out2;
}"
:side-effects nil))
It would be good if one could write an R function with embedded C or lisp code (more interested in the latter) in a similar way.
A native .NET interface to RGUI. R(D)Com is based on COM, and it only allows to exchange matrices, not more complex structures.
I would very much like a line profiler. This exists in Matlab and Python, and is very useful for finding bits of code that take a lot of time or are executed more (or less) than expected. A lot of my code involves function optimizations and how many times something iterates may not be known in advance (though most iterations are constrained or specified).
The call stack is useful if all of your code is in R and is very simple, but as I recently posted about it, it takes a painstaking effort if your code is complex.
It's quite easy to develop a line profiler for a given bit of code. A naive way is to index every line (or just pre-specified sections) and insert a call to log proc.time() that line. In a loop, I simply enumerate sections of code and store in a 2 dimensional list the proc.time values for section i in iteration k. [See update below: this isn't actually a way to do a line profiler for all kinds of code.]
One can use such a tool to find hotspots, anomalies (e.g. code that should be O(n) but is really O(n^2)), code that may benefit from memoization (a line profiler doesn't tell you this, but it lets you know where to look), code that is mistakenly inside a loop, and more.
Update 1: Inserting a timing line between every function line is slightly erroneous: the definition of a line of code is not simply code separated by whitespace. Being able to parse the code into an AST is necessary for knowing where operations begin and end. As discussed in some of the answers to this question, there are some tools (namely, showTree and walkCode in the codetools package) for doing this. Simply applying a regular expression to source code would be a very bad thing to do.

Resources