Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Related question
how to search for r materials
Part of the reason why the R community has been attracted to a tag based service like StackOverflow, I think, is that information on R is fundamentally difficult to find online. Services like RSeek have made this slightly less painful, however, I often find the search results scattered.
Specifically, I am often curious if R packages exist to meet a specific need I am facing. RSeek is useful for finding package documentation, but not for discovering new packages---and the R package manager is even less useful. As such, what are some best practices for searching for packages? That is, when I realize I have a need that my current set of R packages will not meet, and before creating the functionality myself I would like to search for a package that will meet the need. What is the best way to proceed?
First, use help.search() or the shorthand ??. This will search the help files of installed packages. I often find I have a package installed that does what I want; I just haven't used it before.
Next, use the findFn function in the sos package. This function searches the help pages of packages covered by the RSiteSearch archives (which includes all packages on CRAN). These are ordered based on a relevance score, so the top few packages on the list are probably the most useful.
To look even further afield, use RSiteSearch() which will send your search to R site search. As well as CRAN packages, this covers the R-help mailing list archives, help pages, vignettes and task views.
Still no luck? Try Rseek.org. It covers more sites.
Finally, if all else fails, ask here on StackOverflow or send your question to the R-help mailing list.
I believe crantastic.org is hoping to help people discover and collaboratively rate/discuss packages. It might be of use once it gets more traffic.
A new CRAN package is extremely helpful for this: check out the "sos" package.
CRAN task views (BioC uses them as well): http://cran.r-project.org/web/views/
This works well as long as you think of a package in the same way as the person writing the DESCRIPTION file.
http://versioneye.com is a cross platform search engine for Software Libraries. R Packages are in the index, too. You can rate and comment the packages. But the coolest Feature is that you can follow your packages, and as soon the next version is released you will get notified via E-Mail.
There is a "Lucky" Button, too. Similar to Google. It let you discover new packages. More Features for discovering and comparing packages are coming soon.
By the way. I am the CEO at VersionEye. I am always looking for feedback to improve the service.
There is new service for searching through documentation of all packages hosted on Inside-R: http://www.inside-r.org/packages.
Examples:
Neural networks
Violin plot
Via Revolutions Blog
I think you might already know this one (never assume !), but I use http://www.rseek.org/ quite a lot for this kind of issue, generally I'll try to pick out some unique keywords for my task and search here...
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
Is there a way to share R-package anonymously that will work on Unix, Mac, and/or Windows (preferably all 3 and with the ease of having it on CRAN)?
Having an R package on CRAN so that analyses can be reproduced and methodology can be demonstrated and shared can be a big boost to the review of a manuscript submitted to a scientific/statistical journal (in my opinion and experience).
If that journal requires blinded reviews, how can I share the R package in a way that keeps the review blinded (traditionally, the DESCRIPTION file lists my name and email address, which would unblind the review)?
I have thought of the following options, all of which have drawbacks:
Go through whole CRAN submission process with a pseudonym (fake name and throwaway email account) without using github (my github username is my last name). After the review is unblinded / paper is accepted change the throwaway information to the correct information. I'm uncertain of the etiquette of this or how CRAN's policy would deem this practice.
Zip up R package with no involvement of CRAN or Github and trust the reviewer is interested and capable enough to install it from source on unix. There's a big difference between this and being able to type install.packages() and library() on the system the reviewer is familiar with, and manually creating and including zips for all platforms is tedious.
Don't make a package, just send code snippets and data and state in the manuscript an R package is forthcoming (which is a weaker statement than, "here's the R package that is already on CRAN"; another drawback is that listed in item 2).
I've mentioned CRAN and Github because I'm most familiar with these repos. I'm open to other solutions.
There’s no need at all to have the package on CRAN,1 and there’s no way to submit packages anonymously to CRAN. Such a submission would be a big problem for CRAN in terms of maintainability. CRAN is simply not the correct platform for this.
Github has similar issues but in principle you could just create a separate Github account without providing identifying information.
However, this just sidesteps a bigger issue: How non-identifiable is your code really? More generally, the whole idea of double-blinded peer review is dogged by issues of identifiability of the research. I don’t think there’s a good solution (especially involving code review, but even in general) where the research is submitted anonymously. As such, I don’t think it’s worthwhile spending energy trying to make code submissions anonymous, to the detriment of software (maintenance) quality.
In cases where double-blind anonymous peer review is desired, the currently best option is to submit the code to a service that allows anonymous archival, such as Figshare, or submit an archive as supplementary material to the journal. It should not be a stretch to expect the reviewer to perform a simple
install.packages(path_to_file, repos = NULL, type="source")
… otherwise they may not be qualified to review the code anyway.
1 In fact this isn’t even desirable (on the contrary, I find the cluttering of CRAN quite counter-productive; though “CRAN” has “comprehensive” in its name, ideally all its contents should be in the form of properly usable packages; in other words: quality, not quantity).
I am working on a project where I am fetching bulk data from Bloomberg, such as the stock of the 1000 highest valued US companies, and then computing summary statistics on them.
I would like to use R for the procedure and I am wondering which package is would suit the task better, RBloomberg or Rblpapi.
This is what I think are the pros and cons of the packages:
RBloomberg
Has good Manual from 2010 and more SO questions
+May be more stable since it's been around for longer
May not work on new version of R, Requires Java
Will likely not receive new functions and support
Rblpapi
Faster, does not require Java
Will likely receive new functions
If the package is updated significantly, I may have to rewrite my code
In addition, is the functionality of the two packages equivalent?
Thank you for your input.
These opinion based questions are not always the best fit for Stack Overflow but this may help you:
1) This debate may be of use with Whit one of the writers of Rblpapi in 2014 saying go with Rbbg until the functionality is more developed.
2) #Dirk Eddelbuettel write-up explains the history of these packages. Dirk explains how the collaborators are linked from Dirk to Ana to John to Whit. So there is a lot of idea sharing between the two packages.
3) Only the binaries not source is available from which can be a problem for non-Windows users. (please see #GSee comments) Also packages like packrat for sandboxing do not like the lack of src files for Rbbg. (Others might comment on a workaround for this.)
Disclaimer: I do not use Rblpapi yet so I cannot judge it.
Occasionally I see small ways I could improve either R (recently the IQR command) and R documentation (just this week perhaps elaborating differences among and better interconnecting aggregate, tapply, and by). But I don't see a way to really make that contribution back. I looked into the developer site and it seems that my options are either to attempt to become a full fledged developer or create packages, neither of which fit what I wish to accomplish.
I did propose IQR changes on the R mailing list but got no response so I figure that's going nowhere.
And to clarify, I'm talking about base-R. Additional packages are another matter.
Any tips?
Send (or CC) to r-devel. Traffic is quite high on r-help, and things can be overlooked there.
File a bug under the wishlist category detailing the improvement you would like to see.
Having filed the bug, try to provide a patch against the R code and or documentation as appropriate. I've done this before where there was a problem or infelicity in R, supplied a patch and a fix to the help files/manual and had the changes accepted (after suitable modification) by R Core.
If it is an addition to the R code base, you are going to have to show that there is a real pressing need for the addition. Basically you are asking R Core to maintain your code in perpetuity, and they are unlikely to do that unless you can demonstrate a need.
If it is an addition, look for a popular R package that does similar/related things and suggest to the package maintainer that they include your function. That way you don't need to start a whole package for something simple but contribute your code. There are several, popular, *misc packages on CRAN for example.
If you want to contribute fixes to the R documentation and/or manuals, provide patches to the sources. You can find the sources at svn.r-project.org/R
Hopefully that gives you some ideas. Patches and code always help!
How about patches to existing packages?
How about open bug reports on packages? R-Forge projects don't seem to use the issue trackers much, but some folks on the RPostgreSQL team I'm on enabled it (where it is hosted on Google Code), and it has been helpful -- see here. And we had a really useful inflow of fresh blood with a rocking new developer from Japan, probably in part because of the visibility of the project there.
In essence, try to find a project / group / team to become acquainted with and join. In that sense, this is just like any other Open Source project. The r-devel list (gmane view) is a good place for R development in general.
The R Core team, on the other hand, is a little more closed and per invitation only and unlikely to change. So be it, for better or worse. It has worked so far, and hence I am not among those who bemoan this loudly.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I've been using R for a little over a year now and it's been a successful venture. But all too often, I find that there is something that I can't figure out for lack of knowing how to find it or an example of it.
Stackoverflow,
Could you recommend a pathway for learning R in a manner that provides one with a toolset at their disposal to solve problems of a statistical nature?
There's a wealth of knowledge on the internet, between the r-project website and the mailings lists but it seems to be "everywhere" and nowhere when you're actually looking for it.
For example, when I first started using R, I went through "Intro to R". Then I read the language definition (which obviously hasn't sunk in). But every time I ask a question on Stackoverflow I'm presented with some new badass function that is the solution to all my problems in the short term. My question is, how did you know these functions existed in the first place? And how does one go about finding them? Presumably, you read something or found some resources that detoured your learning to the exponential part of the curve. What was it?
Obviously, R's functionality as a statistical tool is broad. For my own purposes I work mostly with economic or financial data. Hence, answers with this in mind would be most helpful.
Completely biased response: learn plyr, reshape2 and ggplot2. They will cover 90% of your data manipulation and visualisation needs. All three packages have a consistent philosophy of data (which the ggplot2 book touches upon), and are designed to be consistent and easier to
learn.
Rather than learning many specialised functions, I really encourage you to learn about simple functions that can be flexibly composed to solve a wide range of problems. This is what plyr strives to do for data manipulation, and what ggplot2 strives to do for visualisation. It does mean you need to invest more time up front to learn a little about the underlying theory, but it's my belief that it will pay off handsomely in the long run.
My way how I learned R.
R resources:
To learn R, the most important resource is google. search for: “TOPIC r-project”, “TOPIC filetype:r”, or “TOPIC site:nabble.com”.
Second, look at the example code provided with most packages. go to “http://bm2.genes.nig.ac.jp/”, search for a topic and look at the example code. run it and adapt it, this way you can often solve part of your problem.
Third: the r-help mailing list. Read the posts, the basic questions get asked over and over again. If you have a problem and you are completely stuck, ask a question on the mailing list.
Finally, look at the source code of the R-packages. that’s the hardest part. if you can alter the code to your needs, you have mastered R ;-)
Some Tips:
R has a steep learing curve. that’s a feature ;-) , it is designed to solve advanced problems and in the end you are fast than when using an alternative to R.
Know every single R package and function that is relevant to your problem. the strength of R is that there are so many packages availiable (around 2000, I think). Usually there is always a package that’s more suited or that already solves your problem. (some help pages are badly written and hard to understand - I got used to it)
R books are not helpful in learning R. yes, that’s true. If you are an expert programmer and expert statistician, you don’t need any book on R. (only exception is Hadley Wickham’s ggplot2 book). If your are not, learn programming in general and/or advanced statistics.
Some R package have known bugs, which nobody will fix (package owner left university, etc.). just a warning, this can be tricky if you are looking for a bug in your code and the bug is in a R package.
I'll start with this:
My question is, how did you know these functions existed in the first place?
Simple - we tried to solve a similar problem and came across that function. It either suited or didn't suit our needs but we now know it's there. I haven't used R much personally but what you're describing is the learning curve for every programming language ever. Firstly, you learn the "grammar" i.e. what you can do. Then you try to do something. You find you can't.
At that stage a programmer has a number of options. What do I do personally? Depends. I'll try and look up that package/header/library/whatever's member functions to see if something suits my needs. I might Google it, because unless you're really pushing the boundaries someone somewhere has probably tried and failed to do it before and had their question answered. If you are pushing the boundaries, someone somewhere has probably tried and failed before, but got no answer. I might try a forum or two to see what happens. I personally don't use IRC much, but that's another option, as are mailing lists depending on how specialised the problem is.
I also have a folder on my computer full of books which I search through depending on the problem and a small library of books I look through/learnt from, which often contain practical, not-quite-there-but-adaptable examples.
My only comment would be attempting to read the language specification is unlikely to be massively useful to you as a beginner. You won't fully understand what it means because you haven't pushed the bounds and tried things yet. For example, a novice in C might try this:
char c = '7';
int x = (int) c;
to convert the character '7' into an integer form. It's not a bad thought process until you understand how characters and ASCII work, then you see why the above doesn't give you what you want.
In short, I think this is going to be part of the learning process and I don't think you can cut it any shorter. The consolation is like any research, the more you do it the more you'll know where to look and what questions to ask on various communities.
One of the things I do is follow the RSS feed of R questions on SO (https://stackoverflow.com/feeds/tag/r). Then I can browse what other people have asked/answered.
Often I will favourite a particular question/answer if I think I'll use it, or jot down the salient points into my notebook software (OneNote), occaisonaly I'll even try the question/answer out myself.
EDIT:
I'd also recomend Patrick Burn's book R-Inferno. It's not so much of a training book as a description of all the gotchas and oooh moments Patrick has found (so far).
There's a free book you might be interested in: Introduction to Probability and Statistics Using R
Here is a good list of resources for learning R:
https://stats.stackexchange.com/questions/138/resources-for-learning-r
Also, that website in general is a good resource.
In general I would say that following a mailing list, or a help list is the best way I have found for learning new things. (That and the "R magazine": http://www.r-bloggers.com )
Learning the RODBC package to interact directly with Oracle data made a big impact at my job. My boss was amazed when I pulled Oracle data directly into R and cranking out a plot in only a few lines of code. Try doing that in Excel!
Moral of the story, learn how to pull in data and manipulate it within R. Then move to some of the cooler stuff like ggplot.
I can recommend Penn University's Introductory Course on R.
The ggplot chapter alone is worth reading - I found ggplot very confusing but this is a great explanation.
The book that helped my learning the most was The Art of R Programming. A lot of programming books can be dry. Since R is commonly an entry point to programming it's important for the voice of materials to resonante with the student. That book did just that with me. The voice felt very casual and I liked that.
Some interesting links:
Intro, links and examples: http://manuals.bioinformatics.ucr.edu/home/programming-in-r
A lot of documentation: https://en.wikibooks.org/wiki/R_Programming
R forum: http://r.789695.n4.nabble.com/
The [R] tag FAQ, right here on Stackoverflow, https://stackoverflow.com/questions/tagged/r?sort=frequent provides numerous reproducible examples that one can use to "learn by doing".
Most of the problems are very common and will eventually be something that you will have to look up as a beginner. The FAQ also provides highly literate (and experienced) examples of usage for a diverse range of functions and useful packages.
If you're new to R, and you prefer a more hands on approach to learning, the FAQ should not be overlooked as a potential resource for learning. Many of the questions also provide useful discussion surrounding paradigms of the language itself (vectorization, workflow, debugging are just a few examples).
Nearly every question in the FAQ is worth studying as a new user as it touches on elements that, speaking for myself, I wish I had been pointed to when I asked this question originally.
Just a few examples:
How to make a great R reproducible example
Grouping functions (tapply, by, aggregate) and the *apply family
Workflow for statistical analysis and report writing
How to sort a dataframe by multiple column(s)?
What is your favorite R debugging trick?
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am a devoted R (r-project.org) user, and love infographics.
I just came across this article:
http://www.noupe.com/design/fantastic-information-architecture-resources.html
Giving a long list of resources for information designers.
And it raised in me the desire to do more beautiful (not just informative) R plots.
Do you have any suggestion/resources on how to make this leap?
What books/software/skills do I need to have/develop in order to be able to make beautiful infographics?
Here's a list of resources that I would suggest:
Tufte's books are really excellent, although my favorite is actually his second book: Envisioning Information. Separately, I always found the periodic table of visualization methods to be entertaining. Ross Ihaka also taught a course on this subject in the past.
For R, learn ggplot2. The learnr.wordpress.com blog is an excellent resource for this. You might consider the ggplot book and the original Grammar of Graphics book.
Here's another useful article from the same site that you linked in your question: Data Visualization: Modern Approaches.
Some good blogs on the subject:
http://www.informationisbeautiful.net/
http://flowingdata.com/
In some cases, you might want to do your data manipulation in R, but create the visualization with another tool (see, for instance, this list). Here are some of the best tools that I have found over the years:
Processing
Prefuse
Protovis
Lastly, an interesting open visualization platform was many eyes.
You might want to look into using R to create your underlying graphics, and then saving them in an editable format (like svg). Then using a more art focused application to edit your svg to make it beautifull (like Inkscape). See my previous question for an example using Cairo. I'd also +1 the learn ggplot2 from Shane.
If the R side of your skills is pretty good, then you'll definitely want to start reading Edward Tufte's books, particularly The Visual Display of Quantitative Information and Beautiful Evidence, both of which provide excellent insights into how to present data effectively and efficiently.
You should be somewhat forewarned that everyone has a different idea of "beautiful," however. Tufte is a big believer in maximizing a quantity he calls the "data-ink ratio": how much of the page's ink is dedicated to data instead of what he calls "chartjunk". This causes his work to have a sleek, minimalist oeuvre that certainly makes it easier to digest everything but that some people may find too utilitarian. But for Tufte, function and form are pretty close to one thing: the more it helps you, the more beautiful and elegant it is.
The tikz manual contains a few dozen pages on how to make good graphics (using tikz). So it's not entirely specific to R, but the ideas are interesting and worked out with examples.
I don't know what's the ultimate goal of your visualizations, but if LaTeX is involved (and when is it not involved for beautiful typography? ;)) it can be a good idea to rework your graphics in a vector graphics language, as Shane suggested.
I am using Tableau Public, it is free software to create charts and maps. Charts are not stunning but I like maps, I do not know better free software for info maps creation.
a good option is to make your own flash components for visual studio and work with they in asp.net