Can R be used for GIS? [closed]

Can R be used for GIS? [closed] - r

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'd like to create some GIS plots, and I'm wondering if R can be used for this. Here are some examples of plots I'd similar in concept to those I'd like to make:
A temperature plot (or contour plot) of the United States, with color (or height) determined by state GDP. Thus, state boundaries would give discontinuities in the resulting plot.
A temperature plot of the United States where altitude is used for data. In this case, the resulting plot should vary smoothly across state boundaries.
The sum of the above 2 plots (with some scaling applied).
I'm just starting to learn R, and want to know if it would be the right tool for this kind of job. Looking at the coord_map of ggplot2, it looks like superimposing data onto the US is possible. But getting data to respect state boundaries could be very difficult.
Any advice?

First, you have the maps, mapproj and maptools packages, that give you a wide variety of map functions, projections, and so on to create about any map you can think of.
Then there is the sp package, which -among other things- allows you to plot any kind of data you load from the GADM database.
But most of all, there is the spatial projects page of R which gives you a whole lot more information, including links to mailing lists, to get going with R and spatial data. And if that's not enough, you have the CRAN Task View page for spatial data, listing 100+ packages to do what you want to do.
Think you had it now? There is more! Both books for sale and free blogs can help you finding out how to do what you want to do. And if you have some specific question, you can always come to StackOverflow , or use any of the mailing lists to get some more help.
So you see : This is R. There is no if. Only how. (Simon Blomberg)
powered by googling.

it still at alpha stage but the Rgis (composed of R packages terrain, RemoteSensing, gdistance ..) project look very promising. You can test the package on r-forge.
For raster data (DEM, altitude,...) handling there is the excellent raster package, and for other task like polygon clipping and more complicated stuff you can use rgeos (bidding of GEOS libs), maptools (for format exchange) or PBSmapping, and of course the sp package and the companion book Applied Spatial Analysis with R (Bivand, Pebsema and Rubio 2008) is a must.
On the other way, you can also link R to GIS like grass (spgrass6), saga (RSAGA), even QGIS and arcGIS but i don't use them.
finally you have to take a look at http://cran.r-project.org/web/views/Spatial.html

You might also want to look at this.
Integrating External programs with Modelbuilder
Using R in ArcGIS 10
and a thread
from Roger Bivand with useful links, advice, and some code for raster import.

Related

Speeding up ggplot2: does it make sense to pre-render plots? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am building an interactive function that will repeatedly build and plot reasonably complicated ggplot2 plots.
Users provide input (rotation angles for a PCA loadings matrix, actually), and I'd like to show them rotated results asap.
Unfortunately, plotting the plots with ggplot2 is quite sluggish.
Note:
There is emphatically not a lot of data (<100 data points or so), so pre-processing won't help (that's the issue on this and a lot of other SO ggplot2 performance posts).
I have to stick with ggplot2 for now. (I know, I know, ggobi etc. ...).
I do know the range of possible inputs in advance (0-360): that's a very finite number.
I have cached the ggplot-generating functions with memoise, but that doesn't seem to help much; the problem seems to be the actual plotting on the graphics device.
(I also noticed that the internal graphics device of RStudio is particularly sluggish).
So, I thought, maybe I'd be an idea to somehow pre-render all necessary plots, maybe by saving the svg() graphics device to a file or something, and then to plot those cached versions as necessary.
On a scale of 1-10, how stupid of an idea is that?
Any better ideas?
Will this even speed up the plotting, or will the graphics device still be the bottleneck?
Why can't we have hardware acceleration in R :(.
Update
this is not hosted software (for now), just working locally and it should work from any number of clients and on any number of platforms.
I am aware of (the much faster) ggvis and ggobi, but these are for now not an option (development bandwidth is too small).
There are actually several, relatively complicated, nested (grid.arranged) plotting functions, and those were memoised at some point – with no noticeable speed increase.
Opening pre-rendered files in an external file viewer seems to endanger cross-platform appeal – correct?

How to compute distances along a network in shapefile? in R

I have a river network in a shapefile (class: "SpatialLinesDataFrame"), with some points on it (see picture below).
I would like to compute the distances between points, but along the rivers. I have been searching a lot and I am not able to find any function that allows directly that.
The closest thing I have found is the function "networkdistance" in the package "secrlinear", however I don't manage to transform my shapefile into the format required to use the function (a "linearmask" object).
Any help with this would be extremely appreciated.
Thanks in advance,
Tina.

I know this is an old thread, but just in case someone runs across this in the future: I just released an R package (riverdist) that deals with this issue, and also provides some tools for network editing and data summaries & visualization. It was written with fisheries work in mind, but could probably be applied to what you're working on, or at least that's the hope!
https://cran.r-project.org/web/packages/riverdist/vignettes/riverdist_vignette.html
Sorry this wasn't more timely -

I think we resolved this problem offline: the geographic coordinates (lat/long) of the shapefile needed to be projected before they could be used in secrlinear. That package approximates the linear network and uses igraph functions for distances.

What is the most useful output format for graphs? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Before any of you run at the closing vote let me say that I understand that this question may be subjective, and the expected answer may begin by "it depends". Nevertheless, it is an actually relevant problem I run into, as I am creating more and more graphs, and I don't necessarily know the exact way I am going to use them, or don't have the time to test for the final use case immediately.
So I am leveraging the experience of SO R users to get good reasons to choose one over the other, between jpg(), bmp(), png(), tiff(), pdf() and possibly with which options. I don't have the experience in R and the knowledge in the different formats to choose wisely.
Potential use cases:
quick look after or during run time of algorithms
presentations (.ppt mainly)
reports (word or latex)
publication (internet)
storage (without too much loss and to transform it later for a specific use)
anything relevant I forgot
Thanks! I'm happy to make the question clearer.

To expand a little on my comment, there is no real easy answer, but my suggestions:
My first totally flexible choice would be to simply store the final raw data used in the plot(s) and a bit of R code for generating the plot(s). That way you could easily enough send the output to whatever device that suits your particular purpose. It would not be that arduous a task to set yourself up a couple of basic templates based on png()/pdf() that you could call upon.
Use the svg() device. As noted by #gung, storing the output using pdf() , svg() , cairo_ps() or cairo_pdf() are your only real options for retaining scalable, vector images. I would tend to lean towards svg() rather than pdf() due to the greater editing options available using programs like Inkscape. It is also becoming a quite widely supported format for internet publication (see - http://caniuse.com/svg )
If on the other hand you're a latex user, most headaches seem to be solved by going straight to pdf() - you can usually import and convert pdf files using Inkscape or command line utilities like Imagemagick if you have to format shift.
For Word/Powerpoint interaction, if you are running R on Windows, you can also export directly using win.metafile() which will give you scalable/component emf images which you can import into Word or Powerpoint directly. I have heard of people running R through Wine or using intermediary steps on Linux to get emf files out for later use. For Mac, there are roundabout pathways as well.
So, to summarise, in order of preference.
Don't store images at all, store code to generate images
Use svg/pdf and convert formats as required.
Use a backup win.metafile export directly for those cases where you can't escape using Word/Powerpoint and are primarily going to be based on Windows systems.

So far the answers for this question have all recommended outputting plots in vector based formats. This will give you the best output, allowing you to resize your image as you need for whatever medium your image will end up in (whether that be a webpage, document, or presentation), but this comes at a computational cost.
For my own work, I often find it is much more convenient to save my plots in a raster format of sufficient resolution. You probably want to do this whenever your data takes a non-trivial amount of time to plot.
Some examples of where I find a raster format is more convenient:
Manhattan plots: A plot showing p-value significance for hundreds of thousands-millions of DNA markers across a genome.
Large Heatmaps: Clustering the top 5000 differentially expressed genes between two groups of people, one with a disease, and one healthy.
Network Rendering: When drawing a large number of nodes connected to each other by edges, redrawing the edges (as vectors) can slow down your computer.
Ultimately it comes down to a trade-off in your own sanity. What annoys you more? your computer grinding to a halt trying to redraw an image? or figuring out the exact dimensions to render an image in raster format so it doesn't look awful for your final publishing medium?

The most basic distinction to bear in mind here is raster graphics versus vector graphics. In general, vector graphics will preserve options for you later. Of the options you listed, jpeg, bmp, tiff, and png are raster formats; only pdf will give you vector graphics. Thus, that is probably the best default of your listed options.

How did you experience the transition from SPSS to R? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
The discussion in this question is the direct cause for me asking this question. The more general reason is the fact that I often have to explain R use to people that are only familiar with SPSS. I know most of the basics of SPSS, as we still use it in the base course statistics. But as I'm more of an R guy, it's difficult to know how SPSS users experience the first meeting with R.
I know there is the book R for SAS and SPSS users and that contains already some information. Yet, I would like to know what the more difficult parts are when you switch from SPSS to R.
Or in other words : if you would have to explain R in one day to SPSS users, which topics would you focus on? This is not a hypothetical question by the way (yeah, I know, it's not because one get paid for it that it always makes sense...).

Firstly, data manipulation has been the most challenging thing to learn coming from SPSS/SAS to R. I've found, personally, that getting the data in the right shape for an analysis is usually much more difficult than the analysis itself. Secondly, a true understanding of how to deal with categorical values through the use of factors. Lastly, summary statistics and descriptives can sometimes be challenging to get in a format that is transmutable to PPT or Excel which are what (my) clients generally expect/demand for reporting.
I would focus on:
1 Data manipulation
Understanding data structures. Import/Export. Then in-depth training on the use of packages like plyer, reshape with a particular focus on how to effectively use cast with formulas and melt with ids. How to apply numerical functions within a data.frame using ddply.
2 Factoring Data
In general, an explanation of dealing with recoding with, epicalc or a user-defined function. Also an explanation of the significance of factors, levels, and labels
3 Descriptives
Take a few minutes to introduce xtabs(), table(), prop.table() using cast() from reshape to create columnar tables of data that are more reasonably exported to Excel.
Graphics are optional, if you've done a good job of the above they should be able to get the data they need to create graphs in whatever software they are most comfortable with.
4 Graphics
If you've done a good job teaching the data manipulation, getting data into the shape needed for graphing should be pretty straightforward (or at least reproducible) at this point. ggplot2 is complicated and requires a day just by itself to be played with. But it is possible to give a quick overview of it. Alternatively, base graphics are simple to understand and the help is much more clear on what things do and how the syntax works.
Note: I left out statistical analysis. However, an overview of lm() and perhaps anova(), or cor() would be helpful as a start point. But this should be explained at the same time as data.manipulation.

Although I "wrote the book" on R to SPSS migration, that was aimed at programmers and most SPSS users that I know prefer to "point-and-click" instead. A graphical user interface like Deducer (or R Commander) can help them feel at home while teaching them how R programming code works if they want to see it. Deducer's Plot Builder also does a nice job letting you create complex plots easily, and if you want to learn to ggplot2 code, it will show you that as well. Ian did a great job with it!
However, while the SPSS graphical user interface covers 98% of what SPSS can do, Deducer covers perhaps 1% of what R can do. That's probably still 75% of what your average researcher needs, but R is so broad that to get the most out of it people will need to learn to program. The free version of my book, "R for SAS and SPSS Users" is only 80 pages & covers the areas of programming that I think are most likely to confuse beginners. It's at http://r4stats.com.

Just recently I've had a student who was somewhat versed in statistics and did some analysis beforehand in SPSS. I then showed him how to do the exact same thing in R. We went through the code and plotting, explaining and debating each line. He realized how easy and convenient it is to do it in R. Thus, R community grew by 1. :)

The biggest issue that the researchers I've dealt with have is the lack of point-and-click GUI. While there are a number of efforts out there in the R community, none of them have reached the ease-of-use/power level that SPSS has.
Since coding is second nature to R users, sometimes we forget that the majority of users of statistical software can't program (and would avoid it like the plague), even though they may have a strong practical understanding of statistics.
If I had one day to bring an SPSS user into R, I'd start them on Deducer. Deducer is an R GUI project (Self promotion note: I'm the author) that should feel very familiar to a user coming from SPSS. As they find themselves needing more advanced functions, they will naturally move to the command line to fulfill their needs.

Map Data with R World Regions

Lately I have seen some cool examples of mapping in R and wanted to give this a shot. I currently have ArcView at work, but my spatial join is not working correctly (most likely user error).
Objective: I need a list of countries and what World Region they belong to. I have two layers (one country detail, the other region detail) and wanted to join the world region assignment onto each country. The join isn't working, so i figured I would come to the R community.
What are my options? This is my first attempt at doing any mapping in R and maybe there is an easier/better solution. Eventually I want to take lat/long data and map it as well.
Any insight will be much appreciated.
Brock

See the Spatial task view on CRAN, and packages like maps/mapdata, sp, rgdal, raster, blighty, rworldmap, RgoogleMaps, etc.
Do you have shapefiles you want to read? First get rgdal installed, or look at other options like maptools and shapefiles if that is difficult on your platform. Read functions in these packages will provide Spatial*DataFrame objects.
For information on the Spatial classes:
library(sp)
vignette("sp")
spatstat also has a lot of support for spatial data, and another vignette for converting to / from sp:
library(spatstat)
vignette("shapefiles")

The PBSmapping package is another good place to start. They have pretty extensive documentation and a great reference manual as well.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex