Manage Circles size in plot using symbols - r

I am using symbols function in r to draw cycles in a map, which has been imported as a plot.
According to the function Cycles radius are scaled basted on the max value of the data set.
I am plotting the same map for different time periods (different data set) and i want the maps to be comparable, meaning that the circle radius refers to the same values in all different maps. Is there a way that I can manage circle scaling?
Thanks
This is my code
#for the first map 2010
plot(my_map)
symbols(data2010$Lon, data2010$Lat, circles= data2010$number, inches=0.25,add=T)
#then the map for 2011
plot(my_map)
symbols(data2011$Lon, data2011$Lat, circles= data2011$number, inches=0.25,add=T)

The manual page suggests that setting inches=FALSE will accomplish what you want. Since you did not provide a sample of your data, we have to use data already available. This data set is used in the Examples on the manual page for the symbols() function:
data(trees)
str(trees)
# 'data.frame': 31 obs. of 3 variables:
# $ Girth : num 8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ...
# $ Height: num 70 65 63 72 81 83 66 75 80 75 ...
# $ Volume: num 10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...
Since we only have one sample, we can plot the the symbols with and without the 31th row which is the largest.
with(trees, symbols(Height, Volume, circles = Girth/24, inches = FALSE))
Now add the data without row 31:
with(trees[-31, ], symbols(Height, Volume, circles = Girth/24, fg="red", inches = FALSE, add=TRUE))
We can tell that the scaling is the same because the red circles match the black circles even though the largest girth is missing from the second plot. For this to work you will have to specify the same values for xlim= and ylim= in each plot.
Run this code again replacing inches=FALSE with inches=.5 to see the difference.

Related

How to sort vector into bins in R?

I have a vector that consists of numbers that can take on any value between 1 and 100.
I want to sort that vector into bins of a certain size.
My logic:
1.) Divide the range (in this case, 1:100) into the amount of bins you want (lets say 10 for this example)
Result: (1, 10.9], 10.9,20.8], (20.8,30.7], (30.7,40.6], (40.6,50.5], (50.5,60.4], (60.4,70.3], (70.3,80.2], (80.2,90.1], (90.1,100]
2.) Then sort my vector
I found a handy function that almost does all this in one fell swoop: cut(). Here is my code:
> table(cut(vector, breaks = 10))
(0.959,10.9] (10.9,20.8] (20.8,30.7] (30.7,40.5] (40.5,50.4] (50.4,60.3] (60.3,70.1] (70.1,80] (80,89.9] (89.9,99.8]
175 171 117 103 82 67 54 46 39 31
Unfortunately, the intervals are different than the bins we calculated from the possible range (1:100). So I tried fixing this by adding in that range into the vector:
> table(cut(c(1,100,vector), breaks = 10))
(0.901,10.9] (10.9,20.8] (20.8,30.7] (30.7,40.6] (40.6,50.5] (50.5,60.4] (60.4,70.3] (70.3,80.2] (80.2,90.1] (90.1,100]
176 171 117 104 82 66 54 48 38 31
This almost worked perfectly except the left-most interval which starts from 0.901 for some reason.
My questions:
1.) Is there a way to do this (using cut or another function/package) without having to insert artificial data points to get the specified bin ranges?
2.) If not, why does the lower bin start from 0.901 and not 1?
Based on your response to #Allan Cameron, I understand taht you want to divide your vector in 10 bins of the same size. But when you define this number of breaks in the cut() function, the size of the intervals calculated by the function, are different accros the groups. As #akrun sad, this occurs because of the method of calculus that the function uses on this case you define only the number's of breaks.
I do not know if there is a way to avoid this in the function. But I think it will be easier if you define the bins as you want as #Gregor Thomas suggested. Here is an example of how I would approach your desire:
vec <- sample(1:100, size = 500, replace = T)
# Here I suppose that you want to divide the data in
# intervals of the same length
breaks <- seq(min(vec), max(vec), by = 9.9)
cut(vec, breaks = breaks)
Other option, would be the cut_interval() function from ggplot2 package, that cut's the vector in n groups with the same length.
library(ggplot2)
cut_interval(vec, n = 10)
why does the lower bin start from 0.901 and not 1?
The answer is the first bit of the Details section of the ?cut help page:
When breaks is specified as a single number, the range of the data is divided into breaks pieces of equal length, and then the outer limits are moved away by 0.1% of the range to ensure that the extreme values both fall within the break intervals.
That .1% adjustment is the reason your lower bound is 0.901 --- the upper bound isn't adjusted because it is a closed, ], not open ) interval on that end.
If you'd like to use other breaks, you can specify exact breaks however you want. Perhaps this:
my_breaks = seq(1, 100, length.out = 11) ## for n bins, you need n+1 breaks
my_breaks
# [1] 1.0 10.9 20.8 30.7 40.6 50.5 60.4 70.3 80.2 90.1 100.0
cut(vector, breaks = my_breaks, include.lowest = TRUE)
But I actually think Allan's suggestion of 0:10 * 10 might be what you really want. I wouldn't dismiss it too quickly:
table(cut(1:100, breaks = 0:10*10))
# (0,10] (10,20] (20,30] (30,40] (40,50] (50,60] (60,70] (70,80] (80,90] (90,100]
# 10 10 10 10 10 10 10 10 10 10

Area estimation using image in R

So i'm looking for a way to estimate the area of a region, using only the image of the map. The reason i'm doing this is I want to calculate the area that would be lost upon a certain increase in sea level and I can't find any kind of meta data for that only the maps (in image formats). Here is the link to such a map:
(source: cresis.ku.edu)
So what i have in mind is to convert this image to a gray scale image using EBimage package and then using the pixel intensity as a criteria to count the number of pixels that represent potentially threaten area.
My question is it possible? How can we you the pixel intensity as a criteria? And if there are any other approach to solve this issue?
Also if there are way to gain access to the meta data used to plot such map that I'm not aware of please tell me.
Thank you everyone.
Edit:
Thank to hrbrmstr I was able to read the grid data int R using rgdal packages. So in order to calculate the area I tried to used the rgeos package, but the dataset from CRESIS doesn't have the shape file. So how can we define the polygon and calculate the area?
Sorry if this question seems silly. This is the first time I've ever dealt with spatial data and analysis
The data is in the ESRI files:
library(rgdal)
grid_file_1m <- new("GDALReadOnlyDataset", "/full/path/to/inund1/w001001.adf")
grid_1m <- asSGDF_GROD(grid_file_1m, output.dim=c(1000, 1000))
plot(grid_1m, bg="black")
grid_1m_df <- as.data.frame(grid_1m)
str(grid_1m_df)
## 'data.frame': 2081 obs. of 3 variables:
## $ band1: int 1 1 1 1 1 1 1 1 1 1 ...
## $ x : num -77.2 -76.9 -76.5 -76.1 -75.8 ...
## $ y : num 83.1 83.1 83.1 83.1 83.1 ...

Plot a thin plate spline using scatterplot3d

Splines are still fairly new to me.
I am trying to figure out how to create a three dimensional plot of a thin plate spline, similar to the visualizations which appear on pages 24-25 of Introduction to Statistical Learning (http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Sixth%20Printing.pdf). I'm working in scatterplot3d, and for the sake of easily reproducible data, lets use the 'trees' dataset in lieu of my actual data.
Setting the initial plot is trivial:
data(trees)
attach(trees)
s3d <- scatterplot3d(Girth, Height, Volume,
type = "n", grid = FALSE, angle = 70,
zlab = 'volume',
xlab = 'girth',
ylab = 'height',
main = "TREES") # blank 3d plot
I use the Tps function from the fields library to create the spline:
my.spline <- Tps(cbind(Girth, Height), Volume)
And I can begin to represent the spline visually:
for(i in nrow(my.spline$x):1) # for every girth . . .
s3d$points3d(my.spline$x[,1], rep(my.spline$x[i,2], times=nrow(my.spline$x)), # repeat every height . . .
my.spline$y, type='l') # and match these values to a predicted volume
But when I try to complete the spline by cross hatching lines along the height access, the results become problematic:
for(i in nrow(my.spline$x):1) # for every height . . .
s3d$points3d(rep(my.spline$x[i,1], times=nrow(my.spline$x)), my.spline$x[,2], # repeat every girth . . .
my.spline$y, type='l') # and match these values to a predicted volume
And the more that I look at the resulting plot, the less certain I am that I'm even using the right data from my.spline.
Please note that this project uses scatterplot3d for other visualizations, so I am wedded to this package as the result of preexisting team choices. Any help will be greatly appreciated.
I don't think you are getting the predicted Tps. That requires using predict.Tps
require(fields)
require(scatterplot3d)
data(trees)
attach(trees) # this worries me. I generally use data in dataframe form.
s3d <- scatterplot3d(Girth, Height, Volume,
type = "n", grid = FALSE, angle = 70,
zlab = 'volume',
xlab = 'girth',
ylab = 'height',
main = "TREES") # blank 3d plot
grid<- make.surface.grid( list( girth=seq( 8,22), height= seq( 60,90) ))
surf <- predict(my.spline, grid)
str(surf)
# num [1:465, 1] 5.07 8.67 12.16 15.6 19.1 ...
str(grid)
#------------
int [1:465, 1:2] 8 9 10 11 12 13 14 15 16 17 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "girth" "height"
- attr(*, "grid.list")=List of 2
..$ girth : int [1:15] 8 9 10 11 12 13 14 15 16 17 ...
..$ height: int [1:31] 60 61 62 63 64 65 66 67 68 69 ...
#-------------
s3d$points3d(grid[,1],grid[,2],surf, cex=.2, col="blue")
You can add back the predicted points. This gives a better idea of x-y regions where there is "support" for the estimated surface:
s3d$points3d(my.spline$x[,1], my.spline$x[,2],
predict(my.spline) ,col="red")
There is no surface3d function in scatterplot3d package. (And I just searched the Rhelp archives to see if I were missing something but the graphics experts have always said that you would need to use lattice::wireframe, the graphics::persp or the 'rgl'-package functions. Since you have made a commitment to scatterplot3d, I think the easiest transtion would not be to those but to the much more capable base-graphics package named plot3d. It is capable of many variations and makes quite beautiful surfaces with its surf3D function:

plotting degrees in IDL

I'm using IDL 8.2
I have a list of positions (RA and Dec) of stars and i want to plot them on a figure, eg.
37.9 ~ 37 54' 0"
37.7 ~ 37 42' 0"
I read in the positions (degrees) in as strings and extract the degrees, minutes and seconds into separate arrays. These are then used to convert the values to decimal degrees for plotting here.
I would like to also have the alternate axis labelled with degrees. i.e.
37.9 ~ 37 54' 0"
37.6 ~ 37 42' 0"
Is there a way to do this other than using something like power point to do it?
Also is there a better way, than having the axis scaled the same, to force the plot to be a square plot using the plot procedure?
A good solution was posted here.
https://groups.google.com/forum/#!starred/comp.lang.idl-pvwave/EsbGiqZnhRw
Effectively, writing a function to perform write user defined tick marks.

R barplot label size of each sample

I would like to label each of the boxes in a barplot by their size(i.e number of observations in dataframe which are in the group).
e.g If the first variable has 3 levels and the second variable has 4 levels, I would like 12 labels.
(Also, is it possible to control the size or position of these labels)
Thank you for any help.
Here's one way to do it, using the data VADeaths as an example (it will be in your R workspace by default, or if not, use library(datasets)).
bar <- barplot(VADeaths)
text(rep(bar,each=nrow(VADeaths)), as.vector(apply(VADeaths,2,cumsum)),
labels=as.vector(apply(VADeaths,2,cumsum)),pos=3)
It looks like this:
To modify the size of the font you can use text(...,cex=2) to make things twice the size they were, e.g.
Now, let's explain this code so you know how to do it yourself!
First, let's look at VADeaths: it's a tally of deaths in each age group by category:
> VADeaths
Rural Male Rural Female Urban Male Urban Female
50-54 11.7 8.7 15.4 8.4
55-59 18.1 11.7 24.3 13.6
60-64 26.9 20.3 37.0 19.3
65-69 41.0 30.9 54.6 35.1
70-74 66.0 54.3 71.1 50.0
Now, to do the text on the barplot, we basically draw the barplot, and then draw the text on top using R command text (see ?text).
text requires x,y coordinates and corresponding pieces of text to draw on the bar plot. We will give it the coordinates of each line in the bar plot to draw the text on.
To do this, see the "Value" section ?barplot. This function not only plots your bar plot, but returns the x coordinate of each bar. score!
> bar <- barplot(VADeaths)
> bar
[1] 0.7 1.9 3.1 4.3
Now all we need is y coordinates to go with our x coordinates.
Well, a stacked bar plot just tallies up the frequencies in VADeaths as you go along.
For example, in the 'Rural Male' group, the first line is drawn at 11.7, and the second is drawn at 11.7 + 18.1 = 29.8, the third at 11.7 + 18.1 + 26.9 = 56.7, and so on (see the values in VADeaths).
So, our y coordinates need to be cumulative sums going down the columns.
To calculate these for each column, we can use cumsum. For example
> cumsum(c(1,2,3,4,5))
[1] 1 3 6 10 15
Since we want to do this for each column in VADeaths, we have to use the function apply.
> apply(VADeaths,2,cumsum)
Rural Male Rural Female Urban Male Urban Female
50-54 11.7 8.7 15.4 8.4
55-59 29.8 20.4 39.7 22.0
60-64 56.7 40.7 76.7 41.3
65-69 97.7 71.6 131.3 76.4
70-74 163.7 125.9 202.4 126.4
apply(VADeaths,2,cumsum) means: "For each column in VADeaths, calculate the cumsum of that".
This gives us the y values for each line of the bar plot.
Let's save these yvalues for further use:
> yvals <- as.vector(apply(VADeaths,2,cumsum))
The reason I use as.vector is just to flatten the matrix into a vector of values -- it makes the plotting easier.
One last thing -- my x values (that I stored in bar) only have one value per bar, but I need to expand it out so there's one x value per line on each bar. To do this:
> xvals <- rep(bar,each=nrow(VADeaths))
This turns my previous x1,x2,x3,x4 into x1,x1,x1,x1,x1, x2,x2,x2,x2,x2, ..., x4,x4,x4,x4,x4.
Now my xvals match my yvals.
After this it's simply a case of using text.
> text( xvals, yvals, labels=yvals, pos=3 )
The labels arguments tells text what text to put at the x/y positions.
The pos=3 means "draw each bit of text just above my specified x/y value". Otherwise, the numbers would be drawn over the lines of the barplot which would be hard to read.
Now, there are many options for customising the position and size of text, and I suggest you read ?text to see them.
All this code condenses down to the two-liner I gave at the beginning of the answer, but this version might be a little more understandable:
bar <- barplot(VADeaths)
xvals <- rep(bar,each=nrow(VADeaths))
yvals <- as.vector(apply(VADeaths,2,cumsum))
text( xvals, yvals, labels=yvals, pos=3 )

Resources