R Raster .tif File: selecting raster#data#attributes - r

I'm using R's raster package to access the USDA National Crop Data Layer.
This is a .tif file. Within the Raster Object are a number of #data#attributes
that sit in a data frame. These are like bands, but they are not bands.
Here's the Crop Data Layer for Colorado:
CDL <- raster("cdl_30m_r_co_2015_albers.tif")
And here's a snippet of the attributes I want to select from:
> CDL#data#attributes
[[1]]
ID COUNT Class.Names Opacity
1 0 0 Background 0
2 1 5133840 Corn 255
3 2 0 Cotton 255
4 3 0 Rice 255
I'd like to be able to select some of these attributes in a new raster to plot or do calculations on them. (I think the COUNT is counts of pixels in the raster.) There are 256 attributes.
How might I do this?
So I have the answer, duh.
It is as simple as just doing a comparison on the raster for the value I want.
Corn <- CDL == 1
(A tif file in the raster sense, I think, is just a normal tif image with georeferencing. It is just a x rows by y cols bitmap and in this case the values
for each pixel are the values in the raster#data#attributes data frame ID column.)
Helps to know that the file format contains the answer the question.
Thanks

Related

How to troubleshoot mislabeling of provinces in my shapefile in r?

I have a shapefile of the Philippines that has all the correct labels of each provinces. After removing some of the provinces I won't be using, aggregating the data into a single data frame, and then attaching my covariates to the shapefile I run into trouble. Using tmap to create some maps, the provinces are mislabeled and therefore, different data is applied to different provinces I am doing a spatial-temporal analysis with this data, so it's important the provinces are in the correct locations.
I have tried retrojecting some of the shapefile, but it doesn't seem to work.
#reading in shapefile
shp <- readOGR(".","province.csv")
#removing provinces not in data from shapefile
myshp82=shp
shp#data$prov=as.character(shp#data$prov)
ind=shp#data$prov%in% mydata$prov
shp.subset=shp[ind,]
#attaching covariates to shapefile for plotting, myagg is my data frame.
#The shape files are divided in four different time periods.
myagg_time1=myagg[myagg$period==1,]
myagg_time2=myagg[myagg$period==2,]
myagg_time3=myagg[myagg$period==3,]
myagg_time4=myagg[myagg$period==4,]
myshptime1=myshptime2=myshptime3=myshptime4=shp
myshptime1#data=merge(myshptime1#data, myagg_time1, by='prov',all.x=TRUE)
myshptime2#data=merge(myshptime2#data, myagg_time2, by='prov',all.x=TRUE)
myshptime3#data=merge(myshptime3#data, myagg_time3, by='prov',all.x=TRUE)
myshptime4#data=merge(myshptime4#data, myagg_time4, by='prov',all.x=TRUE)
#desc maps. Here's the code I've been using for one of the maps.
Per1= tm_shape(myshptime1)+
tm_polygons(c('total_incomeMed','IRA_depMean','pov'), title=c('Total Income', 'IRA', 'Poverty (%)'))+
tm_facets(sync = TRUE, ncol=3)
#sample data from my data sheet "myagg". First column is provinces.
period counts total_income_MED IRA_depMean
Agusan del Norte.1 1 2 119.33052 0.8939136
Agusan del Norte.2 2 0 280.96928 0.8939136
Agusan del Norte.3 3 1 368.30082 0.8939136
Agusan del Norte.4 4 0 368.30082 0.8950379
Aklan.5 1 0 129.63132 0.8716863
Aklan.6 2 3 282.95535 0.8716863
Aklan.7 3 3 460.29969 0.8716863
Aklan.8 4 0 460.29969 0.8437920
Albay.9 1 0 280.12221 0.8696165
Albay.10 2 3 453.05098 0.8696165
Albay.11 3 1 720.40732 0.8696165
Albay.12 4 0 720.40732 0.8254676
Essentially the above tmap code creates three maps for this time period side-by-side for each of the different covariates ('total_incomeMed','IRA_depMean','pov'). This is happening, but the provinces are mislabeled and the data is tied to the name of the province. I just need the provinces properly labeled!
Sorry if this doesn't make sense. Happy to clarify more if needed.

Spatstat: Creating a pixel image object from a database I can't transform into a matrix

Hey people of Stackoverflow!
I'm trying to see which factors increases the incidence of fire caused by lightnings, but I'm having problems creating a pixel image object using the im() function of the library spatstat.
The thing is the data I have is in the shape of the area and not a rectangle or square, so I can't transform the data into a matrix.
I tried to create a window with the function owin() and the poly argument, but I have ALL the points (including border and filling) of the area, so I can't get the polygon of the area.
So I need help to get ideas to a) create an pixel image object directly from my database or b) adding points to create a rectangle to then transform my data into a matrix and create with it the pixel image object.
I hope you can help me and if you need more info, please let me know.
Edit: Sorry for not putting an example of the data before.
So my data looks like this:
no. lon lat elev exp slope veg
1 700.5380 984.4786 548 -1 0 1
2 704.0483 984.4786 518 135 0 1
3 707.5586 984.4786 548 -1 0 1
4 711.0689 984.4786 569 254 4 1
5 714.5791 984.4786 590 178 5 1
6 697.0277 981.9342 518 -1 0 1
You can see a plot of the data here.
The other datafile I have only has the data of the lightnings.
Hope you can help me and thanks for everything!
Also, I can't use the im() because I can't make my data into a matrix and I also tried to use owin(poly=data) but it makes me a shape made of lines with the data, also I read more and I think owin wasn't the solution I needed either...
I'm reading right now other libraries to see if I can do the raster with other library instead of spatstat.
I think this recipe should work for you to create a SpatialPointsDataframe and an image (in the example with the raster package) from you data:
https://gis.stackexchange.com/questions/20018/how-can-i-convert-data-in-the-form-of-lat-lon-value-into-a-raster-file-using-r
Then you can convert the spatial df to spatstat with:
library(maptools)
library(spatstat)
spatialPointsDF<-as(spatialPointsDataFrame, "ppp")
plot(spatialPointsDF)
See the maptools functions:
https://www.rdocumentation.org/packages/maptools/versions/0.9-2/topics/as.ppp
To create an im object from raster, see again the maptools spatstat API (as.im.RasterLayer):
https://www.rdocumentation.org/packages/maptools/versions/0.9-2/topics/as.ppp

How to create an interval file defined by values from another file - for circos imaging of WGS data

I am trying to depict my whole-genome sequence (WGS) data of my parasite, using the circos software.
One of the elements I would like to depict, is the areas of the reference genome for which i do not have sequencing data from my parasite.
I order to do this, I have used Samtools to create an mpileup file, from which I have extracted the positions where the sequence depth = 0. I therefore have a file that looks like this:
$chromosome_name $chromosome_position $depth
chr_1 1 0
chr_1 2 0
chr_1 3 0
chr_2 67 0
chr_2 68 0
chr_2 1099 0
chr_2 1100 0
chr_2 1101 0
this means that there are 3 positions in chromosome 1, with no sequence data (depth = 0): namely positions 1, 2 and 3. For chromosome 2, the positions with no data are positions 67, 68, 1099, 1100 and 1101.
Due to the fact that my files are enormous (up to 3 million lines), and the fact that alot of the unsequenced positions come in intervals, I would like to create an interval file from the above data. Also, circos requires such an interval-file in order to create tiles. I therefore need to create a new file from the above, that looks like this:
$chromosome_name $start_pos $end_pos
chr_1 1 3
chr_2 67 68
chr_2 1099 1101
I have searched a bunch, but I have only found questions pertaining to grouping data by pre-defined intervals (e.g. group purchases occurring over a period of 6 months, patients by age etc).
So if anybody can help me out, I will be extremely happy!
Sidsel
Consider using bedtools. Specifically the bedtools merge sub-command:
http://bedtools.readthedocs.io/en/latest/content/tools/merge.html
From this page, it would seem to do what you want:
bedtools merge combines overlapping or “book-ended” features in an
interval file into a single feature which spans all of the combined
features.
Moreover, you can use the -d option to specify max distance between featured to merge:
-d Maximum distance between features allowed for features to be merged. Default is 0. That is, overlapping and/or book-ended features
are merged.

Merge spatial point dataset with Spatial grid dataset using R. (Master dataset is in SP Points format)

I am working on spatial datasets using R.
Data Description
My master dataset is in SpatialPointsDataFrame format and has surface temperature data (column names - "ruralLSTday", "ruralLSTnight") for every month. Data snippet is shown below:
Master Data - (in SpatialPointsDataFrame format)
TOWN_ID ruralLSTday ruralLSTnight year month
2920006.11 2920006 303.6800 289.6400 2001 0
2920019.11 2920019 302.6071 289.0357 2001 0
2920015.11 2920015 303.4167 290.2083 2001 0
3214002.11 3214002 274.9762 293.5325 2001 0
3214003.11 3214003 216.0267 293.8704 2001 0
3207010.11 3207010 232.6923 295.5429 2001 0
Coordinates:
longitude latitude
2802003.11 78.10401 18.66295
2802001.11 77.89019 18.66485
2803003.11 79.14883 18.42483
2809002.11 79.55173 18.00016
2820004.11 78.86179 14.47118
I want to add columns in the above data about rainfall and air temperature - This data is present in SpatialGridDataFrame in the table "secondary_data" for every month. Snippet of "secondary_data" is shown below:
Secondary Data - (in SpatialGridDataFrame format)
month meant.69_73 rainfall.69_73
1 1 25.40968 0.6283871
2 2 26.19570 0.4580542
3 3 27.48942 1.0800000
4 4 28.21407 4.9440000
5 5 27.98987 9.3780645
Coordinates:
longitude latitude
[1,] 76.5 8.5
[2,] 76.5 8.5
[3,] 76.5 8.5
[4,] 76.5 8.5
[5,] 76.5 8.5
Question
How do I add the columns from secondary data to my master data by matching over latitude longitude and month? Currently the latitude/longitude information in the two table above will not match exactly as master data is a set of points and secondary data is grid.
Is there a way to find the square of the grid on the "Secondary Data" that the lat/long of my master data falls into, and interpolate?
If your SpatialPointsDataFrame object is called x, and your SpatialGridDataFrame is called y, then
x <- cbind(x, over(x, y))
will add the attributes (grid cell values) of y matching to the locations of x, to the attributes of x. Match is done by point-in-grid cell.
Interpolation is a different question; a simple way would be inverse distance with the four nearest neighbours, e.g. by
library(gstat)
x = idw(meant.69_73~1, y, x, nmax = 4)
whether you want one, or the other really depends on what your grid cells mean: do they refer to (i) the point value at the grid cell center, (ii) a value that is constant throughout the grid cell, or (iii) an average value over the whole grid cell. First case: interpolate, second: use over, third: use area-to-point interpolation (not explained here).
R package raster will offer similar functionality, but use different names.

convert Zip3 and zip5 level shapefiles to x-y coordinates

I am not sure how to start this, as my GIS playing in R has been to plot things using ggplot2 and other packages using latlong coordinates. What I need to do now, is to use a visualization component in Microstrategy that uses a shapefile in the form of an HTML file containing x-y coordinates for the plot (ie. top left is 0,0). An example of a state level file is:
<HTML><HEAD><TITLE>untitled</TITLE></HEAD><BODY>
<IMG SRC="" USEMAP="#myMap" WIDTH="812" HEIGHT="713" BORDER="0" />
<MAP NAME="myMap">
<AREA SHAPE="POLY" HREF="#" ALT="Texas" COORDS="299,363,299,360,....." />
</MAP></BODY></HTML>
The points listed in 'coords' are the X and Y points with respect to a 812 by 713 'image' that is plotted and colored on the fly.
I have shp, shx, dbf files for Zip3 and Zip5 from http://www.vdstech.com/usa-data.aspx but am unsure of where to even start the conversion! I don't mind doing the grunt work of formatting the HTML file by hand, it is the X-Y conversion that I am stuck at (rusty, not touched GIS for quite a while):
The following code imports the shapefile into R
library(rgdal)
zip3 <- readOGR(dsn = '/Users/adempsey/Downloads/zip3'), layer = 'zip3')
After which I am stuck and currently hunting for tutorial of how to extract zip3 + x-y coordinates into a dataframe that I can then use to create my final file with
update 2
using the following, I ca convert to a data frame, but I am unable to pull across the associated zip3 code, which appeared to be stored in the associated dbf file
Row long lat order hole piece group id
1 -151.0604 70.41873 1 FALSE 1 0.1 0
2 -150.7620 70.49722 2 FALSE 1 0.1 0
Yes, this is beyond my current rusty R
update3
This code dumps the zip codes into a data frame
zip3.codes <- as.data.frame(zip3)
Which should be combinable with something like
zip3.df <- fortify(zip3#polygons[[1000]])
Where the 1000 would be replaced with all the rows zip3.codes associated with a particular zip3
You can use fastshp package to load the data:
install.packages("fastshp",,"http://rforge.net")
library(fastshp)
s <- read.shp("zip5.shp", format="polygon")
s is now a list of all ZIP shapes. You're interested in the x and y components
- for example to plot the first ZIP simply use something like
plot(s[[1]]$x, s[[1]]$y, asp=1.25)
polygon(s[[1]]$x, s[[1]]$y, col="#eeeeee")
To match the names, use read.dbf from foreign:
library(foreign)
d <- read.dbf("zip5.dbf", as.is=TRUE)
names(s) <- d$ZIP5
See ?read.shp for more details on the available formats. The "polygon" one uses NA to separate individual polygons, "list" uses indexing to give you the parts.
BTW the dataset is somewhat dubious, you may want to look into TIGER/Line census ZCTA5 data (most recent is 2010).

Resources