R - Plotting netcdf climate data - r

I have been trying plot the following gridded netcdf file: "air.1999.nc" found at the following website:
http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.html
I have tried the code below based on answers I have found here and elsewhere, but no luck.
library(ncdf);
temp.nc <- open.ncdf("air.1999.nc");
temp <- get.var.ncdf(temp.nc,"air");
temp.nc$dim$lon$vals -> lon
temp.nc$dim$lat$vals -> lat
lat <- rev(lat)
temp <- temp[nrow(temp):1,]
temp[temp==-32767] <- NA
temp <- t(temp)
image(lon,lat,temp)
library(maptools)
data(wrld_simpl)
plot(wrld_simpl, add = TRUE)
This code was from modified from the one found here: The variable from a netcdf file comes out flipped
Does anyone have any ideas or experience with using these type of netcdf files? Thanks

In the question you linked the whole part from lat <- rev(lat) to temp <- t(temp) was very specific to that particular OP dataset and have absolutely no universal value.
temp.nc <- open.ncdf("~/Downloads/air.1999.nc")
temp.nc
[1] "file ~/Downloads/air.1999.nc has 4 dimensions:"
[1] "lon Size: 144"
[1] "lat Size: 73"
[1] "level Size: 12"
[1] "time Size: 365"
[1] "------------------------"
[1] "file ~/Downloads/air.1999.nc has 2 variables:"
[1] "short air[lon,lat,level,time] Longname:Air temperature Missval:32767"
[1] "short head[level,time] Longname:Missing Missval:NA"
As you can see from these informations, in your case, missing values are represented by the value 32767 so the following should be your first step:
temp <- get.var.ncdf(temp.nc,"air")
temp[temp=="32767"] <- NA
Additionnaly in your case you have 4 dimensions to your data, not just 2, they are longitude, latitude, level (which I'm assuming represent the height) and time.
temp.nc$dim$lon$vals -> lon
temp.nc$dim$lat$vals -> lat
temp.nc$dim$time$vals -> time
temp.nc$dim$level$vals -> lev
If you have a look at lat you see that the values are in reverse (which image will frown upon) so let's reverse them:
lat <- rev(lat)
temp <- temp[, ncol(temp):1, , ] #lat being our dimension number 2
Then the longitude is expressed from 0 to 360 which is not standard, it should be from -180 to 180 so let's change that:
lon <- lon -180
So now let's plot the data for a level of 1000 (i. e. the first one) and the first date:
temp11 <- temp[ , , 1, 1] #Level is the third dimension and time the fourth.
image(lon,lat,temp11)
And then let's superimpose a world map:
library(maptools)
data(wrld_simpl)
plot(wrld_simpl,add=TRUE)

Related

Error in number of decimal places with read.xlsx

I am trying to read in a dataset of coordinates in the British National grid system, using the read.xlsx command.
This is the data:
NORTHING EASTING TOC ELEVATION WELL ID
1194228.31 2254272.83 117.30 AA-1
1194227.81 2254193.90 114.91 AA-2
1194228.41 2254116.26 114.76 AA-3
1194229.37 2254039.57 112.81 AA-4
1194227.09 2253960.17 112.10 AA-5
and this is my code:
coordinates <- read.xlsx2("Coordinates.xlsx",sheetName = "Sheet1",
startRow = 1,endRow = 111, colIndex = c(1:4),
colClasses = c("character","character","numeric","character"))
The problem is, my output looks like this:
NORTHING EASTING TOC.ELEVATION WELL.ID
1 1194228 2254273 117.30 AA-1
2 1194228 2254194 114.91 AA-2
3 1194228 2254116 114.76 AA-3
4 1194229 2254040 112.81 AA-4
5 1194227 2253960 112.10 AA-5
6 1194227 2253880 110.98 AA-6
The command is rounding up the horizontal and vertical coordinates, and while this is not a big issue, I'd like to be as exact as possible. Is there a workaround to this? I could not find anything in the options to the colClasses option either.
This is an issue of how R is printing out the data (it is generally convenient not to give the full representation of floating-point data); you didn't actually lose any precision.
Illustrating with read.table rather than read.xlsx (we're going to end up in the same place). (If I read the data with colClasses specifying "character", I do get all of the digits displayed, but I also end up with a rather useless data frame if I want to do anything sensible with the northings and eastings variables ...)
dat <- read.table(header=TRUE,
text="
NORTHING EASTING TOC.ELEVATION WELL.ID
1194228.31 2254272.83 117.30 AA-1
1194227.81 2254193.90 114.91 AA-2
1194228.41 2254116.26 114.76 AA-3
1194229.37 2254039.57 112.81 AA-4
1194227.09 2253960.17 112.10 AA-5")
This is how R prints the data frame:
# NORTHING EASTING TOC.ELEVATION WELL.ID
# 1 1194228 2254273 117.30 AA-1
# 2 1194228 2254194 114.91 AA-2
# 3 1194228 2254116 114.76 AA-3
# 4 1194229 2254040 112.81 AA-4
# 5 1194227 2253960 112.10 AA-5
But it's still possible to see that all of the precision is still there ...
print(dat$NORTHING,digits=12)
## [1] 1194228.31 1194227.81 1194228.41 1194229.37 1194227.09
You could also print(dat,digits=12) or set options(digits=12) globally ...

Why does R add an "x" when renaming raster stack layers

I have a raster stack/brick in R containing 84 layers and I am trying to name them according to year and month from 199911 to 200610 (November 1999 to October 2006). However for some reason R keeps adding an "X" onto the beginning of any names I give my layers.
Does anyone know why this is happening and how to fix it? Here are some of the ways I've tried:
# Import raster brick
rast <- brick("rast.tif")
names(rast)[1:3]
[1] "MonthlyRainfall.1" "MonthlyRainfall.2" "MonthlyRainfall.3"
## Method 1
names(rast) <- paste0(rep(1999:2006, each=12), 1:12)[11:94]
names(rast)[1:3]
[1] "X199911" "X199912" "X20001"
## Method 2
# Create a vector of dates
dates <- format(seq(as.Date('1999/11/1'), as.Date('2006/10/1'), by='month'), '%Y%m')
dates[1:3]
[1] "199911" "199912" "200001"
# Set names
rast <- setNames(rast, dates)
names(rast)[1:3]
[1] "X199911" "X199912" "X200001"
## Method 3
names(rast) <- paste0("", dates)
names(rast)[1:3]
[1] "X199911" "X199912" "X200001"
## Method 4
substr(names(rast), 2, 7)[1:3]
[1] "199911" "199912" "200001"
names(rast) <- substr(names(rast), 2, 7)
names(rast)[1:3]
[1] "X199911" "X199912" "X200001"
To some extent I have been able to work around the problem by adding "X" to the beginning of some of my other data but now its reached the point where I can't do that any more. Any help would be greatly appreciated!
R won't allow the column to begin with a numeral so it prepends a character to avoid that restriction.

Change RasterBrick dimensions from TXY to XYT

I am trying to import a netCDF file into a RasterBrick in R. The netCDF file has 3 dimensions.
library(ncdf)
nc <- open.ncdf("fm100_2003.nc");
print(nc)
[1] "file fm100_2003.nc has 3 dimensions:"
[1] "lon Size: 1386"
[1] "lat Size: 585"
[1] "day Size: 365"
[1] "------------------------"
[1] "file fm100_2003.nc has 1 variables:"
[1] "short dead_fuel_moisture_100hr[day,lon,lat] Longname:dead_fuel_moisture_100hr Missval:-9999"
The size of the day dimension correspond to daily fuel moisture for one year (365 days). I'd like to import these into a RasterBrick for additional analysis which is pretty straightforward with,
r <- "fm100_2003.nc"
b <- brick(r,varname="dead_fuel_moisture_100hr")
However, the issue is that the ncol and nlayers in the RasterBrick are switched, which results in an incorrect rasterLayer for each layer in the brick. The dimensions of the RasterBrick should read 1386, 585, 505890, 365 instead of the dimensions below:
class : RasterBrick
dimensions : 1386, 365, 505890, 585 (nrow, ncol, ncell, nlayers)
resolution : 1, 0.04166667 (x, y)
extent : 37619.5, 37984.5, -124.793, -67.043 (xmin, xmax, ymin, ymax)
coord. ref. : NA
data source : fm100_2003.nc
names : X49.3960227966309, X49.3543561299642, X49.3126894632975, X49.2710227966309, X49.2293561299642, X49.1876894632975, X49.1460227966309, X49.1043561299642, X49.0626894632975, X49.0210227966309, X48.9793561299642, X48.9376894632975, X48.8960227966309, X48.8543561299642, X48.8126894632975, ...
degrees_north: 25.0626894632975, 49.3960227966309 (min, max)
varname : dead_fuel_moisture_100hr
I am wondering if there is any way to specify the dimensions when creating the RasterBrick to avoid this problem?
It's rather strange, why the dimension is not correct. You could explore your ncdf file by using some command below:
# Open your dimension
# following by '$' you can use tab to see the next available command (R Studio)
r$dim
# REad value for each dimension
lon = get.var.ncdf(r, varid='lon')
lat = get.var.ncdf(r, varid='lat')
time = get.var.ncdf(r, varid='day')
I was able to figure out a solution (or perhaps work around) to the above problem.
I first import the netCDF file into an array in R.
dname <- "dead_fuel_moisture_100hr"
array1 <- get.var.ncdf(nc, dname)
dim(array1)
[1] 365 1386 585
The dimensions of array1 are: days, columns, rows. However, I can change the dimensions of the array:
array2<-aperm(array1, c(3, 2, 1))
dim(array2)
[1] 585 1386 365
Now the array is organized properly: rows, columns, days. At this point I can access the depth range I need (days 1 through 365) as a matrix:
fm.day.001<-array2[,,1]
...
fm.day.365<-array2[,,365]
The matrix can be converted to a raster too:
r2<-raster(nrow=585,ncol=1386,vals=fm.day.001, xmn=-124.7722, xmx=-67.06383, ymn=25.06269, ymx=49.39602)
You can try the dims argument. Something like:
b <- brick("fm100_2003.nc", varname="dead_fuel_moisture_100hr", dims=3:1)
This is experimental. Even if it creates the correct object, the object may not work for subsequent operations.

r read NetCDF and export as shapefile

I need to read a NetCDF file with R and export each time step as a smoothed polygon shapefile.
I have two problems: smoothing the raster and exporting to shapefile with proper projection from the NC file.
The output is a regular grid and is not projected.
Here is a sample code:
>NCFileName = MyncFile.nc
NCFile = open.ncdf(NCFileName)
NCFile
[1] "file CF_OUTPUT.nc has 6 dimensions:"
[1] "time Size: 61"
[1] "height Size: 8"
[1] "lat Size: 185"
[1] "lon Size: 64"
[1] "Time Size: 61"
[1] "DateStrLen Size: 19"
[1] "------------------------"
[1] "file CF_OUTPUT.nc has 20 variables:"
[1] "float temp[lon,lat,height,time] Longname:Temperature Missval:1e+30"
[1] "float relh[lon,lat,height,time] Longname:Relative Humidity Missval:1e+30"
[1] "float airm[lon,lat,height,time] Longname:Air density Missval:1e+30"
[1] "float z[lon,lat,height,time] Longname:Layer top altitude Missval:1e+30"
[1] "float ZH[lon,lat,height,time] Longname:Layer top altitude Missval:1e+30"
[1] "float hlay[lon,lat,height,time] Longname:Layer top altitude Missval:1e+30"
[1] "float PM10ant[lon,lat,height,time] Longname:PM10ant Concentration Missval:1e+30"
[1] "float PM10bio[lon,lat,height,time] Longname:PM10bio Concentration Missval:1e+30"
[1] "float PM10[lon,lat,height,time] Longname:PM10 Concentration Missval:1e+30"
[1] "float PM25ant[lon,lat,height,time] Longname:PM25ant Concentration Missval:1e+30"
[1] "float PM25bio[lon,lat,height,time] Longname:PM25bio Concentration Missval:1e+30"
[1] "float PM25[lon,lat,height,time] Longname:PM25 Concentration Missval:1e+30"
[1] "float C2H4[lon,lat,height,time] Longname:C2H4 Concentration Missval:1e+30"
[1] "float CO[lon,lat,height,time] Longname:CO Concentration Missval:1e+30"
[1] "float SO2[lon,lat,height,time] Longname:SO2 Concentration Missval:1e+30"
[1] "float NO[lon,lat,height,time] Longname:NO Concentration Missval:1e+30"
[1] "float NO2[lon,lat,height,time] Longname:NO2 Concentration Missval:1e+30"
[1] "float O3[lon,lat,height,time] Longname:O3 Concentration Missval:1e+30"
[1] "char Times[DateStrLen,Time] Longname:Times Missval:NA"
[1] "float HGT[lon,lat,time] Longname:Topography Missval:1e+30"
nc.a=get.var.ncdf(NCFile , varid = 'NO2', start=c(1,1,1,1), count=c(-1,-1,1,1))
Pol <- rasterToPolygons(raster(nc.a),dissolve = TRUE)
Pol
class : SpatialPolygonsDataFrame
features : 11829
extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
coord. ref. : NA
variables : 1
names : layer
min values : 0.219758316874504
max values : 0.84041428565979
writeOGR(Pol, dsn = getwd(), layer = 'testPol', driver = 'ESRI Shapefile', overwrite_layer = TRUE)
What I get, however, are grided polygons that are not projected.
UPDATE:
Following #kakk11 and #RobertH answers, I was able to solve part of the problem. I still get a grid-like polygons, not smoothed. Here is what I did so far:
I couldn't extract the variable directly to raster as #RobertH suggested. so I used the 'get.var.ncdf' and then 'raster':
NCFileName = 'MyncFile.nc'
NCFile = open.ncdf(NCFileName)
nc.a = get.var.ncdf(NCFile, varid = 'NO2', start=c(1,1,1,13), count=c(-1,-1,1,1))
nc.a = raster(nc.a)
# put in correct extent:
lat = NCFile$dim$lat$vals
lon = NCFile$dim$lon$vals
ExtentLat = range(lat)
ExtentLon = range(lon)
rm(lat,lon)
nc.a = flip(t(nc.a), direction='y')
# Give it lat/lon coords
extent(nc.a) = c(ExtentLon,ExtentLat)
Then the 'cut' command returns vector, so i used 'ratser:reclassify':
cuts = c(0,5,15,30,50)
classes <- cbind(cuts[1:length(cuts)-1],cuts[2:length(cuts)],cuts[2:length(cuts)])
nc.class <- reclassify(nc.a, classes)
I then used the 'rasterToPolygons' with 'dissolve=TRUE' to create the polygons:
pol <- rasterToPolygons(nc.class, dissolve = TRUE)
# set UTM projection:
WGS84_Projection = "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
proj4string(pol) <- CRS(WGS84_Projection)
writeOGR(pol, dsn = getwd(), layer = 'file' , driver = 'ESRI Shapefile', overwrite_layer = TRUE)
Still, all this creates polygon shapefile with the polygons that are not smooth, which is the main challenge.
Could use some help with this.
Ilik
You first need correctly create a RasterLayer, like this:
r <- raster('MyncFile.nc', var='NO2')
# or, to get all time steps at once
# brick('MyncFile.nc', var='NO2')
You could then generalize (classify) the values using reclassify or cut. For example
cuts <- seq(0.2, 0.9, 0.1)
rc <- cut(r, cuts)
Make polygons and save to shapefile
pol <- rasterToPolygons(rc, dissolve = TRUE)
shapefile(pol, 'file.shp')

postcode distances using google

I have a two lists of postcodes (in R)...one of children's addresses with their academic score and one of schools...
i would like to be able to get the closest school for each child...so presumably a calculation of distance would been needed between postcodes by converting to long and lat values?
And then I would like to be able to plot on a google map all the children per school...and see if the children who live closer to school get better grades...perhaps ploting schools a different colour to kids, and the kids having a gradient of colour according to their score?
perhaps something using the googleVis package?
so for example...
if we have the data for 3 kids and 2 schools...
student.data <- cbind(post.codes=c("KA12 6QE", "SW1A 0AA", "WC1X 9NT"),score=c(23,58,88))
school.postcodes <- c("SL4 6DW", "SW13 9JT")
(N.B. My actual data is obviously significantly larger than the one given so scalability would be useful...)
what should be done with googleVis or any other package for that matter to be able to complete the above?
I would start by something like this to get the lat/long
Get lat/long for each post code
library(XML)
school.postcodes <- c("KA12 6QE", "SW1A 0AA", "WC1X 9NT")
ll <- lapply(school.postcodes,
function(str){
u <- paste('http://maps.google.com/maps/api/geocode/xml?sensor=false&address=',str)
doc <- xmlTreeParse(u, useInternal=TRUE)
lat=xpathApply(doc,'/GeocodeResponse/result/geometry/location/lat',xmlValue)[[1]]
lng=xpathApply(doc,'/GeocodeResponse/result/geometry/location/lng',xmlValue)[[1]]
c(code = str,lat = lat, lng = lng)
})
# get long/lat for the students
ll.students <- lapply(student.data$post.codes,
function(str){
u <- paste('http://maps.google.com/maps/api/geocode/xml?sensor=false&address=',str)
doc <- xmlTreeParse(u, useInternal=TRUE)
lat=xpathApply(doc,'/GeocodeResponse/result/geometry/location/lat',xmlValue)[[1]]
lng=xpathApply(doc,'/GeocodeResponse/result/geometry/location/lng',xmlValue)[[1]]
c(code = str,lat = lat, lng = lng)
})
ll <- do.call(rbind,ll)
ll.students <- do.call(rbind,ll.students)
do.call(rbind,ll)
code lat lng
[1,] "KA12%206QE" "55.6188429" "-4.6766226"
[2,] "SW1A%200AA" "51.5004864" "-0.1254664"
[3,] "WC1X%209NT" "51.5287992" "-0.1181098"
get the distance matrix
library(RJSONIO)
dist.list <- lapply(seq(nrow(ll)),
function(id){
url <- paste("http://maps.googleapis.com/maps/api/distancematrix/json?origins=",
ll[id,2],",",ll[id,3],
"&destinations=",
paste( ll.students[,2],ll.students[,3],sep=',',collapse='|'),
"&sensor=false",sep ='')
res <- fromJSON(url)
hh <- sapply(res$rows[[1]]$elements,function(dest){
c(distance= as.numeric(dest$distance$value),
duration = dest$duration$text)
})
hh <- rbind(hh,destination = ll.students[,1])
})
names(dist.list) <- ll[,1]
dist.list
$`SL4 6DW`
[,1] [,2] [,3]
distance "664698" "36583" "41967"
duration "6 hours 30 mins" "43 mins" "49 mins"
destination "1" "2" "3"
$`SW13 9JT`
[,1] [,2] [,3]
distance "682210" "9476" "13125"
duration "6 hours 39 mins" "22 mins" "27 mins"
destination "1" "2" "3"

Resources