How to subset netcdf CRU V4.00 data in R using lat-lon? - r

I want to subset a region from cru global data. The "cmsaf" package , box_mergetime function can subset the CMIP5 and CORDEX .nc data but in CRU .nc data it gives following error.
>library(cmsaf)
>wd<-getwd()
>box_mergetime("tmp", wd, "cru_ts4.00.1901.1910", "output", 67,98,8,38)
get file information
[1] "vobjtovarid4: error #F: I could not find the requsted var (or dimvar) in the file!"
[1] "var (or dimvar) name: longitude"
[1] "file name: C:/Users/Deepak/Documents/eg/ip/cru_ts4.00.1901.1910.tmp.nc"
Error in vobjtovarid4(nc, varid, verbose = verbose, allowdimvar = TRUE) :
Variable not found
link of data file.

The reason for this error message was a missing standard name of longitude and latitude in CRU NetCDF data. I fixed this issue and with cmsaf version 1.8.0 it should work now. The cmsaf package offers similar functions as cdo (e.g., sellonlatbox, timmean, fldmean are included).

It is easy to do this from the command line with CDO before reading into R
Select a box:
cdo sellonlatbox,lon1,lon2,lat1,lat2 in.nc out.nc
Do a time mean:
cdo timmean in.nc out.nc
Do a space mean:
cdo fldmean in.nc out.nc
If you are using Ubuntu you can install cdo easily with
sudo apt-get install cdo
and under windows you can install it under CYGWIN. (update: rather than cygwin, these days far better to simply install linux subsystem under windows 10, very easy to do and then you have ubuntu straight out of the box)

Related

How to put Timestamp at the start of the Month in CDO?

Currently I am working with cmip6 models monthly precipitation .nc files. I want the dates to start from 1st day of each month.
Here is the part of result of cdo sinfo of input file :
1950-01-16 12:00:00 1950-02-15 00:00:00 1950-03-16 12:00:00 1950-04-16 00:00:00
As you can see the date starts from 16 or 15. Even after applying cdo --timestat_date first monmean to my input file, there was no change in dates of my output file. I tried on other models' files too but in vain.
My CDO ver:
CDI library version : 2.0.5
cgribex library version : 2.0.1
ecCodes library version : 2.26.0
NetCDF library version : 4.8.1 of Apr 25 2022 17:43:42 $
hdf5 library version : 1.12.1 threadsafe
exse library version : 1.4.2
FILE library version : 1.9.1
The easiest way to do this in CDO is probably to reset the time axis or set the day. In your case the dataset appears to start on 1st January 1950 and has every month since then. So, one the following ought to work.
cdo settaxis,1950-01-01,12:00:00,1mon infile outfile
cdo setday,1 infile outfile
Note that the command line option you used, --timestat_date first, assigns the first time available within each month to the output file when calculating the monthly mean. Thus it would have worked as desired if you had used daily data as input. However, as your input is already a monthly mean with a single timestep available, it will simply return the time from the original dataset in this case.
cdo --timestat_date first monmean infile outfile

How do I read an .tar.xz file?

I downloaded the Gwern Branwen dataset here: https://www.gwern.net/DNM-archives
I'm trying to read the dataset in R and I'm having a lot of trouble. I tried to open one of the files in the dataset called "1776.tar.xz" and I think I "unzipped" it with untar() but I'm not getting anything past that.
untar("C:/User/user/Downloads/dnmarchives/1776.tar.xz",
files = NULL,
list = FALSE, exdir = ".",
compressed = "xz", extras = NULL, verbose = FALSE, restore_times = TRUE,
tar = Sys.getenv("TAR"))
Edit: Thanks for all of the comments so far! The code is in base R. I have multiple datasets that I downloaded from Gwern's website. I'm just trying to open one to explore.
Base R includes function untar. On my Ubuntu 19.10 running R 3.6.2, default installation, the following was enough.
fls <- list.files(pattern = "\\.xz")
untar(fls[1], verbose = TRUE)
Note.
In the question, "dataset" is singular but there were several datasets (plural) on that website. To download the files I used
args <- "--verbose rsync://78.46.86.149:873/dnmarchives/grams.tar.xz rsync://78.46.86.149:873/dnmarchives/grams-20150714-20160417.tar.xz ./"
cmd <- "rsync"
od <- getwd()
setwd('~/tmp')
system2(cmd, args)
Thanks everyone! Not sure what was wrong with r for a bit but I reinstalled. I ended up unzipping manually and loading up the files.
I find that base R's untar() is a bit unreliable and/or slow on Windows.
What worked very well for me (on all platforms) was
library(archive)
archive_extract("C:/User/user/Downloads/dnmarchives/1776.tar.xz",
dir="C:/User/user/Downloads/dnmarchives")
It supports 'tar', 'ZIP', '7-zip', 'RAR', 'CAB', 'gzip', 'bzip2', 'compress', 'lzma' and 'xz' formats.
And one can also use it directly read in a csv file within an archive without having to UNZIP it first using
read_csv(archive_read("C:/User/user/Downloads/dnmarchives/1776.tar.xz", file = 1), col_types = cols())
On Debian or Ubuntu, first install the package xz-utils
$ sudo apt-get install xz-utils
Extract a .tar.xz the same way you would extract any tar.__ file.
$ tar -xf file.tar.xz
Done.

How to import SAS chinese version into Rmarkdown?

I already used SASmarkdown package in R to open sas.exe IN R,
however my edm.sas7bdat contain Chinese words(necessary), it needs to be run in SAS Chinese version.
Do anyone know how to use enginepath="sas(chinese)"?
I tried to configure "C:/Program Files/SASHome/SASFoundation/9.4/sas.exe" -CONFIG "C:/Program Files/SASHome/SASFoundation/9.4/nls/zh/sasv9.cfg"
but didn't work.
require(SASmarkdown)
##library
saspath <- "C:/Program Files/SASHome/SASFoundation/9.4/sas.exe"
sasopts <- "-nosplash -ls 75"
knitr::opts_chunk$set(engine='sas', engine.path=saspath,
engine.opts=sasopts, comment="")
data new;
set "c:\SAS\Analysis1201\dataset\edm.sas7bdat";
run;
proc print data=new;
run;
You might need to use double back-slashes:
sasopts <- "-nosplash – ls 75 -config 'C:\\Program Files\\SASHome\\SASFoundation\\9.4\\nls\\zh\\sasv9.cfg'"

Error with importShapefile with PBSmapping package in R

I am receiving a sporadic error message with importShapefile in PBSmapping (version 2.63.37) in RStudio (0.97.318), running R version 2.15.2, platform: i386-w64-mingw32/i386 (32-bit). I also received the error while running previous versions of R and RStudio.
> ST6 = importShapefile("Data/pvi_stat_2002_utm.shp", projection="UTM", readDBF = TRUE)
Error in 1:nrow(dbf) : argument of length 0
> traceback()
2: cbind(1:nrow(dbf), dbf)
1: importShapefile("Data/pvi_stat_2002_utm.shp", projection = "UTM",readDBF = TRUE)
I only receive this error occasionally - perhaps 1 in every 10 times that I run the code. But once the error occurs in a session, it occurs repeatedly and will not successfully implement the command until I have closed R completely and reopened it. On one occasion I had to reboot the computer for it to work, as successive reopening of R did not help.
I thought it might be a memory issue but sometimes I will get the error when no objects are in the workspace. And usually the code runs fine even if I have large objects loaded. In response to the error I have removed all objects from the workspace and even followed with gc(), but to no avail.
This is the only shapefile with which I have received the error but as it is the only one that I use with regularity and since I can not predict when the error will occur, my efforts with other shapefiles are inconclusive. Not sure about uploading a shapefile to Stack Overflow. The zipped file is about 9MB.
Have a look in the folder where your shapefile is. Is there actually a .dbf file? If there is, it sounds like it is empty or corrupted, or misnamed. Are you expecting your shapefile to have polygons with attributes. Can you try importShapefile(... readDBF = FALSE )? Maybe you can make our data available through a dropbox link or something?
Alternatively have you tried rgdal:::readOGR or, my personal favourite, maptools:::readShapePoly(). I personally find readShapePoly() to be extremely robust and there are methods for coercing a SpatialPolygonsDataFrame from sp to a PolySet from PBS.
If you really must use PBS have you tried...
require( maptools )
require( sp )
myshp <- readShapePoly("Data/pvi_stat_2002_utm")
myshpPBS <- SpatialPolygons2PolySet( myshp )
I am assuming that there is a .prj file with your shapefile, describing the projection information?
I'm using R-3.0.1 and PBS Mapping 2.66.53 with the NAVO Divisions shapefile from http://www.nafo.int/about/overview/gis/Divisions.zip. On Windows 7 x86_64 and OS X Snow Leopard (using macports R built for x86_64), the .dbf is being read properly, but it
sometimes fails using RHEL 5.9:
> library("PBSmapping", lib.loc="/home/gwhite/R/x86_64-unknown-linux-gnu-library/3.0")
-----------------------------------------------------------
PBS Mapping 2.66.53 -- Copyright (C) 2003-2013 Fisheries and Oceans Canada
[...]
-----------------------------------------------------------
> library("rgeos", lib.loc="/home/gwhite/R/x86_64-unknown-linux-gnu-library/3.0")
rgeos version: 0.2-19, (SVN revision 394)
GEOS runtime version: 3.3.8-CAPI-1.7.8
Polygon checking: TRUE
> layer='Divisions'
> divs = importShapefile(layer, projection='LL')
Error in 1:nrow(dbf) : argument of length 0
Using readDBF=F does allow the shapefile data to be read:
> divs = importShapefile(layer, projection='LL', readDBF=F)
So far, importShapefile() has been working in a freshly started R session.

Files in Collate field missing from package after build from incorrect .Rbuildignore file

One of the functions to my package refuses to be added to the package source when built, and then fails when running R CMD check.
My package is located on github here. The file, calculate_latitude_and_longitude.R, certainly exists in the R directory:
$ ls R
calculate_latitude_and_longitude.R clean_coordinates_XBLOCK.R clean_crime_data.R
load_crime_data_by_ward.R clean_coordinates.R
clean_coordinates_YBLOCK.R dccrimedata-package.R
I am able to build the package, but the build doesn't include the file calculate_latitude_and_longitude.R for some reason. I am able to verify that it skips this file by browsing the R directory in the tar ball.
Upon installing or running R CMD check dccrimedata_0.1.tar.gz I get the following error in the 00install.log file:
Error in .install_package_code_files(".", instdir) :
files in 'Collate' field missing from '/Users/erikshilts/workspace/dc_crime_data/dccrimedata.Rcheck/00_pkg_src/dccrimedata/R':
calculate_latitude_and_longitude.R
ERROR: unable to collate and parse R files for package ‘dccrimedata’
I've tried renaming the function, creating a new file, commenting out lines, removing roxygen tags, etc, but none of it helps to get that function into the package.
Any idea what's going wrong?
The full code for the function is here:
#' Calculate Latitude and Longitude
#'
#' Calculates latitude and longitude from XBLOCK AND YBLOCK coordinates.
#' The coordinates are given in the NAD 83 projection, Maryland state plane,
#' with units in meters. Documentation for this calculation can be found in the
#' README file.
#'
#' #param crime_data data.frame of crime records
#' #return data.frame with two additional columns, latitude and longitude, with units in the standard GPS format
#' #export
calculate_latitude_and_longitude <- function(crime_data) {
xy_coords <- crime_data[, c('XBLOCK', 'YBLOCK')]
coordinates(xy_coords) <- c('XBLOCK', 'YBLOCK')
# NAD83, maryland state plane, units in meters
proj4string(xy_coords) <- CRS("+init=esri:102285")
# Transform to latitude and longitude for GPS
xy_coords <- spTransform(xy_coords, CRS("+init=epsg:4326"))
xy_coords <- as.data.frame(xy_coords)
names(xy_coords) <- c('longitude', 'latitude')
crime_data <- cbind(crime_data, xy_coords)
crime_data
}
My DESCRIPTION file looks like:
Package: dccrimedata
Title: An R package containing DC crime data.
Description: Crime data from DC from 2006 to mid-2012
Version: 0.1
License: GPL-3
Author: erik.shilts
Maintainer: <erik.shilts#opower.com>
Depends:
rgdal,sp
Collate:
'calculate_latitude_and_longitude.R'
'clean_coordinates_XBLOCK.R'
'clean_coordinates_YBLOCK.R'
'clean_coordinates.R'
'clean_crime_data.R'
'load_crime_data_by_ward.R'
'dccrimedata-package.R'
Update:
I isolated the change to any file with "longitude" in the name ("latitude" works fine). My .Rbuildignore file in this repo looks like this:
.git
.Rhistory
.Rcheck
\.tar\.gz$
out
You'll notice that I don't escape the period in .git, which caused it to ignore any file with "Xgit" in it (X being any character), hence causing it to ignore my file "calculate_latitude_and_lon*git*ude.R"
The .Rbuildignore file incorrectly escapes the period in .git. Here's the .Rbuildignore file:
.git
.Rhistory
.Rcheck
\.tar\.gz$
out
The incorrectly escaped period in .git caused it to ignore files named with the word "longitude" because of the "git" in the middle of the word.
The .Rbuildignore file should look like:
\.git
\.Rhistory
\.Rcheck
\.tar\.gz$
out
Fun with Regular Expressions!

Resources