I have just downloaded some climate data in grib format. I want to use "R" to convert it to NetCDF format.
Furthermore, as the file consists of different variables, I would like to extract one variable at a time into individual files.
It's hard to answer this without your specific file. You should look into producing reproducible examples, especially if you're posting to the R board.
For R, check out library(raster) and library(ncdf4). I just grabbed the first grib1 file I saw, and put together a quick example.
library(raster)
library(ncdf4)
download.file(url = 'ftp://ftp.hpc.ncep.noaa.gov/grib/20130815/p06m_2013081500f030.grb', destfile = 'test.grb')
(r <- raster('test.grb'))
n <- writeRaster(r, filename = 'netcdf_in_youR_comp.nc', overwrite = TRUE)
1. RNOMADS
The package Rnomads has a function readgrib providing wrappers to external libraries allowing one to read grib files
2. converting to netcdf
If the GRIB data is on a regular lat-lon grid, then probably an easier way is to convert to netcdf as the support for reading that is more developed (and you are probably already used to using it)
You can convert grib in several ways, two of the easiest are
CDO:
cdo -f nc copy test.grb test.nc
Use "-f nc4" if you want netcdf4 conventions.
ECCODES (on a mac install with brew install eccodes)
grib_to_netcdf -o test.nc test.grb
you can use ncl installed on you computer
library(ncdf)
system(ncl_convert2nc xxxx.grb, internal = TRUE)
my.nc <- open.ncdf("result.nc")
print(my.nc)
Related
I am trying to make my current project reproducible, and so am creating a master document (eventually a .rmd file) that will be used to call and execute several other documents. This way myself and other investigators only need to open and run one file.
There are three layers to the current setup: master file, 2 read-in files, 2 databases. The master file calls the read-in files using source(), and the read-in files parse the .csv databases and apply labels.
The read-in files and the databases are generated automatically with the data management software I'm currently using (REDCap) each time I download the updated data.
However, the read-in files have a line of code that removes all of the objects in my environment. I would like to edit the read-in files directly from the master file so that I do not have to open the read-in files individually each time I run my report. Specifically, since all the read-in files are the same, I would like to remove line #2 in each.
I've tried searching Google, and tried file.edit(), but have been unable to find anything. Not even sure it is possible, but figured I would ask. Let me know if I can improve this question or if you need any additional code to answer it. Thanks!
Current relevant master code (edited for generality):
source("read-in1")
source("read-in2")
Current relevant read-in file code (same in each file, except for the database name):
#Clear existing data and graphics
rm(list=ls())
graphics.off()
#Load Hmisc library
library(Hmisc)
#Read Data
data=read.csv('database.csv')
#Setting Labels
[read-in code truncated]
Additional details:
OS: Windows 7 Professional x86
R version: 3.1.3
R Studio version: 0.99.441
You might try readLines() and something like the following (which was simplified greatly by a suggestion from #Hong Ooi below):
eval(parse(readLines("read-in1.R")[-2]))
My original solution which was much more pedantic:
f <- file("read-in1.R", open="r")
t <- readLines(f)
close(f)
for (l in t[-2]) { eval(parse(text=l)) }
The for() loop just parses and evaluates each line from the text file except for the second one (that's what the -2 index value does). If you're reading and writing longer files then the following will be much faster than the second option, however still less preferable than #Hong Ooi's:
f <- file("read-in1.R", open="r")
t <- readLines(f)
close(f)
f <- file("out.R", open="w")
o <- writeLines(t[-2], f)
close(f)
source("out.R")
Sorry I'm so late in noticing this question, but you may want to investigate getting access the the REDCap API and using either the redcapAPI package or the REDCapR package. Both of those packages will allow you to export the data from REDCap and directly into R without having to use the download scripts. redcapAPI will even apply all the formats and dates (REDCapR might do this now too. It was in the plan, but I haven't used it in a while).
You could try this. It just calls some shell commands: (1) renames the file, then (2) copies all lines not containing rm(list=ls()) to a new file with the same name as the original file, then (3) removes the copy.
files_to_change <- c("read-in1.R", "read-in2.R")
for (f in files_to_change) {
old <- paste0(f, ".old")
system(paste("cmd.exe /c ren", f, old))
system(paste("cmd.exe /c findstr /v rm(list=ls())", old, ">", f))
system(paste("cmd.exe /c rm", old))
}
After calling this loop you should have
#Clear existing data and graphics
graphics.off()
#Load Hmisc library
library(Hmisc)
#Read Data
data=read.csv('database.csv')
#Setting Labels
in your read-in*.R files. You could put this in a batch script
#echo off
ren "%~f1" "%~nx1.old"
findstr /v "rm(list=ls())" "%~f1.old" > "%~f1"
rm "%~nx1.old"
say, "example.bat", and call that in the same way using system.
I am trying to read a .sas7bdat file in R. When I use the command
library(sas7bdat)
read.sas7bdat("filename")
I get the following error:
Error in read.sas7bdat("county2.sas7bdat") : file contains compressed data
I do not have experience with SAS, so any help will be highly appreciated.
Thanks!
According to the sas7bdat vignette [vignette('sas7bdat')], COMPRESS=BINARY (or COMPRESS=YES) is not currently supported as of 2013 (and this was the vignette active on 6/16/2014 when I wrote this). COMPRESS=CHAR is supported.
These are basically internal compression routines, intended to make filesizes smaller. They're not as good as gz or similar (not nearly as good), but they're supported by SAS transparently while writing SAS programs. Obviously they change the file format significantly, hence the lack of implementation yet.
If you have SAS, you need to write these to an uncompressed dataset.
options compress=no;
libname lib '//drive/path/to/files';
data lib.want;
set lib.have;
run;
That's the simplest way (of many), assuming you have a libname defined as lib as above and change have and want to names that are correct (have should be the filename without extension of the file, in most cases; want can be changed to anything logical with A-Z or underscore only, and 32 or fewer characters).
If you don't have SAS, you'll have to ask your data provided to make the data available uncompressed, or as a different format. If you're getting this from a PUDS somewhere on the web, you might post where you're getting it from and there might be a way to help you identify an uncompressed source.
This admittedly is not a pure R solution, but in many situations (e.g. if you aren't on a pc and don't have the ability to write the SAS file yourself) the other solutions posted are not workable.
Fortunately, Python has a module (https://pypi.python.org/pypi/sas7bdat) which supports reading compressed SAS data sets - it's certainly better using this than needing to acquire SAS if you don't already have it. Once you extract the file and save it to text via Python, you can then access it in R.
from sas7bdat import SAS7BDAT
import pandas as pd
InFileName = "myfile.sas7bdat"
OutFileName = "myfile.txt"
with SAS7BDAT(InFileName) as f:
df = f.to_data_frame()
df.to_csv(path_or_buf = OutFileName, sep = "\t", encoding = 'utf-8', index = False)
The haven package can read compressed SAS-files:
library(haven)
df <- read_sas("sasfile.sas7bdat")
But only SAS-files which are compressed using compress=char, but not compress=binary.
So haven will be able to read this SAS-file:
data output.compressed_data_char (compress=char);
set inputdata;
run;
But not this SAS-file:
data output.compressed_data_binary (compress=binary);
set inputdata;
run;
https://cran.r-project.org/package=haven
http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001002773.htm
"RevoScaleR" is a good package to read SAS data sets (compressed or uncompressed).You can use rxImport function of this package. Below is the example
Importing library
library(RevoScaleR)
Reading data
R_df_name <- rxImport("fake_path/file_name.sas7bdat")
The speed of this function is far better than haven/sas7bdat/sas7bdat.parso. I hope this helps anyone who struggles to read SAS data sets in R.
Cheers!
I found R to be the easiest for this kind of challenge, especially with compressed sas7dbat files, three simple lines:
library(haven)
data <- read_sas("yourfile.sas7dbat")
and then transform it to csv
write.csv(data,"data.csv")
How can I read and write the following file using R ?
https://www.dropbox.com/s/vlnrlxjs7f977zz/3B42_daily.2012.11.23.7.nc
In other words, I would like to read the "3B42_daily.2012.11.23.7.nc" file and write with the same structure that it is written.
Best regards
Package ncdf have functions to do this. You should also read other Q&A on this site tagged with netcdf and r.
Basically to read a netcdf file:
library(ncdf)
a <- open.ncdf('your/path/to/your/file.nc') #that opens a connection to the file
Then function get.var.ncdf helps you extract the data, variable by variable.
The process to write one is described in this Q&A.
The idea is to create dimensions first using dim.def.ncdf then the variables with var.def.ncdf and finally the file itself using create.ncdf.
I have just downloaded some climate data in grib format. I want to use "R" to convert it to NetCDF format.
Furthermore, as the file consists of different variables, I would like to extract one variable at a time into individual files.
It's hard to answer this without your specific file. You should look into producing reproducible examples, especially if you're posting to the R board.
For R, check out library(raster) and library(ncdf4). I just grabbed the first grib1 file I saw, and put together a quick example.
library(raster)
library(ncdf4)
download.file(url = 'ftp://ftp.hpc.ncep.noaa.gov/grib/20130815/p06m_2013081500f030.grb', destfile = 'test.grb')
(r <- raster('test.grb'))
n <- writeRaster(r, filename = 'netcdf_in_youR_comp.nc', overwrite = TRUE)
1. RNOMADS
The package Rnomads has a function readgrib providing wrappers to external libraries allowing one to read grib files
2. converting to netcdf
If the GRIB data is on a regular lat-lon grid, then probably an easier way is to convert to netcdf as the support for reading that is more developed (and you are probably already used to using it)
You can convert grib in several ways, two of the easiest are
CDO:
cdo -f nc copy test.grb test.nc
Use "-f nc4" if you want netcdf4 conventions.
ECCODES (on a mac install with brew install eccodes)
grib_to_netcdf -o test.nc test.grb
you can use ncl installed on you computer
library(ncdf)
system(ncl_convert2nc xxxx.grb, internal = TRUE)
my.nc <- open.ncdf("result.nc")
print(my.nc)
I am using R to work with meteorological data. I proceed in two steps:
convert grib to netcdf using the command line function ncl_convert2nc from ncar command language
use package ncdf in R to import the netcdf data.
I still have one problem:
2- For some particular grib files, the conversion with ncar tool does not work. Is there other ways or trick (other than transcription into netcdf) to read grib files in R ?
Problem Answered by Dirk: 1- I would like to do automatic treatment of many files within R. Can I call ncl_convert2nc within R ? (answered by Dirk Eddelbuettel below )
Regarding question 1, the answer is 'Yes' -- see help(system) and the internal=TRUE option if you want to capture results.
rgdal also can do it, but is less flexible and requires more care and detail than ncdf or RNetCDF - and depends of your GDAL/rgdal built including the GRIB driver.
ncl_convert2nc seems to be the best solution. However, if the structure of data is a little bit more complicated I use GrADS to convert GRIB file to ASCII (e.g. .csv) and then it is possible to create NetCDF file using ncdf4 package dedicated for R. GrADS also provides support for re-writing GRIB to NetCDF, but there is limitation to only 1 variable.
As an alternative to calling ncl_convert2nc from R, there are two alternatives I can suggest:
1. CDO conversion
Another quick and easy command line solution is to use cdo to convert to netcdf to read in:
cdo -f nc copy file.grb file.nc
If you want to output a netcdf4 file you specify "-f nc4".
One potential glitch with this approach is if your grib file has more than one time axis (e.g. for multiple seasonal forecasts) which can cause issues with the conversion.
2. ECCODES conversion
Instead eccodes offers a grib converter that is very robust and can handle all cases of multiple time axes which usually cause CDO and NCL based conversions to fail.
The command is called grib_to_netcdf
grib_to_netcdf -o output.nc input_grib.grb
So far, grib_to_netcdf has been able to handle every grib file I have thrown at it without problems.
Another solution is to use the wgrib/wgrib2 software (http://www.cpc.ncep.noaa.gov/products/wesley/wgrib2/) and dump your GRIB-1/GRIB-2 file directly to CSV format, e.g.:
/path/to/your/wgrib2 input_file.grb -csv output_file.csv
Then it may be read directly in R...