python xarray open_dataset unable to read the second, third, or more nc file - netcdf

I tried to read different NetCDF file by xarray, but it only read the first file. While the second, third, and more NetCDF files were read as the first file without any error. Anyone knows how to solve this problem?
### read files
## VI
VI_terra = xr.open_dataset(data_vi+"MOD13A1.006_500m_aid0001.nc")
VI_aqua = xr.open_dataset(data_vi+"MYD13A1.006_500m_aid0001.nc")
## LAI
LAI = xr.open_dataset(data_lai+"MCD15A2H.006_500m_aid0001.nc")
## ET
ET_terra = xr.open_dataset(data_et+"MOD16A2GF.006_500m_aid0001.nc")
ET_aqua = xr.open_dataset(data_et+"MYD16A2GF.006_500m_aid0001.nc")
## Surface temperature
Tsurf_terra = xr.open_dataset(data_tsurf+"MOD11A2.006_1km_aid0001.nc")
Tsurf_aqua = xr.open_dataset(data_tsurf+"MYD11A2.006_1km_aid0001.nc")
But the LAI is misread as VI_terra:
While when I use ncdump to check LAI, the file itself doesn't have problem (differ from VI):

With xr.open_dataset, the .nc files remain open after reading.
The easiest solution is to use xr.load_dataset, which closes the .nc file automatically after reading.
If you need to stick with xr.open_dataset then you can place it in a with statement, or call .close() when finished.

Related

Reading multiple netcdf files

I am trying to read multple nc4 files in r. below is the code I am using to execute this task;
library(ncdf4)
OSND_gpmr.df<-NULL
GPM_R.files= list.files(path,pattern='*.nc4',full.names=TRUE)
for(i in seq_along(GPM_R.files)) {
nc_data = nc_open(GPM_R.files[i])
GPM_Prec<-ncvar_get(nc_data, 'IRprecipitation')
x=dim(GPM_Prec)
### note start=c(42,28) are the index in image regards to real coordinates of interset
## R reads images from lat,long.
OSND_gpmr.spec =ncvar_get(nc_data, 'IRprecipitation', start = c(42,28), count = c(1,1))
rbind(OSND_gpmr.df,data.frame(OSND_gpmr.spec))->OSND_gpmr.df
nc_close(nc_data)
}
but I consistently get this error:
Error in R_nc4_open: No such file or directory.
But the list of files is correctly recognised as chr [1:1440] as shown in the global environments-Values.
Can someone please help me with what I am doing wrong?
Your working directory might have been different from the files location. Your GPM_R.files list stores only the file names from the given location without file paths. While nc_open() expects filenames with complete path.

Save an object as a .R file within R, keeping formatting

I am writing an R script that reads in a template .R file, a list of dates, and creates a bunch of folders corresponding to the dates and containing copes of the .R wherein text substitution has been performed in R to customize each script for the given date.
I'm stuck on the part where I write out the .R file though, because the formatting and/or character representation keeps getting screwed up.
Here's a minimal, reproducible example:
RMapsDemo <- readLines("https://raw.githubusercontent.com/hack-r/RMapsDemo/master/RMapsDemo.R")
RMapsDemo <- gsub("## File: RMapsDemo.R", "## File: RMapsDemo.R ####", RMapsDemo)
save(RMapsDemo, file = "RMapsDemo.R") # Doesn't work right
save(RMapsDemo, file = "RMapsDemo.R", ascii = T) # Doesn't work right
dput(RMapsDemo, file = "RMapsDemo.R") # Close, but no cigar
dput(RMapsDemo, file = "RMapsDemo.R", control = c("keepNA", "keepInteger")) # Close, but no cigar
Ricardo Saporta pointed out the solution in the comments -- use writeLines.
I feel stupid for not thinking of this myself. It works beautifully.
writeLines(RMapsDemo, con = "RMapsDemo.R")

Package compilation and relative path

I must be very confused. Have looked around but cannot find a suitable answer and have a feeling I am doing something wrong.
Here is a minimalist example:
My function test import a file from a folder and does subsequent analysis on that file. I have dozens of compressed files in the folder specified by path = "inst/extdata/input_data"
test = structure(function(path,letter) {
file = paste0(path, "/file_",letter,".tsv.gz")
data = read.csv(file,sep="\t",header=F,quote="\"",stringsAsFactors=F)
return(mean(data$var1))
}, ex = function(){
path = "inst/extdata/input_data"
m1 = test(path,"A")
})
I am building a package with the function in the folder R/ of the package directory.
When I set the working directory to the package parent and run the example line by line, everything goes fine. However when I check the package with R CMD check it gives me the following:
cannot open file 'inst/extdata/input_data/file_A.tsv.gz': No such file or directory
Error in file(file, "rt") : cannot open the connection
I thought in checking and building the package the working directory is automatically set to the parent directory of the package (that in my case is "C:/Users/yuhu/R/Projects/ABCDpackage" but it seems not to be the case.
What is the best practice in this case? I would avoid converting all data in .rda format and put it in the data folder as there are too many files. Is there a way to compile the package and set in the function example the relative working directory where the package is located? This would be helpful also when the package is distributed (therefore it should not be my own path)
Many thanks for your help.
When R CMD check (or the user later for that matter) runs the example, you need to provide the full path to the file! You can build that path easily with the system.file or the path.package command.
If your package is called foo, the following should do the trick:
}, ex = function(){
path = paste0(system.file(package = "foo"), "/extdata/input_data")
m1 = test(path,"A")
})
You might want to add a file.path command somewhere to be OS independent.
Since read.csv is just a wrapper for read.table I would not expect any fundamental difference w.r.t. to reading compressed files.
Comment: R removes the "inst/" part of the directory when it builds the system directory. This thread has a discussion on the inst directory
I think you might just want to go with read.table... At any rate give this a try.
fopen <- file(paste0(path,"/file_",letter,".tsv.gz"),open="rt")
data <- read.table(fopen,sep="\t",header=F,quote="\"",stringsAsFactors=F)
Refinement:
At the end of the day I think your problem is mainly because you are using read.csv instead of read.table which can open up .gz zipped files directly. So just to be sure. Here is a little experiment I did.
Experiment:
# zip up a .csv file (in this case example_A.csv) that exists in my working directory into .gz format
system("gzip example_A.csv")
# just wanted to pass the path as a variable like you did
path <- getwd()
file <- paste0(path, "/example_", "A", ".csv.gz")
data <- read.table(file, sep=",", header=FALSE, stringsAsFactors=FALSE) # I think
# these are the only options you need.
# stringsAsFactors=FALSE is agood one.
data <- data[1:5,1:7] # a subset of the data
V1 V2 V3 V4 V5 V6 V7
1 id Scenario Region Fuel X2005 X2010 X2015
2 1 BSE9VOG4 R1 Biomass 0 2.2986 0.8306
3 2 BSE9VOG4 R1 Coal 7.4339 13.3548 9.2918
4 3 BSE9VOG4 R1 Gas 1.9918 2.4623 2.5558
5 4 BSE9VOG4 R1 LFG 0.2111 0.2111 0.2111
At the end of the day (I say that too much) you can be certain that the problem is in either the method you used to read the zipped up files or the text string you've constructed for the file names (haven't looked into the latter). At any rate best of luck with the package. I hope it turns tides.

DEM to Raster for multiple files

I'm trying to design a program to help me convert 1000+ DEM file into USGS raster file, using the method "arcpy.DEMtoRaster_Conversion" in ArcGIS. My idea is to use a OpenFileDialog to allow multiple selection for these files, then use an array to same these names and use these names as the inDEM and save the outRaster in tif format.
file_path = tkFileDialog.askopenfilename(filetypes=(("DEM", "*.dem"),),multiple=1)
this is how I open multiple files in the dialog, but I;m not sure how to save them so as to fulfill the following steps. Can someone help me?
This code will find all dems in a folder and apply the conversion function and save the output tiffs to another folder
#START USER INPUT
datadir="Y:/input_rasters/" #directory where dem files are located
outputdir="Y:/output_rasters/" #existing directory where output tifs are to be saved in
#END USER INPUT
import os
arcpy.env.overwriteOutput = True
arcpy.env.workspace = datadir
arcpy.env.compression = "LZW"
DEMList = arcpy.ListFiles("*.dem")
for f in DEMList:
print "starting %s" %(f)
rastername=os.path.join(datadir, f)
outrastername=os.path.join(outputdir, f[:-4]+".tif")
arcpy.DEMToRaster_conversion(rastername, outrastername)

How to download a gzipped file in R without saving it to the computer

I am a new user of R.
I have some txt.gz files on the web of approximate size 9x500000.
I'm trying to uncompress a file and read it straight to R with read.table().
I have used this code (url censored):
LoadData <- function(){
con <- gzcon(url("http://"))
raw <- textConnection(readLines(con, n = 25000))
close(con)
dat <- read.table(raw,skip = 2, na.strings = "99.9")
close(raw)
return(dat)
}
The problem is that if I read more lines with readLines, the
program will take much more time to do what it should.
How can I do this is reasonable time?
You can make a temporary file like this:
tmpfile <- tempfile(tmpdir=getwd())
file.create(tmpfile)
download.file(url,tmpfile)
#do your stuff
file.remove(tmpfile) #delete the tmpfile
Don't do this.
Each time you want to access the file, you'll have to re-download it, which is both time consuming for you and costly for the file hoster.
It is better practise to download the file (see download.file) and then read in a local copy in a separate step.
You can decompress the file with untar(..., compressed = "gzip").

Resources