i have rainfall file nc and temperature file nc, i do'nt really understand with r, no experience before, so i'm trying this script and get error,
library(ncdf4)
library(data.table)
library(raster)
library(metR)
library(rgdal)
tmax2 <- nc_open("E:/SKRIPSI/prec-tmin-tmax-sumut/tmax2006-2022.nc")
> names(tmax2$var)
[1] "TASMAX"
> names(tmax2$dim)
[1] "NTIME1" "XAXIS23_301" "YAXIS26_132" "M2"
> info.file <- GlanceNetCDF(tmaxsumut)
Error in GlanceNetCDF(tmaxsumut) : could not find function "GlanceNetCDF"
>
> #pemilihan lokasi & waktu
> lat <- 0:4
> lon <- 98:100
> wkt <- seq(from = as.Date("2017-01-01"),
+ to = as.Date("2020-12-31"),
+ by = "days")
>
> tmax2 <- ReadNetCDF(tmaxsumut, vars="TASMAX",
+ subset=list(XAXIS23_301=lon, YAXIS26_132= lat, NTIME1=wkt))
Error in ReadNetCDF(tmaxsumut, vars = "TASMAX", subset = list(XAXIS23_301 = lon, :
could not find function "ReadNetCDF"
You are not describing what you want to achieve, making it very difficult to help. Feel free to edit your question to clarify your goals (do not use the comments for that).
I am guessing that you want to extract values from the ncdf file for point (long/lat) locations. If so, similar questions have been asked many times on this site, so you could probably do some more searches.
With standard compliant ncdf files you can simply do:
library(terra)
tmax2 <- rast("E:/SKRIPSI/prec-tmin-tmax-sumut/tmax2006-2022.nc", "TASMAX")
lat <- 1:3
lon <- 98:100
points <- vect(cbind(lon, lat))
e <- extract(tmax2, points)
This only works if the ncdf file has regular raster data. That is not guaranteed, but you provide no information about the file, nor do you provide the file.
Every time I run the script it always gives me an error: Error in { : task 1 failed - "could not find function "%>%""
I already check every post on this forum and tried to apply it but no one works.
Please advise any solution.
Please note: I have only 2 cores on my PC.
My code is as follows:
library(dplyr) # For basic data manipulation
library(ncdf4) # For creating NetCDF files
library(tidync) # For easily dealing with NetCDF data
library(ggplot2) # For visualising data
library(doParallel) # For parallel processing
MHW_res_grid <- readRDS("C:/Users/SUDHANSHU KUMAR/Desktop/MTech Project/R/MHW_result.Rds")
# Function for creating arrays from data.frames
df_acast <- function(df, lon_lat){
# Force grid
res <- df %>%
right_join(lon_lat, by = c("lon", "lat")) %>%
arrange(lon, lat)
# Convert date values to integers if they are present
if(lubridate::is.Date(res[1,4])) res[,4] <- as.integer(res[,4])
# Create array
res_array <- base::array(res[,4], dim = c(length(unique(lon_lat$lon)), length(unique(lon_lat$lat))))
dimnames(res_array) <- list(lon = unique(lon_lat$lon),
lat = unique(lon_lat$lat))
return(res_array)
}
# Wrapper function for last step before data are entered into NetCDF files
df_proc <- function(df, col_choice){
# Determine the correct array dimensions
lon_step <- mean(diff(sort(unique(df$lon))))
lat_step <- mean(diff(sort(unique(df$lat))))
lon <- seq(min(df$lon), max(df$lon), by = lon_step)
lat <- seq(min(df$lat), max(df$lat), by = lat_step)
# Create full lon/lat grid
lon_lat <- expand.grid(lon = lon, lat = lat) %>%
data.frame()
# Acast only the desired column
dfa <- plyr::daply(df[c("lon", "lat", "event_no", col_choice)],
c("event_no"), df_acast, .parallel = T, lon_lat = lon_lat)
return(dfa)
}
# We must now run this function on each column of data we want to add to the NetCDF file
doParallel::registerDoParallel(cores = 2)
prep_dur <- df_proc(MHW_res_grid, "duration")
prep_max_int <- df_proc(MHW_res_grid, "intensity_max")
prep_cum_int <- df_proc(MHW_res_grid, "intensity_cumulative")
prep_peak <- df_proc(MHW_res_grid, "date_peak")
I have a bunch of NZ Map Grid coordinates, which I want convert to lat/long. Based on this question, here is what I tried.
library(sp)
options(digits = 11) # to display to greater d.p.
Attempt 1:
proj4string <- "+proj=nzmg +lat_0=-41.0 +lon_0=173.0 +x_0=2510000.0
+y_0=6023150.0 +ellps=intl +units=m"
p <- proj4::project(c(2373200, 5718800), proj = proj4string, inverse=T)
Attempt 2
dat <- data.frame(id = c(1), x = c(2373200) , y = c(5718800))
sp::coordinates(dat) = ~x+y
sp::proj4string(dat) = CRS('+init=epsg:27200')
data_wgs84 <- spTransform(dat, CRS('+init=epsg:4326'))
print(data_wgs84)
If I run my coordinates through the linz coordinate conversion tool I get a slightly different result, which is the "true" result.
Results:
171.30179199 -43.72743909 # attempt 1 - ~200m off linz
171.30190004, -43.72577765 # attempt 2 - a few meters off linz
171.30189464, -43.72576664 # linz
Based on Mike T's answer I should be using a "distortion grid transformation method" and he links to a "nzgd2kgrid0005.gsb grid shift file".
My Question: Is it possible to do this conversion using R without downloading additional files (nzgd2kgrid0005.gsb)? I want to share my code with others without them having to download any additional files.
Any advice much appreciated.
Turns out it is pretty simple, if you have the rgdal package installed, the required nzgd2kgrid0005.gsb file is included and you don't need to download anything extra.
You just need to use the full PROJ.4 string as outlined in Mike T's answer.
dat <- data.frame(id = c(1), x = c(2373200) , y = c(5718800))
sp::coordinates(dat) = ~x+y
proj4string <- "+proj=nzmg +lat_0=-41 +lon_0=173 +x_0=2510000 +y_0=6023150
+ellps=intl +datum=nzgd49 +units=m +towgs84=59.47,-5.04,187.44,0.47,-0.1,1.024,-4.5993
+nadgrids=nzgd2kgrid0005.gsb +no_defs"
sp::proj4string(dat) = sp::CRS(proj4string)
data_wgs84 <- sp::spTransform(dat, sp::CRS('+init=epsg:4326'))
as.data.frame(data_wgs84)
id x y
1 171.3018946 -43.72576664
Which is the same as the output from the LINZ coordinate conversion tool. Hopefully this saves someone else a bit of time.
I'm trying to plot precipitation data which has a 2.5 x 2.5 grid with the country contour on top, the data is available in this link: https://www.esrl.noaa.gov/psd/data/gridded/data.cmap.html "Mean (Enhanced Monthly)"
I was using the answer from: R - Plotting netcdf climate data. However I get an error.
This is what I have done:
library(ncdf4)
ncpath <- "C:/Users/"
ncname <- "precip.mon.mean"
ncfname <- paste(ncpath,ncname,".nc",sep="")
ncin <- nc_open(ncfname)
lon <- ncvar_get(ncin, "lon")
nlon <- dim(lon)
lat <- ncvar_get(ncin, "lat")
nlat <- dim(lat)
dname <-"precip"
ppt_array <- ncvar_get(ncin,dname)
dim(ppt_array)
pres <- ppt_array[ , ,25:444]
precip <- array(pres, , dim=c(nlon, nlat, 12, ano))
prec <- precip[97:115,21:34, ,1:ano] #I just want a piece of the map
Here is where I have the problem:
latlat <- rev(lat)
precipit <- prec[ , ,1,1] %Just to see if it works
lonlon <- lon-180
image(lonlon,latlat,precipit)
library(maptools)
data(wrld_simpl)
#however I don't know if this will work to plot just a portion of the map
plot(wrld_simpl,add=TRUE)
I get several errors, could someone please help?
EDIT:
The errors I got were these:
> image(lonlon,latlat,precipit)
Error in image.default(lonlon, latlat, precipit) :
increasing 'x' and 'y' values expected
> library(maptools)
> data(wrld_simpl)
> plot(wrld_simpl,add=TRUE)
Error in polypath(x = mcrds[, 1], y = mcrds[, 2], border = border, col = col, :
plot.new has not been called yet
There's several things that need to be fixed:
1) ano does not seem to be defined anywhere. Perhaps it was defined interactively?
precip <- array(pres, , dim=c(nlon, nlat, 12, ano))
2) It appears you intended to add a comment but used an infix operator instead - replace this with a #, like so:
precipit <- prec[ , ,1,1] # Just to see if it works
3) If you want to only have part of the map, you can either ensure that both the lat and lon arrays match the area that you want to show (essentially cropping the world map) or define NAs outside the region you want to highlight (which will appear similar to the map here)
I have several directories with 700+ binary encoded rasters that i take average the output rasters per directory. however, i currently create the rasters 1 by 1 in a for loop, then load newly created rasters back into R to take the sum to obtain the monthly rainfall total.
However, since I dont need the individual rasters, only the average raster, I have a hunch that I could do this all w/in 1 loop and not save the rasters but just the output average raster, but I am coming up short in how to program this in R.
setwd("~/Desktop/CMORPH/Levant-Clip/200001")
dir.output <- '~/Desktop/CMORPH/Levant-Clip/200001' ### change as needed to give output location
path <- list.files("~/Desktop/CMORPH/MonthlyCMORPH/200001",pattern="*.bz2", full.names=T, recursive=T)
for (i in 1:length(path)) {
files = bzfile(path[i], "rb")
data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
y<-matrix((data=data), ncol=1649, nrow=4948)
r <- raster(y)
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
tr <- t(r) #transpose
re <- setExtent(tr,extent(e)) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
C_Lev <- crop(ry, Levant) ### Clip to Levant
M_C_Lev<-mask(C_Lev, Levant)
writeRaster(M_C_Lev, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
}
#
raspath <- list.files ('~/Desktop/CMORPH/Levant-Clip/200001',pattern="*.tif", full.names=T, recursive=T)
rasstk <- stack(raspath)
sum200001<-sum(rasstk)
writeRaster(avg200001, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
currently, this code takes about 75 mins to execute, and I have about 120 more directories to go, and am looking for faster solutions.
thank you for all and any comments and input. best, evan
Elaborating on my previous comment, you could try:
setwd("~/Desktop/CMORPH/Levant-Clip/200001")
dir.output <- '~/Desktop/CMORPH/Levant-Clip/200001' ### change as needed to give output location
path <- list.files("~/Desktop/CMORPH/MonthlyCMORPH/200001",pattern="*.bz2", full.names=T, recursive=T)
raster_list = list()
for (i in 1:length(path)) {
files = bzfile(path[i], "rb")
data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
y<-matrix((data=data), ncol=1649, nrow=4948)
r <- raster(y)
if (i == 1) {
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
}
tr <- t(r) #transpose
re <- setExtent(tr,extent(e)) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
C_Lev <- crop(ry, Levant) ### Clip to Levant
M_C_Lev<-mask(C_Lev, Levant)
raster_list[[i]] = M_C_Lev
}
#
rasstk <- stack(raster_list, quick = TRUE) # OR rasstk <- brick(raster_list, quick = TRUE)
avg200001<-mean(rasstk)
writeRaster(avg200001, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
Using the "quick" options in stack should definitely speed-up things, in particular if you have many rasters.
Another possibility is to first compute the average, and then perform the "spatial proceesing". For example:
for (i in 1:length(path)) {
files = bzfile(path[i], "rb")
data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
if (i == 1) {
totdata <- data
num_nonNA <- as.numeric(!is.na(data))
} else {
totdata = rowSums(cbind(totdata,data), na.rm = TRUE)
# We have to count the number of "valid" entries so that the average is correct !
num_nonNA = rowSums(cbind(num_nonNA,as.numeric(!is.na(data))),na.rm = TRUE)
}
}
avg_data = totdata/num_nonNA # Compute the average
# Now do the "spatial" processing
y<-matrix(avg_data, ncol=1649, nrow=4948)
r <- raster(y)
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
tr <- t(r) #transpose
re <- setExtent(tr,extent(e)) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
C_Lev <- crop(avg_data, Levant) ### Clip to Levant
M_C_Lev<-mask(C_Lev, Levant)
writeRaster(M_C_Lev, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
This could be faster or slower, depending from "how much" you are cropping the original data.
HTH,
Lorenzo
I'm adding another answer to clarify and simplify things a bit, also in relation with comments in chat. The code below should do what you ask: that is, cycle over files, read the "data", compute the sum over all files and convert it to a raster with specified dimensions.
Note that for testing purposes here I substituted your cycle on file names with a simple 1 to 720 cycle, and file reading with the creation of arrays of the same length as yours filled with values from 1 to 4 and some NA !
totdata <- array(dim = 4948*1649) # Define Dummy array
for (i in 1:720) {
message("Working on file: ", i)
data <- array(rep(c(1,2,3,4),4948*1649/4), dim = 4948*1649) # Create a "fake" 4948*1649 array each time to simulate data reading
data[1:1000] <- -999 # Set some values to NA
data[data == -999] <- NA #convert missing data from -999
totdata <- rowSums(cbind(totdata, data), na.rm = T) # Let's sum the current array with the cumulative sum so far
}
# Now reshape to matrix and convertt to raster, etc.
y <- matrix(totdata, ncol=1649, nrow=4948)
r <- raster(y)
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
tr <- t(r) #transpose
re <- setExtent(tr,e) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
This generates a "proper" raster:
> ry
class : RasterLayer
dimensions : 1649, 4948, 8159252 (nrow, ncol, ncell)
resolution : 0.07275667, 0.1052902 (x, y)
extent : -180, 180, -90, 83.6236 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : 0, 2880 (min, max)
contatining the sum of the different arrays: You can notice that max value is 720 * 4 = 2880 (Only caveat: If you have cells which are always at NA, you will get 0 instead than NA)
On my laptop, this runs in about 5 minutes !
In practice:
to avoid memory problems, I am not reading in memory all the data.
Each of your arrays is more or less 64MB, so I cannot load them all
and then do the sum (unless I have 50 GB of RAM to throw away - and even in
that case it would be slow). I instead make use of the associative
propoerty of summation by computing a "cumulative" sum at each
cycle. In this way you are only working with two 8-millions arrays at
a time: the one you read from file "i", and the one that contains
the current sum.
to avoid unnecessary computations here I am summing directly the
1-dimensional arrays I get from reading the binary. You don't need
to reshape to matrix the arrays in the cycle because you can do that
on the final "summed" array which you can then convert to matrix form
I hope this will work for you and that I am not missing something obvious !
As far as I can understand, if using this approach is still slow you are having problems elsewhere (for example in data reading: on 720 files, 3 seconds spent on reading for each file means roughly 35 minutes of processing).
HTH,
Lorenzo