How to convert Sentinel-3 .nc-file into .tiff-file? - r

regarding the conversion of .nc-files into .tiff-files i encounter the problem of loosing geoinformation of my pixels. I know that other users experienced the same problem and tried to solve it via kotlin but failed. i would prefer a solution using R. see here for kotlin approach URL:https://gis.stackexchange.com/questions/259700/converting-sentinel-3-data-netcdf-to-geotiff
I downloaded freely available Sentinel-3 data of the ESA (URL:https://scihub.copernicus.eu/dhus/#/home). This data comes unfortunately in the .nc-format, so I want to convert it into the .tiff-format. I have already tried various approaches, but failed. What I have tried so far:
data_source <- 'D:/user_1/01_test_data/S3A_SL_1_RBT____20180708T093240_20180708T093540_20180709T141944_0179_033_150_2880_LN2_O_NT_003.SEN3/F1_BT_in.nc'
# define path to .nc-file
data_output <- 'D:/user_1/01_test_data/S3A_SL_1_RBT____20180708T093240_20180708T093540_20180709T141944_0179_033_150_2880_LN2_O_NT_003.SEN3/test.tif'
# define path of output .tiff-file
###################################################
# 1.) use gdal_translate via Windows cmd-line in R
# see here URL:https://stackoverflow.com/questions/52046282/convert-netcdf-nc-to-geotiff
system(command = paste('gdal_translate -of GTiff -sds -a_srs epsg:4326', data_source, data_output))
# hand over character string to Windows cmd-line to use gdal_translate
###################################################
# 2.) use the raster-package
# see here URL:https://www.researchgate.net/post/How_to_convert_a_NetCDF4_file_to_GeoTIFF_using_R2
epsg4326 <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
# proj4-code
# URL:https://spatialreference.org/ref/epsg/wgs-84/proj4/
specific_band <- raster(data_source)
crs(specific_band) <- epsg4326
writeRaster(specific_band, filename = data_output)
# both approaches work, i can convert the files from .nc-format into .tiff-format, but i **loose the geoinformation for the pixels** and just get pixel coordinates instead of longlat-values.
I really appreciate any solutions that keep the geoinformation for the pixels!
Thanks a lot in advance, ExploreR

As #j08lue points out,
The product format for Sentinel 3 products is horrible. Yes, the data
values are stored in netCDF, but the coordinate axes are in separate
files and it is all just a bunch of files and metadata.
I did not find any documentation (I assume it must exist), but it seems you can get the data like this:
library(ncdf4)
# coordinates
nc <- nc_open("geodetic_in.nc")
lon <- ncvar_get(nc, "longitude_in")
lat <- ncvar_get(nc, "latitude_in")
# including elevation for sanity check only
elv <- ncvar_get(nc, "elevation_in")
nc_close(nc)
# the values of interest
nc <- nc_open("F1_BT_in.nc")
F1_BT <- ncvar_get(nc, "F1_BT_in")
nc_close(nc)
# combine
d <- cbind(as.vector(lon), as.vector(lat), as.vector(elv), as.vector(F1_BT_in))
Plot a sample of the locations. Note that the raster is rotated
plot(d[sample(nrow(d), 25000),1:2], cex=.1)
I would need to investigate a bit more to see how to write a rotated raster.
For now, a not recommended shortcut could be to rasterize to a non-rotated raster
e <- extent(as.vector(apply(d[,1:2],2, range))) + 1/120
r <- raster(ext=e, res=1/30)
#elev <- rasterize(d[,1:2], r, d[,3], mean)
F1_BT <- rasterize(d[,1:2], r, d[,4], mean, filename="")
plot(F1_BT)

so that´s what i have done so far - unfortunately the raster is not somehow rotated by 180degree, but somehow distorted in another way...
# (1.) first part of the code adapted to Robert Hijmans approach (see code of answer provided above)
nc_geodetic <- nc_open(paste0(wd, "/01_test_data/sentinel_3/geodetic_in.nc"))
nc_geodetic_lon <- ncvar_get(nc_geodetic, "longitude_in")
nc_geodetic_lat <- ncvar_get(nc_geodetic, "latitude_in")
nc_geodetic_elv <- ncvar_get(nc_geodetic, "elevation_in")
nc_close(nc_geodetic)
# to get the longitude, latitude and elevation information
F1_BT_in_vars <- nc_open(paste0(wd, "/01_test_data/sentinel_3/F1_BT_in.nc"))
F1_BT_in <- ncvar_get(F1_BT_in_vars, "F1_BT_in")
nc_close(F1_BT_in_vars)
# extract the band information
###############################################################################
# (2.) following part of the code is adapted to #Matthew Lundberg rotation-code see URL:https://stackoverflow.com/questions/16496210/rotate-a-matrix-in-r
rotate_fkt <- function(x) t(apply(x, 2, rev))
# create rotation-function
F1_BT_in_rot180 <- rotate_fkt(rotate_fkt(F1_BT_in))
# rotate raster by 180degree
test_F1_BT_in <- raster(F1_BT_in_rot180)
# convert matrix to raster
###############################################################################
# (3.) extract corner coordinates and transform with gdal
writeRaster(test_F1_BT_in, filename = paste0(wd, "/01_test_data/sentinel_3/test_flip.tif"), overwrite = TRUE)
# write the raster layer
data_source_flip <- '"D:/unknown_user/X_processing/01_test_data/sentinel_3/test_flip.tif"'
data_tmp_flip <- '"D:/unknown_user/X_processing/01_test_data/temp/test_flip.tif"'
data_out_flip <- '"D:/unknown_user/X_processing/01_test_data/sentinel_3/test_flip_ref.tif"'
# define input, temporary output and output for gdal-transformation
nrow_nc_mtx <- nrow(nc_geodetic_lon)
ncol_nc_mtx <- ncol(nc_geodetic_lon)
# investigate on matrix size of the image
xy_coord_char1 <- as.character(paste("1", "1", nc_geodetic_lon[1, 1], nc_geodetic_lat[1, 1]))
xy_coord_char2 <- as.character(paste(nrow_nc_mtx, "1", nc_geodetic_lon[nrow_nc_mtx, 1], nc_geodetic_lat[nrow_nc_mtx, 1]))
xy_coord_char3 <- as.character(paste(nrow_nc_mtx, ncol_nc_mtx, nc_geodetic_lon[nrow_nc_mtx, ncol_nc_mtx], nc_geodetic_lat[nrow_nc_mtx, ncol_nc_mtx]))
xy_coord_char4 <- as.character(paste("1", ncol_nc_mtx, nc_geodetic_lon[1, ncol_nc_mtx], nc_geodetic_lat[1, ncol_nc_mtx]))
# extract the corner coordinates from the image
system(command = paste('gdal_translate -of GTiff -gcp ', xy_coord_char1, ' -gcp ', xy_coord_char2, ' -gcp ', xy_coord_char3, ' -gcp ', xy_coord_char4, data_source_flip, data_tmp_flip))
system(command = paste('gdalwarp -r near -order 1 -co COMPRESS=NONE ', data_tmp_flip, data_out_flip))
# run gdal-transformation

Related

stuck with extracting and converting nc file

i have rainfall file nc and temperature file nc, i do'nt really understand with r, no experience before, so i'm trying this script and get error,
library(ncdf4)
library(data.table)
library(raster)
library(metR)
library(rgdal)
tmax2 <- nc_open("E:/SKRIPSI/prec-tmin-tmax-sumut/tmax2006-2022.nc")
> names(tmax2$var)
[1] "TASMAX"
> names(tmax2$dim)
[1] "NTIME1" "XAXIS23_301" "YAXIS26_132" "M2"
> info.file <- GlanceNetCDF(tmaxsumut)
Error in GlanceNetCDF(tmaxsumut) : could not find function "GlanceNetCDF"
>
> #pemilihan lokasi & waktu
> lat <- 0:4
> lon <- 98:100
> wkt <- seq(from = as.Date("2017-01-01"),
+ to = as.Date("2020-12-31"),
+ by = "days")
>
> tmax2 <- ReadNetCDF(tmaxsumut, vars="TASMAX",
+ subset=list(XAXIS23_301=lon, YAXIS26_132= lat, NTIME1=wkt))
Error in ReadNetCDF(tmaxsumut, vars = "TASMAX", subset = list(XAXIS23_301 = lon, :
could not find function "ReadNetCDF"
You are not describing what you want to achieve, making it very difficult to help. Feel free to edit your question to clarify your goals (do not use the comments for that).
I am guessing that you want to extract values from the ncdf file for point (long/lat) locations. If so, similar questions have been asked many times on this site, so you could probably do some more searches.
With standard compliant ncdf files you can simply do:
library(terra)
tmax2 <- rast("E:/SKRIPSI/prec-tmin-tmax-sumut/tmax2006-2022.nc", "TASMAX")
lat <- 1:3
lon <- 98:100
points <- vect(cbind(lon, lat))
e <- extract(tmax2, points)
This only works if the ncdf file has regular raster data. That is not guaranteed, but you provide no information about the file, nor do you provide the file.

Error in do.ply(i) : task 1 failed - "could not find function "%>%"" in R parallel programming

Every time I run the script it always gives me an error: Error in { : task 1 failed - "could not find function "%>%""
I already check every post on this forum and tried to apply it but no one works.
Please advise any solution.
Please note: I have only 2 cores on my PC.
My code is as follows:
library(dplyr) # For basic data manipulation
library(ncdf4) # For creating NetCDF files
library(tidync) # For easily dealing with NetCDF data
library(ggplot2) # For visualising data
library(doParallel) # For parallel processing
MHW_res_grid <- readRDS("C:/Users/SUDHANSHU KUMAR/Desktop/MTech Project/R/MHW_result.Rds")
# Function for creating arrays from data.frames
df_acast <- function(df, lon_lat){
# Force grid
res <- df %>%
right_join(lon_lat, by = c("lon", "lat")) %>%
arrange(lon, lat)
# Convert date values to integers if they are present
if(lubridate::is.Date(res[1,4])) res[,4] <- as.integer(res[,4])
# Create array
res_array <- base::array(res[,4], dim = c(length(unique(lon_lat$lon)), length(unique(lon_lat$lat))))
dimnames(res_array) <- list(lon = unique(lon_lat$lon),
lat = unique(lon_lat$lat))
return(res_array)
}
# Wrapper function for last step before data are entered into NetCDF files
df_proc <- function(df, col_choice){
# Determine the correct array dimensions
lon_step <- mean(diff(sort(unique(df$lon))))
lat_step <- mean(diff(sort(unique(df$lat))))
lon <- seq(min(df$lon), max(df$lon), by = lon_step)
lat <- seq(min(df$lat), max(df$lat), by = lat_step)
# Create full lon/lat grid
lon_lat <- expand.grid(lon = lon, lat = lat) %>%
data.frame()
# Acast only the desired column
dfa <- plyr::daply(df[c("lon", "lat", "event_no", col_choice)],
c("event_no"), df_acast, .parallel = T, lon_lat = lon_lat)
return(dfa)
}
# We must now run this function on each column of data we want to add to the NetCDF file
doParallel::registerDoParallel(cores = 2)
prep_dur <- df_proc(MHW_res_grid, "duration")
prep_max_int <- df_proc(MHW_res_grid, "intensity_max")
prep_cum_int <- df_proc(MHW_res_grid, "intensity_cumulative")
prep_peak <- df_proc(MHW_res_grid, "date_peak")

Convert NZMG coordinates to lat/long

I have a bunch of NZ Map Grid coordinates, which I want convert to lat/long. Based on this question, here is what I tried.
library(sp)
options(digits = 11) # to display to greater d.p.
Attempt 1:
proj4string <- "+proj=nzmg +lat_0=-41.0 +lon_0=173.0 +x_0=2510000.0
+y_0=6023150.0 +ellps=intl +units=m"
p <- proj4::project(c(2373200, 5718800), proj = proj4string, inverse=T)
Attempt 2
dat <- data.frame(id = c(1), x = c(2373200) , y = c(5718800))
sp::coordinates(dat) = ~x+y
sp::proj4string(dat) = CRS('+init=epsg:27200')
data_wgs84 <- spTransform(dat, CRS('+init=epsg:4326'))
print(data_wgs84)
If I run my coordinates through the linz coordinate conversion tool I get a slightly different result, which is the "true" result.
Results:
171.30179199 -43.72743909 # attempt 1 - ~200m off linz
171.30190004, -43.72577765 # attempt 2 - a few meters off linz
171.30189464, -43.72576664 # linz
Based on Mike T's answer I should be using a "distortion grid transformation method" and he links to a "nzgd2kgrid0005.gsb grid shift file".
My Question: Is it possible to do this conversion using R without downloading additional files (nzgd2kgrid0005.gsb)? I want to share my code with others without them having to download any additional files.
Any advice much appreciated.
Turns out it is pretty simple, if you have the rgdal package installed, the required nzgd2kgrid0005.gsb file is included and you don't need to download anything extra.
You just need to use the full PROJ.4 string as outlined in Mike T's answer.
dat <- data.frame(id = c(1), x = c(2373200) , y = c(5718800))
sp::coordinates(dat) = ~x+y
proj4string <- "+proj=nzmg +lat_0=-41 +lon_0=173 +x_0=2510000 +y_0=6023150
+ellps=intl +datum=nzgd49 +units=m +towgs84=59.47,-5.04,187.44,0.47,-0.1,1.024,-4.5993
+nadgrids=nzgd2kgrid0005.gsb +no_defs"
sp::proj4string(dat) = sp::CRS(proj4string)
data_wgs84 <- sp::spTransform(dat, sp::CRS('+init=epsg:4326'))
as.data.frame(data_wgs84)
id x y
1 171.3018946 -43.72576664
Which is the same as the output from the LINZ coordinate conversion tool. Hopefully this saves someone else a bit of time.

R- plotting 2.5 grid netcdf data with country contour

I'm trying to plot precipitation data which has a 2.5 x 2.5 grid with the country contour on top, the data is available in this link: https://www.esrl.noaa.gov/psd/data/gridded/data.cmap.html "Mean (Enhanced Monthly)"
I was using the answer from: R - Plotting netcdf climate data. However I get an error.
This is what I have done:
library(ncdf4)
ncpath <- "C:/Users/"
ncname <- "precip.mon.mean"
ncfname <- paste(ncpath,ncname,".nc",sep="")
ncin <- nc_open(ncfname)
lon <- ncvar_get(ncin, "lon")
nlon <- dim(lon)
lat <- ncvar_get(ncin, "lat")
nlat <- dim(lat)
dname <-"precip"
ppt_array <- ncvar_get(ncin,dname)
dim(ppt_array)
pres <- ppt_array[ , ,25:444]
precip <- array(pres, , dim=c(nlon, nlat, 12, ano))
prec <- precip[97:115,21:34, ,1:ano] #I just want a piece of the map
Here is where I have the problem:
latlat <- rev(lat)
precipit <- prec[ , ,1,1] %Just to see if it works
lonlon <- lon-180
image(lonlon,latlat,precipit)
library(maptools)
data(wrld_simpl)
#however I don't know if this will work to plot just a portion of the map
plot(wrld_simpl,add=TRUE)
I get several errors, could someone please help?
EDIT:
The errors I got were these:
> image(lonlon,latlat,precipit)
Error in image.default(lonlon, latlat, precipit) :
increasing 'x' and 'y' values expected
> library(maptools)
> data(wrld_simpl)
> plot(wrld_simpl,add=TRUE)
Error in polypath(x = mcrds[, 1], y = mcrds[, 2], border = border, col = col, :
plot.new has not been called yet
There's several things that need to be fixed:
1) ano does not seem to be defined anywhere. Perhaps it was defined interactively?
precip <- array(pres, , dim=c(nlon, nlat, 12, ano))
2) It appears you intended to add a comment but used an infix operator instead - replace this with a #, like so:
precipit <- prec[ , ,1,1] # Just to see if it works
3) If you want to only have part of the map, you can either ensure that both the lat and lon arrays match the area that you want to show (essentially cropping the world map) or define NAs outside the region you want to highlight (which will appear similar to the map here)

how to add average rasters within for-loop that creates the rasters? R

I have several directories with 700+ binary encoded rasters that i take average the output rasters per directory. however, i currently create the rasters 1 by 1 in a for loop, then load newly created rasters back into R to take the sum to obtain the monthly rainfall total.
However, since I dont need the individual rasters, only the average raster, I have a hunch that I could do this all w/in 1 loop and not save the rasters but just the output average raster, but I am coming up short in how to program this in R.
setwd("~/Desktop/CMORPH/Levant-Clip/200001")
dir.output <- '~/Desktop/CMORPH/Levant-Clip/200001' ### change as needed to give output location
path <- list.files("~/Desktop/CMORPH/MonthlyCMORPH/200001",pattern="*.bz2", full.names=T, recursive=T)
for (i in 1:length(path)) {
files = bzfile(path[i], "rb")
data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
y<-matrix((data=data), ncol=1649, nrow=4948)
r <- raster(y)
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
tr <- t(r) #transpose
re <- setExtent(tr,extent(e)) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
C_Lev <- crop(ry, Levant) ### Clip to Levant
M_C_Lev<-mask(C_Lev, Levant)
writeRaster(M_C_Lev, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
}
#
raspath <- list.files ('~/Desktop/CMORPH/Levant-Clip/200001',pattern="*.tif", full.names=T, recursive=T)
rasstk <- stack(raspath)
sum200001<-sum(rasstk)
writeRaster(avg200001, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
currently, this code takes about 75 mins to execute, and I have about 120 more directories to go, and am looking for faster solutions.
thank you for all and any comments and input. best, evan
Elaborating on my previous comment, you could try:
setwd("~/Desktop/CMORPH/Levant-Clip/200001")
dir.output <- '~/Desktop/CMORPH/Levant-Clip/200001' ### change as needed to give output location
path <- list.files("~/Desktop/CMORPH/MonthlyCMORPH/200001",pattern="*.bz2", full.names=T, recursive=T)
raster_list = list()
for (i in 1:length(path)) {
files = bzfile(path[i], "rb")
data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
y<-matrix((data=data), ncol=1649, nrow=4948)
r <- raster(y)
if (i == 1) {
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
}
tr <- t(r) #transpose
re <- setExtent(tr,extent(e)) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
C_Lev <- crop(ry, Levant) ### Clip to Levant
M_C_Lev<-mask(C_Lev, Levant)
raster_list[[i]] = M_C_Lev
}
#
rasstk <- stack(raster_list, quick = TRUE) # OR rasstk <- brick(raster_list, quick = TRUE)
avg200001<-mean(rasstk)
writeRaster(avg200001, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
Using the "quick" options in stack should definitely speed-up things, in particular if you have many rasters.
Another possibility is to first compute the average, and then perform the "spatial proceesing". For example:
for (i in 1:length(path)) {
files = bzfile(path[i], "rb")
data <- readBin(files,what="double",endian = "little", n = 4948*1649, size=4) #Mode of the vector to be read
data[data == -999] <- NA #covert missing data from -999(CMORPH notation) to NAs
if (i == 1) {
totdata <- data
num_nonNA <- as.numeric(!is.na(data))
} else {
totdata = rowSums(cbind(totdata,data), na.rm = TRUE)
# We have to count the number of "valid" entries so that the average is correct !
num_nonNA = rowSums(cbind(num_nonNA,as.numeric(!is.na(data))),na.rm = TRUE)
}
}
avg_data = totdata/num_nonNA # Compute the average
# Now do the "spatial" processing
y<-matrix(avg_data, ncol=1649, nrow=4948)
r <- raster(y)
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
tr <- t(r) #transpose
re <- setExtent(tr,extent(e)) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
C_Lev <- crop(avg_data, Levant) ### Clip to Levant
M_C_Lev<-mask(C_Lev, Levant)
writeRaster(M_C_Lev, paste(dir.output, basename(path[i]), sep = ''), format = 'GTiff', overwrite = T) ###the basename allows the file to be named the same as the original
This could be faster or slower, depending from "how much" you are cropping the original data.
HTH,
Lorenzo
I'm adding another answer to clarify and simplify things a bit, also in relation with comments in chat. The code below should do what you ask: that is, cycle over files, read the "data", compute the sum over all files and convert it to a raster with specified dimensions.
Note that for testing purposes here I substituted your cycle on file names with a simple 1 to 720 cycle, and file reading with the creation of arrays of the same length as yours filled with values from 1 to 4 and some NA !
totdata <- array(dim = 4948*1649) # Define Dummy array
for (i in 1:720) {
message("Working on file: ", i)
data <- array(rep(c(1,2,3,4),4948*1649/4), dim = 4948*1649) # Create a "fake" 4948*1649 array each time to simulate data reading
data[1:1000] <- -999 # Set some values to NA
data[data == -999] <- NA #convert missing data from -999
totdata <- rowSums(cbind(totdata, data), na.rm = T) # Let's sum the current array with the cumulative sum so far
}
# Now reshape to matrix and convertt to raster, etc.
y <- matrix(totdata, ncol=1649, nrow=4948)
r <- raster(y)
e <- extent(-180, 180, -90, 83.6236) ### choose the extent based on the netcdf file info
tr <- t(r) #transpose
re <- setExtent(tr,e) ### set the extent to the raster
ry <- flip(re, direction = 'y')
projection(ry) <- "+proj=longlat +datum=WGS84 +ellps=WGS84"
This generates a "proper" raster:
> ry
class : RasterLayer
dimensions : 1649, 4948, 8159252 (nrow, ncol, ncell)
resolution : 0.07275667, 0.1052902 (x, y)
extent : -180, 180, -90, 83.6236 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : 0, 2880 (min, max)
contatining the sum of the different arrays: You can notice that max value is 720 * 4 = 2880 (Only caveat: If you have cells which are always at NA, you will get 0 instead than NA)
On my laptop, this runs in about 5 minutes !
In practice:
to avoid memory problems, I am not reading in memory all the data.
Each of your arrays is more or less 64MB, so I cannot load them all
and then do the sum (unless I have 50 GB of RAM to throw away - and even in
that case it would be slow). I instead make use of the associative
propoerty of summation by computing a "cumulative" sum at each
cycle. In this way you are only working with two 8-millions arrays at
a time: the one you read from file "i", and the one that contains
the current sum.
to avoid unnecessary computations here I am summing directly the
1-dimensional arrays I get from reading the binary. You don't need
to reshape to matrix the arrays in the cycle because you can do that
on the final "summed" array which you can then convert to matrix form
I hope this will work for you and that I am not missing something obvious !
As far as I can understand, if using this approach is still slow you are having problems elsewhere (for example in data reading: on 720 files, 3 seconds spent on reading for each file means roughly 35 minutes of processing).
HTH,
Lorenzo

Resources