Why is Mosaic rasters not working in some files? - r

I want to create a new raster from 2 tiles. For the date "247" everything works fine. But for the next date "248", the raster I'm getting in return is only the first tile h12v12, is not merging the consecutive one (h13v12).
This is just an example. I've worked with more files and had the same problem with some random files. Still, for some of them, I've got the desired results.
The files:
https://drive.google.com/drive/folders/1FSUAg-H8ePP9jeZjCTatlGFsDsKCq3cN?usp=sharing
#Open files for a day
files <- c("AOD2022248.h12v12.tif", "AOD2022248.h13v12.tif")
files_2 <- c("AOD2022247.h12v12.tif", "AOD2022247.h13v12.tif")
#create stack for each one
test_1 <- stack(files[1])
test_2 <- stack(files[2])
test_3 <- stack(files_2[1])
test_4 <- stack(files_2[2])
The first try was with raster::mosaic. For the 248 file, I've got an error. For the 247 file, it worked.
joint <- mosaic(test_1, test_2, fun=mean, filename = "joint.tif", overwrite=TRUE)
Error in v[cells, i] <- as.vector(getValues(x[[i]])) :
number of items to replace is not a multiple of replacement length
joint2 <- mosaic(test_3, test_4, fun=mean, filename = "joint2.tif", overwrite=TRUE)
I don't get this error, the rasters are not the same extent because they are supposed to be side by side, not on top of each other.
So... the second try was with gdalUtils::mosaic_rasters. Although I didn't get any errors here, when I open the new tif in QGIS the 248 only has the first tile, and the 247 has the two of them.
mosaic_rasters(files, dst_dataset= "files.tif")
[1] "AOD2022248.h12v12.tif" "AOD2022248.h13v12.tif"
NULL
mosaic_rasters(files_2, dst_dataset= "files2.tif")
[1] "AOD2022247.h12v12.tif" "AOD2022247.h13v12.tif"
NULL
When I use verbose = TRUE in mosaic_rasters...
I've got for 247
Input file size is 2400, 12000...10...20...30...40...50...60...70...80...90...100 - done.
and for 248
Input file size is 1200, 12000...10...20...30...40...50...60...70...80...90...100 - done.
I also compared the rasters to see if there is any difference in h12v12 or h13v12 between days but they are the same.
> compareRaster(test_1, test_3)
[1] TRUE
> compareRaster(test_2, test_4)
[1] TRUE
If the files are the same, why mosaic/merge is only working right with some of them?

The raster stacks need to have the same number of layers
nlayers(test_1)
# [1] 4
nlayers(test_2)
# [1] 3
If we add another dummy layer filled with NA, then the mosaic works. I assumed that the missing layer is the fourth, but you may need to figure out which one is missing in your specific case.
test_2a = stack(test_2, init(test_2[[1]], fun=function(x) rep(NA,x)))
joint <- mosaic(test_1, test_2a, fun=mean, filename = "joint.tif", overwrite=TRUE)
Or, we can use package terra instead of raster
library(terra)
test_1 <- rast(files[1])
test_2 <- rast(files[2])
joint <- mosaic(test_1, test_2)

Related

Finding two directories (which are in ten min bins) based on a time. A diabolical directory disaster

I have looked all round and can't find a working solution. A bit of background:
I am using R to find raw images based on a validated image name (all this bit works). The issue is there are at least 30 date directories with each of these having a large number of time directories, these are divided up into 10 min bins. Looking in all the bins or just the parent directory is asking a bit too much computationally. An example format of the bin would be
R_Experiments\RawImageFinder\Raw\2016-10-08\1536
R_Experiments\RawImageFinder\Raw\2016-10-08\1546
It is important to note that the bins are not consistent with their starting minutes; it can vary and here in lies the problem.
I know what time the image was taken from the file name using the following bit of code
SingleImage <- Pia1.2016-10-08.1103+N2353_hc.tif
TimeDir <- sub('.*?\\.\\d{4}-\\d{2}-\\d{2}\\.(\\d{2})(\\d{2}).*', '\\1:\\2', SingleImage)
TimeDir <- sub(':','', TimeDir)
#
> print(TimeDir)
[1] "1103"
So the image could belong in any of the following bins:
\1053,\1054,\1055,..you get the idea...,\1112,\1113
it just depends when the bin was started. So I want the "finder" code to look in all possible bins that are within tin mins either side (as per the example above), obviously some of them will not exist.
I thought about doing:
TimeDir1 <- as.numeric(TimeDir)+1
TimeDir2 <- as.numeric(TimeDir)+2
but the issue arises if we get to 59 mins, because there is no such thing as 61 mins in the hour (haha).
I then use the following to tell which directories to search, although I am a bit stuck also on how to tell it to look in multiple directories.
Directorytosearch <- ParentDirectory
#this has the \ in it, same for time, it works
Directorytosearch <- sub('$',paste(DateDir), Directorytosearch)
Directorytoserach <- sub('$',paste(TimeDir), Directorytoserach)
IMAGEtocopy <- list.files(
path = c(Directorytosearch),
recursive = TRUE,
include.dirs = FALSE,
full.names = FALSE,
pattern = SingleImagePattern)
Any help really would be great!
Could be using the strptime function?
Many thanks
Jim
Update for #Nya
test <- strptime("1546", format = "%H%M")
dirs[select.image.dir(test, dirs.time)]
> dirs[select.image.dir(test, dirs.time)]
[1] "test/1546"
To list directories, you are looking for the list.dirs() function. Let's assume that the following example was obtained from such a search through all the directories.
# directories possibly obtained with list.dirs
dirs <- c("test/1536", "test/1546", "test/1556", "test/1606")
A good practice then would be to extract both the date and time components from the directories and image file names. Here, I will only use the time since that was the original request.
# convert times
dirs.time <- sub(".*/(\\d+)$", "\\1", dirs)
dirs.time <- strptime(dirs.time, format="%H%M")
# test data, in your case from image file names
test <- strptime(c("1538", "1559", "1502"), format="%H%M")
The function that will select the desired directories by comparing if the time from the image file is within the 10 minutes interval up and down the time of the directory. It will then provide the indices, where the image could be located.
select.image.dir <- function(i, dt){
res <- NULL
# adding and substracting 10 minutes converted to seconds
ik <- c(i - 600, i + 600)
condition <- c(ik[1] <= dt & ik[2] >= dt)
if(any(condition)){
res <- which(condition)
} else { res <- NA }
res
}
Note that the updated function accepts a single image file time to test in each round. The indices can then be used to extract the path to the image directory. The last time is outside the range of the directories and thus the function returns NA.
dirs[select.image.dir(test[1], dirs.time)]
# [1] "test/1536" "test/1546"
dirs[select.image.dir(test[2], dirs.time)]
# [1] "test/1556" "test/1606"
dirs[select.image.dir(test[3], dirs.time)]
# [1] NA NA NA NA

Using a test sample file with MaxEnt in R

I worked a lot with MaxEnt in R recently (dismo-package), but only using a crossvalidation to validate my model of bird-habitats (only a single species). Now I want to use a self-created test sample file. I had to pick this points for validation by hand and can't use random test point.
So my R-script looks like this:
library(raster)
library(dismo)
setwd("H:/MaxEnt")
memory.limit(size = 400000)
punkteVG <- read.csv("Validierung_FL_XY_2016.csv", header=T, sep=";", dec=",")
punkteTG <- read.csv("Training_FL_XY_2016.csv", header=T, sep=";", dec=",")
punkteVG$X <- as.numeric(punkteVG$X)
punkteVG$Y <- as.numeric(punkteVG$Y)
punkteTG$X <- as.numeric(punkteTG$X)
punkteTG$Y <- as.numeric(punkteTG$Y)
##### mask NA ######
mask <- raster("final_merge_8class+le_bb_mask.img")
dataframe_VG <- extract(mask, punkteVG)
dataframe_VG[dataframe_VG == 0] <- NA
dataframe_TG <- extract(mask, punkteTG)
dataframe_TG[dataframe_TG == 0] <- NA
punkteVG <- punkteVG*dataframe_VG
punkteTG <- punkteTG*dataframe_TG
#### add the raster dataset ####
habitat_all <- stack("blockstats_stack_8class+le+area_8bit.img")
#### MODEL FITTING #####
library(rJava)
system.file(package = "dismo")
options(java.parameters = "-Xmx1g" )
setwd("H:/MaxEnt/results_8class_LE_AREA")
### backgroundpoints ###
set.seed(0)
backgrVMmax <- randomPoints(habitat_all, 100000, tryf=30)
backgrVM <- randomPoints(habitat_all, 1000, tryf=30)
### Renner (2015) PPM modelfitting Maxent ###
maxentVMmax_Renner<-maxent(habitat_all,punkteTG,backgrVMmax, path=paste('H:/MaxEnt/Ergebnisse_8class_LE_AREA/maxVMmax_Renner',sep=""),
args=c("-P",
"noautofeature",
"nothreshold",
"noproduct",
"maximumbackground=400000",
"noaddsamplestobackground",
"noremoveduplicates",
"replicates=10",
"replicatetype=subsample",
"randomtestpoints=20",
"randomseed=true",
"testsamplesfile=H:/MaxEnt/Validierung_FL_XY_2016_swd_NA"))
After the "maxent()"-command I ran into multiple errors. First I got an error stating that he needs more than 0 (which is the default) "randomtestpoints". So I added "randomtestpoints = 20" (which hopefully doesn't stop the program from using the file). Then I got:
Error: Test samples need to be in SWD format when background data is in SWD format
Error in file(file, "rt") : cannot open the connection
The thing is, when I ran the script with the default crossvalidation like this:
maxentVMmax_Renner<-maxent(habitat_all,punkteTG,backgrVMmax, path=paste('H:/MaxEnt/Ergebnisse_8class_LE_AREA/maxVMmax_Renner',sep=""),
args=c("-P",
"noautofeature",
"nothreshold",
"noproduct",
"maximumbackground=400000",
"noaddsamplestobackground",
"noremoveduplicates",
"replicates=10"))
...all works fine.
Also I tried multiple things to get my csv-validation-data in the correct format. Two rows (labled X and Y), Three rows (labled species, X and Y) and other stuff. I would rather use the "punkteVG"-vector (which is the validation data) I created with read.csv...but it seems MaxEnt wants his file.
I can't imagine my problem is so uncommon. Someone must have used the argument "testsamplesfile" before.
I found out, what the problem was. So here it is, for others to enjoy:
The correct maxent-command for a Subsample-file looks like this:
maxentVMmax_Renner<-maxent(habitat_all, punkteTG, backgrVMmax, path=paste('H:/MaxEnt',sep=""),
args=c("-P",
"noautofeature",
"nothreshold",
"noproduct",
"maximumbackground=400000",
"noaddsamplestobackground",
"noremoveduplicates",
"replicates=1",
"replicatetype=Subsample",
"testsamplesfile=H:/MaxEnt/swd.csv"))
Of course, there can not be multiple replicates, since you got only one subsample.
Most importantly the "swd.csv" Subsample-file has to include:
the X and Y coordinates
the Values at the respective points (e.g.: with "extract(habitat_all, PunkteVG)"
the first colum needs to consist of the word "species" with the header "Species" (since MaxEnt uses the default "species" if you don't define one in the Occurrence data)
So the last point was the issue here. Basically, if you don't define the species-colum in the Subsample-file, MaxEnt will not know how to assign the data.

R: Crop GeoTiff Raster using packages "rgdal" and "raster"

I'd like to crop GeoTiff Raster Files using the two mentioned packages, "rgdal" and "raster". Everything works fine, except that the quality of the resulting output tif is very poor and in greyscale rather than colour. The original data are high quality raster maps from the swiss federal office of Topography, example files can be downloaded here.
This is my code:
## install.packages("rgdal")
## install.packages("raster")
library("rgdal")
library("raster")
tobecroped <- raster("C:/files/krel_1129_2012_254dpi_LZW.tif")
ex <- raster(xmn=648000, xmx=649000, ymn=224000, ymx=225000)
projection(ex) <- proj4string(tobecroped)
output <- "c:/files/output.tif"
crop(x = tobecroped, y = ex, filename = output)
In order to reproduce this example, download the sample data and extract it to the folder "c:/files/". Oddly enough, using the sample data the quality of the croped image is alright, but still greyscale.
I played around using the options "datatype", "format", but didnt get anywhere with that. Can anybody point out a solution? Should I supply more information the the input data?
EDIT:
Josh's example works superb with the sample data 2. Unfortunately, the data I have seems to be older and somewhat different. Can you tell me what option I choose if you read the following GDALinfo:
# packages same as above
OldInFile = "C:/files/krel1111.tif"
dataType(raster(OldInFile)
[1] "INT1U"
GDALinfo(OldInFile)
rows 4800
columns 7000
bands 1
lower left origin.x 672500
lower left origin.y 230000
res.x 2.5
res.y 2.5
ysign -1
oblique.x 0
oblique.y 0
driver GTiff
projection +proj=somerc +lat_0=46.95240555555556 +lon_0=7.439583333333333+k_0=1 +x_0=600000+y_0=200000 +ellps=bessel +units=m+no_defs
file C:/files/krel1111.tif
apparent band summary:
GDType hasNoDataValue NoDataValue blockSize1 blockSize2
1 Byte FALSE 0 1 7000
apparent band statistics:
Bmin Bmax Bmean Bsd
1 0 255 NA NA
Metadata:
AREA_OR_POINT=Area
TIFFTAG_RESOLUTIONUNIT=2 (pixels/inch)
TIFFTAG_XRESOLUTION=254
TIFFTAG_YRESOLUTION=254
Warning message:
statistics not supported by this driver
Edit (2015-03-10):
If one simply wants to crop out a subset of an existing GeoTIFF and save the cropped part to a new *.tif file, using gdalUtils::gdal_translate() may be the most straightforward solution:
library(raster) # For extent(), xmin(), ymax(), et al.
library(gdalUtils) # For gdal_translate()
inFile <- "C:/files/krel_1129_2012_254dpi_LZW.tif"
outFile <- "subset.tif"
ex <- extent(c(686040.1, 689715.9, 238156.3, 241774.2))
gdal_translate(inFile, outFile,
projwin=c(xmin(ex), ymax(ex), xmax(ex), ymin(ex)))
Looks like you need to change two details.
First, the *.tif file you're reading in has three bands, so should be read in using stack(). (Using raster() on it will only read in a single band (the first one, by default) producing a monochromatic or 'greyscale' output).
Second (for reasons mentioned here) writeRaster() will by default write out the values as real numbers (Float64 on my machine). To explicitly tell it you instead want to use bytes, give it the argument datatype='INT1U'.
library("rgdal")
library("raster")
inFile <- "C:/files/krel_1129_2012_254dpi_LZW.tif"
outFile <- "out.tif"
## Have a look at the format of your input file to:
## (1) Learn that it contains three bands (so should be read in as a RasterStack)
## (2) Contains values written as Bytes (so you should write output with datatype='INT1U')
GDALinfo(inFile)
## Read in as three separate layers (red, green, blue)
s <- stack(inFile)
## Crop the RasterStack to the desired extent
ex <- raster(xmn=648000, xmx=649000, ymn=224000, ymx=225000)
projection(ex) <- proj4string(s)
s2 <- crop(s, ex)
## Write it out as a GTiff, using Bytes
writeRaster(s2, outFile, format="GTiff", datatype='INT1U', overwrite=TRUE)
All of which outputs the following tiff file:

Merging multiple rasters in R

I've been trying to find a time-efficient way to merge multiple raster images in R. These are adjacent ASTER scenes from the southern Kilimanjaro region, and my target is to put them together to obtain one large image.
This is what I got so far (object 'ast14dmo' representing a list of RasterLayer objects):
# Loop through single ASTER scenes
for (i in seq(ast14dmo.sd)) {
if (i == 1) {
# Merge current with subsequent scene
ast14dmo.sd.mrg <- merge(ast14dmo.sd[[i]], ast14dmo.sd[[i+1]], tolerance = 1)
} else if (i > 1 && i < length(ast14dmo.sd)) {
tmp.mrg <- merge(ast14dmo.sd[[i]], ast14dmo.sd[[i+1]], tolerance = 1)
ast14dmo.sd.mrg <- merge(ast14dmo.sd.mrg, tmp.mrg, tolerance = 1)
} else {
# Save merged image
writeRaster(ast14dmo.sd.mrg, paste(path.mrg, "/AST14DMO_sd_", z, "m_mrg", sep = ""), format = "GTiff", overwrite = TRUE)
}
}
As you surely guess, the code works. However, merging takes quite long considering that each single raster object is some 70 mb large. I also tried Reduce and do.call, but that failed since I couldn't pass the argument 'tolerance' which circumvents the different origins of the raster files.
Anybody got an idea of how to speed things up?
You can use do.call
ast14dmo.sd$tolerance <- 1
ast14dmo.sd$filename <- paste(path.mrg, "/AST14DMO_sd_", z, "m_mrg.tif", sep = "")
ast14dmo.sd$overwrite <- TRUE
mm <- do.call(merge, ast14dmo.sd)
Here with some data, from the example in raster::merge
r1 <- raster(xmx=-150, ymn=60, ncols=30, nrows=30)
r1[] <- 1:ncell(r1)
r2 <- raster(xmn=-100, xmx=-50, ymx=50, ymn=30)
res(r2) <- c(xres(r1), yres(r1))
r2[] <- 1:ncell(r2)
x <- list(r1, r2)
names(x) <- c("x", "y")
x$filename <- 'test.tif'
x$overwrite <- TRUE
m <- do.call(merge, x)
The 'merge' function from the Raster package is a little slow. For large projects a faster option is to work with gdal commands in R.
library(gdalUtils)
library(rgdal)
Build list of all raster files you want to join (in your current working directory).
all_my_rasts <- c('r1.tif', 'r2.tif', 'r3.tif')
Make a template raster file to build onto. Think of this a big blank canvas to add tiles to.
e <- extent(-131, -124, 49, 53)
template <- raster(e)
projection(template) <- '+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs'
writeRaster(template, file="MyBigNastyRasty.tif", format="GTiff")
Merge all raster tiles into one big raster.
mosaic_rasters(gdalfile=all_my_rasts,dst_dataset="MyBigNastyRasty.tif",of="GTiff")
gdalinfo("MyBigNastyRasty.tif")
This should work pretty well for speed (faster than merge in the raster package), but if you have thousands of tiles you might even want to look into building a vrt first.
You can use Reduce like this for example :
Reduce(function(...)merge(...,tolerance=1),ast14dmo.sd)
SAGA GIS mosaicking tool (http://www.saga-gis.org/saga_tool_doc/7.3.0/grid_tools_3.html) gives you maximum flexibility for merging numeric layers, and it runs in parallel by default! You only have to translate all rasters/images to SAGA .sgrd format first, then run the command line saga_cmd.
I have tested the solution using gdalUtils as proposed by Matthew Bayly. It works quite well and fast (I have about 1000 images to merge). However, after checking with document of mosaic_raster function here, I found that it works without making a template raster before mosaic the images. I pasted the example codes from the document below:
outdir <- tempdir()
gdal_setInstallation()
valid_install <- !is.null(getOption("gdalUtils_gdalPath"))
if(require(raster) && require(rgdal) && valid_install)
{
layer1 <- system.file("external/tahoe_lidar_bareearth.tif", package="gdalUtils")
layer2 <- system.file("external/tahoe_lidar_highesthit.tif", package="gdalUtils")
mosaic_rasters(gdalfile=c(layer1,layer2),dst_dataset=file.path(outdir,"test_mosaic.envi"),
separate=TRUE,of="ENVI",verbose=TRUE)
gdalinfo("test_mosaic.envi")
}
I was faced with this same problem and I used
#Read desired files into R
data_name1<-'file_name1.tif'
r1=raster(data_name1)
data_name2<-'file_name2.tif'
r2=raster(data_name2)
#Merge files
new_data <- raster::merge(r1, r2)
Although it did not produce a new merged raster file, it stored in the data environment and produced a merged map when plotted.
I ran into the following problem when trying to mosaic several rasters on top of each other
In vv[is.na(vv)] <- getValues(x[[i]])[is.na(vv)] :
number of items to replace is not a multiple of replacement length
As #Robert Hijmans pointed out, it was likely because of misaligned rasters. To work around this, I had to resample the rasters first
library(raster)
x <- raster("Base_raster.tif")
r1 <- raster("Top1_raster.tif")
r2 <- raster("Top2_raster.tif")
# Resample
x1 <- resample(r1, crop(x, r1))
x2 <- resample(r2, crop(x, r2))
# Merge rasters. Make sure to use the right order
m <- merge(merge(x1, x2), x)
# Write output
writeRaster(m,
filename = file.path("Mosaic_raster.tif"),
format = "GTiff",
overwrite = TRUE)

Manipulate variables in netcdf files and write them again

I have several netcdf files. each nc file has several variables. I am only interested in two variables "Soil_Moisture" and "Soil_Moisture_Dqx".
I would like to filter "Soil_Moisture" based on "Soil_Moisture_Dqx". I want to replace values in "Soil_Moisture" by NA whenever corresponding "Soil_Moisture_Dqx" pixels have values greater than 0.04.
:Here are the files to download:
1- I tried this loop but when I typed f[1] or f[2] I got something weird which means that my loop is incorrect.I am grateful to anyhelp to get my loop corrected.
a<-list.files("C:\\3 nc files", "*.DBL", full.names = TRUE)
for(i in 1:length(a)){
f=open.ncdf(a[i])
A1 = get.var.ncdf(nc=f,varid="Soil_Moisture",verbose=TRUE)
A1* -0.000030518509475997 ## scale factor
A2 = get.var.ncdf(nc=f,varid="Soil_Moisture_Dqx",verbose=TRUE)
A2*-0.0000152592547379985## scale factor
A1[A2>0.04]=NA ## here is main calculation I need
}
2- Can anybody tell me to write them again?
Missing values are special values in netCDF files whose value is to be taken as indicating the data
is "missing". So you need to use set.missval.ncdf to set this values.
a<-list.files("C:\\3 nc files", "*.DBL", full.names = TRUE)
SM_NAME <- "Soil_Moisture"
SM_SDX_NAME <- "Soil_Moisture_Dqx"
library(ncdf)
lapply(a, function(filename){
nc <- open.ncdf( filename,write=TRUE )
SM <- get.var.ncdf(nc=nc,varid=SM_NAME)
SM_dqx <- get.var.ncdf(nc=nc,varid=SM_SDX_NAME)
SM[SM_dqx > 0.4] <- NA
newMissVal <- 999.9
set.missval.ncdf( nc, SM_NAME, newMissVal )
put.var.ncdf( nc, SM_NAME, SM )
close.ncdf(nc)
})
EDIT add some check
It is intersting here to count how many points will tagged as missed.
Whithout applying the odd scale factor we have:
lapply(a, function(filename){
nc <- open.ncdf( filename,write=TRUE )
SM_dqx <- get.var.ncdf(nc=nc,varid=SM_SDX_NAME)
table(SM_dqx > 0.4)
})
[[1]]
[1] 810347 91
[[2]]
[1] 810286 152
[[3]]
[1] 810287 151
[[4]]
[1] 810355 83
This can also be accomplished from the command line using CDO.
As I understand it both variables are contained in your input file (which I will call "datafile.nc", you will want to presumably do the following in a loop over the file lists), so first of all we will extract those two variables into two separate files:
cdo selvar,Soil_Moisture datafile.nc soil_moisture.nc
cdo selvar,Soil_Moisture_Dqx datafile.nc dqx.nc
Now we will define a mask file that contains 1 when dqx<0.04 but contains NAN when dqx>=0.04
cdo setctomiss,0 -ltc,0.04 dqx.nc mask.nc
The ltc is "than than constant" (you may want instead lec for <= ), the setctomiss replaces all the zeros with NAN.
Now we multiply these together with CDO - NAN*C=NAN and 1*C=C, so this gives you a netcdf with your desired field:
cdo mul mask.c soil_moisture.nc masked_soil_moisture.nc
you can actually combine those last two lines together if you like, and avoid the I/O of writing the mask file:
cdo mul -setctomiss,0 -ltc,0.04 dqx.nc soil_moisture.nc masked_soil_moisture.nc
But it is easier to explain the steps separately :-)
You can put the whole thing in a loop over files easily in bash.

Resources