NCO : Masking netcdf file using another netcdf mask file with (0 - 1) values - netcdf

I have two .nc files data.nc and mask.nc.
Where: data.nc contain a variable called temp unmasked, while mask.nc contain the mask within a variable called tmask with (0 - 1) values.
Using NCO, how can I apply the masking for the data.nc file, such that zero mask values are set to missing, and unity mask values are retained unchanged?

It's unclear what you wish to do with the mask. Here is a common procedure, use the mask to replace the actual values with missing values:
ncks -A -v tmask mask.nc data.nc
ncap2 -s 'where(tmask == 0) temp=temp.get_miss()' data.nc out.nc
Documentation for where and get_miss is in the manual.
If temp has more records than tmask then make the where() condition operate on a copy of tmask that has been broadcast to the size of temp:
ncap2 -s '*big_mask=0*temp+tmask;where(big_mask == 0) temp=temp.get_miss()' data.nc out.nc

To do the same operation in cdo you could try this, which sets zero to missing in the mask first before taking the product:
cdo setctomiss,0 mask.nc maskm.nc
cdo mul data.nc maskm.nc masked_data.nc
cdo automatically repeats the mask to make it the same length in time as the data file, known as data "broadcasting".
I have a youtube video on masking here for further guidance, and other material on temporal broadcasting.

Related

CDO : Masking 3D and 4D variables within the same netcdf file

I have a netcdf file called data.nc and a masking file called mask.nc. The data.nc file has a 3D variable A with dimensions over (time,long,lat), and 4D variable B with dimensions over (time,depth,long,lat).
The mask.nc file has two masking variables mask_3d (time,long,lat) and mask_4d (time,depth,long,lat) with 0 and 1 values.
So far, I am masking each variable separately using:
cdo -div -selname,A data.nc -selname,mask_3d mask.nc out.nc
and
cdo -div -selname,B data.nc -selname,mask_4d mask.nc out2.nc
My question is:
How can I mask both variables A and B in data.nc using only one command ?
I'm not sure this really qualifies as an answer, perhaps it should be more of a comment, but I think this is not possible with cdo. If I understand correctly, you essentially want the output to be in the same file (cdo only allows one output file, so by definition your question implies this).
The reason is that you would need to cat the two output files together in order to get what you want, but because cdo cat allows you to cat a variable number of files together, and even use wildcards (e.g. cdo cat *.nc out.nc) then it doesn't know how many input files to expect and thus cdo does not let you pipe such commands in combination with other commands as it can't interpret them safely.
Thus you would need to keep this as three lines (at least for a cdo based solution, I think, but stand to be corrected):
cdo -div -selname,A data.nc -selname,mask_3d mask.nc out1.nc
cdo -div -selname,B data.nc -selname,mask_4d mask.nc out2.nc
cdo cat out?.nc out.nc
sorry about that... That said, once commands start to get long, I think that keeping them separate as here aids legibility of the code.

NetCDF spatially merging to global data

Currently I use global precipitation (ppt) and potential evapotranspiration (pet) data to calculate SPEI. As I have limited hardware resources, I divided global ppt and ppt data into 32 parts, each file covering 45x45deg and contains 756 data - monthly from 1958-2020 (tile01.nc, tile02.nc, ... tile32.nc)
For example to do this, I use cdo sellonlatbox,-180,-135,45,90 in.nc out.nc or ncks -d lat,45.,90. -d lon,-180.,-135. in.nc -O out.nc
As required by SPEI script, I reorder and fixed the dimension from time,lat,lon to lat,lon,time using ncpdq and ncks.
From the SPEI output, I got the output in lat,lon,time. So I did reorder the dimension so that it becomes time,lat,lon using ncpdq.
Each tile SPEI output covering 45x45deg and contains 756 SPEI data - monthly from 1958-2020
Finally I need to merge all the output together (32 files) into one, so I will get the global SPEI output. I have try to use cdo mergegrid but the result is not what I expected. Is there any command from cdo or nco to solve this problem that has function similar to gdal_merge if we are dealing with geoTIFF format?
Below is the example of the SPEI output
UPDATE
I managed to merge all the data using cdo collgrid as suggested by Robert below. And here's the result:
I believe you want to use CDO's distgrid and collgrid methods for this.
First, run this:
cdo distgrid,4,8 in.nc obase
That will split the files up the way you want them.
Then do the post-processing necessary on the files.
Then use collgrid to merge the files:
cdo collgrid obase* out.nc
Potentially you can just use collgrid in place of mergegrid in your present work flow, depending on how you have split the files up.

Extract specific particle id variable from a netCDF file

I have a netCDF file output from a particle dispersion model (GNOME).
As it is a particle dispersion model, I have every particle identified by a particle id variable:
int id(data) ;
id:description = "particle ID" ;
id:units = "1" ;
I need to extract only some specific particle id and their locations. I have tried with cdo and nco operators and I get these errors:
ncks -v longitude,latitude -d id,62001. infile.nc outputfile.nc
ncks: ERROR dimension id is not in input file
cdo -select,name=latitude,longitude,id=62968 infile.nc outputfile.nc
cdo select (Abort): Unsupported selection keyword: 'id'!
I hope someone could help me. Thanks
The dimension is actually named "data". I suggest you rename the dimension to "id". Then your command should work:
ncrename -d data,id in.nc
ncks -v longitude,latitude -d id,62001. in.nc out.nc
or you could leave the names alone, and if the id is really the data index, then this should work:
ncks -v longitude,latitude -d data,62001 in.nc out.nc
NB: no decimal point this time since data is not a coordinate, as explained here.
EDIT: 20210921 in response to comment below, unless I am missing something, the dataset would need to have a variable traj dimensioned traj(time,data) in order for the suggested commands to have the result you desire. The header of your file shows no such variable.

How to merge 2 separate netcdf files into 1 and add a time dimension

I have two NetCDF files of the Greenland ice sheet velocities, one from 2015 and one from 2016. These files contain grided data where the velocity is plotted with x,y coordinates. However, no time dimension is included. How can I merge these two files into 1, where the final file has a time dimension? So in stead of two separate x,y,z grids, I would like to have one x,y,z,t data structure, where time = 2.
Thanks!
If the files contain the same variables and are the same size, try ncecat
ncecat -u time file1.nc file2.nc out.nc
You can add a time dimension to a file with ncap2:
ncap2 -s 'defdim("time",1);time[time]=74875.0;time#long_name="Time"; etc.etc.etc.' -O ~/nco/data/in.nc ~/foo.nc
I suggest reading this thread for more details: https://sourceforge.net/p/nco/discussion/9830/thread/cee4e1ad/
After you have done that you can merge them together either using the ncrcat command (see https://linux.die.net/man/1/ncrcat) or also with cdo
cdo mergetime file1.nc file2.nc combined_file.nc

How should I use CDO selyear? I get an output file four times larger

CDO seems to work fine for me, until I met this. I have a netcdf of daily data from year 2101 to 2228, and I want to obtain a file with only years from 2101 to 2227, so I run:
cdo selyear,2101/2227 in.nc out.nc
But the output file is more than four times the input in memory size! It seems to have the right number of time steps, and the initial and end date are correct. Also, latitude and longitude seem to be the same as the input, so I wonder why the file size.
Perhaps try to retain the compression with the cdo operator and output netcdf 4
cdo -f nc4c -z zip_9 selyear,2101/2227 in.nc out.nc
This is the maximum compression, usually I use zip_5 or so, as the files are not much larger than zip_9 and the command is much faster.
An alternative is to (re?)pack the data to SHORTS with add_offset and scale_factor like this:
cdo pack -selyear,2101/2227 in.nc out.nc

Resources