Is there a convenient way to convert multiple data types with ncap2? - netcdf

I would like to know if exists a manner to conveniently convert multiple data types of a file using e.g. ncap2.
More in details, I would like to convert all the variables with type double to type float.
I understand the way to do it is ncap2 -s 'var1=var1.convert(NC_FLOAT);var2=var2.convert(NC_FLOAT)' in.nc out.nc but this is not convenient when having a lot of variables.
Is there a smarter way to do this?
Cheers

Good question. This is relatively easy to do with "variable pointers" aka "vpointers" described here. Try this:
ncap2 -s '#all=get_vars_in();*sz=#all.size();for(*idx=0;idx<sz;idx++){#var_nm=#all(idx);*#var_nm=*#var_nm.convert(NC_FLOAT);}' in.nc out.nc
Responding to question in comments below:
Your attempt does not work because convert() makes the change on the RHS. The RHS change is volatile until it is saved in a LHS variable. A small change in your script fixes this:
ncap2 -s '#all=get_vars_in();*sz=#all.size();*idx=0;for(idx=0;idx < sz;idx++){#var_nm=#all(idx);if(*#var_nm.type() == NC_DOUBLE) *#var_nm=*#var_nm.float();}' in.nc out.nc
Also note that ncpdq appears to be the best operator to use for your purposes, because it has a packing map that automagically converts all doubles to floats, and you can enable compression easily at the same time too:
ncpdq -7 -L 1 --pck_map=dbl_flt in.nc out.nc

CDO permits conversion to 32 bit float too, although I should emphasize that this doesn't exactly answer the question as it converts everything to float, so Charlie's answer is the correct one, this is more for general info to readers of this question.
cdo -b f32 copy in.nc out.nc

Related

CDO : Masking 3D and 4D variables within the same netcdf file

I have a netcdf file called data.nc and a masking file called mask.nc. The data.nc file has a 3D variable A with dimensions over (time,long,lat), and 4D variable B with dimensions over (time,depth,long,lat).
The mask.nc file has two masking variables mask_3d (time,long,lat) and mask_4d (time,depth,long,lat) with 0 and 1 values.
So far, I am masking each variable separately using:
cdo -div -selname,A data.nc -selname,mask_3d mask.nc out.nc
and
cdo -div -selname,B data.nc -selname,mask_4d mask.nc out2.nc
My question is:
How can I mask both variables A and B in data.nc using only one command ?
I'm not sure this really qualifies as an answer, perhaps it should be more of a comment, but I think this is not possible with cdo. If I understand correctly, you essentially want the output to be in the same file (cdo only allows one output file, so by definition your question implies this).
The reason is that you would need to cat the two output files together in order to get what you want, but because cdo cat allows you to cat a variable number of files together, and even use wildcards (e.g. cdo cat *.nc out.nc) then it doesn't know how many input files to expect and thus cdo does not let you pipe such commands in combination with other commands as it can't interpret them safely.
Thus you would need to keep this as three lines (at least for a cdo based solution, I think, but stand to be corrected):
cdo -div -selname,A data.nc -selname,mask_3d mask.nc out1.nc
cdo -div -selname,B data.nc -selname,mask_4d mask.nc out2.nc
cdo cat out?.nc out.nc
sorry about that... That said, once commands start to get long, I think that keeping them separate as here aids legibility of the code.

NetCDF spatially merging to global data

Currently I use global precipitation (ppt) and potential evapotranspiration (pet) data to calculate SPEI. As I have limited hardware resources, I divided global ppt and ppt data into 32 parts, each file covering 45x45deg and contains 756 data - monthly from 1958-2020 (tile01.nc, tile02.nc, ... tile32.nc)
For example to do this, I use cdo sellonlatbox,-180,-135,45,90 in.nc out.nc or ncks -d lat,45.,90. -d lon,-180.,-135. in.nc -O out.nc
As required by SPEI script, I reorder and fixed the dimension from time,lat,lon to lat,lon,time using ncpdq and ncks.
From the SPEI output, I got the output in lat,lon,time. So I did reorder the dimension so that it becomes time,lat,lon using ncpdq.
Each tile SPEI output covering 45x45deg and contains 756 SPEI data - monthly from 1958-2020
Finally I need to merge all the output together (32 files) into one, so I will get the global SPEI output. I have try to use cdo mergegrid but the result is not what I expected. Is there any command from cdo or nco to solve this problem that has function similar to gdal_merge if we are dealing with geoTIFF format?
Below is the example of the SPEI output
UPDATE
I managed to merge all the data using cdo collgrid as suggested by Robert below. And here's the result:
I believe you want to use CDO's distgrid and collgrid methods for this.
First, run this:
cdo distgrid,4,8 in.nc obase
That will split the files up the way you want them.
Then do the post-processing necessary on the files.
Then use collgrid to merge the files:
cdo collgrid obase* out.nc
Potentially you can just use collgrid in place of mergegrid in your present work flow, depending on how you have split the files up.

Extract specific particle id variable from a netCDF file

I have a netCDF file output from a particle dispersion model (GNOME).
As it is a particle dispersion model, I have every particle identified by a particle id variable:
int id(data) ;
id:description = "particle ID" ;
id:units = "1" ;
I need to extract only some specific particle id and their locations. I have tried with cdo and nco operators and I get these errors:
ncks -v longitude,latitude -d id,62001. infile.nc outputfile.nc
ncks: ERROR dimension id is not in input file
cdo -select,name=latitude,longitude,id=62968 infile.nc outputfile.nc
cdo select (Abort): Unsupported selection keyword: 'id'!
I hope someone could help me. Thanks
The dimension is actually named "data". I suggest you rename the dimension to "id". Then your command should work:
ncrename -d data,id in.nc
ncks -v longitude,latitude -d id,62001. in.nc out.nc
or you could leave the names alone, and if the id is really the data index, then this should work:
ncks -v longitude,latitude -d data,62001 in.nc out.nc
NB: no decimal point this time since data is not a coordinate, as explained here.
EDIT: 20210921 in response to comment below, unless I am missing something, the dataset would need to have a variable traj dimensioned traj(time,data) in order for the suggested commands to have the result you desire. The header of your file shows no such variable.

How to add variable from nc1 to nc2 file without deleting/removing the variables in nc2?

I have two kinds of variables in two different nc files. The dimension and other things are same, I just have to add one more variable in the existing nc file, How can I do this (using CDO or R or any other)
I used the command line (cdo selvar,varname in.nc out.nc) but it doesn't help. This command does work but deletes the existing variables. Any suggestions on how can I add new variables without deleting the variable inside the nc file?
Many thanks.
cdo solution
From your comment of clarification, I think the cdo command you need is cat
cdo cat aaa.nc bbb.nc output.nc
This will concatenate the fields in bbb.nc to the ones in aaa.nc and put the result in output.nc
nco solution
As an alternative you can also use ncrcat:
ncrcat aaa.nc bbb.nc output.nc
The NCO solution is
ncks -A -v yyy bbb.nc aaa.nc
as documented here. (Adrian's suggested NCO command would concatenate the files in time, not append one variable to the other file)

How should I use CDO selyear? I get an output file four times larger

CDO seems to work fine for me, until I met this. I have a netcdf of daily data from year 2101 to 2228, and I want to obtain a file with only years from 2101 to 2227, so I run:
cdo selyear,2101/2227 in.nc out.nc
But the output file is more than four times the input in memory size! It seems to have the right number of time steps, and the initial and end date are correct. Also, latitude and longitude seem to be the same as the input, so I wonder why the file size.
Perhaps try to retain the compression with the cdo operator and output netcdf 4
cdo -f nc4c -z zip_9 selyear,2101/2227 in.nc out.nc
This is the maximum compression, usually I use zip_5 or so, as the files are not much larger than zip_9 and the command is much faster.
An alternative is to (re?)pack the data to SHORTS with add_offset and scale_factor like this:
cdo pack -selyear,2101/2227 in.nc out.nc

Resources