I am trying extract to a single variable kd_490 from a NetCDF file (over thredds) using NCO.
My code is below:
ncks -v kd_490 -d lat,40.0,70.0 -d lon,-20.0,15.0 https://rsg.pml.ac.uk/thredds/dodsC/cci/v4.2-release/geographic/daily/kd/1998/ESACCI-OC-L3S-K_490-MERGED-1D_DAILY_4km_GEO_PML_KD490_Lee-19980102-fv4.2.nc out.nc
However, along with kd_490 it also extracts kd_490_bias and kd_490_rmsd. I know that ncks extracts "associated" variables. See here. However, I am not clear on why NCO is identifying these as "associated" variable.
I cannot figure out a way to only select the variable kd_490. Adding "-C" to the code results in the grid being wrong. Does anyone know how?
Note: CDO can solve this problem. However, CDO is less efficient at extracting the spatial subset in the code, so NCO is more appropriate.
NCO identifies kd_490_bias and kd_490_rmsd as associated variables because of the ancillary_variables attribute in kd_490:
float kd_490(time,lat,lon) ;
kd_490:_FillValue = 9.96921e+36f ;
kd_490:long_name = "Downwelling attenuation coefficient at 490nm, derived using Lee 2005 equation and bbw from Zhang 2009 (following the SeaDAS Kd_lee algorithm)" ;
kd_490:units = "m-1" ;
kd_490:ancillary_variables = "kd_490_rmsd kd_490_bias" ;
kd_490:grid_mapping = "crs" ;
kd_490:parameter_vocab_uri = "http://vocab.ndg.nerc.ac.uk/term/P071/19/CFSN0064" ;
kd_490:standard_name = "volume_attenuation_coefficient_of_downwelling_radiative_flux_in_sea_water" ;
kd_490:units_nonstandard = "m^-1" ;
kd_490:_ChunkSizes = 1, 270, 270 ;
as documented here. To extract kd_490 without the ancillary variables, but with the grid variables (which are other "associated" variables), this works for me:
ncks -O -C -v kd_490,crs,lat,lon -d lat,40.0,70.0 -d lon,-20.0,15.0 https://rsg.pml.ac.uk/thredds/dodsC/cci/v4.2-release/geographic/daily/kd/1998/ESACCI-OC-L3S-K_490-MERGED-1D_DAILY_4km_GEO_PML_KD490_Lee-19980102-fv4.2.nc ~/foo2.nc
Related
I have a .nc file with a group structure, one of the groups containing a variable I need to delete.
Using xarray, if I want to delete the variable I can only extract its group as a new .nc file.
ds = xr.load_dataset(path_test,group='/data_01/ku')
ds = ds.drop_vars(["ssh"])
ds.to_netcdf(path_test, mode="a", group='/data_01/ku')
Using bash command ncks (from nco) doing this :
ncks -x -g data_01/ku -v ssh in.nc out.nc
I get a memory error.
Does anyone know how to delete one specific variable while keeping the complete group structure of the file ?
Thanks guys
The ncks command you tried looks correct, and such commands work for me.
Try adding the -C switch just in case:
ncks -O -x -C -g g1/g1g1 -v ppc_dbl ~/nco/data/in_grp.nc ~/foo.nc
Seems like you got unlucky, or possibly are employing an old NCO version?
I am using monthly global potential evapotranspiration data from TerraClimate from 1958-2020 (available as 1 nc per year) and planning to concatenate all into single nc file.
The data has a variable pet and three dimension ppt(time,lat,lon).
I managed to combine all of the data using cod mergetime TerraClimate_*.nc and generate around 100GB of output file.
For analysis purpose in Windows machine, I need single netCDF file with order lat,lon,time. What I have done is as follows:
Reorder the dimension from time,lat,lon into lat,lon,time using ncpdq command
for fl in *.nc; do ncpdq -a lat,lon,time $fl ../pet2/$fl; done
Loop all file in the folder to make time the record dimension/variable used for concatenating files using ncks command
for fl in *.nc; do ncks -O --mk_rec_dmn time $fl $fl; done
Concatenates all nc files in the folder into one nc file using ncrcat command
ncrcat -h TerraClimate_*.nc -O TerraClimate_pet_1958_2020.nc
It's worked, but the result is not what I expected, it generate 458KB size of file, when I check the result using Panoply it provide wrong information, all have value -3276.7. See below picture.
I have check the files from step 1 and 2, and everything is correct.
I also try to concatenate only 2 files, using 1958 and 1959 data (each file 103MB), but the result still not what I expected.
ncrcat -h TerraClimate_pet_1958.nc TerraClimate_pet_1959.nc -O ../TerraClimate_pet_1958_1959.nc
Did I missed something on the code or write the wrong code? Any suggestion how to solve the problem?
UPDATE 1 (22 Oct 2021):
Here's the metadata of original data downloaded from above link.
UPDATE 2 (23 Oct 2021):
Following suggestion from Charlie, I did unpack for all the data from point 2 above using below command.
for fl in *.nc4; do ncpdq --unpack $fl ../unpack/$fl; done
Here's the example metadata from unpack process.
And the data visualised using Panoply.
Then I did test to concatenate again using 2 data from unpack process (1958 and 1959)
ncrcat -h TerraClimate_pet_1958.nc TerraClimate_pet_1959.nc -O ../TerraClimate_pet_1958_1959.nc
Unfortunately the result remain same, I got result with size 1MB. Below is the metadata
And visualised the ncrcat result using Panoply
Your commands appear to be correct, however I suspect that the data in the input files is packed. As explained in the ncrcat documentation here, the input data should be unpacked (e.g., with ncpdq --unpack) prior to concatenating all the input files (unless they all share the same values of scale_factor and add_offset). If that does not solve the problem, then (1) there is likely an issue with _FillValue and (2) please post the pet metadata from a sample input file.
I have a netCDF file output from a particle dispersion model (GNOME).
As it is a particle dispersion model, I have every particle identified by a particle id variable:
int id(data) ;
id:description = "particle ID" ;
id:units = "1" ;
I need to extract only some specific particle id and their locations. I have tried with cdo and nco operators and I get these errors:
ncks -v longitude,latitude -d id,62001. infile.nc outputfile.nc
ncks: ERROR dimension id is not in input file
cdo -select,name=latitude,longitude,id=62968 infile.nc outputfile.nc
cdo select (Abort): Unsupported selection keyword: 'id'!
I hope someone could help me. Thanks
The dimension is actually named "data". I suggest you rename the dimension to "id". Then your command should work:
ncrename -d data,id in.nc
ncks -v longitude,latitude -d id,62001. in.nc out.nc
or you could leave the names alone, and if the id is really the data index, then this should work:
ncks -v longitude,latitude -d data,62001 in.nc out.nc
NB: no decimal point this time since data is not a coordinate, as explained here.
EDIT: 20210921 in response to comment below, unless I am missing something, the dataset would need to have a variable traj dimensioned traj(time,data) in order for the suggested commands to have the result you desire. The header of your file shows no such variable.
I have two netcdf files:
file_1.nc with variables qty_1 and qty_2 and
file_2.nc with variables qty_3, qty_4 and qty_5.
I want a file with 3 variables qty_3=qty_3*qty_2; qty_4=qty_4+qty_2 and qty_5.
Now I am first copying the variables to file_2 using
ncks -A -v qty_1,qty_2 file_1.nc file_2.nc
then I am doing math operation as,
ncap2 -A -s 'qty_3=qty_3*qty_2' -s 'qty_4=qty_4+qty_2' file_2.nc
This works, however, take some time.
Is there a way I can do this calculation in a single command ?
If you aren't totallly dependent on NCO, you could do this with CDO:
cdo -selname,qty_3,qty_4,qty_5 -aexpr,'qty_3=qty_3*qty_2;qty_4=qty_4+qty_2' -merge file_1.nc file_2.nc out.nc
I have a large number of NetCDF files from which I would like to extract a small number of variables for one location, and merge them into a new NetCDF file. The dimensions of the files are:
dimensions:
time = 18 ;
level = 65 ;
levelh = 66 ;
domain = 36 ;
I can subtract/merge the files for all domains with something like:
cdo select,name=u,v file1.nc file2.nc out.nc
But all other operators seem to be related to selections in space (e.g. sellonlatbox) or time (e.g. seltimestep), but I can't find a way to select only 1 domain from the NetCDF files. Is this possible with CDO's or NCO's?
Not sure I fully understand the question/intent. NCO treats all dimensions equally. If you want domain #17 then try
ncrcat -v u,v -d domain,17 file1.nc file2.nc out.nc
If file1.nc and file2.nc are not sequential in a record coordinate then try
ncecat -v u,v -d domain,17 file1.nc file2.nc out.nc
ADDED 20180929:
or if you don't like that, and the files do not have a record dimension yet are time-sequential then before using ncrcat turn the temporal dimension into a record coordinate for each file with
ncks -O --mk_rec_dmn time file1.nc file1.nc
ncks -O --mk_rec_dmn time file2.nc file2.nc
...
etc. and proceed as above. That may be the best way forward with NCO.