How can I swap the dimensions of a netCDF file? - netcdf

I have a
dimensions:
time = 1000 ;
xr = 100 ;
variables:
int time(time) ;
time:long_name = "time since time reference (instant)" ;
time:units = "hours since 2010-05-01" ;
double v1(xr, time) ;
v1:long_name = "V1" ;
averageInstantRunoff:units = "m s-1" ;
double v1(time, xr) ;
v1:long_name = "V1" ;
averageInstantRunoff:units = "m s-1" ;
int xr(xr) ;
xr:long_name = "xr index" ;
xr:units = "-" ;
Here, I want to make variable v1 to have dimensions of (time, xr) instead of (xr,time).
I do this using:
ncpdq -v v1 -a time,xr test.nc test2.nc
But ofcourse, it doesn't copy v2. I do it over large files with large number of variables and I only want to do it for a single variable.
How can I do this?

You can also the use ncap2 permute() method:
ncap2 -s 'v1_prm=v1.permute($time,$xr)' in.nc out.nc
but then your output will have a v1 and a v1_prm. Either way it's a two line process with NCO because you need a second step to append (ncpdq method) or remove the extra variable (ncap2 method).

Related

`coordinates` attribute of a netcdf variable is missing in xarray

I have a netcdf variable, namely mesh2d_sa1 which contains an attribute coordinates. But when I tried to call this attribute by ds.mesh2d_sa1.attrs["coordinates"], it is not found. The following is an extract of the output of ncdump -h xxxxxxx.nc, which confirms the existence of the attribute of coordinates.
double mesh2d_sa1(time, mesh2d_nFaces, mesh2d_nLayers) ;
mesh2d_sa1:mesh = "mesh2d" ;
mesh2d_sa1:location = "face" ;
mesh2d_sa1:coordinates = "mesh2d_face_x mesh2d_face_y" ;
mesh2d_sa1:cell_methods = "mesh2d_nFaces: mean" ;
mesh2d_sa1:standard_name = "sea_water_salinity" ;
mesh2d_sa1:long_name = "Salinity in flow element" ;
mesh2d_sa1:units = "1e-3" ;
mesh2d_sa1:grid_mapping = "wgs84" ;
mesh2d_sa1:_FillValue = -999. ;
The coordinate attribute can be accessed via ds.mesh2d_sa1.encoding['coordinates']

95% significance plot with NCL

i want to plot a contour of PBLH difference between 2 wrf-chem simulations. I have the netcdf means (attached files), and i want to draw contour of 95% significance levels, but the script did not work, can you give your suggestions please?
"Error: scalar_field: If the input data is 1-dimensional, you must set sfXArray and sfYArray to 1-dimensional arrays of the same length.
warning:create: Bad HLU id passed to create, ignoring it"
i'm explecting a contour plot with Grey shaded areas indicate regions with less than 95 % significance.
here is the code. You can test it with any two WRF netcdf files:
`;----------------------------------------------------------------------
; contoursym_1.ncl
;
; Concepts illustrated:
; - Using a symmetric color map
; - Using a blue-red color map
; - Explicitly setting contour levels
;----------------------------------------------------------------------
;
; These files are loaded by default in NCL V6.2.0 and newer
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/wrf/WRF_contributed.ncl"
; This file still has to be loaded manually
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/shea_util.ncl"
begin
;*****************
;-- load data
;*****************
;specify file names (input&output, netCDF)
pathin = "./" ; directory
fin1 = "15-25-omet.nc" ; input file name #1
fin2 = "15-25-wrfda.nc" ; input file name #2
fout = "signif_pblh_omet-wrfda" ; output file name
foutnc = fout+".nc"
f = addfile ("15-25-omet.nc", "r")
; open input files
in1 = addfile(pathin+fin1,"r")
in2 = addfile(pathin+fin2,"r")
; read data
tmp1 = in1->PBLH
tmp2 = in2->PBLH
x = f->PBLH(0,:,:)
diff=tmp1-tmp2
printVarSummary(tmp1)
printVarSummary(tmp2)
;****************************************************
; calculate probabiliites
;****************************************************
;t-test
res1=True
xtmp=tmp1(XTIME|:,south_north|:, west_east|:)
ytmp = tmp2(XTIME|:,south_north|:, west_east|:)
aveX = dim_avg_Wrap(xtmp)
aveY = dim_avg_Wrap(ytmp)
varX = dim_variance(xtmp)
varY = dim_variance(ytmp)
sX = dimsizes(xtmp&XTIME)
sY = dimsizes(ytmp&XTIME)
print(sX)
print(sY)
alphat = 100.*(1. - ttest(aveX,varX,sX, aveY,varY,sY, True, False))
;aveX = where(alphat.lt.95.,aveX#FillValue, aveX)
;print(alphat)
;*********************
;---Start the graphics
;**********************
wks = gsn_open_wks("ps" ,"Bias_gray_F") ; ps,pdf,x11,ncgm,eps
res = True
res#gsnMaximize = True ; uncomment to maximize size
res#gsnSpreadColors = True ; use full range of colormap
res#cnFillOn = True ; color plot desired
res#cnLinesOn = False ; turn off contour lines
res#cnLineLabelsOn = True ; turn off contour labels
res#cnLineLabelsOn = True ; turn on line labels
res#lbOrientation = "Vertical"
res#lbLabelPosition = "Right" ; label position
res#tiMainFontHeightF = 0.025
res#lbBoxEndCapStyle = "TriangleBothEnds" ; triangle label bar
;************************************************
; Use WRF_contributed procedure to set map resources
;************************************************
res = True
WRF_map_c(f, res, 0) ; reads info from file
;************************************************
; if appropriate, set True for native mapping (faster)
; set False otherwise
;************************************************
res#tfDoNDCOverlay = True
;************************************************
; associate the 2-dimensional coordinates to the variable for plotting
; only if non-native plot
;************************************************
if (.not.res#tfDoNDCOverlay) then
x#lat2d = f->XLAT(0,:,:) ; direct assignment
x#lon2d = f->XLONG(0,:,:)
end if
;************************************************
; Turn on lat / lon labeling
;************************************************
res#pmTickMarkDisplayMode = "Always" ; turn on tickmarks
;res#tmXTOn = False ; turn off top labels
;res#tmYROn = False ; turn off right labels
;************************************************
; Loop over all times and levels ( uncomment )
; Demo: one arbitrarily closen time and level
;************************************************
dimx = dimsizes(x) ; dimensions of x
ntim = dimx(0) ; number of time steps
klev = dimx(1) ; number of "bottom_top" levels
nt = 0 ; arbitrary time
kl = 6 ; " level
opt=True
opts=True
res1=True
res = opts ; Use basic options for this field
opts#MainTitle = "OMET-FBDA"
opts#InitTime = False ; Do not plot time or footers
opts#Footer = False
plot0 = gsn_csm_contour_map(wks,diff(0,:,:),res ) ; define plot 0
pval = gsn_csm_contour(wks,alphat(0,:),res1) ;-- this adds a contour line around the stippling
opt#gsnShadeMid="gray62"
pval = gsn_contour_shade(pval,0.05,1.00,opt) ;-- this adds the stippling for all pvalues <= 0.05
overlay(plot0,pval)
draw(plot0)
frame(wks)
end`

zsh loses precision when miltiplying two floats of significantly different magnitude

I'm trying to do some floating-point math in a zsh script. I'm seeing the following behavior:
$ (( a = 1.23456789 * 0.00000001 )); printf "a = %g\n" $a
a = 1.23e-08
$ (( a = 1.23456789 * 0.00000001 )); printf "a = %e\n" $a
a = 1.230000e-08
$ (( a = 1.23456789 * 0.0000001 )); printf "a = %e\n" $a
a = 1.235000e-07
I expect not to loose the 1st number's mantissa precision when I merely multiply it by a number whose mantissa is 1 (or at least very close to 1, if the true binary representation is considered). In other words, I'd expect to get a = 1.23456789e-08 or maybe some truncated mantissa, but not zeros after 1.23 / 1.235.
I'm running the following version:
$ zsh --version
zsh 5.8 (x86_64-apple-darwin20.0)
Am I missing something? Or is it an issue in zsh? I'm new to zsh, and I don't have a lot of experience in shell programming in general, so any help is appreciated. Thanks!
It appears that (( x = 1.0 )), when x is not defined, will cause Zsh to declare the variable as -F: a double precision floating point which is formatted to fixed-point with 10 decimal digits on output:
% unset x; (( x = 0.12345678901234567 )); declare -p x
typeset -F x=0.1234567890
% unset x; x=$((0.12345678901234567)); declare -p x
typeset x=0.12345678901234566
I don't know why it works this way, but if you manually declare your variable as a string first, this won't happen, and you'll get the full value:
% unset a; typeset a; (( a = 1.23456789 * 0.00000001 )); printf "a = %g\n" $a
a = 1.23457e-08
The difference comes from the way of how you pass the value of a to printf. If you write it as
(( a = 1.23456789 * 0.00000001 )); printf "a = %e\n" $((a))
$ (( a = 1.23456789 * 0.0000001 )); printf "a = %e\n" $((a))
the problem does not occur. This is described here, where it says:
floating point numbers can be declared with the float builtin; there are two types, differing only in their output format, as described for the typeset builtin. The output format can be bypassed by using arithmetic substitution instead of the parameter substitution, i.e. ‘${float}’ uses the defined format, but ‘$((float))’ uses a generic floating point format

NCO: can we remove dimension without modifying coordinates attribute?

I have a netcdf file :
dimensions:
y = 453 ;
x = 453 ;
plev = 1 ;
time = UNLIMITED ; // (1460 currently)
variables:
double plev(plev) ;
plev:name = "plev" ;
plev:standard_name = "air_pressure" ;
plev:long_name = "pressure" ;
plev:units = "Pa" ;
plev:axis = "Z" ;
plev:positive = "down" ;
float va925(time, plev, y, x) ;
va925:_FillValue = 1.e+20f ;
va925:missing_value = 1.e+20f ;
va925:coordinates = "lon lat plev" ;
va925:grid_mapping = "Lambert_Conformal" ;
I would like to remove the plev dimension, but keep plev variable and do not modify va925 coordinates attribute.
So I would like :
dimensions:
y = 453 ;
x = 453 ;
time = UNLIMITED ; // (1460 currently)
variables:
double plev;
plev:name = "plev" ;
plev:standard_name = "air_pressure" ;
plev:long_name = "pressure" ;
plev:units = "Pa" ;
plev:axis = "Z" ;
plev:positive = "down" ;
float va925(time, y, x) ;
va925:_FillValue = 1.e+20f ;
va925:missing_value = 1.e+20f ;
va925:coordinates = "lon lat plev" ;
va925:grid_mapping = "Lambert_Conformal" ;
I have tried with :
ncwa -a plev in.nc out.nc
But it modifies va925 coordinates such as :
va925:coordinates = "lon lat ";
I can change it again with :
ncatted -h -O -a coordinates,va925,m,c,"lon lat plev" out.nc
But it means that I have to loop on the variable name, which is too long!
Thank you in advance,
Lola
As you have discovered, ncwa automatically removes the averaged dimensions from the coordinates attribute. There is no switch to turn this off. It took a lot of work to include this feature so it is ironic that some users want to disable it :) You have already discovered and rejected the obvious workaround with ncatted. A lengthier workaround would be to rename all the coordinates attributes before using ncwa, then rename back afterwards, e.g.,
ncrename -a .coordinates,impeachment in.nc
ncwa -a lon in.nc out.nc
ncrename -a .impeachment,coordinates out.nc

plot average of n'th rows in gnuplot

I have some data that I want to plot them with gnuplot. But I have for the same x value many y values, I will show you to understand well:
0 0.650765 0.122225 0.013325
0 0.522575 0.001447 0.010718
0 0.576791 0.004277 0.104052
0 0.512327 0.002268 0.005430
0 0.530401 0.000000 0.036541
0 0.518333 0.001128 0.017270
20 0.512864 0.001111 0.005433
20 0.510357 0.005312 0.000000
20 0.526809 0.001089 0.033523
20 0.527076 0.000000 0.034215
20 0.507166 0.001131 0.000000
20 0.513868 0.001306 0.004344
40 0.531742 0.003295 0.0365
In this example, I have 6 values for each x value.So how can I draw the average and the confidence bar(interval) ??
thanks for help
To do this, you will need some kind of external processing. One possibility would be to use gawk to calculate the required quantities and the feed this auxiliary output to Gnuplot to plot it. For example:
set terminal png enhanced
set output 'test.png'
fName = 'data.dat'
plotCmd(col_num)=sprintf('< gawk -f analyze.awk -v col_num=%d %s', col_num, fName)
set format y '%0.2f'
set xr [-5:25]
plot \
plotCmd(2) u 1:2:3:4 w yerrorbars pt 3 lc rgb 'dark-red' t 'column 2'
This assumes that the script analyze.awk resides in the same directory from which Gnuplot is launched (otherwise, it would be necessary to modify the path in the -f option of gawk. The script analyze.awk itself reads:
function analyze(x, data){
n = 0;mean = 0;
val_min = 0;val_max = 0;
for(val in data){
n += 1;
delta = val - mean;
mean += delta/n;
val_min = (n == 1)?val:((val < val_min)?val:val_min);
val_max = (n == 1)?val:((val > val_max)?val:val_max);
}
if(n > 0){
print x, mean, val_min, val_max;
}
}
{
curr = $1;
yval = $(col_num);
if(NR==1 || prev != curr){
analyze(prev, data);
delete data;
prev = curr;
}
data[yval] = 1;
}
END{
analyze(curr, data);
}
It directly implements the online algorithm to calculate the mean and for each distinct value of x prints this mean as well as the min/max values.
In the Gnuplot script, the column of interest is then passed to the plotCmd function which prepares the command to be executed and the output of which will be plotted with u 1:2:3:4 w yerrorbars. This syntax means that the confidence interval is stored in the 3rd/4th columns while the value itself (the mean) resides in the second column.
In total, the two scripts above produce the picture below. The confidence interval on the last point is not visible since the example data in your question contain only one record for x=40, thus the min/max values coincide with the mean.
You can easily plot the average in this case:
plot "myfile.dat" using ($1):($2 + $3 + $4)/3
If you want average of only second and fourth column for example, you can write ($2+$4)/2 and so on.

Resources