Can't concatenate netCDF files with ncrcat - netcdf

I am looping over a model that outputs daily netcdf files. I have a 7-year time series of daily files that, ideally, I would like to append into a single file at the end of each loop but it seems that, using nco tools, the best way to merge the data into one file is to concatenate. Each daily file is called test.t.nc and is renamed as the date of the daily file e.g. 20070102.nc, except the first one that I create with
ncks -O --mk_rec_dmn time test.t.nc 2007-01-01.nc
to make time the record dimension for concatenation. If I try to concatenate the first two files such as
ncrcat -O -h 2007-01-01.nc 2007-01-02.nc out.nc
I get the error message
ncrcat: symbol lookup error: /usr/local/lib/libudunits2.so.0: undefined symbol: XML_ParserCreate
I don't understand what this means and, looking at all the help online, ncrcat should be a straightforward process. Does anyway understand what's happening?
Just in case this helps, the ncdump -h for 20070101.nc is
netcdf \20070101 {
dimensions:
time = UNLIMITED ; // (8 currently)
y = 1 ;
x = 1 ;
tile = 9 ;
soil = 4 ;
nt = 2 ;
and 20070102.nc
netcdf \20070102 {
dimensions:
x = 1 ;
y = 1 ;
tile = 9 ;
soil = 4 ;
time = UNLIMITED ; // (8 currently)
nt = 2 ;
This is part of a bigger shell script and I don't have much flexibility over the naming of files - just in case this matters!

Related

Extracting dates that satisfied multiple conditions using NCO/CDO or bash

I have a netcdf file containing vorticity (per sec) and winds (m/s). I want to print the dates of gridpoints that satisfy the following conditions:
1). Vorticity > 1x10^-5 per sec and winds >= 5 m/s at a gridpoint.
2). The average of vorticity and winds at the "four" (North, West, East, South) surrounding the gridpoint found in (1) should also be > 1x10^-5 and 5m/s, respectively.
I am able to just filter the gridpoints that satisfied (1), using ncap:
ncap2 -v -O -s 'where(vort > 1e-5 && winds >= 5) vort=vort; elsewhere vort=vort.get_miss();' input_test.nc output_test.nc
How do I get the dates? Also how can I implement the second condition.
Here's the screenshot of the header of the netcdf file.
I'll appreciate any help on this.
This may be achieved by combining "cdo" and "nco".
The average value of the surrounding 4 grids needed for the second condition can be calculated by combining the shiftxy and ensmean operators of "cdo".
cdo selname,vr,wspd input_test.nc vars.nc
cdo -expr,'vr_mean=vr; wspd_mean=wspd' \
-ensmean \
-shiftx,1 vars.nc \
-shiftx,-1 vars.nc \
-shifty,1 vars.nc \
-shifty,-1 vars.nc \
vars_mean.nc
You can then use merge operator of "cdo" to combine the variables needed to check conditions 1) and 2) into a single NetCDF file, and use ncap2 to check the conditions as you have tried.
In the example command below, the "for" loop of "ncap2" is used to scan the time. If there is at least one grid that satisfies both conditions 1) and 2) at each time, the information for that time will be displayed.
cdo merge vars.nc vars_mean.nc vars_test.nc
ncap2 -s '*flag = (vr > 1e-5 && wspd >= 5) && (vr > 1e-5 && wspd >= 5); *nt=$time.size; for(*i=0;i<nt;i++) { if ( max(flag(i,:,:))==1 ) { print(time(i)); } }' vars_test.nc

How to plot data from file from specific lines start at line with some special string

I am trying to execute command similar to
plot "data.asc" every ::Q::Q+1500 using 2 with lines
But i have problem with that "Q" number. Its not a well known value but number of line with some specific string. Lets say i have line with string "SET_10:" and then i have my data to plot after this specific line. Is there some way how to identify the number of that line with specific string?
An easy way is to pass the data through GNU sed to print just the wanted lines:
plot "< sed -n <data.asc '/^SET_10:/,+1500{/^SET_10:/d;p}'" using 1:2 with lines
The -n stops any output, the a,b says between which lines to do the {...} commands, and those commands say to delete the trigger line, and p print the others.
To make sure you have a compatible GNU sed try the command on its own, for a short number of lines, eg 5:
sed -n <data.asc '/^SET_10:/,+5{/^SET_10:/d;p}'
If this does not output the first 5 lines of your data, an alternative is to use awk, as it is too difficult in sed to count lines without this GNU-specific syntax. Test the (standard POSIX, not GNU-specific) awk equivalent:
awk <data.asc 'end!=0 && NR<=end{print} /^start/{end=NR+5}'
and if that is ok, use it in gnuplot as
plot "< awk <data.asc 'end!=0 && NR<=end{print} /^start/{end=NR+1500}'" using 1:2 with lines
Here's a version entirely within gnuplot, with no external commands needed. I tested this on gnuplot 5.0 patchlevel 3 using the following bash commands to create a simple dataset of 20 lines of which only 5 lines are to be printed from the line with "start" in column 1. You don't need to do this.
for i in $(seq 1 20)
do let j=i%2
echo "$i $j"
done >data.asc
sed -i data.asc -e '5a\
start'
The actual gnuplot uses a variable endlno initially set to NaN (not-a-number) and a function f which takes 3 parameters: a boolean start saying if column 1 has the matching string, lno the current linenumber, and the current column 1 value val. If the linenumber is less-than-or-equal-to the ending line number (and therefore it is not still NaN), f returns val, else if the start condition is true the wanted ending line number is set in variable endlno and NaN is returned. If we have not yet seen the start, NaN is returned.
gnuplot -persist <<\!
endlno=NaN
f(start,lno,val) = ((lno<=endlno)?val:(start? (endlno=lno+5,NaN) : NaN))
plot "data.asc" using (f(stringcolumn(1)eq "start", $0, $1)):2 with lines
!
Since gnuplot does not plot points with NaN values, we ignore lines upto the start, and again after the wanted number of lines.
In your case you need to change 5 to 1500 and "start" to "SET_10:".

call routine in IDL programming lanuage

I am new and learning IDL on a steep curve. I have 2 PROS first one follows
Pro READ_Netcdf1,infile,temperature,time,print_prompts=print_prompts
COMPILE_OPt IDL2
infile='D:/Rwork/dataset/monthly_mean/version_2C/air.2m.mon.mean.nc'
IF (N_Elements(infile) EQ 0 ) Then STOP,'You are being silly, you must specify infile on call'
print,infile
iid = NCDF_OPEN(infile)
NCDF_VARGET, iid, 'time', time ; Read time
NCDF_VARGET, iid, 'air', temperature ; Read surface average temperature
NCDF_VARGET, iid, 'lat', latitude ; Read Latitude
NCDF_VARGET, iid, 'lon', longitude ; Read Longitude
NCDF_CLOSE, iid ; Close Input File
Centigrade=temperature-273.15
print,'Time'
print,time[0:9]
Print, 'Latitude'
Print, latitude[0:9]
Print, 'Longitude'
Print, longitude[0:9]
print,'Temperature'
print, temperature[0:9]
Print, 'Centigrade'
Print, Centigrade[0:9]
;ENDIF
RETURN
END
This works perfectly. My second Pro is as follows:-
PRO Change_Kelvin_to_Cent,Temperature
;+ This programme take the temperature from the NETCDF file and converts
; to Centigrade
;Output
; The Month Mean Temperature in Centigrade
; Must have read ncdf1 in the directory run first
;
; -
COMPILE_OPt IDL2
infile='D:/Rwork/dataset/monthly_mean/version_2c/air.2m.mon.mean.nc'
read_netcdf1, infile, Temperature
Centigrade = Temperature-273.15
print,'Centigrade'
print,Centigrade[0:9]
RETURN
END
This also works
I am being instructed to call the variable "Temperature" from the first PRO to calculate the Temperature in the second PRO without the command line
read_netcdf1, infile, Temperature
I cannot get this to work. Can anybody advise and help me out of this problem please
I was misinformed . it can not be done. You must have the
"read_netcdf1, infile, Temperature" piece of code. Although Temperature can be any tag because it the position it is in not the wording.
i hope this makes sence

Count occurence of unique value in 2nd field using awk

I'm using this syntax to count occurrence of unique values in 2nd field of the file. Can somebody explain how does this work. How is Unix calculating this count ? Is it reading each line or whole file as one.. how is it assigning count and incrementing it?
Command:
awk -F: '{a[$2]++} END {for ( i in a) { print i,a[i]}}' inputfile
It's not Unix calculating but awk; awk is not Unix or shell, it's a language. Presented awk program calculates how many times each unique value in the second field ($2. separated by :) occurs and outputs the values and related counts.
awk -F: ' # set the field separator to ":"
{
# awk reads in records or lines in a loop
a[$2]++ # here it hashes each value to a and counts each occurrance
}
END { # after all records have been processed
for ( i in a) { # hash a is looped thru in no particular order
print i,a[i] # and value-count pairs are outputed
}
}' inputfile
If you want to learn more about awk, please read following quote (* see below) by #EdMorton: The best source of all awk information is the book Effective Awk Programming, 4th Edition, by Arnold Robbins. If you have any other book, throw it away, and if you're trying to learn from a web site - don't as most of them are full of complete nonsense. Just get the book.
*) Now go read the book.
Edit How a[$2]++ works:
Sample data and a[$2]'s value:
1 val1 # a[$2]++ causes: a["val1"] = 1
2 val2 # a[$2]++ causes: a["val2"] = 1
3 val1 # a[$2]++ causes: a["val1"] = 2
4 val1 # a[$2]++ causes: a["val1"] = 3

Feeding data into octave plot from a command-line pipe

I have a question concerning real time plotting in octave. The idea is very simple, but unfortunately I could not find a solution on the internet. In my project, I am using netcat to sample data and awk to filter it, for example:
nc 172.16.254.1 23 | awk ' /point/ '
In this way, I get a new data point (approximately) every 4-10 ms, together with a timestamp.
Now I would like to pipe this data to octave and plot it real time. Does anyone have any ideas?
Update
It seems to me that
nc 172.16.254.1 23 | awk ' /point/ ' | octave --silent --persist --eval "sample(stdin)"
pipes the data to my octave script sample, which does the plotting. But now there is still one problem: the replotting is far to slow, and slows down during sampling the data (I get thousands of data points). I have
function sample(stream)
t = NaN; r = NaN; k = 1;
figure(1)
plot(t,r,'o')
hold on
while(~feof(stream))
s = fgets(stream);
t(k) = str2double(s(1:6));
r(k) = str2double(s(8:11));
plot(t(k),r(k),'o')
drawnow()
k = k + 1;
end
What should I add/change?
After some research, feedgnuplot seems to satisfy my purpose of realtime plotting:
nc 172.16.254.1 23 |
awk ' /point/ ' |
feedgnuplot --domain --points --stream 0.01

Resources