Removing weekend/non-trading hours gaps in gnuplot - plot

My question is identical to this question except that I'm using conditional coloring for candlesticks and the solution mentioned isn't working.
My data is like:
10/24/2018 23:45,168.25,168.65,168.2,168.4,0
10/24/2018 23:46,168.5,169,168.5,168.95,67577
10/24/2018 23:47,169.35,169.6,169.1,169.1,151630
10/24/2018 23:48,169.05,169.35,168.95,169.2,63418
.
.
10/26/2018 13:48,169.05,169.35,168.95,169.2,63418
10/26/2018 23:47,169.35,169.6,169.1,169.1,151630
10/26/2018 23:48,169.05,169.35,168.95,169.2,63418
Plotting a file like this in gnuplot using command:
plot ".../ISTFeed.csv" using (timecolumn(1, "%m/%d/%Y %H:%M:%S")) : 2 : 3 : 4 : 5 : ($5 < $2 ? rgb(255, 0, 0) : rgb(0, 255, 0)) linecolor rgb variable notitle with candlesticks
produces chart with gaps. I want gnuplot to omit the gaps and plot candlesticks continuously. Is there a way to achieve this?

Have a look at the below example with generated dummy data. Although the xtics in the plot look equidistant the values of the xtics are not equidistant at all. So if you want to readout a y-value between two xtics you don't have "any" idea what the actual x-value is. You just know it must be somewhere inbetween the neighbouring xtics. A bit strange... but if you are fine with this precision...
Code:
### remove gaps in timedata
# requires gnuplot >=5.2, because of use of datablocks
reset session
# generate some dummy data
t = strptime("%m/%d/%Y %H:%M", "10/24/2018 23:45")
close = 168.25
set print $Data
print "# date time, open, low, high, close"
do for [i=1:100] {
open = close + (rand(0)*2)-1
low_tmp = open - rand(0)
high_tmp = open + rand(0)
close = open + (rand(0)*2)-1
low = open<low_tmp ? open : close<low_tmp ? close : low_tmp
high = open>high_tmp ? open : close>high_tmp ? close : high_tmp
dt = 3600*rand(0)*48
t = t + dt
print sprintf("%s,%8 .2f,%8 .2f,%8 .2f,%8 .2f",strftime("%m/%d/%Y %H:%M",t),open,low,high,close)
}
set print
print $Data
N = 10
Offset = 5
everyNthRow(N) = (int($0)%N==Offset ? word(strcol(1),1) : NaN)
set datafile separator ","
set xtics rotate by 45 right
set style fill solid 1.0
plot $Data u 0:2:3:4:5:($5<$2 ? 0xff0000 : 0x00ff00):xtic(everyNthRow(N)) w candlesticks lc rgb var not
### end of code
Result:

Related

95% significance plot with NCL

i want to plot a contour of PBLH difference between 2 wrf-chem simulations. I have the netcdf means (attached files), and i want to draw contour of 95% significance levels, but the script did not work, can you give your suggestions please?
"Error: scalar_field: If the input data is 1-dimensional, you must set sfXArray and sfYArray to 1-dimensional arrays of the same length.
warning:create: Bad HLU id passed to create, ignoring it"
i'm explecting a contour plot with Grey shaded areas indicate regions with less than 95 % significance.
here is the code. You can test it with any two WRF netcdf files:
`;----------------------------------------------------------------------
; contoursym_1.ncl
;
; Concepts illustrated:
; - Using a symmetric color map
; - Using a blue-red color map
; - Explicitly setting contour levels
;----------------------------------------------------------------------
;
; These files are loaded by default in NCL V6.2.0 and newer
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/wrf/WRF_contributed.ncl"
; This file still has to be loaded manually
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/shea_util.ncl"
begin
;*****************
;-- load data
;*****************
;specify file names (input&output, netCDF)
pathin = "./" ; directory
fin1 = "15-25-omet.nc" ; input file name #1
fin2 = "15-25-wrfda.nc" ; input file name #2
fout = "signif_pblh_omet-wrfda" ; output file name
foutnc = fout+".nc"
f = addfile ("15-25-omet.nc", "r")
; open input files
in1 = addfile(pathin+fin1,"r")
in2 = addfile(pathin+fin2,"r")
; read data
tmp1 = in1->PBLH
tmp2 = in2->PBLH
x = f->PBLH(0,:,:)
diff=tmp1-tmp2
printVarSummary(tmp1)
printVarSummary(tmp2)
;****************************************************
; calculate probabiliites
;****************************************************
;t-test
res1=True
xtmp=tmp1(XTIME|:,south_north|:, west_east|:)
ytmp = tmp2(XTIME|:,south_north|:, west_east|:)
aveX = dim_avg_Wrap(xtmp)
aveY = dim_avg_Wrap(ytmp)
varX = dim_variance(xtmp)
varY = dim_variance(ytmp)
sX = dimsizes(xtmp&XTIME)
sY = dimsizes(ytmp&XTIME)
print(sX)
print(sY)
alphat = 100.*(1. - ttest(aveX,varX,sX, aveY,varY,sY, True, False))
;aveX = where(alphat.lt.95.,aveX#FillValue, aveX)
;print(alphat)
;*********************
;---Start the graphics
;**********************
wks = gsn_open_wks("ps" ,"Bias_gray_F") ; ps,pdf,x11,ncgm,eps
res = True
res#gsnMaximize = True ; uncomment to maximize size
res#gsnSpreadColors = True ; use full range of colormap
res#cnFillOn = True ; color plot desired
res#cnLinesOn = False ; turn off contour lines
res#cnLineLabelsOn = True ; turn off contour labels
res#cnLineLabelsOn = True ; turn on line labels
res#lbOrientation = "Vertical"
res#lbLabelPosition = "Right" ; label position
res#tiMainFontHeightF = 0.025
res#lbBoxEndCapStyle = "TriangleBothEnds" ; triangle label bar
;************************************************
; Use WRF_contributed procedure to set map resources
;************************************************
res = True
WRF_map_c(f, res, 0) ; reads info from file
;************************************************
; if appropriate, set True for native mapping (faster)
; set False otherwise
;************************************************
res#tfDoNDCOverlay = True
;************************************************
; associate the 2-dimensional coordinates to the variable for plotting
; only if non-native plot
;************************************************
if (.not.res#tfDoNDCOverlay) then
x#lat2d = f->XLAT(0,:,:) ; direct assignment
x#lon2d = f->XLONG(0,:,:)
end if
;************************************************
; Turn on lat / lon labeling
;************************************************
res#pmTickMarkDisplayMode = "Always" ; turn on tickmarks
;res#tmXTOn = False ; turn off top labels
;res#tmYROn = False ; turn off right labels
;************************************************
; Loop over all times and levels ( uncomment )
; Demo: one arbitrarily closen time and level
;************************************************
dimx = dimsizes(x) ; dimensions of x
ntim = dimx(0) ; number of time steps
klev = dimx(1) ; number of "bottom_top" levels
nt = 0 ; arbitrary time
kl = 6 ; " level
opt=True
opts=True
res1=True
res = opts ; Use basic options for this field
opts#MainTitle = "OMET-FBDA"
opts#InitTime = False ; Do not plot time or footers
opts#Footer = False
plot0 = gsn_csm_contour_map(wks,diff(0,:,:),res ) ; define plot 0
pval = gsn_csm_contour(wks,alphat(0,:),res1) ;-- this adds a contour line around the stippling
opt#gsnShadeMid="gray62"
pval = gsn_contour_shade(pval,0.05,1.00,opt) ;-- this adds the stippling for all pvalues <= 0.05
overlay(plot0,pval)
draw(plot0)
frame(wks)
end`

Multiple Y axes on a single plot Octave

What I am trying to do is create something that looks like this,
which I have created by using the Octave pcolor and barh functions using 3 subplots with the subplot x axes scaled to look as if the figure were one plot. However, this approach is unsatisfactory as I cannot zoom or pan across it as I would be able to if it were actually one plot.
How can I plot one background figure using pcolor and then add multiple y axes at different points along the x axis to plot the horizontal histograms using barh?
For some background to this question, I am trying to create what is called a Market Profile chart, i.e. see example here or here.
I have also cross posted this question on the Octave mailing list here.
I have found a way to do what I want. Instead of using the barh function, one can use the fill function instead.
A minimal, reproducible example given below.
## create y and x axes for chart
y_max = max( high_round ) + max_tick_range * tick_size ;
y_min = min( low_round ) - max_tick_range * tick_size ;
y_ax = ( y_min : tick_size : y_max )' ;
end_x_ax_freespace = 5 ;
x_ax = ( 1 : 1 : numel( open ) + end_x_ax_freespace )' ;
colormap( 'viridis' ) ; pcolor( x_ax , y_ax , vp_z' ) ; shading interp ; axis tight ;
## plot the individual volume profiles
hold on ;
scale_factor = 0.18 ;
fill( vp( 1 , : ) .* scale_factor , y_ax' , [99;99;99]./255 ) ;
fill( vp( 2 , : ) .* scale_factor .+ 50 , y_ax' , [99;99;99]./255 ) ;
fill( vp( 3 , : ) .* scale_factor .+ 100 , y_ax' , [99;99;99]./255 ) ;
hold off;
More details on a blog post of mine.

plot average of n'th rows in gnuplot

I have some data that I want to plot them with gnuplot. But I have for the same x value many y values, I will show you to understand well:
0 0.650765 0.122225 0.013325
0 0.522575 0.001447 0.010718
0 0.576791 0.004277 0.104052
0 0.512327 0.002268 0.005430
0 0.530401 0.000000 0.036541
0 0.518333 0.001128 0.017270
20 0.512864 0.001111 0.005433
20 0.510357 0.005312 0.000000
20 0.526809 0.001089 0.033523
20 0.527076 0.000000 0.034215
20 0.507166 0.001131 0.000000
20 0.513868 0.001306 0.004344
40 0.531742 0.003295 0.0365
In this example, I have 6 values for each x value.So how can I draw the average and the confidence bar(interval) ??
thanks for help
To do this, you will need some kind of external processing. One possibility would be to use gawk to calculate the required quantities and the feed this auxiliary output to Gnuplot to plot it. For example:
set terminal png enhanced
set output 'test.png'
fName = 'data.dat'
plotCmd(col_num)=sprintf('< gawk -f analyze.awk -v col_num=%d %s', col_num, fName)
set format y '%0.2f'
set xr [-5:25]
plot \
plotCmd(2) u 1:2:3:4 w yerrorbars pt 3 lc rgb 'dark-red' t 'column 2'
This assumes that the script analyze.awk resides in the same directory from which Gnuplot is launched (otherwise, it would be necessary to modify the path in the -f option of gawk. The script analyze.awk itself reads:
function analyze(x, data){
n = 0;mean = 0;
val_min = 0;val_max = 0;
for(val in data){
n += 1;
delta = val - mean;
mean += delta/n;
val_min = (n == 1)?val:((val < val_min)?val:val_min);
val_max = (n == 1)?val:((val > val_max)?val:val_max);
}
if(n > 0){
print x, mean, val_min, val_max;
}
}
{
curr = $1;
yval = $(col_num);
if(NR==1 || prev != curr){
analyze(prev, data);
delete data;
prev = curr;
}
data[yval] = 1;
}
END{
analyze(curr, data);
}
It directly implements the online algorithm to calculate the mean and for each distinct value of x prints this mean as well as the min/max values.
In the Gnuplot script, the column of interest is then passed to the plotCmd function which prepares the command to be executed and the output of which will be plotted with u 1:2:3:4 w yerrorbars. This syntax means that the confidence interval is stored in the 3rd/4th columns while the value itself (the mean) resides in the second column.
In total, the two scripts above produce the picture below. The confidence interval on the last point is not visible since the example data in your question contain only one record for x=40, thus the min/max values coincide with the mean.
You can easily plot the average in this case:
plot "myfile.dat" using ($1):($2 + $3 + $4)/3
If you want average of only second and fourth column for example, you can write ($2+$4)/2 and so on.

"blur" existing single image to mosaic with rmagick?

Trying to mosaic an image with rmagick.
How would one "mosaic blur" an existing image making the picture that it represents mosaic'ed ?
Like:
This is how you do a mosaic using RMagick
#!/home/software/ruby-1.8.5/bin/ruby -w
require 'RMagick'
# Demonstrate the mosaic method
a = Magick::ImageList.new
letter = 'A'
26.times do
# 'M' is not the same size as the other letters.
if letter != 'M'
a.read("images/Button_"+letter+".gif")
end
letter.succ!
end
# Make a copy of "a" with all the images quarter-sized
b = Magick::ImageList.new
page = Magick::Rectangle.new(0,0,0,0)
a.scene = 0
5.times do |i|
5.times do |j|
b << a.scale(0.25)
page.x = j * b.columns
page.y = i * b.rows
b.page = page
(a.scene += 1) rescue a.scene = 0
end
end
# Make a 5x5 mosaic
mosaic = b.mosaic
mosaic.write("mosaic.gif")
# mosaic.display
exit

Plotting labeled intervals in matplotlib/gnuplot

I have a data sample which looks like this:
a 10:15:22 10:15:30 OK
b 10:15:23 10:15:28 OK
c 10:16:00 10:17:10 FAILED
b 10:16:30 10:16:50 OK
What I want is to plot the above data in the following way:
captions ^
|
c | *------*
b | *---* *--*
a | *--*
|___________________
time >
With the color of lines depending on the OK/FAILED status of the data point. Labels (a/b/c/...) may or may not repeat.
As I've gathered from documentation for gnuplot and matplotlib, this type of a plot should be easier to do in the latter as it's not a standard plot and would require some preprocessing.
The question is:
Is there a standard way to do plots like this in any of the tools?
If not, how should I go about plotting this data (pointers to relevant tools/documentation/functions/examples which do something-kinda-like the thing described here)?
Updated: Now includes handling the data sample and uses mpl dates functionality.
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, MinuteLocator, SecondLocator
import numpy as np
from StringIO import StringIO
import datetime as dt
### The example data
a=StringIO("""a 10:15:22 10:15:30 OK
b 10:15:23 10:15:28 OK
c 10:16:00 10:17:10 FAILED
b 10:16:30 10:16:50 OK
""")
#Converts str into a datetime object.
conv = lambda s: dt.datetime.strptime(s, '%H:%M:%S')
#Use numpy to read the data in.
data = np.genfromtxt(a, converters={1: conv, 2: conv},
names=['caption', 'start', 'stop', 'state'], dtype=None)
cap, start, stop = data['caption'], data['start'], data['stop']
#Check the status, because we paint all lines with the same color
#together
is_ok = (data['state'] == 'OK')
not_ok = np.logical_not(is_ok)
#Get unique captions and there indices and the inverse mapping
captions, unique_idx, caption_inv = np.unique(cap, 1, 1)
#Build y values from the number of unique captions.
y = (caption_inv + 1) / float(len(captions) + 1)
#Plot function
def timelines(y, xstart, xstop, color='b'):
"""Plot timelines at y from xstart to xstop with given color."""
plt.hlines(y, xstart, xstop, color, lw=4)
plt.vlines(xstart, y+0.03, y-0.03, color, lw=2)
plt.vlines(xstop, y+0.03, y-0.03, color, lw=2)
#Plot ok tl black
timelines(y[is_ok], start[is_ok], stop[is_ok], 'k')
#Plot fail tl red
timelines(y[not_ok], start[not_ok], stop[not_ok], 'r')
#Setup the plot
ax = plt.gca()
ax.xaxis_date()
myFmt = DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(myFmt)
ax.xaxis.set_major_locator(SecondLocator(interval=20)) # used to be SecondLocator(0, interval=20)
#To adjust the xlimits a timedelta is needed.
delta = (stop.max() - start.min())/10
plt.yticks(y[unique_idx], captions)
plt.ylim(0,1)
plt.xlim(start.min()-delta, stop.max()+delta)
plt.xlabel('Time')
plt.show()
the answer for #tillsten is not working for Python3 any more I did some modification I hope it will helps.
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, MinuteLocator, SecondLocator
import numpy as np
import pandas as pd
import datetime as dt
import io
### The example data
a=io.StringIO("""
caption start stop state
a 10:15:22 10:15:30 OK
b 10:15:23 10:15:28 OK
c 10:16:00 10:17:10 FAILED
b 10:16:30 10:16:50 OK""")
data = pd.read_table(a, delimiter=" ")
data["start"] = pd.to_datetime(data["start"])
data["stop"] = pd.to_datetime(data["stop"])
cap, start, stop = data['caption'], data['start'], data['stop']
#Check the status, because we paint all lines with the same color
#together
is_ok = (data['state'] == 'OK')
not_ok = np.logical_not(is_ok)
#Get unique captions and there indices and the inverse mapping
captions, unique_idx, caption_inv = np.unique(cap, 1, 1)
#Build y values from the number of unique captions.
y = (caption_inv + 1) / float(len(captions) + 1)
#Plot function
def timelines(y, xstart, xstop, color='b'):
"""Plot timelines at y from xstart to xstop with given color."""
plt.hlines(y, xstart, xstop, color, lw=4)
plt.vlines(xstart, y+0.03, y-0.03, color, lw=2)
plt.vlines(xstop, y+0.03, y-0.03, color, lw=2)
#Plot ok tl black
timelines(y[is_ok], start[is_ok], stop[is_ok], 'k')
#Plot fail tl red
timelines(y[not_ok], start[not_ok], stop[not_ok], 'r')
#Setup the plot
ax = plt.gca()
ax.xaxis_date()
myFmt = DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(myFmt)
ax.xaxis.set_major_locator(SecondLocator(interval=20)) # used to be SecondLocator(0, interval=20)
#To adjust the xlimits a timedelta is needed.
delta = (stop.max() - start.min())/10
plt.yticks(y[unique_idx], captions)
plt.ylim(0,1)
plt.xlim(start.min()-delta, stop.max()+delta)
plt.xlabel('Time')
plt.show()
gnuplot 5.2 version with creating a unique key list
The main difference to #CiroSantilli's solution is that a list of unique keys is created automatically from column 1 and the index can be accessed via the defined function Lookup(). The referenced gnuplot demo already uses a list of unique items, however, in the OP's case there are duplicates.
Creating such a list of unique items does not exist in gnuplot right away, so you have to implement it yourself.
The code requires gnuplot >=5.2. It is probably difficult to get a solution which works under gnuplot 4.4 (the time of OP's question) because a few useful features were not implemented at that time: do for-loops, summation, datablocks, ... (a version for gnuplot 4.6 might be possible with some workarounds).
Edit: the earlier version used with vectors and linewidth 20 to plot the bars, however, linewidth 20 also extends in x-direction which is not desired here. Therefore, with boxxyerror is now used.
Yes, it can be done shorter and clearer.
Script:
### Time chart with gnuplot (requires gnuplot>=5.0)
reset session
$Data <<EOD
# category start end status
"event 1" 10:15:22 10:15:30 OK
"event 2" 10:15:23 10:15:28 OK
pause 10:16:00 10:17:10 FAILED
"something else" 10:16:30 10:17:50 OK
unknown 10:17:30 10:18:50 OK
"event 3" 10:18:30 10:19:50 FAILED
pause 10:19:30 10:20:50 OK
"event 1" 10:17:30 10:19:20 FAILED
EOD
# create list of unique items
uniqueList = ''
item(col) = ' "'.strcol(col).'"'
isInList(list,col) = strstrt(uniqueList,item(col)) # returns a number >0 if found
addToList(list,col) = list.item(col)
stats $Data u (!isInList(uniqueList,1) ? uniqueList = addToList(uniqueList,1) : 0) nooutput
timeCenter(col1,col2) = (timecolumn(col1,myTimeFmt)+timecolumn(col2,myTimeFmt))*0.5
timeDeltaT(col1,col2) = (timecolumn(col1,myTimeFmt)-timecolumn(col2,myTimeFmt))*0.5
Lookup(col) = int(sum [i=1:words(uniqueList)] (strcol(col) eq word(uniqueList,i)) ? i : 0)
myColor(col) = strcol(col) eq "OK" ? 0x00cc00 : 0xff0000
myBoxWidth = 0.6
myTimeFmt = "%H:%M:%S"
set format x "%M:%S" timedate
set yrange [0.5:words(uniqueList)+0.5]
set grid x,y
plot $Data u (timeCenter(2,3)):(Lookup(1)):(timeDeltaT(2,3)):(0.5*myBoxWidth): \
(myColor(4)):ytic(1) w boxxyerror fill solid 1.0 lc rgb var notitle
### end of script
Result:
gnuplot with vector solution
Minimized from: http://gnuplot.sourceforge.net/demo_5.2/gantt.html
main.gnuplot
#!/usr/bin/env gnuplot
$DATA << EOD
1 1 5
1 11 13
2 3 10
3 4 8
4 7 13
5 6 15
EOD
set terminal png size 512,512
set output "main.png"
set xrange [-1:]
set yrange [0:]
unset key
set border 3
set xtics nomirror
set ytics nomirror
set style arrow 1 nohead linewidth 3
plot $DATA using 2 : 1 : ($3-$2) : (0.0) with vector as 1, \
$DATA using 2 : 1 : 1 with labels right offset -2
GitHub upstream.
Output:
You can remove the labels by removing the second plot command line, I added them because they are useful in many applications to more easily identify the intervals.
The Gantt example I linked to shows how to handle date formats instead of integers.
Tested in gnuplot 5.2 patchlevel 2, Ubuntu 18.04.

Resources