Gnuplot: Trouble creating heat map / reading data - dictionary

(I hope you get the meaning of what I write, english isn't my mother tongue.)
I have trouble creating a heat map showing air humidity using a txt file.
My data looks like this:
26.02.13 10:30:00 MEZ 31.79688 31.0625 32.875 31.8125 31.46875 30.9375 39.0 36.71875 36.1875
26.02.13 10:45:00 MEZ 31.875 31.10938 32.75 31.8125 31.46875 30.9375 39.0 36.71875 36.1875
26.02.13 11:00:00 MEZ 31.82813 31.15625 32.84375 31.8125 31.48438 30.9375 39.0 36.71875 36.1875
...
Its not a matrix, it's:
date1, time1, timezone, value1-room1, value1-room2, value1-room3,...
date2, time2, timezone, value2-room1, value2-room2, value2-room3,...
I inserted a blank line each 96 values to "seperate" the days from each other
this is what my code, so far, looks like(i left out labels etc.):
reset
set cbrange [0:100]
set palette defined (0 '#0000BB',0.091 '#0055FF',0.182 '#44BBEE',0.2728 '#DDFFDD',0.273 '#DDFFBB',0.45 '#DDFF44',0.5 '#FFFF00',0.55 '#FFF600',0.61 '#FFEE00',0.66 '#FFDD00',0.727 '#FFBD00',0.7272 '#FFBB00',0.86 '#FA2200',0.92 '#EA0000',1.0 '#880000')
set cblabel "Humidity"
set cbtics 0,20,100
set timefmt '"%d.%m.%y %H:%M:%S"'
set format x '"%d.%m.%y"'
set xrange ['"26.02.13"':'"27.03.13"']
set format y '"%H:%M:%S"'
set xrange ['"00:00:00"':'"23:59:59"']
plot "data.txt" using 1:2:4
My intention was to create a heat map for room 1. If that works I want to create heat maps for the other rooms, but first things first :-)
**the problems i can't solve are:
"Skipping unreadable file "data.txt""
and
"Can't plot with an empty x range"**
why is my file unreadable? its in ANSI, the blank lines should tell gnuplot where to start over again
why is the x range empty? did i specify it wrong?
all files are located in the "bin" directory of gnuplot, the "data.txt" has a length of ca. 2000 lines, my gnuplot version is 4.6
thanks in advance
Johannes

Plotting time data in gnuplot is tricky, but there are a few things to remember:
You should set xdata time somewhere in your script to tell gnuplot that you are working with time.
When it comes to time specifiers, be very careful how your time columns are delimited. If your data file does not have quote characters you do not need them in your specifier.
If your time data spans multiple columns and you say so in your format specifier, gnuplot will figure it out. You only have to specify the column where the time data starts (column 1 in your case).
Here is a script that works for me:
#!/usr/bin/env gnuplot
reset
set terminal pngcairo enhanced color rounded dashed size 800,500
set output 'test.png'
set cbrange [0:100]
set palette defined (0 '#0000BB',0.091 '#0055FF',0.182 '#44BBEE',0.2728 '#DDFFDD',0.273 '#DDFFBB',0.45 '#DDFF44',0.5 '#FFFF00',0.55 '#FFF600',0.61 '#FFEE00',0.66 '#FFDD00',0.727 '#FFBD00',0.7272 '#FFBB00',0.86 '#FA2200',0.92 '#EA0000',1.0 '#880000')
set cblabel "Humidity"
set cbtics 0,20,100
# need these three lines for time data
set xdata time
set timefmt '%d.%m.%y %H:%M:%S' # no quotes in specifier weil sie sind nicht im file
set format x '%d.%m.%y' # no extra quotes here, otherwise they appear in the output
set xrange ['26.02.13':'27.03.13']
# removed y formatting - not sure what you intended there
plot "data.dat" using 1:4 title 'Room 1'
Which makes the following output:
Turning this into a heatmap is another question entirely..

Related

Trying to make gnuplot use custom colors from the third column of a file

John 10479 228c44
Tom 5780 4fffa7
Willia 5248 773095
Salem 4747 ea1c1s
john 4630 db2f0d
plot "data.txt" using 2:xtic(1):(1):3 with boxes lc rgb variable title "Messages sent"
**2:xtic(1):(1):3**
2 decides height
xtic(1) put the names of the first column on the X tics
(1) should be the boxwidth
3 should be the column gnuplot takes the rgb color from
instead, i get "skipping data file with no valid points" and nothing shows up
if i only use the first three 2:xtic(1):(1)
for some reason, gnuplot uses the third part ((1) instead of xtic(1)) as the xtic names and ignore xtic(1)
i have no idea what causes that either
i just want to save column colors in hex rgb format in files instead of using linestyles
First of all, column 3 should be integer numbers not strings. So, add 0x in front of column 3 to make them hexadecimal values. Alternatively, you could write a function to convert them to hexadecimal values.
The sequence is essential. xtic() is always last. Variable color is second last. The plotting style with boxes can be used with either two or three columns (check help boxes). However, you can use is also with just one column, then gnuplot takes the pseudocolumn 0 as implicit x-values. Apparently, together with a column for variable color gnuplot gets confused.
So, you have to specify the x-column explicitly.
And you can specify the width of the box separately, e.g. set boxwidth 0.8.
So both will work:
plot $Data u 0:2:(0.8):3:xtic(1) w boxes lc rgb var
and
set boxwidth 0.8
plot $Data u 0:2:3:xtic(1) w boxes lc rgb var
Script:
### with boxes and variable color
reset session
$Data <<EOD
John 10479 0x228c44
Tom 5780 0x4fffa7
Willia 5248 0x773095
Salem 4747 0xea1c1s
john 4630 0xdb2f0d
EOD
# set boxwidth 0.8
set style fill solid 0.3
plot $Data u 0:2:3:xtic(1) w boxes lc rgb var ti "Messages sent"
### end of script
Result:

Multiple Gnuplots within figure giving incorrect colors

I'm trying to plot 3 data sets, each a different color, on one plot, but my code coloring seems to always incorrecty assign the last set's color also to the middle set:
set terminal png
set datafile separator ","
set title "Hours slept"
set xlabel "Date"
set ylabel "Hours"
set output '1.png'
set xdata time
set timefmt "%m/%d/%y"
set xrange ["09/17/22":"11/12/22"]
set format x "%m/%d"
set style line 1 lt 1 linecolor rgb "blue" lw 2 pt 1
set style line 2 lt 2 linecolor rgb "red" lw 2 pt 1
set style line 3 lt 3 linecolor rgb "yellow" lw 2 pt 1
plot "< grep -e '\*' fraction.csv | sed 's/*//'" using 1:($4) title 'weekends' ls 1 with points, \
"< grep -e '^[0-9]' fraction.csv" using 1:($4) title 'weekdays' ls 2 with points, \
"< grep -e '\^' fraction.csv | sed 's/^//'" using 1:($4) title 'fridays' ls 3 with points
There are suddenly no reds (the middle plot).
When I remove just the 3rd friday plot (last line), it looks like this:
So clearly I'm doing the coloring wrong? With three plots, all the weekdays become yellow instead of red.
This weird bug is driving me crazy. I initially did it like this without the explicit styles:
"< grep -e '^[0-9]' fraction.csv" using ($1):($3) title 'weekends' with points lc rgb 'blue'
And the same exact problem happened.
When I run each of the 3 grep calls they are all distinct data sets and there are far more weekday points than the other two.
I don't have your data, so, the following script creates some random test data.
Why do you use grep and sed if you can do it with gnuplot only?
Check help tm_wday which returns a number for the weekdays (0-6) for Sunday to Saturday. Define a function which sets the color accordingly.
For the legend you can use keyentry (check help keyentry).
Addition: more explanations
I didn't have an clue about your gnuplot level, I thought you could adapt the example to your case.
Well, there is for almost every command, function, keyword a help entry in gnuplot. In the gnuplot console type help <keyword>.
myColor(t) = ..., defines a function using the ternary operator which returns a color in the format 0xRRGGBB depending on the weekday, check help ternary, help tm_wday, help colorspec.
set format x "%m/%d" timedate will format the x-axis as time axis, check help time_specifiers.
...(t=timecolumn(1,myTimeFmt))... in gnuplot date/time is handled as seconds passed since1970-01-01 00:00:00, check help timecolumn.
...lc rgb var, sets the color from the data (or function), check help lc variable.
list your weekday categories in a string and address them by index via word check help word.
for the keyentry use a loop (check help for and help keyentry) and get the color from the weekday number. 1970-01-01 was a Thursday (=4). So, subtract 24*3600 seconds (=1 day) in order to get from the indices 1,2,3 to the weekday numbers 4,5,6 (Thu,Fri,Sat) (=weekdays, fridays, weekends) which will return the colors (red, yellow, blue).
Ok, so I modified the code such that you just have to
skip the random data creation section
replace $Data with '<YourFilename>', i.e. in your case 'fraction.csv'.
Script:
### color days of the week differently
reset session
myTimeFmt = "%m/%d/%y"
# create some random test data
set table $Data separator comma
t0 = time(0)
plot '+' u (strftime(myTimeFmt,t0+$0*24*3600)):(invnorm(rand(0))+7) w table
unset table
set datafile separator comma
set key noautotitle
set yrange[0:12]
set format x "%m/%d" timedate
myColor(t) = (d=tm_wday(t), d==5 ? 0xffff00 : d==6 || d==0 ? 0x0000ff : 0xff0000)
myDay(i) = word("weekdays fridays weekends",i)
plot $Data u (t=timecolumn(1,myTimeFmt)):2:(myColor(t)) w p pt 13 lc rgb var, \
for [i=1:3] keyentry w p pt 13 lc rgb myColor((i-1)*24*3600) ti myDay(i)
### end of script
Result:

Plotting multiple sets of information from file with Gnuplot

I have a file that looks like this:
0 0.000000
1 0.357625
2 0.424783
3 0.413295
4 0.417723
5 0.343336
6 0.354370
7 0.349152
8 0.619159
9 0.871003
0.415044
The last line is the mean of the N entries listed right above it. What I want to do is to plot a chart that has each point listed and a line with the mean value. I know it involves replot in some way but I can't read the last value separately.
You can make two passes using the stats command to get the necessary data
stats datafile u 1 nooutput
stats datafile u ($0==(STATS_records-1)?$1:1/0) nooutput
The first pass of stats will summarize the data file. What we are actually interested in is the number of records in the file, which will be saved in the variable STATS_records.
The second pass will compute a column to analyze. If the line number (the value of $0) is equal to one less than the number of records (lines are numbered from 0, so this is the last line), than we get this value, otherwise we get an invalid value. This causes the stats command to only look at this last line. Now the value of the last line is stored in STATS_max (or STATS_min and several other variables).
Now we can create the plot using
plot datafile u 1:2, STATS_max
where we explicitly state columns 1 and 2 to make the first plot specification ignore that last line (actually, if we just do plot datafile it should default to this column selection and automatically ignore that last line, but this makes certain). This produces
An alternative way is to use external programs to filter the data. For example, if we have the linux command tail available, we could do1
ave = system("tail -1 datafile")
plot datafile u 1:2, ave+0
Here, ave will contain the last row of the file as a string. In the plot command we add 0 to it to force it to change to a number (otherwise gnuplot will think it is a filename).
Other external programs can be used to read that last line as well. For example, the following call to python3 (using Windows style shell quotes) does the same:
ave = system('python -c "print(open(datafile,\"r\").readlines()[-1])"')
or the following using AWK (again with Windows style shell quotes) has the same result:
ave = system('awk "END{print}"')
or even using Perl (again with Windows shell quotes):
ave = system('perl -lne "END{print $last} $last=$_" datafile')
1 This use of tail uses a now obsolete (according to the GNU manuals) command line option. Using tail -n 1 datafile is the recommended way. However, this shorter way is less to type, and if forward compatibility is not needed (ie you are using this script once), there is no reason not to use it.
Gnuplot ignores those lines with missing data (for example, the last line of your datafile has no column 2). Then, you can simply do the following:
stats datafile using 2 nooutput
plot datafile using 1:2, STATS_mean
The result:
There is no need for using external tools or using stats (unless the value hasn't been calculated already, but in your example it has).
During plotting of the data points you can assign the value of the first column, e.g. to the variable mean.
Since the last row doesn't contain a second column, no datapoint will be plotted, but this last value will be hold in the variable mean.
If you replace reset session with reset and read the data from a file instead of a datablock, this will work with gnuplot 4.6.0 or even earlier versions.
Minimal solution:
plot FILE u (mean=$1):2, mean
Script: (nicer plot and including data for copy & paste & run)
### plot values as points and last value from column 1 as line
reset session
$Data <<EOD
0 0.000000
1 0.357625
2 0.424783
3 0.413295
4 0.417723
5 0.343336
6 0.354370
7 0.349152
8 0.619159
9 0.871003
0.415044
EOD
set key top center
plot $Data u (mean=$1):2 w p pt 7 lc rgb "blue" ti "Data", \
mean w l lw 2 lc rgb "red"
### end of script
Result:

store value from data file in variable gnuplot using dummy plot

This question is related to this one:
store commented value from data file in gnuplot
I formatted now every single data file that it looks like:
1.0 0.01
0.2 0.0163 0.0000125
0.4 0.0275 0.0001256
Then I tried to read the first line and store it into variables in this way:
set term push
set term unknown
plot dataFile every ::0::0 using (a=$0):(b=$1)
set term pop
But this is not working as it should, why? The rest of the file I plot as follows:
plot dataFile every ::1 using 1:2:3 with errorbars lt 1 linecolor "red",f(a,b)
Column counting starts at 1, the zeroth column is the row number. And you must also restrict to the first block (note the three colons). Try
plot dataFile every :::0::0 using (a=$1):(b=$2)
Alternatively you can use stats in a similar way:
stats dataFile every :::0::0 using 1:2
a = STATS_min_x
b = STATS_min_y

How to use GnuPlot to plot a time series chart from a CSV file date and time stored in separate columns?

Lets' take this as the data file:
2012-06-01, 01:00, 1
2012-06-01, 02:00, 2
2012-06-01, 03:00, 4
2012-06-01, 04:00, 3
...
2012-06-02, 01:00, 5
2012-06-02, 02:00, 2
2012-06-02, 03:00, 1
2012-06-02, 04:00, 1
...
I know how to set timefmt and xdata to plot time series when date and time are represented with a single field, but how to plot this with GnuPlot when time and date are stored in separate columns?
Not too differently than you would if they were spaces...
set timefmt '%Y-%m-%d, %H:%M'
set xdata time
set datafile sep ','
plot 'test.dat' u 1:3 w lines
I don't know if you've used timefmt with spaces in it before either (for regular space separated datafiles) but in that case, you specify the column where the time-data starts -- gnuplot automatically looks however many columns it needs to fill out the full time format. Of course, you need a full using specification (in this case that means designating that the data is in the 3rd column -- note, not the second as you might expect).
(tested on gnuplot 4.4 -- OS X)
Running Arch Linux
Gnuplot 4.6 patchlevel 3
I couldn't get mgilson's code snippet to work.
I needed to set the xrange before it would stop complaining
all points y value undefined!
I had to
set xrange["2012-06-01, 01:00":"2012-06-02, 05:00"]
and finally got a pretty plot

Resources