I have data in some text file which has let's say 10000 rows and 2 columns. I know that I can plot it easily by plot "filename.txt" using 1:2 with lines . What I want is however just plotting let's say the rows from 1000 to 2000 or any other reasonable selection. Is it possible to do that easily? Thank you very much in advance.
It appears that the "every" command in gnuplot is what you're looking for:
plot "filename.txt" every ::1000::2000 using 1:2 with lines
Alternatively, pre-process your file to select the rows in which you are interested. For example, using awk:
awk "NR>=1000 && NR<=2000" filename.txt > processed.txt
Then use the resulting "processed.txt" in your existing gnuplot command/script.
Simpler:
plot "<(sed -n '1000,2000p' filename.txt)" using 1:2 with lines
You can probably cut out the reliance on an external utility (If your system doesn't have them installed for example) using the pseudo-column 0.
see help plot datafile using pseudocolumn
Try something like:
LINEMIN=1000
LINEMAX=2000
#create a function that accepts linenumber as first arg
#an returns second arg if linenumber in the given range.
InRange(x,y)=((x>=LINEMIN) ? ((x<=LINEMAX) ? y:1/0) : 1/0)
plot "filename.txt" using (InRange($0,$1)):2 with lines
(tested on Gnuplot 4.4.2, Linux)
Gnuplot ignores NaN values. This works for me for a specified range of the x coordinate. Not sure how to specify row range though.
cutoff(c1,c2,xmin,xmax) = (c1>=xmin)*(c1<=xmax) ? c2 : NaN
plot "data.txt" u 1:(cutoff(($1),($2),1000,2000))
I would recommend some commandline tools like sed, grep or bash. In your example
head -n 2000 ./file.data > temp.data
and
tail -n 1000 temp.data > temp2.data
might work. But haven't tested if such large numbers work with head and tail.
Related
How can I create multiple columns in just a line of code? For instance, in the picture below, I am trying make the below six lines of code into a single line of code?
A1AFirstBatch4$OOATB1 <- with(A1AFirstBatch4, coalesce(Q1a, Q1b))
A1AFirstBatch4$OOATB2 <- with(A1AFirstBatch4, coalesce(Q2a, Q2b))
A1AFirstBatch4$OOATB3 <- with(A1AFirstBatch4, coalesce(Q3a, Q3b))
A1AFirstBatch4$OOATB4 <- with(A1AFirstBatch4, coalesce(Q4a, Q4b))
A1AFirstBatch4$OOATB5 <- with(A1AFirstBatch4, coalesce(Q5a, Q5b))
A1AFirstBatch4$OOATB6 <- with(A1AFirstBatch4, coalesce(Q6a, Q6b))
Created on 2021-04-07 by the reprex package (v2.0.0)
You could create the dataframe like this. In case you need to append (not create), an option would be to merge a newly created dataframe with the old one afterwards.
If it is really ONE LINE (not one statement) you are looking for, you can omit the linebreaks inserted here to increase the ability to read it.
A1AFirstBatch4 <- data.frame(OOATB1 = coalesce(Q1a, Q1b),
OOATB2 = coalesce(Q2a, Q2b),
... and so forth ... )
Is this what you are looking for? I am not sure what you need to achieve.
In case it is a piece of code for a function, if-statement etc.: use curly brackets:
{statement 1
statement 2
statement 3...
}
(taking your question literally, one could simply put semicolons in between statements in a single line, but this is unreadable and I am not sure about the max char per line A... <- ....; A... <- ....; ...).
I would like to plot in R the equivalent of the binscatter command that you can find in Stata.
I have found the statar package that should give the same with the command stat_binmean.
I am having problems in setting the bins though. I want to set the specific values of x at which I want the bin to be constructed. Indeed , for now, I have only managed to set the number of bins that I want, leaving to R the option to set the corresponding values of x.
The following is my code:
library(statar)
library(ggplot2)
g<-ggplot( df , aes(x=var_x , y=var_y))
g + stat_binmean(n=0)
From the statar's instruction code: "Set (n) to zero if you want to use distinct value of x for grouping", but how do I specify the specific values of the grouping?
PS: I am also fine with other commands, like stat_summary_bin, but my problem stays the same.
I have a coefficient array bees created in the following way:
gfit = lm(y_data,x_data);
bees = coef(gfit);, where bees[1]=0.123, bees[2]=4.56
A plot plot(x_data,y_data) is created. I'd liket to add some text on this plot. The text should look like $b_0=0.123, b_1=4.55$ (how to add Latex symbols on StackOverflow?).
I tried the following command: text(3,15,expression(paste("b"[0],"="bees[1])));, which turns out to be $b_0=bees_1$, i.e. the variable bees[1] is not interpreted properly.
How can I display the value of a variable by typing its name?
R doesn't have a LaTeX interpreter. You need to use ?plotmath. Try using bquote to allow getting values of R-objects , and here assuming that (1,1) is in the range of your (undescribed) data. The .()-function will put values pulled from the working environment into expressions:
text(1,1, bquote( list( b[0] == .(bees[1]) , b[1] == .(bees[2]) ) ) )
See the examples in ?bquote.
Writing formulas is a horrible mess in R. Only regexp is more write-only.
bees=c(0.12, 4.56)
plot(rnorm(100))
text(30,0,bquote(bees[1]== .(bees[1])))
I have a file which I am plotting with gnuplot. My data looks like this:
x,y1,y2
0,0,0
1,0.0,0.1
1,0.1,0.15
1,0.3,0.2
... etc
2 blank lines -> new block
0,0,0
0,0,0 (just example data)
0,0,0
... etc
2 blank lines -> new block
0,0,0
0,0,0
0,0,0
... etc
... etc (more blocks)
If I run the command: plot 'file.csv' using 1:2, then all the blocks appear on the same graph. I have about 1000 blocks, so obviously this produces something unreadable.
How can I plot all the blocks on different graphs? Sort of like a "for each datablock" loop or something?
Possible Partial Answer
I have made progress on this using a gnuplot for loop. This might not actually be a particularly good method, and I am now stuck as I am unable to count the number of "data blocks" in my file.
This is what I have so far:
NMAX=3 # How do I know what this should be?
do for [n=0:NMAX] {
ofname=sprintf("%d.png", n)
set output ofname
plot 'timeseries.csv' index n using 1:2, 'timeseries.csv' index n using 1:3 with lines
}
Perhaps that is useful? At the moment I don't know how to set NMAX automatically.
Further Developments
NMAX can be set using the stats command: stats 'datafile.csv' then NMAX=STATS_blocks.
There may be a better method.
This question helped me: Count number of blocks in datafile
My code:
stats datafile
NMAX=STATS_blocks
do for [n=0:NMAX] {
ofname=sprintf("%d.png", n)
set output ofname
plot 'timeseries.csv' index n using 1:2, 'timeseries.csv' index n using 1:3 with lines
}
I am struggling with GNUPLOT binary data handling.
I have a binary file, printed by MATLAB frite function, which prints in column order.
I am printing a Nx2 array, that is a collection of points on xy plane, that I guess is stored as x1..xn y1..yn, as consecutive records in the binary file. Do you agree? Consider that I still have a not clear idea of what binary storage means. I am used to ASCII files, with nice separators and \n's.
So I want to plot these points with gnuplot. I have been reading the binary general documentation and I ended trying this:
plot 'datafile.bin' binary array=N:N w l
that means that my data file is made by two arrays, each one of N elements. Gnuplot produces one line, first following the values of the first array, then following the values of the second array, both of them on the interval 1:N.
I tried to use the first array as x axis of my plot and the second array as y axis, So I try:
plot 'datafile.bin' binary array=N:N u 1:2 w l
It plots the two arrays again consecutively, not in a xy plot. Where am I wrong?
Many thanks
EDIT: I tried to apply the scan=xy keyword to both the lines, but he told me that my file is a unidimensional record. So I guess that u 1:2 has no sense
I don't think gnuplot can handle the data type you describe. It doesn't know about arrays and matrices like matlab does.
Write your data file with pairs of x,y values.
Then you can
plot dataf binary format='%float%float' using 1:2".
(if your x,y values are both floats).
The "array" keyword is meant for the case when your file only contains the function values and you want gnuplot to construct the independent variable(s). Totally different.