I have a file which I am plotting with gnuplot. My data looks like this:
x,y1,y2
0,0,0
1,0.0,0.1
1,0.1,0.15
1,0.3,0.2
... etc
2 blank lines -> new block
0,0,0
0,0,0 (just example data)
0,0,0
... etc
2 blank lines -> new block
0,0,0
0,0,0
0,0,0
... etc
... etc (more blocks)
If I run the command: plot 'file.csv' using 1:2, then all the blocks appear on the same graph. I have about 1000 blocks, so obviously this produces something unreadable.
How can I plot all the blocks on different graphs? Sort of like a "for each datablock" loop or something?
Possible Partial Answer
I have made progress on this using a gnuplot for loop. This might not actually be a particularly good method, and I am now stuck as I am unable to count the number of "data blocks" in my file.
This is what I have so far:
NMAX=3 # How do I know what this should be?
do for [n=0:NMAX] {
ofname=sprintf("%d.png", n)
set output ofname
plot 'timeseries.csv' index n using 1:2, 'timeseries.csv' index n using 1:3 with lines
}
Perhaps that is useful? At the moment I don't know how to set NMAX automatically.
Further Developments
NMAX can be set using the stats command: stats 'datafile.csv' then NMAX=STATS_blocks.
There may be a better method.
This question helped me: Count number of blocks in datafile
My code:
stats datafile
NMAX=STATS_blocks
do for [n=0:NMAX] {
ofname=sprintf("%d.png", n)
set output ofname
plot 'timeseries.csv' index n using 1:2, 'timeseries.csv' index n using 1:3 with lines
}
Related
Im an R beginner, and spent almost two days to figure out how one can draw two time-series inside one graph using "ts.plot". This should be a very simple task, but for some reason there was always something wrong.
My Dataset looks like this:
Data
I figured it out, and there are several ways to accomplish the task.
This is the most straightworward way: assign "variable_1" to "x" and "variable_2" to "y". Then use "ts.plot" to plot the graph:
x <- usa$central_bank_assets_gdp_percent
y <- usa$domestic_credit_private_sector_gdp
ts.plot(ts(x), ts(y), col=1:2)
Defining the location of the master dataset first, and then including the real variable names in the code:
attach(usa)
ts.plot(ts(central_bank_assets_gdp_percent), ts(domestic_credit_private_sector_gdp), col=1:2)
detach(usa)
Using the "$" sign as an alternative to specifying the location of the data:
ts.plot(ts(usa$central_bank_assets_gdp_percent), ts(usa$domestic_credit_private_sector_gdp), col=1:2)
Using "data.frame()" one can include the variables:
ts.plot(data.frame(usa$central_bank_assets_gdp_percent, usa$financial_system_deposits_gdp_percent), col=1:2)
This is the way specified in the help: Using "ts.plot(..., gpars = list())". In this case "..." are the variables, and all other functions go in the "gpars=list()":
ts.plot(ts(usa$central_bank_assets_gdp_percent), ts(usa$financial_system_deposits_gdp_percent), gpars = list(col=1:2))
I am trying to create filled.contour using the following matrix.
row1 <- rep(10,100)
row2 <- sample(c(10:30),100,replace=TRUE)
row3 <- rep(30,100)
z1 <- cbind(row1,row2,row3)
col1 <- colorRampPalette(c('red','yellow','deepskyblue'))(20)
filled.contour(z=z1,col=col1,cex.lab=2,cex.main=1.1,nlevels=20,main=('Heat map'))
I get the following plot:
(You won't see the border in the legend because I have modified filled.contour as stated here)
You can see that row2 is placed exactly in the middle(at position 0.5 with respect to the axis). my question is as follows:
Is it possible to put the rows not symmetrically but on user defined locations? For example, I am requiring to put the rows at positions c(0,.33,1) and not the default c(0,.5,1).
While I waited for some response here, I kept on trying and actually it was a very silly problem.
You just need to change the default y argument in filled.contour
Here is the (not significantly) modified function which gives the following image:
filled.contour(z=z1,y=c(0,.33,1),col=col1,cex.lab=2,cex.main=1.1,nlevels=20,main=('Heat map with custom y placement'))
I have a data frame that looks like that:
bin_with_regard_to_strand CLONE3
31 0.14750872
33 0.52735917
28 0.48559060
. .
. .
I want to use this data frame to generate violin plots in such a way that all of the values in CLONE3 corresponding to a given value of bin_with_regard_to_strand will generate one plot.
Further, I want all of the plots to appear in the same graphic device (I'm using R-studio, and I want all of the plots to appear in one plot window).
Theoretically I could do this with:
vioplot(df$CLONE3[which(df$bin_with_regard_to_strand==1)],
df$CLONE3[which(df$bin_with_regard_to_strand==2)]...)
but since bin_with_regard_to_strand has 60 different values, this seems a bit ridiculous.
I tried using tapply:
tapply(df$CLONE3, df$bin_with_regard_to_strand,vioplot)
But that would open 60 different windows (one for each plot).
Or, if I used the add parameter:
tapply(df$CLONE3, df$bin_with_regard_to_strand,vioplot(add=TRUE))
generated a single plot with the data from all values bin_with_regard_to_strand (seperated by lines).
Is there a way to do this?
You could use par(mfrow=c(rows, columns)) (see ?par for details).
(see also ?layout for complexer arrangements)
d <- lapply(1:6, function(x)runif(100)) # generate some example data
library("vioplot")
par(mfrow=c(3, 2)) # use a 3x2 (rows x columns) layout
lapply(d, vioplot) # call plot for each list element
par(mfrow=c(1, 1)) # reset layout
Another alternative to mfrow, is to use layout. It is very handy to organize your plots. You just create a matrix with plots index. Here what you can do. It seems that 60 boxplots is a huge number. Maybe you should organize them in 2 pages.
The code below in function of N (number of plots)
library(vioplot)
N <- 60
par(mar=rep(2,4))
layout(matrix(c(1:N),
nrow=10,byrow=T))
dat <- data.frame(bin_with_regard_to_strand=gl(N,10),CLONE3=rnorm(10*N))
with(dat ,
tapply(CLONE3,bin_with_regard_to_strand ,vioplot))
This is an old question, but though I would put out a different solution for getting vioplot to make multiple violin plots on the same graph (i.e. same axes), rather than on different graphics objects like the above answers.
Basically use do.call to apply vioplot to a list of data. Ultimately, vioplot is not very well written (can't even set the title, axis names, etc.). I usually prefer base R, but this is a case where ggplot2 options is probably the way to go.
x<-rnorm(1000)
fac<-rep(c(1:10),each=100)
listOfData<-tapply(x,fac,function(x){x},simplify=FALSE)
names(listOfData)[[1]]<-"x" #because vioplot requires a 'x' argument
do.call(vioplot,listOfData)
resultingImage
I currently have a dataset which has a format of: (x, y, type)
I've used the code that is found on the example of plotting with Postgres through R.
My question is: How would I get R to generate multiple graphs for each unique "type" column?
I'm new to R, so my appologies if this is something that is extremely easy and I just lack the understanding of loops with R.
So lets say we have this data:
(1,1,T), (1,2,T), (1,3,T), (1,4,T), (1,5,T), (1,6,T),
(1,1,A), (1,2,B), (1,3,B), (1,4,B), (1,5,A), (1,6,A),
(1,1,B), (1,2,B), (1,3,C), (1,4,C), (1,5,C), (1,6,C),
It would plot 4 individual graphs on the page. One for each of the types T, A, B, and C. [Ploting x,y]
How would I do that with R when the data coming in may look like the data above?
While the other post has some good info, there's a faster way to do all that. So assuming your data frame or matrix is called DF and is in the form above (where each (1,2,B) or whatever is a row), then:
by(DF, DF[,3], function(x) plot(x[,1], x[,2], main=unique(x[,3])))
And that's it.
If you'd like all the four plots to be on the same page, you can first change the graphing paramter option:
par(mfrow=c(2,2))
And back to default par(mfrow=c(1,1) when you're done.
I'm quite fond of the ggplot2 package, which does the same thing that user1717913 suggests, but with slightly different syntax (it does a lot of other things very nicely, which is why I like it.)
test <- data.frame(x=rep(1,18),y=rep(1:6,3),type=c("T","T","T","T","T","T","A","B","B","B","A","A","B","B","C","C","C","C"))
require(ggplot2)
ggplot(test, aes(x=x, y=y)) + #define the data that the plot will use, and which variables go where
geom_point() + #plot it with points
facet_wrap(~type) #facet it by the type variable
R is really cool in that there's a bazillion (that's a technical term) different ways to do most things. The way I would do is is to split the data along the groups, and then plot by group.
To do that, the split command is what you want (I'll assume your data is in an object called data):
data.splitted <- split(data, data$type)
Now the data will have this form (let's assume you have 3 types, A, B, and C):
data.splitted
L A
| L x y type
| 1 4 A
| 3 6 A
L B
| L x y type
| 3 3 B
| 2 1 B
L C
L x y type
4 5 C
5 2 C
and so on. You would reference the "4" in the y column of group A like so:
data.splitted$A$y[1] or data.splitted[[1]][[2]][1] Hopefully seeing them both together makes enough sense.
Now that we have the data split, we're getting closer.
We still need to tell R that we want to plot a bunch of graphs to the same window. Now, this is just one way to go about it. You could also tell it to write each graph to a image file, or a pdf, or whatever you want.
groups <- names(data.splitted) puts your different types into a variable for reference later.
par(mfcol=c(length(groups),1))
Using mfcol fills the graphs in vertically. the mfrow option fills in horizontally. The c() just combines input. The length(groups) returns the total number of groups.
Now we can work on the for-loop.
for(i in 1:length(data.splitted)){ # This tells it what i is iterating from and to.
# It can start and stop wherever, or be a
# sequence, ascending or descending,
# the sky is the limit.
tempx <- data.splitted[[i]][[x]] # This just saves us
tempy <- data.splitted[[i]][[y]] # a bunch of typing.
plot(tempx, tempy, main=groups[i]) # Plot it and make the title the type.
rm(tempx, tempy) # Remove our temporary variables for the next run through.
}
So you see, it's not too bad when you break it down into its components. You can do pretty much anything this way. I have a project I'm working on right now, where I'm doing this for 18 lidar metrics that I calculated using another for loop.
Commands to read up on:
split, plot, data.frame, "[",
par(mfrow=___) and par(mfcol=___)
Here's a few helpful links to get you started. The most helpful one of all is built right in to R though. a ? followed by a command will bring up the html help for that command in your browser.
Good luck!
I have data in some text file which has let's say 10000 rows and 2 columns. I know that I can plot it easily by plot "filename.txt" using 1:2 with lines . What I want is however just plotting let's say the rows from 1000 to 2000 or any other reasonable selection. Is it possible to do that easily? Thank you very much in advance.
It appears that the "every" command in gnuplot is what you're looking for:
plot "filename.txt" every ::1000::2000 using 1:2 with lines
Alternatively, pre-process your file to select the rows in which you are interested. For example, using awk:
awk "NR>=1000 && NR<=2000" filename.txt > processed.txt
Then use the resulting "processed.txt" in your existing gnuplot command/script.
Simpler:
plot "<(sed -n '1000,2000p' filename.txt)" using 1:2 with lines
You can probably cut out the reliance on an external utility (If your system doesn't have them installed for example) using the pseudo-column 0.
see help plot datafile using pseudocolumn
Try something like:
LINEMIN=1000
LINEMAX=2000
#create a function that accepts linenumber as first arg
#an returns second arg if linenumber in the given range.
InRange(x,y)=((x>=LINEMIN) ? ((x<=LINEMAX) ? y:1/0) : 1/0)
plot "filename.txt" using (InRange($0,$1)):2 with lines
(tested on Gnuplot 4.4.2, Linux)
Gnuplot ignores NaN values. This works for me for a specified range of the x coordinate. Not sure how to specify row range though.
cutoff(c1,c2,xmin,xmax) = (c1>=xmin)*(c1<=xmax) ? c2 : NaN
plot "data.txt" u 1:(cutoff(($1),($2),1000,2000))
I would recommend some commandline tools like sed, grep or bash. In your example
head -n 2000 ./file.data > temp.data
and
tail -n 1000 temp.data > temp2.data
might work. But haven't tested if such large numbers work with head and tail.