plotting gnuplot with palette showing some labels - plot

I am plotting data series with gnuplot with command:
p 'file.txt' u 1:2:3 with labels
and got the graph with a lot of labels as below
which looks messy. So, i use different command:
p 'file.txt' u 1:2:3 with points pt 5 palette
which showed beautiful graph with colour spectrum.
But it did not show the labels. Acutally I don't need to show all labels, but I would like to show lowest five and highest five values.
How can I mix these two commands so that I can show the graph with colour spectrum with 10 labels (5 for lowest five and another 5 for highest five). Thanks.

The labels style accepts a tc palette option
Thus you can do
plot datafile u 1:2:3:3 with labels tc palette
For example, with the following data
1 1 30
1 2 40
2 2 30
2 1 35
3 3 10
3 4 15
using plot datafile u 1:2:3:3 with labels tc palette will plot
In order to filter to only the top 5 and bottom 5 numbers, you will need to do some pre-processing of your data outside of gnuplot.

Related

Label First and Last data points on a r Plot

I have seen many examples using directlabels to place labels on plot itself. However, all the examples only label the lines or points with the name of the series, i.e pretty much like legend.
Is it possible to label the first and last data points with the values of the points? E.g.
1-Jan 2-Jan ... 31-Jan
A 10 3 ... 7
B 8 11 ... 20
If the above data is plotted as line charts, is it possible to place a label on the left of the 2 lines as 10 and 8, and likewise label the right most points as 7 and 20?
Update: Thanks for the comments. Yes, I am using ggplot. I attach a mock-up below just to illustrate my requirement:
with ggplot2 you can pass the desired subset which you want to be labled in to data
add the below to your ggplot
p + geom_point() + geom_text(data="subset condition"), .....)

Manipulating Axes in ggplot2 in R

I have the following dataframe and I am using ggplot to plot the ind vs values.
ggplot(data=stats,aes(x=ind,y=values,fill=ind))+geom_bar(stat="identity")+coord_flip()+scale_fill_brewer()
stats
values ind
1 238970950 testdb_i
2 130251496 testdb_b
3 314350612 testdb_s
4 234212341 testdb_m
5 222281421 testdb_e
6 183681071 testdb_if
7 491868567 testdb_l
8 372612463 testdb_p
The plot in y-axis is in the form of 0e+00, 1e+08, 2e+08 and so on but instead I need it in the form of 100M(hundred million), 200M(two hunderd million) etc marks. How can I get the desired axes in ggplot?
You may try
ggplot(data=stats,aes(x=ind,y=values,fill=ind))+
geom_bar(stat="identity")+
coord_flip()+
scale_fill_brewer()+
scale_y_continuous(labels=function(x) paste0(x/1e6,"M"))

Multiple Plots in R

I want to plot 2 graphs in 1 frame. Basically I want to compare the results.
Anyways, the code I tried is:
plot(male,pch=16,col="red")
lines(male,pch=16,col="red")
par(new=TRUE)
plot(female,pch=16,col="green")
lines(female,pch=16,col="green")
When I run it, I DO get 2 plots in a frame BUT it changes my y-axis. Added my plot below. Anyways, y-axis values are -4,-4,-3,-3,...
It's like both of the plots display their own axis.
Please help.
Thanks
You don't need the second plot. Just use
> plot(male,pch=16,col="red")
> lines(male, pch=16, col = "red")
> lines(female, pch=16, col = "green")
> points(female, pch=16, col = "green")
Note: that will set the frame boundaries based on the first data set, so some data from the second plot could be outside the boundaries of the plot. You can fix it by e.g. setting the limits of the first plot yourself.
For this kind of plot I usually like the plotting with ggplot2 much better. The main reason: It generalizes nicely to more than two lines without a lot of code.
The drawback for your sample data is that it is not available as a data.frame, which is required for ggplot2. Furthermore, in every case you need a x-variable to plot against. Thus, first let us create a data.frame out of your data.
dat <- data.frame(index=rep(1:10, 2), vals=c(male, female), group=rep(c('male', 'female'), each=10))
Which leaves us with
> dat
index vals group
1 1 -0.4334269341 male
2 2 0.8829902521 male
3 3 -0.6052638138 male
4 4 0.2270191965 male
5 5 3.5123679143 male
6 6 0.0615821014 male
7 7 3.6280155376 male
8 8 2.3508890457 male
9 9 2.9824432680 male
10 10 1.1938052833 male
11 1 1.3151289227 female
12 2 1.9956491556 female
13 3 0.8229389822 female
14 4 1.2062726250 female
15 5 0.6633392820 female
16 6 1.1331669670 female
17 7 -0.9002109636 female
18 8 3.2137052284 female
19 9 0.3113656610 female
20 10 1.4664434215 female
Note that my command assumes you have 10 data values each. That command would have to be adjusted according to your actual data.
Now we may use the mighty power of ggplot2:
library(ggplot2)
ggplot(dat, aes(x=index, y=vals, color=group)) + geom_point() + geom_line()
The call above has three elements: ggplot initializes the plot, tells R to use dat as datasource and defines the plot aesthetics, or better: Which aesthetic properties of the plot (such as color, position, size, etc.) are influenced by your data. We use the x and y-values as expected and furthermore set the color aesthetic to the grouping variable - that makes ggplot automatically plot two groups with different colors. Finally, we add two geometries, that pretty much do what is written above: Draw lines and draw points.
The result:
If you have your data saved in the standard way in R (in a data.frame), you end with one line of code. And if after some thousands years of evolution you want to add another gender, it is still one line of code.

R lattice - trying to change labels colors with y.scale.components customisation

I currently try to customise a lattice parallel plot, by changing its Y axis label colors, depending on the character of these same lables. I created a customised y.scale.components function, as described in many books/forums. However, after assigning a vector of new colors to the ans$left$labels$col parameter, only default color (black) is used for the plot.
Here's the code:
test2 <- read.table(textConnection("
species evalue l1 l2 l3
Daphnia.pulex 1.0E-6 17 41 35
Daphnia.pulex 1.0E-10 11 30 25
Daphnia.pulex 1.0E-20 4 14 17
Daphnia.pulex 1.0E-35 4 8 15
Daphnia.pulex 1.0E-50 1 4 8
Daphnia.pulex 1.0E-75 0 2 6
Ixodes.scapularis 1.0E-6 7 20 118
Ixodes.scapularis 1.0E-10 6 17 107
Ixodes.scapularis 1.0E-20 4 6 46
Ixodes.scapularis 1.0E-35 2 3 14
Ixodes.scapularis 1.0E-50 0 0 5
Ixodes.scapularis 1.0E-75 0 0 2
")->con,header=T);close(con)
#data.frame to assign a color to the data, depending on species names on y axis
orga<-c("Daphnia.pulex","Ixodes.scapularis")
color<-c("cornsilk2","darkolivegreen1" );
phylum<-c("arthropoda","arthropoda" );
colorChooser<-data.frame(orga,color,phylum)
#fonction for custom rendering of left y axis labels
yscale.components.custom<-function(...) {
ans<-yscale.components.default(...)
#vector for new label colors, grey60 by default
new_colors<-c()
new_colors<-rep("grey60",length(ans$left$labels$labels))
# the for() check all labels character and assign the corresponding color with the colorChooser data.frame
n<-1
for (i in ans$left$labels$labels) {
new_colors[n]<-as.character(colorChooser$color[colorChooser$orga==i])
#got the color corresponding to the label, with the colorChooser dataframe
n<-n+1
}
print(length(new_colors))
cat(new_colors,sep="\n") #print the content of the generated color vector
ans$left$labels$col<-new_colors #assign this vector to col parameter
ans
}
#plot everything
bwplot( reorder(species,l1,median)~l1,
data=test2,
panel = function(..., box.ratio) {
panel.grid(h=length(colnames(cdata[,annot.arthro]))-1,v=0,col.line="grey80")
panel.violin(..., col = "white",varwidth = FALSE, box.ratio = box.ratio )
panel.bwplot(..., fill = NULL, box.ratio = .07)
},
yscale.components=yscale.components.custom
)
Here's the output of the cat() command, included in the yscale.components.custom function. As you can see, it outputs two times the color labels, but the vector assigned to ans$left$labels$col is of length 2. Is there a second call that setup the Y axis labels colors ? where does it come from ?
[1] 2
darkolivegreen1
cornsilk2
[1] 2
darkolivegreen1
cornsilk2
Any help is welcome, i don't undestand why the colors are assigned to ans$left$labels$col but everything is drawn in blacK. I would like also to change the violin border colors, using the same colorChooser data.frame, but that's another story...
After asking to Deepayan Sarkar, the ans$left$labels$col value is apparently ignored during lattice execution.
I used a different solution with the "scales" argument. Unfortunately, I cannot anymore rely on the lattice reorder function to reorder my data series by their median.
For the code mentionned above, I order them manually, then create a vector of color with the corresponding order. I cannot rely on the lattice reordering anymore (the "reorder" in my lattice formula). Then, I setup my axix labels colors with
scales=list(col=c(color1,color2,...))

How can I produce this graph with ggplot2

This is the basic graph R presents when plotting a data frame.
plot(df)
It displays the relationship between all variables.
I know about faceting in ggplot2 but it's used for partition according to specific variables. I want to facet by a target parameter (for color) and split the grid by the variables.
sample data:
prediction.date mean.forcast mean.Error standard.Deviation AIC param.u param.v
2012-08-29 0.0015608102 0.008296402 0.008296402 -6.165365 2 5
2012-08-30 -0.0002720289 0.008537309 0.008537309 -6.164167 2 4
2012-09-02 -0.0014277972 0.008194409 0.008194409 -6.168868 4 0
2012-09-03 0.0016537998 0.008062687 0.008062687 -6.176634 5 3
2012-09-04 -0.0030247699 0.007885009 0.007885009 -6.181844 4 3
2012-09-05 0.0001538991 0.007524703 0.007524703 -6.197240 3 4
If you just need to color points in plot you provided, then you can use argument col= in plot() and set names of colors and variable to use in determining color.
#variable of test result (should be the same length as number of rows in df)
test.result<-c(0,1,1,0,0,1)
plot(df[,3:7],col=c("green","red")[as.factor(test.result)])

Resources