GNU PLOT 2D Curve - plot

I am trying to plot the following data:
SMO LogiBoost BFTree
25(>=7) 0.81 0.72 0.62
30(>=7) 0.83 0.76 0.56
35(>=7) 0.84 0.70 0.75
40(>=7) 0.74 0.67 0.58
25(>=8) 0.73 0.76 0.57
30(>=8) 0.78 0.74 0.65
35(>=8) 0.83 0.78 0.68
40(>=8) 0.75 0.67 0.66
25(>=9) 0.69 0.74 0.62
30(>=9) 0.79 0.75 0.62
35(>=9) 0.82 0.82 0.69
40(>=9) 0.78 0.80 0.53
25(>=12) 0.77 0.78 0.67
30(>=12) 0.76 0.74 0.59
35(>=12) 0.91 0.94 0.75
40(>=12) 0.75 0.75 0.64
25(>=15) 0.74 0.74 0.60
30(>=15) 0.80 0.71 0.64
35(>=15) 0.80 0.71 0.76
40(>=15) 0.75 0.75 0.75
SansVar(>= 7) 0.80 0.77 0.61
SansVar(>=8) 0.71 0.75 0.56
SansVar(>=9) 0.81 0.76 0.71
SansVar(>=12) 0.84 0.82 0.68
SansVar(>=15) 0.81 0.83 0.75
The first column represents the X labels and the 1st line represents the Y lables
I tried to add the X labels also but they overlap each other, is it possible to fix it?
Command to plot: plot "data1.txt" using 1:xtic(1) title 'SMO' with lines,\ "data.txt" using 2:xtic(1) title 'LogiBoost' with lines, \ "data.txt" using 3:xtic(1) title 'BFTree' with lines
I found maybe a solution which is the following, but still the problem si that the xlabels don't fit in the whole image.
set xtics rotate by -45

You could try resizing the margins.
reset
set terminal png
set rmargin at screen 0.85
set bmargin at screen 0.25
set output 'out.png'
set xtics rotate by -45 scale 0
plot "data.dat" using 1:xtic(1) title 'SMO' with lines, \
"" using 2:xtic(1) title 'LogiBoost' with lines, \
"" using 3:xtic(1) title 'BFTree' with lines

Related

Grouped boxplot in R - simplest way

I have been struggling with creating a very simple grouped boxplot. My data looks as follows
> data
Wörter Sätze Text
P.01 0.15 0.24 0.34
P.02 0.10 0.15 0.08
P.03 0.05 0.18 0.16
P.04 0.55 0.60 0.44
P.05 0.00 0.06 0.26
P.06 0.20 0.65 0.68
P.07 0.15 0.31 0.47
P.08 0.35 0.87 0.69
P.09 0.35 0.75 0.76
N.01 0.40 0.78 0.59
N.02 0.55 0.95 0.76
N.03 0.65 0.96 0.83
N.04 0.60 0.90 0.77
N.05 0.50 0.95 0.82
If I simply execute boxplot(data) I obtain almost what I want. One plot with three boxes, each for one of the variables in my data.
Boxplot, almost
What I want is to separate these into two boxes per variable (one for the P-indexed, one for the N-indexed observations) for a total of six plots each.
I began by introducing a new variable
data$Gruppe <- c(rep("P",9), rep("N",5))
> data
Wörter Sätze Text Gruppe
P.01 0.15 0.24 0.34 P
P.02 0.10 0.15 0.08 P
P.03 0.05 0.18 0.16 P
P.04 0.55 0.60 0.44 P
P.05 0.00 0.06 0.26 P
P.06 0.20 0.65 0.68 P
P.07 0.15 0.31 0.47 P
P.08 0.35 0.87 0.69 P
P.09 0.35 0.75 0.76 P
N.01 0.40 0.78 0.59 N
N.02 0.55 0.95 0.76 N
N.03 0.65 0.96 0.83 N
N.04 0.60 0.90 0.77 N
N.05 0.50 0.95 0.82 N
Now that the data contains a non-numerical variable I cannot simply execute the boxplot() function as before. What would be a minimal alteration to make here to obtain the six plots that I want? (colour coding for the two groups would be nice)
I have encountered some solutions to a grouped boxplot, however the data from which others start tends to be organised differently than my (very simple) one.
Many thanks!
As #teunbrand already mentioned in the comments you could use pivot_longer to make your data in a longer format by Gruppe. You could use fill to make for each variable two boxplot in total 6 like this:
library(tidyr)
library(dplyr)
library(ggplot2)
data$Gruppe <- c(rep("P",9), rep("N",5))
data %>%
pivot_longer(cols = -Gruppe) %>%
ggplot(aes(x = name, y = value, fill = Gruppe)) +
geom_boxplot()
Created on 2023-01-10 with reprex v2.0.2
Data used:
data <- read.table(text = " Wörter Sätze Text
P.01 0.15 0.24 0.34
P.02 0.10 0.15 0.08
P.03 0.05 0.18 0.16
P.04 0.55 0.60 0.44
P.05 0.00 0.06 0.26
P.06 0.20 0.65 0.68
P.07 0.15 0.31 0.47
P.08 0.35 0.87 0.69
P.09 0.35 0.75 0.76
N.01 0.40 0.78 0.59
N.02 0.55 0.95 0.76
N.03 0.65 0.96 0.83
N.04 0.60 0.90 0.77
N.05 0.50 0.95 0.82", header = TRUE)

Fail to split string at each instance of character in R

I am using R to try and separate a long string of numbers all separated by the ";" character. The string looks like this:
";0,38;0,33;0,24;0,28; 0,33;0,33;0,38;0,23; 0,33;0,33; 0,38; 0,43; 0,51;0,56;0,33;0,56;0,33;0,43;0,51;0,56;\n\n0,61; 0,66;0,56; 0,66;0,56; 0,61; 0,66;0,61; 0,63; 0,66; 0,71;0,81;0,86; 0,99;0,86; 0,99; 1,12;1,27; 1,54; 1,57"
I have tried to do
strsplit(string,";")
and
str(string,";")
What is the quick way to do this so that I end up with a list of all the numbers in my list? Is there a way to do this with tidy verse?
The scan function allow using semicolons as separators and commas as decimal points (at least for input).
> vals <- scan(text=string, sep=";", dec=",")
Read 42 items
> vals
[1] NA 0.38 0.33 0.24 0.28 0.33 0.33 0.38 0.23 0.33 0.33 0.38 0.43 0.51 0.56 0.33 0.56 0.33
[19] 0.43 0.51 0.56 NA 0.61 0.66 0.56 0.66 0.56 0.61 0.66 0.61 0.63 0.66 0.71 0.81 0.86 0.99
[37] 0.86 0.99 1.12 1.27 1.54 1.57

R: Why can't use dimnames() to assign dim names

fg = read.table("fungus.txt", header=TRUE, row.names=1);fg
names(dimnames(fg)) = c("Temperature", "Area");names(dimnames(fg))#doesn't work
dimnames(fg) = list("Temperature"=row.names(fg), "Area"=colnames(fg));dimnames(fg)
#doesn't work
You can look at the picture of data I used below:
Using dimnames() to assign dim names to the data.frame doesn't work.
The two R command both do not work. The dimnames of fg didn't change, and the names of dimnames of fg is still NULL.
Why does this happen? How to change the dimnames of this data.frame?
Finally I found change the data frame to matrix works well.
fg = as.matrix(read.table("fungus.txt", header=TRUE, row.names=1))
dimnames(fg) = list("Temp"=row.names(fg), "Isolate"=1:8);fg
And got the output:
Isolate
Temp 1 2 3 4 5 6 7 8
55 0.66 0.67 0.43 0.41 0.69 0.63 0.46 0.52
60 0.82 0.81 0.80 0.79 0.85 0.91 0.53 0.66
65 0.91 1.09 0.81 0.86 0.95 0.93 0.64 1.10
70 1.02 1.22 1.03 1.08 1.10 1.13 0.80 1.17
75 1.06 1.17 0.89 1.02 1.06 1.29 0.94 1.01
80 0.80 0.81 0.73 0.77 0.80 0.79 0.59 0.95
85 0.26 0.40 0.36 0.53 0.67 0.53 0.57 0.18
Reply to the comments: if you do not know anything about the code, then do not ask me why I post such a question.

'x' must be numeric ERROR in R while trying to create a Leaf and Stem display

I am a beginner at R and I'm just trying to read a text file that contains values and create a stem display, but I keep getting an error. Here is my code:
setwd("C:/Users/Michael/Desktop/ch1-ch9 data/CH01")
gravity=read.table("C:ex01-11.txt", header=T)
stem(gravity)
**Error in stem(gravity) : 'x' must be numeric**
The File contains this:
'spec_gravity'
0.31
0.35
0.36
0.36
0.37
0.38
0.4
0.4
0.4
0.41
0.41
0.42
0.42
0.42
0.42
0.42
0.43
0.44
0.45
0.46
0.46
0.47
0.48
0.48
0.48
0.51
0.54
0.54
0.55
0.58
0.62
0.66
0.66
0.67
0.68
0.75
If you can help, I would appreciate it! Thanks!
gravity is a data frame. stem expects a vector. You need to select a column of your data set and pass to stem, i.e.
## The first column
stem(gravity[,1])

A better way to plot lots of lines (in ggplot perhaps)?

Using R 3.0.2, I have a dataframe that looks like
head()
0 5 10 15 30 60 120 180 240
YKL134C 0.08 -0.03 -0.74 -0.92 -0.80 -0.56 -0.54 -0.42 -0.48
YMR056C -0.33 -0.26 -0.56 -0.58 -0.97 -1.47 -1.31 -1.53 -1.55
YBR085W 0.55 3.33 4.11 3.47 2.16 2.19 2.01 2.09 1.55
YJR155W -0.44 -0.92 -0.27 0.75 0.28 0.45 0.45 0.38 0.51
YNL331C 0.42 0.01 -0.05 0.23 0.19 0.43 0.73 0.95 0.86
YOL165C -0.49 -0.46 -0.25 0.03 -0.26 -0.16 -0.12 -0.37 -0.34
Where row.names() are variable names, names() are measurement times, and the values are measurements. It's several thousand rows deep. Let's call it tmp.
I want to do a sanity check of plotting every variable as time versus value as a line-plot on one plot. What's a better way to do it than naively plotting each line with plot() and lines():
timez <- names(tmp)
plot(x=timez, y=tmp[1,], type="l", ylim=c(-5,5))
for (i in 2:length(tmp[,1])) {
lines(x=timez,y=tmp[i,])
}
The above crude answer is good enough, but I'm looking for a way to do this right. I had a concusion recently, so sorry if I'm missing something obvious. I've been doing that a lot.
Could it be something with transposing the data.frame so it's each timepoint observed across several thousand variables? Or melt()-ing the data.frame in some meaningful way? Is there someway of handling it in ggplot using aggregate()s of data.frames or something? This isn't the right way to do this, is it?
At a loss.
I personally prefer ggplot2 for all of my plotting needs. Assuming I've understood you correctly, you can put the data in long format with reshape2 and then use ggplot2 to plot all of your lines on the same plot:
library(reshape2)
df2<-melt(df,id.var="var")
names(df2)<-c("var","time","value")
df2$time<-as.numeric(substring(df2$time,2))
library(ggplot2)
ggplot(df2,aes(x=time,y=value,colour=var))+geom_line()
You can simply use matplot as follows
DF
## 0 5 10 15 30 60 120 180 240
## YKL134C 0.08 -0.03 -0.74 -0.92 -0.80 -0.56 -0.54 -0.42 -0.48
## YMR056C -0.33 -0.26 -0.56 -0.58 -0.97 -1.47 -1.31 -1.53 -1.55
## YBR085W 0.55 3.33 4.11 3.47 2.16 2.19 2.01 2.09 1.55
## YJR155W -0.44 -0.92 -0.27 0.75 0.28 0.45 0.45 0.38 0.51
## YNL331C 0.42 0.01 -0.05 0.23 0.19 0.43 0.73 0.95 0.86
## YOL165C -0.49 -0.46 -0.25 0.03 -0.26 -0.16 -0.12 -0.37 -0.34
matplot(t(DF), type = "l", xaxt = "n", ylab = "") + axis(side = 1, at = 1:length(names(DF)), labels = names(DF))
xaxt = "n" suppresses ploting x axis annotations. axis function allows you to specify details for any axis, in this case we are using to specify labels of x axis.
It should produce plot as below.

Resources