Names cut on the x axis of the boxplot

Names cut on the x axis of the boxplot - r

I'm trying to save a boxplot in .tiif format using the code below:
sample_01 <- c(6, 1, 6, 8, 9, 8, 7, 3, 4, 9)
sample_02 <- c(13, 17, 16, 22, 18, 14, 20, 20, 11, 19)
sample_03 <- c(25, 23, 26, 29, 29, 22, 30, 27, 26, 21)
sample_04 <- c(31, 37, 40, 36, 33, 34, 31, 32, 37, 35)
sample_05 <- c(41, 44, 43, 47, 45, 50, 41, 45, 43, 50)
tiff(file = "temp.tiff", width = 3200, height = 3200, units = "px", res = 300)
box <- boxplot(sample_01,sample_02,sample_03,sample_04,sample_05,
names = c("sample_01","sample_02","sample_03","sample_04","sample_05"),
ylab = 'Relative Abundance (%)',
ylim = c(0,55),
col = c('red','green','blue','orange','purple'),
las=2,
cex.axis = 1.5,
cex.lab = 1.5
)
dev.off()
However, variable names are always cut on the graph's x-axis. I tried to use the parameter par(mar = c ()) in several different ways but I was unable to solve the problem. I also changed the height and width values, but without success either. How can I proceed so that the names of the x-axis are saved whole.

You can set the margin of your plot by using par(mar = c(bottom,left,top,right)).
As pointed it out by #AndersonNBarbosa, par(mar(...) need to be specified after tiff(...):
tiff(file = "temp.tiff", width = 3200, height = 3200, units = "px", res = 300)
par(mar = c(8,5,2,2))
box <- boxplot(sample_01,sample_02,sample_03,sample_04,sample_05,
names = c("sample_01","sample_02","sample_03","sample_04","sample_05"),
ylab = 'Relative Abundance (%)',
ylim = c(0,55),
col = c('red','green','blue','orange','purple'),
las=2,
cex.axis = 1.5,
cex.lab = 1.5
)
dev.off()

dc37, you noticed me a mistake you were making. In my script, I was using the command par(mar=c()) before tiff() and this was causing error in the graph with the example below:
par(mar = c(8,5,2,2))
tiff(...)
boxplot(...)
dev.off()
Therefore, when saving the image, the command par(mar=c()) must come after tiff() to be all right. As I show below:
tiff(...)
par(mar = c(8,5,2,2))
boxplot(...)
dev.off()

Related

How to add or annotate Latex formula as annotation in boxplot() in R?

I want to annotate my boxplot (create in Base R) with some text and latex formula's, I tried with $..formula..$, but didn't work. Does anyone know a solution?
i = c(1:20)
X = c(13, 18, 25, 58, 25, 31, 39, 42, 17, 35, 46, 22, 18, 20, 26, 14, 33, 19, 20, 21)
df = data.frame(i, X)
boxplot(df$X, data=df, main="Belminuten Data",
xlab=" ", ylab="Aantal Belminuten",
frame = FALSE,
ylimit = c(10, 60),
range=3)
text(x = c(1.3), y = 60, "n = 20") # n should be in italic or in formula style
text(x = c(.7), y = 23.5, "Med = 23.5")
text(x = c(.7), y = 18.5, "Q_1 = 18.5")

library(latex2exp)
i = c(1:20)
X = c(13, 18, 25, 58, 25, 31, 39, 42, 17, 35, 46, 22, 18, 20, 26, 14,
33, 19, 20, 21)
df = data.frame(i, X)
boxplot(df$X, data=df, main="Belminuten Data",
xlab=" ", ylab="Aantal Belminuten",
frame = FALSE,
ylimit = c(10, 60),
range=3)
text(x = c(1.3), y = 60, TeX('$n = 20$'))
text(x = c(.7), y = 13.0, TeX('$Min = 13$'))
text(x = c(.7), y = 18.5, TeX('$Q_1 = 18.5$'))
text(x = c(.7), y = 23.5, TeX('$Med = 23.5$'))
text(x = c(.7), y = 34.0, TeX('$Q_3 = 34$'))
text(x = c(.7), y = 58.0, TeX('$Max = 58$'))

How to get the perfect "Before-After" graph with connected dots and paired U test using ggplot2?

My data looks like this:
mydata <- data.frame(ID = c(1, 2, 3, 5, 6, 7, 9, 11, 12, 13), #patient ID
t1 = c(37, 66, 28, 60, 44, 24, 47, 44, 33, 47), #evaluation before
t4 = c(33, 45, 27, 39, 24, 29, 24, 37, 27, 42), #evaluation after
sexe = c(1, 2, 2, 1, 1, 1, 2, 2, 2, 1)) #subset
I would like to do a simple before-after graph.
So far, I managed to get this:
With this:
library(ggplot2)
ggplot(mydata) +
geom_segment(aes(x = 1, xend = 2, y = t1, yend = t4), size=0.6) +
scale_x_discrete(name = "Intervention", breaks = c("1", "2"), labels = c("T1", "T4"), limits = c(1, 2)) +
scale_y_continuous(name = "Var") + theme_bw()
I am facing multiple issues, can you help me to...
add black circle at the begining and the end of every line? (geom_point() doesn't work)
make line smoother (look how pixelated they are, especially the second one)?
decrease blank space on left and right side of the graph?
add median for T1 and T4 (in red), link those points, compare them with paired mann whitney test and print p-value on the graph?
I would like not to reformat my database to long format I have a lot of other variable and timepoint (not shown here).
I have read other posts (such as here) but solution provided look so complicated for something that seems simple (yet i can't do it...).
Huge thanks for your help!
I will update the graph along with progression :)
EDIT
I would like not to reformat my database to long format as I have a lot of other variables and timepoints (not shown here)...

Here what i would do! Please feel free to ask questions regarding what's going on here.
library(tidyverse)
mydata <- data.frame(ID = c(1, 2, 3, 5, 6, 7, 9, 11, 12, 13), #patient ID
t1 = c(37, 66, 28, 60, 44, 24, 47, 44, 33, 47), #evaluation before
t4 = c(33, 45, 27, 39, 24, 29, 24, 37, 27, 42), #evaluation after
sexe = c(1, 2, 2, 1, 1, 1, 2, 2, 2, 1))
pval <- wilcox.test(x = mydata$t1,y = mydata$t4, paired = T,exact = F)$p.value %>% round(2)
df <- mydata %>%
pivot_longer(2:3,names_to = "Time") %>% # Pivot into long-format
mutate(sexe = as.factor(sexe),
Time = as.factor(Time)) # Make factors
ggplot(df,aes(Time,value,color = sexe,group = ID)) +
geom_point() +
geom_line() +
stat_summary(inherit.aes = F,aes(Time,value),
geom = "point", fun = "median", col = "red",
size = 3, shape = 24,fill = "red"
) +
annotate("text", x = 1.7, y = 60, label = paste('P-Value is',pval)) +
coord_cartesian(xlim = c(1.4,1.6)) +
theme_bw()
Also be aware that it is common to have some variables which repeat through time, in addition to the long format data. See example here:
mydata <- data.frame(ID = c(1, 2, 3, 5, 6, 7, 9, 11, 12, 13), #patient ID
t1 = c(37, 66, 28, 60, 44, 24, 47, 44, 33, 47), #evaluation before
t4 = c(33, 45, 27, 39, 24, 29, 24, 37, 27, 42), #evaluation after
sexe = c(1, 2, 2, 1, 1, 1, 2, 2, 2, 1),
var1 = c(1:10),
var2 = c(1:10),
var3 = c(1:10))
df <- mydata %>%
pivot_longer(2:3,names_to = "Time") %>% # Pivot into long-format
mutate(sexe = as.factor(sexe),
Time = as.factor(Time))

I can address (1) black circles issue:
First, you should tidy your data, so one column holds information of one variable (now 'Var' values on the plot are stored in two columns: 't1' and 't4'). You can achive this with tidyr package.
library(tidyr)
mydata_long <- pivot_longer(mydata, c(t1, t4), names_to = "t")
Now creating points is easy, and the rest of the code becomes a lot clearer:
We can tell ggplot that we want 't' groups on x-axis, their values on y-axis and in case of lines, we want them separate for every 'ID'.
ggplot(mydata_long) +
geom_line(aes(x = t, y = value, group = ID)) + #ploting lines
geom_point(aes(x = t, y = value)) + #ploting points
labs(x = "Intervention", y = "Var") + #changing labels
theme_bw()

Adding a third line to a twoord plot

I have a twoord plot produced with the plotrix package and would like to add a horizontal line representing a particular value to it. The plot is all set up but I need help adding the line.
Here is some sample code:
fake <- matrix(c(1, 2, 3, 4, 5, 22, 30, 47, 98, 62, 20, 40, 10, 15, 15), nrow = 5)
fake <- as.data.frame(m)
horizontallineat <- 50
twrd.p <- twoord.plot(fake$V1,fake$V3,fake$V1,fake$V2, xlab="Bin",
lylim=c(0,100),rylim=c(0,100),type=c("bar","l"),
ylab="Exposure Percentage",rylab="Bin Average PP",
lytickpos=seq(0,100, by = 10),
rytickpos=seq(0,100, by = 10),
ylab.at=50,rylab.at=50,
main="Variable Name",
lcol=3,rcol=4)
This is the plot
Thank you in advance for any insight you are able to offer.

I am not sure whether this is what you want, but you can simply add line with the R-base lines function. Use code like this
library(plotrix) # added library
## your code
fake <- matrix(c(1, 2, 3, 4, 5, 22, 30, 47, 98, 62, 20, 40, 10, 15, 15), nrow = 5)
fake <- as.data.frame(fake) # changed "m" to "fake"
twrd.p <- twoord.plot(fake$V1,fake$V3,fake$V1,fake$V2, xlab="Bin",
lylim=c(0,100),rylim=c(0,100),type=c("bar","l"),
ylab="Exposure Percentage",rylab="Bin Average PP",
lytickpos=seq(0,100, by = 10),
rytickpos=seq(0,100, by = 10),
ylab.at=50,rylab.at=50,
main="Variable Name",
lcol=3,rcol=4)
## simple lines() function with x an y coordinates
## we'll add 2 lines for fun
## 1. dashed, thicker, and red
## 2. dots, thicker and black
lines(x = c(1, 2, 3, 4), y= c(40, 60, 40, 70), lty = 2, lwd = 2, col = "red")
lines(x = c(1.25 , 4.75), y = c(95, 25), lty = 3, lwd=2, col = "black")
This yields the following plot
Adding a short or long horizontal line should be simple now, I hope.
Please let me know whether this is what you had in mind.

Optimizing add_trace() in a for loop?

I'm using the add_trace() function in a for loop to create lines for a 3d network graph in plotly's scatter3d mode. Each add_trace draws an individual line between two nodes in the network. The method is working, but with large number of loops, the speed of the individual loops seems to be slowing down very quickly.
Example data can be downloaded here: https://gist.github.com/pravj/9168fe52823c1702a07b
library(igraph)
library(plotly)
G <- read.graph("karate.gml", format = c("gml"))
L <- layout.circle(G)
vs <- V(G)
es <- as.data.frame(get.edgelist(G))
Nv <- length(vs)
Ne <- length(es[1]$V1)
Xn <- L[,1]
Yn <- L[,2]
network <- plot_ly(type = "scatter3d", x = Xn, y = Yn, z = rep(0, Ne), mode = "markers", text = vs$label, hoverinfo = "text", showlegend = F)
for(i in 1:Ne) {
v0 <- es[i,]$V1
v1 <- es[i,]$V2
x0 <- Xn[v0]
y0 <- Yn[v0]
x1 <- Xn[v1]
y1 <- Yn[v1]
df <- data.frame(x = c(x0, x1), y = c(y0, y1), z = c(0, 0))
network <- add_trace(network, data = df, x = x, y = y, z = z, type = "scatter3d", mode = "lines", showlegend = F,
marker = list(color = '#030303'), line = list(width = 0.5))
}
This example is fairly quick, but when I include a few hundred edges or more, the execution of the individual loops start to slow down radically. I tried different optimization methods (vectorisation etc), but there seems to be no working around the slowness of the add_trace function itself.
Any suggestions?

The most efficient way to add many line segments in plotly is not as a separate trace each, but to use only a single trace that contains all the line segments. You can do this by constructing a data frame with the x,y coordinates of each node to be connected, interspersed with NA's between each line segment. Then use connectgaps=FALSE to break the trace into separate segments at each NA. You can see another example of this approach, applied to spaghetti plots in this answer.
es$breaks <- NA
lines <- data.frame(node=as.vector(t(es)), x=NA, y=NA, z=0)
lines[which(!is.na(lines$node)),]$x <- Xn[lines[which(!is.na(lines$node)),]$node]
lines[which(!is.na(lines$node)),]$y <- Yn[lines[which(!is.na(lines$node)),]$node]
network <- plot_ly(type = "scatter3d", x = Xn, y = Yn, z = rep(0, Ne),
mode = "markers", text = vs$label, hoverinfo = "text",
showlegend = F) %>%
add_trace(data=lines, x=x, y=y, z=z, showlegend = FALSE,
type = 'scatter3d', mode = 'lines+markers',
marker = list(color = '#030303'), line = list(width = 0.5),
connectgaps=FALSE)
Reproducible data for this question
For convenience, here are the data for this question. The OP required downloading a .gml file from github, and installing library(igraph) to process the data into these.
es <- structure(list(
V1 = c(1, 1, 2, 1, 2, 3, 1, 1, 1, 5, 6, 1, 2, 3, 4, 1, 3, 3, 1, 5, 6, 1, 1, 4, 1, 2, 3, 4, 6, 7, 1, 2, 1, 2,
1, 2, 24, 25, 3, 24, 25, 3, 24, 27, 2, 9, 1, 25, 26, 29, 3, 9, 15, 16, 19, 21, 23, 24, 30, 31, 32, 9, 10, 14, 15, 16, 19, 20,
21, 23, 24, 27, 28, 29, 30, 31, 32, 33),
V2 = c(2, 3, 3, 4, 4, 4, 5, 6, 7, 7, 7, 8, 8, 8, 8, 9, 9, 10, 11, 11, 11, 12, 13, 13,
14, 14, 14, 14, 17, 17, 18, 18, 20, 20, 22, 22, 26, 26, 28, 28, 28, 29, 30, 30, 31, 31, 32, 32, 32, 32, 33, 33, 33, 33, 33, 33,
33, 33, 33, 33, 33, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34)),
.Names = c("V1", "V2"), row.names = c(NA, -78L), class = "data.frame")
theta <- seq(0,2,length.out=35)[1:34]
Xn <- cospi(theta)
Yn <- sinpi(theta)
Nv <- NROW(Xn)
Ne <- NROW(es)
vs <- data.frame(label = as.character(1:Nv))

Area plot with missing values in base R

I want to draw an area plot for which the base of the polygon is zero and the data lines are connected to the base by vertical segments at every data break (that is the beginning, the end and possible NAs/NaN).
I drew this:
I had to force vertical down ward segments where the serie is interrupted with NAs, and I did this transforming NAs in 0s. But that doesn't produce vertical segments but polygon lines that reach the following 0s. I solved the problem for the beginning and the end of the series, adding a (y = 0, x = 0) point on both sides on the serie.
But this doesn't fix the problem if the NAs are inside the serie.
Any idea?
here's an example code (different image):
pollen <- c(45, 257.4, 24.67, 54.6, 89.4, 297, 471.25, 1256.5, 312.25, 969.2, 787.5, 425, NaN, 76.6, 42.67, 38.5, 20.2, 5.67, 15.8, 13.2, 11, 6.25, 6.67, 2.3, 0.5, 30.8, 3.75, 3, 2, 2.2, 3.25, 4.5, 9.6, 15.8, 200.2, NaN)
weeks.vec <- c(5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
plot.ts(y = pollen, x = weeks.vec, col = 'red', ylab = 'Pollen', xlab = 'Weeks', lwd = 3, xy.labels = F, xy.lines = T)
pollen[is.na(pollen)] <- 0
poly.y <- c(0,pollen,0)
poly.x <- c(weeks.vec[1], weeks.vec, weeks.vec[length(weeks.vec)])
polygon(y = poly.y, x = poly.x, density = NA,border = NA, col = rgb(1,0,0, .3))

I'd use ggplot2:
pollen <- c(45, 257.4, 24.67, 54.6, 89.4, 297, 471.25, 1256.5, 312.25, 969.2, 787.5, 425, NaN, 76.6, 42.67, 38.5, 20.2, 5.67, 15.8, 13.2, 11, 6.25, 6.67, 2.3, 0.5, 30.8, 3.75, 3, 2, 2.2, 3.25, 4.5, 9.6, 15.8, 200.2, NaN)
weeks.vec <- c(5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
DF <- data.frame(pollen, weeks.vec)
library(ggplot2)
ggplot(DF, aes(x = weeks.vec, y = pollen)) +
geom_ribbon(aes(ymin = 0, ymax = pollen),
colour = NA, fill = "red", alpha = 0.3) +
geom_line(colour = "red") +
geom_point(colour = "red", size = 3) +
xlab("Week") + ylab("Pollen") +
theme_bw()
But if you must use base plots:
plot.ts(y = pollen, x = weeks.vec, col = 'red',
ylab = 'Pollen', xlab = 'Weeks', lwd = 3,
xy.labels = F, xy.lines = T)
g <- cumsum(!is.finite(pollen))
for (i in unique(g)) {
y <- pollen[g == i]
x <- weeks.vec[g == i]
x <- x[is.finite(y)]
y <- y[is.finite(y)]
x <- c(x, rev(x))
y <- c(y, y * 0)
polygon(y = y, x = x, density = NA,border = NA, col = rgb(1,0,0, .3))
}

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Names cut on the x axis of the boxplot - r

Related

How to add or annotate Latex formula as annotation in boxplot() in R?

How to get the perfect "Before-After" graph with connected dots and paired U test using ggplot2?

Adding a third line to a twoord plot

Optimizing add_trace() in a for loop?

Area plot with missing values in base R

Categories

Resources