Issues with axis labeling on boxplots in R - r

Hopefully this is a quick fix. I am trying to make boxplots of nutrient river concentrations using R code written by the person previously in my position (and I am not so experienced with R, but we use it only for this). The issue is that the output boxplot axis have multiple overlapping text, some of which seems to come from another part of the code which I thought did not dictate axis labels. The original code is shown below (the working directory is already set and csv files imported and I know that works), and the resulting boxplot is in 1.
Edit: Code below
png(filename="./TP & TN Plotting/Plots/TN Concentration/Historical TN Concentrations Zoomed to Medians.png",
width=10,
height=4,
unit="in",
res=600)
par(mar=c(5,5,3,1),
cex=.75)
tnhistconc<-(boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010],
data=TNhist))
boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010],
data=TNhist,
ylim=c(0,3000),
xaxt="n"),
at=c(1:3,5:8,10:(length(tnhistconc$n)+2)))
axis.break(axis=1,breakpos=c(4),style="slash")
axis.break(axis=1,breakpos=c(9),style="slash")
text(c(1:3,5:8,10:(length(tnhistconc$n)+2)),
-50,
paste("n=",
tnhistconc$n),
cex=0.8)
title(ylab="TN Concentration (ppb)",
xlab="Year")
title(main=paste("Historical (1998 - 2000), (2005 - 2008) + UMass (2012 -
",max(TNhist$Year),") TN Concentration"))
dev.off()
I made the edit of adding ,xlab="",ylab="" after xlab="Year" towards the bottom, since this fixed this issue in other similar sections of boxplot code (except it seems I needed to add it to a different part of those sections, see 2 - also tried it after xaxt ="n" as in 2 and got the same result). It fixes the overlapping text issue, but the axis labels are still not what I want them to be ("Year", and "TN Concentration (ppb)), and this is shown in 3.
So, does anyone potentially know of a simple fix that might get rid of these unwanted labels and replace them with the correct ones? Am I missing something basic? The same original code seemed to work fine in the past before I was doing this (for 2018 data), and the spreadsheets the data is being imported from are the same, same setup and everything. Many thanks in advance!
Edit: I have a sample dataset which is just the last 2 years of data. See here: https://docs.google.com/spreadsheets/d/10oo9w-IzXkLWdY10A9gHYhDH67MeSibBpc2q67L6o88/edit?usp=sharing
Original code result
How this code fixed other similar issues
Partially fixed result based on edit

I don't know if it is a typo but you have an extra parenthesis after xatx = "n".
Maybe you can try something like that:
png(filename="./TP & TN Plotting/Plots/TN Concentration/Historical TN Concentrations Zoomed to Medians.png",
width=10, height=4, unit="in", res=600)
par(mar=c(5,5,3,1), cex=.75)
tnhistconc<-(boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010], data=TNhist))
boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010],
data=TNhist,
ylim=c(0,3000),
xaxt="n", ylab = "", xlab = "",
at=c(1:3,5:8,10:(length(tnhistconc$n)+2)))
axis.break(axis=1,breakpos=c(4),style="slash")
axis.break(axis=1,breakpos=c(9),style="slash")
text(c(1:3,5:8,10:(length(tnhistconc$n)+2)),
-50,
paste("n=",
tnhistconc$n),
cex=0.8)
title(ylab="TN Concentration (ppb)",
xlab="Year",
main=paste("Historical (1998 - 2000), (2005 - 2008) + UMass (2012 -
",max(TNhist$Year),") TN Concentration"))
dev.off()
xatx will remove the x axis (that will control by axis.break. xlab and ylab will remove x and y axis title and they will be set later by title.
Hopefully, it will works
EDIT: Using ggplot2
Your dataframe is actually in a longer format making it easily ready to be plot using ggplot2 in few lines. Here your dataset is named df:
library(ggplot2)
ggplot(df, aes(x = as.factor(Year), y = Conc_ppb))+
geom_boxplot()+
labs(x = "Year", y = "TN Concentration (ppb)",
title = paste("Historical (1998 - 2000), (2005 - 2008) + UMass (2012 -
",max(df$Year),") TN Concentration"))

Related

how to mimic histogram plot from flowjo in R using flowCore?

I'm new to flowCore + R. I would like to mimic a histogram plot after gating that can be manually done in FlowJo software. I got something similar but it doesn't look quite right because it is a "density" plot and is shifted. How can I get the x axis to shift over and look similar to how FlowJo outputs the plot? I tried reading this document but couldn't find a plot similar to the one in FlowJo: howtoflowcore Appreciate any guidance. Thanks.
code snippet:
library(flowCore)
parentpath <- "/parent/path"
subfolder <- "Sample 1"
fcs_files <- list.files(paste0(parentpath, subfolder), pattern = ".fcs")
fs <- read.flowSet(fcs_files)
rect.g <- rectangleGate(filterId = "main",list("FSC-A" = c(1e5, 2e5), "SSC-A" = c(3e4,1e5)))
fs_sub <- Subset(fs, rect.g)
p <- ggcyto(fs_sub[[15]], aes(x= `UV-379-A`)) +
geom_density(fill='black', alpha = 0.4) +
ggcyto_par_set(limits = list(x = c(-1e3, 5e4), y = c(0, 6e-5)))
p
FlowJo output:
R FlowCore output:
The reason that for the "shift" is that the x axis is logarithmic (base 10) in the flowJo graph. To achieve the same result in R, add
+ scale_x_log10()
after the existing code. This might interact weirdly with the axis limits you've set, so bare that in mind.
To make the y-axis "count" rather than density, you can change the first line of your ggcyto() call to:
aes(x= `UV-379-A`, y = after_stat(count))
Let me know if that works - I don't have your data to hand so that's all from memory!
For any purely aesthetic changes, they are relatively easy to look up.

Q: How combine two types of lines using ggplot?

I am trying to plot the following graph:
This plot was made using a command in R; however, I need to change the x-axis. As you see the x-axis starts at 0 and finish at 46. I want that the x-axis starts in 1972 and finishes in 2018 seq(1972, 2018). The data used for this graph is the following:
For regime one
structure(c(0.996336942021931, 0.982749831853788, 0.25257000136794,
0.707797489518183, 0.339372705184362, 0.999209103898399, 0.348786927897612,
0.821500770877589, 0.569473419352121, 0.544946043345147, 0.15347485404411,
0.987921203799956, 0.00247541125926418, 0.999925918450173, 0.996940249283586,
0.0141234625702467, 0.105466117156579, 0.999992944275275, 0.991723355647765,
0.0958472062267191, 0.0362729940372193, 0.999999790503447, 0.0750715811130157,
0.999975836828039, 0.998991768987905, 0.327943641159186, 5.05723080618291e-05,
0.999999999869691, 0.995538324405397, 0.123355227931813, 0.999776636825943,
0.00875781169836433, 0.696284480883101, 0.854839147672286, 0.113243492249383,
0.00984853715078062, 0.442061195271808, 0.999959859676686, 0.0249739384218217,
0.715262186931097, 0.269481397703521, 0.708458897302807, 0.0444979324520481,
0.000133950914911277, 0.997976154782607, 0.191386380576805, 0.99775339928206,
0.97921531595208, 0.27690132186733, 0.671995422154737, 0.458800347851363,
0.999155966774432, 0.417000082142666, 0.838969001100901, 0.576424593247709,
0.439169303472056, 0.227227711549776, 0.978527102362448, 0.00408165810824898,
0.999955057843957, 0.994643622809094, 0.00847570472458959, 0.163000467960203,
0.999995704786608, 0.987482614312069, 0.0569007267419926, 0.0585312256476362,
0.999999671060746, 0.118213072794827, 0.99998536150034, 0.998897081324845,
0.212968271334585, 8.35316288758489e-05, 0.999999999920876, 0.993537683112221,
0.188538497918178, 0.999604116439039, 0.00905848219612739, 0.769430430615986,
0.794457999021984, 0.0665707154963958, 0.00776458004359329, 0.5668500474175,
0.999931021995446, 0.0265573724408095, 0.661699294173752, 0.296009575623967,
0.587638579198176, 0.0251758869152202, 0.000220356219397782,
0.997352716237698, 0.191386380576805), .Dim = c(46L, 2L))
for regime 2:
structure(c(0.00366305797806813, 0.0172501681462116, 0.74742999863206,
0.292202510481817, 0.660627294815638, 0.000790896101601132, 0.651213072102388,
0.178499229122411, 0.430526580647879, 0.455053956654853, 0.846525145955889,
0.0120787962000438, 0.997524588740736, 7.40815498269273e-05,
0.00305975071641352, 0.985876537429753, 0.894533882843421, 7.05572472485335e-06,
0.00827664435223535, 0.904152793773281, 0.963727005962781, 2.09496553467159e-07,
0.924928418886985, 2.41631719608902e-05, 0.00100823101209502,
0.672056358840815, 0.999949427691938, 1.30308744399533e-10, 0.00446167559460289,
0.876644772068187, 0.00022336317405711, 0.991242188301636, 0.303715519116899,
0.145160852327714, 0.886756507750617, 0.990151462849219, 0.557938804728191,
4.01403233139628e-05, 0.975026061578178, 0.284737813068903, 0.730518602296479,
0.291541102697193, 0.955502067547952, 0.999866049085089, 0.00202384521739295,
0.808613619423195, 0.00224660071793958, 0.0207846840479196, 0.72309867813267,
0.328004577845263, 0.541199652148637, 0.000844033225568314, 0.582999917857334,
0.161030998899099, 0.423575406752291, 0.560830696527944, 0.772772288450224,
0.0214728976375518, 0.995918341891751, 4.49421560426429e-05,
0.00535637719090558, 0.99152429527541, 0.836999532039797, 4.29521339242403e-06,
0.0125173856879312, 0.943099273258007, 0.941468774352364, 3.28939253926857e-07,
0.881786927205173, 1.46384996596921e-05, 0.00110291867515508,
0.787031728665414, 0.999916468371124, 7.91243531099699e-11, 0.00646231688777926,
0.811461502081822, 0.00039588356096145, 0.990941517803873, 0.230569569384014,
0.205542000978016, 0.933429284503604, 0.992235419956407, 0.4331499525825,
6.89780045536876e-05, 0.973442627559191, 0.338300705826248, 0.703990424376033,
0.412361420801824, 0.97482411308478, 0.999779643780602, 0.00264728376230197,
0.808613619423195), .Dim = c(46L, 2L))
I know that the red line can be plotted using geom_line but I do not know how can the black bars plot? maybe using geom_bar, and also how can I merge the plots?
Thanks for your help
It's actually plotted using base R (good old times), using your first data for For regime one:
plot(Regime1[,1],type="h",xaxt="n",ylab="",cex.axis=0.6,xlab="",xlim=c(0,46))
lines(Regime1[,2],col="red")
mtext("Smoothed Probabilities",2,padj=-5,col="red",cex=0.7)
mtext("Fitted Probabilities",4,padj=1,cex=0.7)
axis(side=1,at=c(0,20,46),labels=c(1972,1992,2018))
Your xaxis values are actually 0:46, so you turn off the x-axis ticks using xaxt="n", then with axis(), you put it at 0,20,46 with the labels 1972...
It also depends on your plotting device, so might have to change the padj parameter in the axis to adjust the axis labels. I guess you can check out post like this for base R plotting functions.
In ggplot2, I guess you just create a data.frame with the Index as the years you need, and you call geom_segment() to plot the vertical lines :
library(ggplot2)
Regime1 = data.frame(Regime1)
colnames(Regime1) = c("Fitted","Smoothed")
Regime1$index = 1:nrow(Regime1)+1972
ggplot(Regime1,aes(x=index))+
geom_segment(aes(xend=index,y=0,yend=Fitted,col="Fitted")) +
geom_line(aes(y=Smoothed,col="Smoothed")) + theme_minimal() +
scale_color_manual(values=c("black","red"))
For a ggplot2 solution, you are going to need a data.frame or tibble with 4 columns (Regime, Year, Smoothed, and Fitted). Based on the data you provided, this would have 92 rows.
Now assuming you use those column names (and storing your data into the variable example.dat), a ggplot2 solution is
example.dat %>%
ggplot( aes(x=Year) ) +
geom_line( aes(y=Smoothed), color="red" ) +
geom_linerange( aes(ymax=Fitted), ymin=0 ) +
facet_wrap( ~ Regime, ncol=1 )
Then you might need to adjust some of the scales to get the best plot.

R multi boxplot in one graph with value (quantile)

How to create multiple boxplot with value shown in R ?
Now I'm using this code
boxplot(Data_frame[ ,2] ~ Data_frame[ ,3], )
I tried to use this
boxplot(Data_frame[ ,2] ~ Data_frame[ ,3], )
text(y=fivenum(Data_frame$x), labels =fivenum(Data_frame$x), x=1.25)
But only first boxplot have value. How to show value in all boxplot in one graph.
Thank you so much!
As far as I understand your question (it is not clear how the fivenum summary should be displayed) here is one solution. It presents the summary using the top axis.
x <- data.frame(
Time = c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3),
Value = c(5,10,15,20,30,50,70,80,100,5,7,9,11,15,17,19,17,19,100,200,300,400,500,700,1000,200))
boxplot(x$Value ~ x$Time)
fivenums <- aggregate(x$Value, by=list(Time=x$Time), FUN=fivenum)
labels <- apply(fivenums[,-1], 1, function(x) paste(x[-1], collapse = ", "))
axis(3, at=fivenums[,1],labels=labels, las=1, col.axis="red")
Of course you can additionally play with the font size or rotation for this summary. Moreover you can break the line in one place, so the label will have smaller width.
Edit
In order to get what have you posted in the comment below you can add
text(x = 3 + 0.5, y = fivenums[3,-1], labels=fivenums[3,-1])
and you will get
however it won't be readable for other boxplots.

R Plot Adds Extra Unwanted Line

I'm generating a simple line plot in R, however it adds another unwanted straight horizontal line to my plot that I don't want. And it happens in all of my line plots. I have tried google, however it only gives me instructions on how to add an extra line and not why this is happening. I am using RStudio 0.98.1028 on Mac OS X Yosemite.
plot(data2$interval,data2$steps,main="Plot of Average Activity",
xlab = "Interval", type="l", ylab="Average steps taken")
I guess the problem is with your data. You might have rows at the end of the data frame that "return" to the origin. Here you have a reproducible example:
data2 <- data.frame(interval = 1:200, steps = rnorm(200, 50, 20))
data2[1,2] <- 0
data2[200,2] <- 0
data2[201, ] <- c(0, 0)
plot(data2$interval,data2$steps,main="Plot of Average Activity",
xlab = "Interval", type="l", ylab="Average steps taken")
please vote if the answer is fine with you :)

plot labels overlap - how to expand scale of graph sheet?

Setup/Problem:
I have created a simple scatter plot using ggplot2 library and the qplot() function in RStudio.
Issue:
The issue is that the labels overlap when I create the plot.
Question:
Is there a simple way to expand the graph sheet to stop the plot labels overlapping?
Is there a simple way to stop the labels being cut off by the edge of the graph
I do not want to remove labels. My sense would be to expand the sheet size but I cannot seem to find a way to do that. Any help would be much appreciated.
Research so far
I have investigated the wordcloud library as an alternative but hit the same issue.
I have investigated using the scale_x_continuous(expand = c(.3, .3)) command which does allow me to expand the sheet to address the edge issue but I am looking to see if a better solution exists.
I have read through the ggplot2 manual pages but have failed to find a clean solution. I felt it is time to ask for some help and a few pointers to a solution. If I find a solution I will post it.
Example Output (Date File to below)
Code
library(ggplot2)
library(grid)
td3 <- read.csv("td3.csv")
p <-qplot(X,Y, xaxs = "i", yaxs = "r", las = 1, data=td3, shape=as.factor(Type), label=Identifier, asp = 1)
p <- p + scale_x_continuous(expand = c(.3, .3))
p + geom_text(aes(colour=factor(Type)), angle = 30, size=4, hjust=-0.1, panel.margin = unit(50, "lines"))
Test Data
Identifier,X,Y,,Type
1st Reference Long Title,5,280,,Super fit
2nd Reference Long Title,1,60,,fit
3rd Reference Long Title,1,60,,fit
4th Reference Long Title,3,100,,fit
5th Reference Long Title,1,14,,unfit
6th Reference Long Title,1,48,,fit
7th Reference Long Title,1,48,,fit
8th Reference Long Title,10,80,,fit
9th Reference Long Title,1,24,,unfit
10th Reference Long Title,1,80,,fit
11th Reference Long Title,1,36,,unfit
12th Reference Long Title,1,10,,unfit
13th Reference Long Title,3,60,,fit
14th Reference Long Title,3,120,,fit
15th Reference Long Title,3,80,,fit
16th Reference Long Title,10,400,,Super fit
17th Reference Long Title,5,360,,Super fit
18th Reference Long Title,2,5,,unfit
You could either "increase the size of the canvas" before issuing the plotting command, see for example ?png or ?jpg.
Alternatively you could use ggsave, see R plot: size and resolution
G

Resources