Struggling to add legend to box plot in R - r

I am relatively new user in R - and I seem stuck on what should be fairly easy, I am just not finding the problem in my code set-up. I am trying to create a legend on a simply box plot but I cannot get it to line up correctly, without overlaying itself.
My box plot:
boxplot(OS, main='Computer Users Surveyed', xlab='Program Used', ylab= "Seconds (s)", col=c('blue', 'gold1'))
Then when I add a legend:
legend("topright", c("linux", "windows"), border="black", fill = "blue", "gold1")
All it does is show me a blue square with the words gold1 - instead of double stacking the Linux and windows groups with the corresponding colors.

I think you made a simple mistake by not concatenating the fill colors:
Mock data:
OS <- data.frame(
x = rnorm(100),
y = runif(100)
)
boxplot(OS, main='Computer Users Surveyed', xlab='Program Used', ylab= "Seconds (s)", col=c('blue', 'gold1'), frame = F)
legend("topright", c("linux", "windows"), border="black", fill = c("blue", "gold1"))

Related

Two histograms on one one plot without overlap

I am trying to plot two sets of data on one histogram, but I dont want the bars to overlap, just to be next to each other in the same plot. currently I am using the code:
plot(baxishist1,freq=FALSE, xlab = 'B-Axis (mm)', ylab = 'Percent of Sample', main = 'Distribution of B-Axis on Moraine 1', ylim=c(0,30),breaks=seq(25,60,1), col='blue')
par(new=T)
plot(baxishist2,freq=FALSE, xlab = 'B-Axis (mm)', ylab = 'Percent of Sample', main = 'Distribution of B-Axis on Moraine 2', ylim=c(0,30),breaks=seq(25,60,1), col='red')
and the results are bars overlapping on histogram
Can anyone help me to make the bars to be in the same bins but not overlap so that I can see both histograms?
You can make this a little easier to interpret, by using transparent colors.
Let's fist generate some data:
a <- rnorm(100)
b <- rnorm(100, mean=3)
And now plot the histograms:
hist(a, col=rgb(1,0,0,0.5))
hist(b, col=rgb(0,1,0,0.5), add=T)
As you can see, both are now somewhat visible but we would now have to manually adjust the x-axis to accomodate both distributions. And in any case, it's still not nice to read/interpret so I would rather plot two separate histograms, a boxplot or a violinplot.

y axis labeling in R and how to change x-axis to specific increments in R

I would like to create a plot of this data, with x-axis increments of 500000 and with sampleIDs on the y-axis. The following code works to create the plot, but the y-axis labels don't work, and I am unsure how to code the x-axis ticks. Also, I had to add headings manually to the data file (and then obviously add header = TRUE when I assigned d) to get the code to work. I shouldn't have had to put the column titles in though should I since I use setNames?
d = read.delim("n_reads_per_sample.tsv", header = TRUE, sep = "\t")
xticks <- ( ? increments of 500000 to xmax ? )
dotchart(
sort(setNames(d$n_reads, d$X.sample)),
xlim = c(0, at = xticks, 1 max(d$n_reads)),
labels = dimnames(d[[1]])
,
main = "reads per sample",
xlab = "number of reads",
ylab = "sample"
)
In case the link doesn't work, this is what the file looks like.
x.sample n_reads
LT-145 3193621
LT-323 786578
LT-458 485543
LT-500 3689123
LT-95 3308764
LT-367 765972
LT-205 2090226
LT-245 10238727
I can't get at your full data right now, so I am just using your sample in the question.
Not sure what you mean that the y-axis labels don't work. They seem OK to me. You can get the x-axis labels that you want by suppressing the x-axis produced by dotchart and then making your own axis using the axis function. That requires a little fancy footwork with par. Also, unless you stretch out your graphics window, there will not be enough room to print all of the axis labels. I reduced the font size and stretched the window to get the graph below.
UpperLimit <- ceiling(max(d$n_reads)/500000)*500000
xticks <- seq(0,UpperLimit, 500000)
par(xaxt = "n")
dotchart(
sort(setNames(d$n_reads, d$X.sample)),
xlim=c(0, UpperLimit),
labels = dimnames(d[[1]]),
main = "reads per sample",
xlab = "number of reads",
ylab = "sample"
)
par(xaxt = "s")
axis(1, at=xticks, cex.axis=0.7)

Making an R histogram plot from a saved hist() call

In R, one can do
x <- rnorm(100, 0, 1) # generate some fake data
hgram <- hist(x, plot=F)
plot(hgram$mids, hgram$counts)
One can further specify a plot type, such as 'h' or 's'. However, these don't really come out looking like a proper histogram. How can one make a nice looking histogram this way?
Thought to add my inputs about making decent looking histograms in R (using your "x" from your question).
Using Base R
# histogram with colors and labels
hist(x, main = "Histogram of Fake Data", xlab = paste("x (units of measure)"), border = "blue", col = "green", prob = TRUE)
# add density
lines(density(x))
# add red line at 95th percentile
abline(v = quantile(x, .95), col = "red")
Using Plotly
install.packages("plotly")
library(plotly)
# basic Plotly histogram
plot_ly(x = x, type = "histogram")
The plotly result should open in a browser window with a variety of interactive controls. More plotly capabilities are available on their website at:
https://plot.ly/r/histograms/#normalized-histogram

R plot legend not showing colors according to points

So I have made a plot in R, with a lot of different colors indicating which of my 23 categories a point belongs to. The colors of points are added through a vector (stratumcol, which is a factor with 23 levels).
When I add the legend, trying to let that show the colors and their category, it seems they do not match (tested using ordihull, see picture below).
This is my plot code:
plot(pca_nmdsscores, type = "n")
points(pca_nmdsscores, col=stratumcol, cex=1.5, pch = 15)
legend("right","top",levels(stratumcol),cex=.8, col = as.numeric(stratumcol), pch =15, lty = 0) # pch = stratumcol
ordihull(pca_nmdsscores, groups = stratumcol,draw = "polygon", col ="purple",label = T, show.groups = "LateMoistRich")
ordihull(pca_nmdsscores, groups = stratumcol,draw = "polygon", col ="blue",label = T, show.groups = "MidWetPoor")
Here my Rplot should be visible. As you can see, my category "MaleMoistRich" connects the points with the pink-ish color, but in the legend this color is named "MidMoistRich".
The same for "MidWetPoor", connecting the mid-blue points, in the legend this color refers to "LateMoistPoor".
How do I solve this problem?
I tried looking for solutions, but didn't come across any that could solve it - including "unique" (which doesn't change anything, since my palette has been defined with 23 colors, so no need to recycle those anyway)
[Plot from R, showing legend and points color][1]
Ok - so since I'm a newbie I can't upload an image of my plot...
But see it here instead: http://i.stack.imgur.com/pzn2y.png
/thanks
Edit:
The solution was to not use levels() on my factor! Not in legend = levels(stratumcol), nor in col = levels(stratumcol). Richard and DeveauP suggested levels might be the problem.
This created a new problem: my legen displayed the whole factor, not just the levels in it (but the colors corresponded to the correct point colors, which was the original problem).
this new problem was solved by using "unique()" instead of "levels()".
legend("Right","top",legend=unique(stratumcol), cex=.8, col = unique(stratumcol), pch=15, lty=0)
I found a solution
Try
legend("right","top",legend = levels(stratumcol),cex=.8, col = levels(stratumcol), pch =15, lty = 0)

How do you make one factor show as symbol, and another factor as colour in nMDS (vegan)?

I am trying to make an nMDS plot of data with a nested factor. I would like the nMDS to show both factors on one plot by using symbols and colour.
In this reproducible example, if use was nested in moisture, I would like the plot to show Moisture as different symbols, and then Use as different colours.
So far I have figured out this:
library("vegan")
library("BiodiversityR")
data(dune, dune.env)
MDS <- metaMDS(dune, distance="bray", strata=dune.env$Moisture)
MDS
plot(MDS$points[,2], MDS$points[,1], type="n", main="Communities by Use",
xlab="NMDS Axis 1", ylab="NMDS Axis 2", xlim=c(-1.5,1.5), ylim=c(-1.5,1.5))
ordisymbol(MDS, dune.env, factor="Use", cex=1.25, rainbow=T, legend=T)
Which gives me the different uses as both different symbols and colours, but shows me nothing about moisture. Is it possible to make it show the different factors instead? I'm assuming it might be somewhere in the MDS$points[,] arguments but I'm not sure what exactly those are doing.
Figured it out by modifying the answer from this question: Plot points of metaMDS
data(dune, dune.env)
dune.MDS <- metaMDS(dune, distance = "bray", strata=dune.env$Moisture)
dune.MDS
pchs<- c(0:5)
gr.moi <- factor(dune.env$Moisture)
gr.use <- factor(dune.env$Use)
col.gr <- c("red", "blue", "purple")
plot(dune.MDS, type = "n", display = "sites")
orditorp(dune.MDS,display="species",col="dark grey",air=0.01)
points(dune.MDS, display = "sites", pch = pchs[gr.moi], col = col.gr[gr.use])
legend("topright", legend=levels(gr.moi), bty = "n", col= c("black"), pch = pchs)
legend("bottomright", legend = levels(gr.use), bty = "n", col = col.gr, pch=c(20),)
And it will produce a lovely plot with symbols and colours exactly how I wanted :)

Resources