Coloured Boxplot - r

I have a matrix of 12 columns, and I am using boxplot function in R to plot the boxplot.
following commands are used:
pdf("data.pdf")
data<-read.table("data1", header=T)
boxplot(data, outline=F)
dev.off()
What I want, is to present the first three boxplots in red, green, and blue. while the next three in yellow, next three in orange and next three in purple.
How can I do this?
Thank you

To get colours, you just need to pass a vector of colours to the boxplot function:
##Create some dummy data
runif(10*12), ncol=12)
##Create a vector of 12 colours
cols = rep(c("yellow", "orange", "purple"), each=3)
cols = col=c("red", "green","blue",cols)
##Plot as normal
boxplot(dd, col=cols)
BTW, don't load your data at every iteration of your for loop. Load it once:
data <- read.table("data1", header=T)
pdf("data.pdf")
boxplot(data, outline=F)
dev.off()

Related

Multi-panel plots with for loops

I want to create multi-layer plots using for-loops. The main dataframe I am working with has the following characteristics:
product: 55_ab_LL_bubbles_D1 | 55_ab_LL_troubles_D1 | 34_ac_LL_bubbles_D1 | 34_ac_LL_troubles_D1
Color
Blue 453.3 766.1 562.1 883.3
Green 775.5 897.1 434.5 983.4
Purple 883.4 445.7 787.2 555.5
Yellow 764.1 445.6 887.3 673.5
From the code below, I am running loop down the rownames (Color) to create a scatter plot.
What I would like to do is, not only run the current loop down the rownames, but I also want to create individual scatter plots for each product (based on the first string "55_", "34_" etc..). I want to group all data points for the number preceding in the product, create the independent scatter plots for each of these numbers for each of the colors. So instead of the four scatter plots it gives me right now (one for each color), I would like to have 8 (for each color and each product number).
Any suggestion is appreciated :) !
CODE:
pdf("scatterplot.pdf")
for(i in seq_len(nrow(data))){
df <- data.frame(x= data[i, grep("bubbles_", colnames(data))],
y= data[i, grep("troubles_",colnames(data))])
plot(df$x, df$y,
xlim=xy, ylim = xy)
}
dev.off()

Plot3d only plots first two principle components

I am currently trying to plot the colours of different subgroups of a large dataset. I have separated the data in to 6 subgroups with 6 colours. However my plot3d function only plots the first two principle components.
Here is the example of the plot.
Here is the code. I have create a PCA analysis of my dataset and originally only want to show the first 3 main principle components but I have tried plotting all the principle components to ensure it isn't to do with the data.
PCA_Model <- prcomp(t(Input_dataset), center = T, scale=F)
samples_names <- row.names(PCA_Model$rotation)
# Bind sample names to their subgroup
pca_matrix <- cbind(samples_names, "Subgroup"=labeled_subgroup, stringsAsFactors=FALSE)
# Link dataframe to color
colours <- as.character(factor(pca_matrix[,"Subgroup"], levels = paste0("C", 1:6),labels = c("blue",
"red", "yellow", "green", "black", "white")))
plot3d(PCA_Model$x[,1:440], col=colours)
The dataset is very diverse so should show all subgroups. Any help would be much appreciated!
You may be using the wrong plotting function. Using scatter3d in the latest version of plot3D package:
# fit PCA model
PCA_Model <- prcomp(dplyr::select(iris, -Species), center = T, scale=F)
# Plot
scatter3D(x = PCA_Model$x[,1], y = PCA_Model$x[,2], z = PCA_Model$x[,3],
# just use the factor to color the points:
col = factor(iris$Species))
I think you get something odd from feeding characters into the col option of plot3d. So below I show an example of how to feed the colors. You create a color vector first, named it after your levels and then call them out. Adjust the script before for 6 colours:
library(rgl)
library(RColorBrewer)
pca = prcomp(iris[,-5])$x
COLS = brewer.pal(3,"Set1")
names(COLS) = levels(iris$Species)
plot3d(pca,col=COLS[as.character(iris$Species)])
I used snapshot3d() to capture the image, and the axis labels seem quite squished

R plot3d coloring

I am using plot3d() from rgl to make 3D scatter plots, using three columns, of samples in a data frame. Furthermore, I am using a fourth column colorby from my data frame (where each sample takes values, say, 10, 11, 12 as factors/levels) to color the points in the plot.
When using plot() to make 2D plots, I first set palette(rainbow(3)) and then col = colorby within plot() followed by legend(). The plot, colors and legend work fine.
However, when I repeat the last part for plot3d(), the coloring mixes up and not the same colors are assigned to the same levels as they would be in plot(). Moreover, if I use legend3d("topright", legend = levels(colorby), col = rainbow(3)) to create a legend, it looks the same as the 2D legend, but the coloring is clearly wrong in the 3D plot.
Where am I going wrong?
This looks correct to me:
df <- data.frame(x=1:9, y=9:1, z=101:109, colorby=gl(3,3))
palette(rainbow(3))
plot(x~y, df, col = df$colorby)
library(rgl)
with(df, plot3d(x,y,z, col = colorby))
legend3d("topright", legend = levels(df$colorby), col = levels(df$colorby), pch=19)

How to add colour matched legend to a R matplot

I plot several lines on a graph using matplot:
matplot(cumsum(as.data.frame(daily.pnl)),type="l")
This gives me default colours for each line - which is fine,
But I now want to add a legend that reflects those same colours - how can I achieve that?
PLEASE NOTE - I am trying NOT to specify the colours to matplot in the first place.
legend(0,0,legend=spot.names,lty=1)
Gives me all the same colour.
The default color parameter to matplot is a sequence over the nbr of column of your data.frame. So you can add legend like this :
nn <- ncol(daily.pnl)
legend("top", colnames(daily.pnl),col=seq_len(nn),cex=0.8,fill=seq_len(nn))
Using cars data set as example, here the complete code to add a legend. Better to use layout to add the legend in a pretty manner.
daily.pnl <- cars
nn <- ncol(daily.pnl)
layout(matrix(c(1,2),nrow=1), width=c(4,1))
par(mar=c(5,4,4,0)) #No margin on the right side
matplot(cumsum(as.data.frame(daily.pnl)),type="l")
par(mar=c(5,0,4,2)) #No margin on the left side
plot(c(0,1),type="n", axes=F, xlab="", ylab="")
legend("center", colnames(daily.pnl),col=seq_len(nn),cex=0.8,fill=seq_len(nn))
I have tried to reproduce what you are looking for using the iris dataset. I get the plot with the following expression:
matplot(cumsum(iris[,1:4]), type = "l")
Then, to add a legend, you can specify the default lines colour and type, i.e., numbers 1:4 as follows:
legend(0, 800, legend = colnames(iris)[1:4], col = 1:4, lty = 1:4)
Now you have the same in the legend and in the plot. Note that you might need to change the coordinates for the legend accordingly.
I like the #agstudy's trick to have a nice legend.
For the sake of comparison, I took #agstudy's example and plotted it with ggplot2:
The first step is to "melt" the data-set
require(reshape2)
df <- data.frame(x=1:nrow(cars), cumsum(data.frame(cars)))
df.melted <- melt(df, id="x")
The second step looks rather simple in comparison to the solution with matplot
require(ggplot2)
qplot(x=x, y=value, color=variable, data=df.melted, geom="line")
Interestingly #agstudy solution does the trick, but only for n ≤ 6
Here we have a matrix with 8 columns. The colour of the first 6 labels are correct.
The 7th and 8th are wrong. The colour in the plots restarts from the beginning (black, red ...) , whereas in the label it continues (yellow, grey, ...)
Still haven't figured out why this is the case. I'll maybe update this post with my findings.
matplot(x = lambda, y = t(ridge$coef), type = "l", main="Ridge regression", xlab="λ", ylab="Coefficient-value", log = "x")
nr = nrow(ridge$coef)
legend("topright", rownames(ridge$coef), col=seq_len(nr), cex=0.8, lty=seq_len(nr), lwd=2)
Just discovered that matplot uses linetypes 1:5 and colors 1:6 to establish the appearance of the lines. If you want to create a legend try the following approach:
## Plot multiple columns of the data frame 'GW' with matplot
cstart = 10 # from column
cend = cstart + 20 # to column
nr <- cstart:cend
ltyp <- rep(1:5, times=length(nr)/5, each=1) # the line types matplot uses
cols <- rep(1:6, times=length(nr)/6, each=1) # the cols matplot uses
matplot(x,GW[,nr],type='l')
legend("bottomright", as.character(nr), col=cols, cex=0.8, lty=ltyp, ncol=3)

Legend of a raster map with categorical data

I would like to plot a raster containing 4 different values (1) with a categorical text legend describing the categories such as 2 but with colour boxes:
I've tried using legend such as :
legend( 1,-20,legend = c("land","ocean/lake", "rivers","water bodies"))
but I don't know how to associate one value to the displayed color. Is there a way to retrieve the colour displayed with 'plot' and to use it in the legend?
The rasterVis package includes a Raster method for levelplot(), which plots categorical variables and produces an appropriate legend:
library(raster)
library(rasterVis)
## Example data
r <- raster(ncol=4, nrow=2)
r[] <- sample(1:4, size=ncell(r), replace=TRUE)
r <- as.factor(r)
## Add a landcover column to the Raster Attribute Table
rat <- levels(r)[[1]]
rat[["landcover"]] <- c("land","ocean/lake", "rivers","water bodies")
levels(r) <- rat
## Plot
levelplot(r, col.regions=rev(terrain.colors(4)), xlab="", ylab="")
By default, the colours used in a raster-plot are generated by rev(terrain.colors()) (see ?raster::plot). You can use this to re-create that sequence of 4 colours for your legend - or choose a random sequence of colours:
my_col = rev(terrain.colors(n = 4))
# my_col = c('beige','red','green','blue')
First plot the map using the colour sequence. legend = FALSE gets rid of the standard colour bar:
plot(my_raster, legend = FALSE, col = my_col)
Add a custom legend to the bottom left. Use the fill argument to generate coloured boxes:
legend(x='bottomleft', legend = c("land", "ocean/lake", "rivers", "water bodies"), fill = my_col)

Resources