Select colour of boxplot based on function in R - r

I am plotting boxplots of fish biomass by reefname, in order of median biomass. All reefnames (sites) are either in or out of a MPA, e.g MPA="1" or MPA=="0". Currently all plots show green.
How can I show MPA=="0" sites as blue and MPA=="1" as green for example. While maintaining the order of the fish biomass.
MPA <- factor(Fish$MPA)
bymedian <- with(Fish, reorder(ReefName, log10(Biomassm+1)), median)
boxplot(log10(Biomassm+1) ~ bymedian, data = Fish,
xlab = "ReefName", ylab = "Biomassm",
main = "Biomassm in Caribbean", varwidth = TRUE,
col=(c("darkgreen")), las=3, cex.axis=0.3)
Thank you

It might be a better idea to use the ggplot2 package for this. Your code would then look like this:
ggplot(data=Fish, aes(x=reorder(ReefName, log10(Biomassm+1)), median), y=Biomassm, fill=MPA)) +
geom_boxplot() +
scale_y_log10("Biomassm") +
xlab("ReefName") +
scale_fill_manual(values=c("blue", "green")) +
ggtitle("Biomassm in Caribbean")

Here's a set of boxplots coloured depending on the value of MPA:
# generate some data
set.seed(1)
X = matrix(rnorm(100), ncol=10)
# order by median
X = X[,order(apply(X, 2, median))]
# some fake MPA values
MPA = round(runif(n=10, min=0, max=1))
# generate boxplots and check if MPA==1
boxplot(X, col=ifelse(test=MPA==1, yes='green', no='blue'))
# add legend
legend(x='bottomleft', fill=c('green','blue'), legend=c('MPA=1', 'MPA=0'), inset=c(0.01))
The output of ifelse is a vector of colours according to the MPA values and these are used to colour the boxes:
[1] "blue" "blue" "green" "blue" "blue" "green" "green" "blue" "blue" "green"

Related

How to partly colorize histogram?

I've been trying to color specific bins above a defined threshold in the following data frame (df)
df <- read.table("https://pastebin.com/raw/3En2GWG6", header=T)
I've been following this example (Change colour of specific histogram bins in R), but I cannot seem to get this to adapt their suggestions to my data, so I wanted to ask you here at stackoverflow
I would like all bins with values above 0.100 to be "red", and the rest all to be either no color, or just black (I defined black, but I would prefer no color)
Here is what I tried:
col<-(df$consumption>=0.100)
table(col) # I can see 40 points above 100, the rest below
col[which(col=="TRUE")] <- "firebrick1"
col[which(col=="FALSE")] <- "black"
hist(df$consumption, breaks = 1000, xlim = c(0,0.2), col=col,xlab= "Consumption [MG]")
However, the whole graph is red, and that doesn't make sense..?
In other words, I would like anything to the right side of the line below to be red
hist(df$consumption, breaks = 1000, xlim = c(0,0.2),xlab= "Consumption [MG]")
abline(v=c(.100), col=c("red"),lty=c(1), lwd=c(5))
Simply plot two histograms on top of each other using add=TRUE and sub-setting the second.
hist(df$consumption, breaks=1000, xlim=c(0,.2),xlab= "Consumption [MG]")
hist(df$consumption[df$consumption > .100], breaks=1000, xlim=c(0,.2), col=2, add=TRUE)
abline(v=.100, col=2, lty=3)
Here is along the lines of what you were doing. You do not want to count the points above your cutoff, but rather the number of histogram bins above your cutoff.
# store the histogram as an object
h <- hist(df$consumption, breaks = 1000)
# extract out the breaks, and assign a color vector accordingly
cols <- ifelse(h$breaks > 0.1, "firebrick1", "black")
# use the color vector
plot(h, col = cols, xlim=c(0,.2),xlab= "Consumption [MG]")
abline(v=c(.100), col=c("red"),lty=c(1), lwd=c(5))

Change the legend on the color key of Heat Map in R

I have created the heatmap in R using HeatMaps.2. Everything went fine except the color key in my figure.
I'm plotting the expression values of genes in 'control' and 'case' conditions. In the final heatmap all upregualted genes have 'Red' in the 'case' condition and the downregulated genes have 'Green' in the 'case' condition which is perfect. But looking at the color key, Red shows low values and green shows high values.
In the legend of the color key, the value range is from 20-100 and the colors transitioning from 'red' to 'green'. I don't know how the Value's are calculated as my data has the values only from 2.0 - 13.0. This is how my code looks like:
my_palette <- colorRampPalette(c("red", "green"), (n=100))
heatmap.2(mat_data, Rowv=F, Colv=F, trace="none", dendrogram="none", density.info="none", key=TRUE,col=my_palette,
notecol="black",cexRow=0.45, cexCol=0.75,
offsetCol=0.5, symm=F,symkey=F, scale="none")
Could someone let me know how exactly these values are calculated and how can i invert these values to show 'red' high and 'green' low values?
I like to use ggplot.
# Package
require(ggplot)
# Sample data
mat_data = data.frame(x = rep(1:3,3), y = rep(1:3, each = 3), value = 1:9);
# Plot using ggplot
p <- ggplot(data = mat_data) + # Set data
geom_tile(aes(x = x, y = y, fill = value)) + # Define variables to plot
scale_fill_gradient(low = "red", high = "green") # Set colour scheme
# Display plot
p
See here for more a more detailed example.
Output from example:

Different color for different line for plot in R

I am trying to plot a survival curve with ten different color for ten different line, its basically the survival probability based on ten regions.
The problem right now is as the default palette number is 8, the last two will color will be repeating even if I give col=1:10 which is making it difficult for us to interpret the result.
Can I used the rainbow function here ?
pdf("E:/survplot.pdf", width=10, height=10)
plot(survfit(survobj ~ strata(region), data=rfdata,type="kaplan-meier",conf.int=FALSE),mark.time=FALSE,col=1:10,xlim=c(0,400),ylim=c(0,1),ylab="Probability", xlab="Days on Lot",main="Survival Probability")
legend("topright",legend = c(1:10), lty = 1, col = 1:10,title = "Regions")
dev.off()
palette()
[1] "black" "red" "green3" "blue" "cyan" "magenta" "yellow" "gray"

Labeling a dotchart

Given the following example and chart, how would one be able to plot the exact value of the x axis next to each points?
x <- mtcars[order(mtcars$mpg),] # sort by mpg
x$cyl <- factor(x$cyl) # it must be a factor
x$color[x$cyl==4] <- "red"
x$color[x$cyl==6] <- "blue"
x$color[x$cyl==8] <- "darkgreen"
dotchart(x$mpg,labels=row.names(x),cex=.7,groups= x$cyl,
main="Gas Milage for Car Models\ngrouped by cylinder",
xlab="Miles Per Gallon", gcolor="black", color=x$color)
I tried modifying the labels argument, but it only adjusts the y axis label name.
You'll need to sort by category, then by x. Then you can use text as Ricardo suggests, accounting for the breaks between categories.
x <- mtcars[order(-mtcars$cyl, mtcars$mpg),]
# sort by category, then by position within category
# As above
x$cyl <- factor(x$cyl) # it must be a factor
x$color[x$cyl==4] <- "red"
x$color[x$cyl==6] <- "blue"
x$color[x$cyl==8] <- "darkgreen"
dotchart(x$mpg,labels=row.names(x),cex=.7,groups= x$cyl,
main="Gas Milage for Car Models\ngrouped by cylinder",
xlab="Miles Per Gallon", gcolor="black", color=x$color)
# Adding text
text(x = x$mpg,
y = 1:nrow(x) + ifelse(x$cyl == "6", 2, ifelse(x$cyl == "4", 4, 0)),
labels= x$mpg,
cex = 0.5,
pos = 4)

Change ordering of values in 'y' axis for dotchart

The following code :
avector <- as.vector(top.links.added.overall$Amount)
x <- as.vector(top.links.added.overall[order(avector),])
row.names(x) <- c("Yahoo" ,"Cnn", "Google")
x$color[x$Amount == 100] <- "red"
x$color[x$Amount == 500] <- "blue"
x$color[x$Amount == 1000] <- "darkgreen"
dotchart(x$Amount,
labels = row.names(x),
cex=.7,
groups = x$Amount,
gcolor = "black",
color = x$color,
pch=19,
main = "Gas Mileage for Car Models\ngrouped by cylinder",
xlab = "Miles Per Gallon")
Generates this graph :
Here is the format of the dataset top.links.added.overall$Amount :
here is the file dataset :
Amount,Name
1000,Google
500,Cnn
100,Yahoo
When I remove the code :
row.names(x) <- c("Yahoo" ,"Cnn", "Google")
I get row names of 1,2,3
I don't need I should need to set the names of the 'y' axis ? How can the code of the graph be amended so that the company with lowest numerical value(in this case yahoo) start at beginning of 'y' axis instead of top, which is currently what is occuring ?
I don't think I can test it with the offered R data objects but perhaps something along these lines:
x <- as.vector(top.links.added.overall[order(-avector),])
row.names(x) <- rev( c("Yahoo" ,"Cnn", "Google") )
Using mathematical negation to the order argument and the rev (reverse) function.
Edit: I now understand your frustration, but after looking at the code I decided to try this which seems to do it:
dotchart(x$Amount,
labels = row.names(x),
cex=.7,
groups = -x$Amount, # the code sorts by `as.numeric(groups)`
gcolor = "black",
color = x$color,
pch=19,
main = "Gas Mileage for Car Models\ngrouped by cylinder",
xlab = "Miles Per Gallon")

Resources