How to partly colorize histogram? - r

I've been trying to color specific bins above a defined threshold in the following data frame (df)
df <- read.table("https://pastebin.com/raw/3En2GWG6", header=T)
I've been following this example (Change colour of specific histogram bins in R), but I cannot seem to get this to adapt their suggestions to my data, so I wanted to ask you here at stackoverflow
I would like all bins with values above 0.100 to be "red", and the rest all to be either no color, or just black (I defined black, but I would prefer no color)
Here is what I tried:
col<-(df$consumption>=0.100)
table(col) # I can see 40 points above 100, the rest below
col[which(col=="TRUE")] <- "firebrick1"
col[which(col=="FALSE")] <- "black"
hist(df$consumption, breaks = 1000, xlim = c(0,0.2), col=col,xlab= "Consumption [MG]")
However, the whole graph is red, and that doesn't make sense..?
In other words, I would like anything to the right side of the line below to be red
hist(df$consumption, breaks = 1000, xlim = c(0,0.2),xlab= "Consumption [MG]")
abline(v=c(.100), col=c("red"),lty=c(1), lwd=c(5))

Simply plot two histograms on top of each other using add=TRUE and sub-setting the second.
hist(df$consumption, breaks=1000, xlim=c(0,.2),xlab= "Consumption [MG]")
hist(df$consumption[df$consumption > .100], breaks=1000, xlim=c(0,.2), col=2, add=TRUE)
abline(v=.100, col=2, lty=3)

Here is along the lines of what you were doing. You do not want to count the points above your cutoff, but rather the number of histogram bins above your cutoff.
# store the histogram as an object
h <- hist(df$consumption, breaks = 1000)
# extract out the breaks, and assign a color vector accordingly
cols <- ifelse(h$breaks > 0.1, "firebrick1", "black")
# use the color vector
plot(h, col = cols, xlim=c(0,.2),xlab= "Consumption [MG]")
abline(v=c(.100), col=c("red"),lty=c(1), lwd=c(5))

Related

Plot raster with continuous color palette with zero in white (R Base)

As much as I looked at other questions I couldn't solve my problem (I'm new in R).
I simply need to plot a raster where the minimum value (let's say color red) goes to zero (white) and from zero to maximum (color blue) continuously.
I would like to create that color palette independently if the data is symmetrically distributed in negative and positive values.
Let's say I have a raster with this values:
library(raster)
values <- c(seq(-2000,0,by=1),seq(1,499,by=1))
values <- sample(values)
r <- raster(ncol=50,nrow=50)
r <- setValues(r,values)
plot(r)
If this has already been resolved in another question, I would appreciate any information.
So, you can use RColorBrewer or colorRampPalette to achieve this (or a combination of both) and by setting breaks.
library(raster)
library(RColorBrewer)
breakpoints <- c(-2000, seq(0, 500, 55.6))
colors <- c("red", RColorBrewer::brewer.pal(9, "Blues"))
plot(r, breaks = breakpoints, col = colors)
Output
You can also do something similar with colorRampPalette by setting two unique colors. Then, in parenthesis, define how many colors on the gradient you want.
colors <- c("red", colorRampPalette(c("steelblue1", "steelblue4"))(9))
You can also use these both in conjunction with one another.
colors <- c("red", colorRampPalette(RColorBrewer::brewer.pal(9, "Blues"))(11))
If you want red also continuous, then you could create a gradient for both.
breakpoints <- c(seq(-2000, -1, 222), seq(0, 500, 55.6))
colors <- c(RColorBrewer::brewer.pal(9, "Reds"), RColorBrewer::brewer.pal(9, "Blues"))
plot(r, breaks = breakpoints, col = colors)
Output
It's a little easier to set these using ggplot. You essentially need to rescale the values on a scale of 0 to 1 to make 0 the "midpoint".
library(tidyverse)
library(ggplot2)
rdf <- as.data.frame(r, xy = TRUE)
rdf %>%
ggplot() +
geom_raster(aes(x, y, fill = layer)) +
scale_fill_gradientn(
colours = colorRampPalette(rev(RColorBrewer::brewer.pal(11, "RdBu")))(255),
values = c(1.0, (0 - min(rdf$layer)) / (max(rdf$layer) - min(rdf$layer)), 0)
)
Output

Single histogram with two or more colors depending on xaxis values

I know it was already answered here, but only for ggplot2 histogram.
Let's say I have the following code to generate a histogram with red bars and blue bars, same number of each (six red and six blue):
set.seed(69)
hist(rnorm(500), col = c(rep("red", 6), rep("blue", 7)), breaks = 10)
I have the following image as output:
I would like to automate the entire process, how can I use values from any x-axis and set a condition to color the histogram bars (with two or more colors) using the hist() function, without have to specify the number os repetitions of each color?
Assistance most appreciated.
The hist function uses the pretty function to determine break points, so you can do this:
set.seed(69)
x <- rnorm(500)
breaks <- pretty(x,10)
col <- ifelse(1:length(breaks) <= length(breaks)/2, "red", "blue")
hist(x, col = col, breaks = breaks)
When I want to do this, I actually tabulate the data and make a barplot as follows (note that a bar plot of tabulated data is a histogram):
set.seed(69)
dat <- rnorm(500, 0, 1)
tab <- table(round(dat, 1))#Round data from rnorm because rnorm can be precise beyond most real data
bools <- (as.numeric(attr(tab, "name")) >= 0)#your condition here
cols <- c("grey", "dodgerblue4")[bools+1]#Note that FALSE + 1 = 1 and TRUE + 1 = 2
barplot(tab, border = "white", col = cols, main = "Histogram with barplot")
The output:

How to get R plot to plot variable on heat.color scale

I'm plotting data in R. I'm running the following two commands:
plot(x = df$Latitude, df$Longitude, col = heat.colors(nrow(df)), type = "p")
plot(x = df$Latitude, df$Longitude, col = df$feature, type = "p")
The first line plots the points along a color gradient (points with higher values are red, points with lower values are yellow) and the second line plots data with color dictated by the int values given by features.
However, I want to combine both such that I'm plotting points with colors on a scale using the numeric values from feature. In some sense, I want to pass two arguments to col. How can I do this?
You can try:
# some data
set.seed(123)
x <- rnorm(100)
# Create some breaks and use colorRampPalette to transform the breaks into a color code
gr <- .bincode(x, seq(min(x), max(x), len=length(x)), include.lowest = T)
col <- colorRampPalette(c("red", "white", "blue"))(length(x))[gr]
# the plot:
plot(x, pch=16, col=col)
For a legend see solutions here or here

How to plot the value of abline in R?

I used this code to make this plot:
plot(p, cv2,col=rgb(0,100,0,50,maxColorValue=255),pch=16,
panel.last=abline(h=67,v=1.89, lty=1,lwd=3))
My plot looks like this:
1.) How can I plot the value of the ablines in a simple plot?
2.) How can I scale my plot so that both lines appear in the middle?
to change scale of plot so lines are in the middle change the axes i.e.
x<-1:10
y<-1:10
plot(x,y)
abline(a=1,b=0,v=1)
changed to:
x<-1:10
y<-1:10
plot(x,y,xlim=c(-30,30))
abline(a=1,b=0,v=1)
by "value" I am assuming you mean where the line cuts the x-axis? Something like text? i.e.:
text((0), min(y), "number", pos=2)
if you want the label on the x axis then try:
abline(a=1,b=0,v=1)
axis(1, at=1,labels=1)
to prevent overlap between labels you could remove the zero i.e.:
plot(x,y,xlim=c(-30,30),yaxt="n")
axis(2, at=c(1.77,5,10,15,20,25))
or before you plot extend the margins and add the labels further from the axis
par(mar = c(6.5, 6.5, 6.5, 6.5))
plot(x,y,xlim=c(-30,30))
abline(a=1,b=0,v=1)
axis(2, at=1.77,labels=1.77,mgp = c(10, 2, 0))
Similar in spirit to the answer proposed by #user1317221, here is my suggestion
# generate some fake points
x <- rnorm(100)
y <- rnorm(100)
# positions of the lines
vert = 0.5
horiz = 1.3
To display the lines at the center of the plot, first compute the horizontal and vertical distances between the data points and the lines, then adjust the limits adequately.
# compute the limits, in order for the lines to be centered
# REM we add a small fraction (here 10%) to leave some empty space,
# available to plot the values inside the frame (useful for one the solutions, see below)
xlim = vert + c(-1.1, 1.1) * max(abs(x-vert))
ylim = horiz + c(-1.1, 1.1) * max(abs(y-horiz))
# do the main plotting
plot(x, y, xlim=xlim, ylim=ylim)
abline(h=horiz, v=vert)
Now, you could plot the 'values of the lines', either on the axes (the lineparameter allows you to control for possible overlapping):
mtext(c(vert, horiz), side=c(1,2))
or alternatively within the plotting frame:
text(x=vert, y=ylim[1], labels=vert, adj=c(1.1,1), col='blue')
text(x=xlim[1], y=horiz, labels=horiz, adj=c(0.9,-0.1), col='blue')
HTH

scatter plot specifying color and labelling axis in r

I have following data and plot:
pos <- rep(1:2000, 20)
xv =c(rep(1:20, each = 2000))
# colrs <- unique(xv)
colrs <- xv # edits
yv =rnorm(2000*20, 0.5, 0.1)
xv = lapply(unique(xv), function(x) pos[xv==x])
to.add = cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
plot (bp,yv, pch = "*", col = colrs)
I have few issues in this plot I could not figure out.
(1) I want to use different color for different group or two different color for different groups (i.e xv), but when I tried color function in terms to be beautiful mixture. Although I need to highlight some points (for example bp 4000 to 4500 for example with blue color)
(2) Instead of bp positions I want to put a tick mark and label with the group.
Thank you, appreciate your help.
Edits: with help of the following answer (with slight different approach in case I have unbalanced number in each group will work) I could get the similar plot. But still question remaining regarding colors is what if I want to use two alternate colors in alternate group ?
You can solve your colour issue by repeating the colour index however many times each group has a point plotted, like so:
plot (bp,yv, pch = "*", col = rep(colrs,each=2000))
The default colour palette (see ?palette or palette() ) will wrap around itself and you might want to specify your own to get 20 distinct colours.
To relabel the x axis, try plotting without the axis and then specifying the points and labels manually.
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20)
If you are trying to squeeze a lot of labels in there, you might have to shrink the text (cex.axis)or spin the labels 90 degrees (las=2).
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20,cex.axis=0.7,las=2)
Result:
One way is you could use a nested ifelse.
I'm still learning R, but one way it could be done would look something like:
plot(whatev$x, whatev$y, col=ifelse(xv<2000,red,ifelse(2000<xv & xv<4000,yellow,blue)))
You could nest as many of these as you want to have specificity on the colors and the intervals. The ifelse command is of form ifelse(TEST, True, False).
A simpler way would be to use the unique groups in xv to assign rainbow colors.
colrs=rainbow(length(unique(xv))) #Or colrs=rainbow(length(xv)) if xv is unique.
plot(whatev$x, whatev$y, col=colrs)
I hope I got all that right. I'm still learning R myself.
I'm going to go out on a limb and guess that your real data are something like 2000 values of things from 20 different groups. For instance, heights of 2000 plants of 20 different species. In such a case, you might want to look at the dotplot() function (or as illustrated below, dotplot.table()) in the lattice package.
Generate matrix of hypothetical values:
set.seed(1)
myY <- sapply( seq_len(20), function(x) rnorm(2000, x^(1/3)))
Transpose matrix to get groups as rows
myY <- t(myY)
Provide names of groups to matrix:
dimnames(myY)[[1]]<-paste("group", seq_len(nrow(myY)))
Load lattice package
library(lattice)
Generate dotplot
dotplot(myY, horizontal = FALSE, panel = function(x, y, horizontal, ...) {
panel.dotplot(x = x, y = y, horizontal = horizontal, jitter.x = TRUE,
col = seq_len(20)[x], pch = "*", cex = 1.5)
}, scales = list(x = list(rot = 90))
)
Which looks like (with unfortunate y-axis labeling):
Seeing that #JohnCLK is requesting a way of colouring by values on the x axis, I tried these demos in ggplot2-- each uses a dummy variable that is coded based on values or ranges to be highlighted in the other variables.
So, first set up the data, as in the question:
pos <- rep(1:2000, 20)
xv <- c(rep(1:20, each = 2000))
yv <- (2000*20, 0.5, 0.1)
xv <- lapply(unique(xv), function(x) pos[xv==x])
to.add <- cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
Then load ggplot2, prepare a couple of utility functions, and set the default theme:
library("ggplot2")
make.png <- function(p, fName) {
png(fName, width=640, height=480, units="px")
print(p)
dev.off()
}
make.plot <- function(df) {
p <- ggplot(df,
aes(x = bp,
y = yv,
colour = highlight))
p <- p + geom_point()
p <- p + opts(legend.position = "none")
return(p)
}
theme_set( theme_bw() )
Draw a plot which highlights values in a defined range on the vertical axis:
# highlight a horizontal band
df <- data.frame(cbind(bp, yv))
df$highlight <- 0
df$highlight[ df$yv >= 0.4 & df$yv < 0.45 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_horizontal.png")
Next draw a plot which highlights values in a defined range on the x axis, a vertical band:
# highlight a vertical band
df$highlight <- 0
df$highlight[ df$bp >= 38000 & df$bp < 42000 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_vertical.png")
And finally draw a plot which highlights alternating vertical bands, by x value:
# highlight alternating bands
library("gtools")
alt.band.width <- 2000
df$highlight <- as.integer(df$bp / alt.band.width)
df$highlight <- ifelse(odd(df$highlight), 1, 0)
p <- make.plot(df)
print(p)
make.png(p, "demo_alternating.png")
Hope this helps; it was good practice anyway.

Resources