I would like to set x axis on a heatmap ggtree.
This is my code
ggtree(working_tree,open.angle=15, size=0.1) %<+% avian %<+% color +
aes(color = I(colour)) +
geom_tippoint(size = 2,) +
geom_tiplab(size = 3, colour = "black") +
theme_tree2()
# I want to rotate the x axis and get the positive number
p1 <- revts(p) + scale_x_continuous(labels = abs)
h1 <- gheatmap(p1, landuse,
offset = 15, width = 0.05, font.size = 3, colnames_position = "top", colnames_angle = 0,
colnames_offset_y = 0, hjust = 0) +
scale_fill_manual(breaks = c("Forest", "Jungle rubber", "Rubber", "Oil palm"),
values = c("#458B00", "#76EE00", "#1874CD", "#00BFFF"), name = "Land use system",
na.value = "white")
, and I got this picture
The problem is that when I showed the heatmap, the x axis automatically changes the range itself from 0 to 60. However, the range I want is from 0 to 80.
Does anyone know how to do this or have any experiences for this?
Updated
I already solved the case by using the function scale_x_continous like this
scale_x_continuous(breaks = seq(-80,0,20), labels = abs(seq(-80,0,20)))
For anyone interested in geological timescale in R, I suggest to use the package deeptime
I have a dataframe like so:
set.seed(453)
year= as.factor(c(rep("1998", 20), rep("1999", 16)))
lepsp= c(letters[seq(from = 1, to = 20 )], c('a','b','c'),letters[seq(from =8, to = 20 )])
freq= c(sample(1:15, 20, replace=T), sample(1:18, 16,replace=T))
df<-data.frame(year, lepsp, freq)
df<-
df %>%
group_by(year) %>%
mutate(rank = dense_rank(-freq))
Frequencies freq of each lepsp within each year are ranked in the rank column. Larger freq values correspond to the smallest rank value and smaller freq values have the largest rank values. Some rankings are repeated if levels of lepsp have the same abundance.
I would like to split the df into multiple subsets by year. Then I would like to plot each subsetted dataframe in a multipanel figure. Essentially this is to create species abundance curves. The x-axis would be rank and the yaxis needs to be freq.
In my real dataframe I have 22 years of data. I would prefer the graphs to be displayed as 2 columns of 4 rows for a total of 8 graphs per page. Essentially I would have to repeat the solution offered here 3 times.
I also need to demarcate the 25%, 50% and 75% quartiles with vertical lines to look like this (desired result):
It would be great if each graph specified the year to which it belonged, but since all axis are the same name, I do not want x and y labels to be repeated for each graph.
I have tried to plot multiple lines on the same graph but it gets messy.
year.vec<-unique(df$year)
plot(sort(df$freq[df$year==year.vec[1]],
decreasing=TRUE),bg=1,type="b", ylab="Abundance", xlab="Rank",
pch=21, ylim=c(0, max(df$freq)))
for (i in 2:22){
points(sort(df$freq[df$year==year.vec[i]], decreasing=TRUE), bg=i,
type="b", pch=21)
}
legend("topright", legend=year.vec, pt.bg=1:22, pch=21)
I have also tried a loop, however it does not produce an output and is missing some of the arguments I would like to include:
jpeg('pract.jpg')
par(mfrow = c(6, 4)) # 4 rows and 2 columns
for (i in unique(levels(year))) {
plot(df$rank,df$freq, type="p", main = i)
}
dev.off()
Update
(Attempted result)
I found the following code after my post which gets me a little closer, but is still missing all the features I would like:
library(reshape2)
library(ggplot2)
library (ggthemes)
x <- ggplot(data = df2, aes(x = rank, y = rabun)) +
geom_point(aes(fill = "dodgerblue4")) +
theme_few() +
ylab("Abundance") + xlab("Rank") +
theme(axis.title.x = element_text(size = 15),
axis.title.y = element_text(size = 15),
axis.text.x = element_text(size = 15),
axis.text.y = element_text(size = 15),
plot.title = element_blank(), # we don't want individual plot titles as the facet "strip" will give us this
legend.position = "none", # we don't want a legend either
panel.border = element_rect(fill = NA, color = "darkgrey", size = 1.25, linetype = "solid"),
axis.ticks = element_line(colour = 'darkgrey', size = 1.25, linetype = 'solid')) # here, I just alter to colour and thickness of the plot outline and tick marks. You generally have to do this when faceting, as well as alter the text sizes (= element_text() in theme also)
x
x <- x + facet_wrap( ~ year, ncol = 4)
x
I prefer base R to modify graph features, and have not been able to find a method using base R that meets all my criteria above. Any help is appreciated.
Here's a ggplot approach. First off, I made some more data to get the 3x2 layout:
df = rbind(df, mutate(df, year = year + 4), mutate(df, year = year + 8))
Then We do a little manipulation to generate the quantiles and labels by group:
df_summ =
df %>% group_by(year) %>%
do(as.data.frame(t(quantile(.$rank, probs = c(0, 0.25, 0.5, 0.75)))))
names(df_summ)[2:5] = paste0("q", 0:3)
df_summ_long = gather(df_summ, key = "q", value = "value", -year) %>%
inner_join(data.frame(q = paste0("q", 0:3), lab = c("Common", "Rare-75% -->", "Rare-50% -->", "Rare-25% -->"), stringsAsFactors = FALSE))
With the data in good shape, plotting is fairly simple:
library(ggthemes)
library(ggplot2)
ggplot(df, aes(x = rank, y = freq)) +
geom_point() +
theme_few() +
labs(y = "Abundance (% of total)", x = "Rank") +
geom_vline(data = df_summ_long[df_summ_long$q != "q0", ], aes(xintercept = value), linetype = 4, size = 0.2) +
geom_text(data = df_summ_long, aes(x = value, y = Inf, label = lab), size = 3, vjust = 1.2, hjust = 0) +
facet_wrap(~ year, ncol = 2)
There's some work left to do - mostly in the rarity text overlapping. It might not be such an issue with your actual data, but if it is you could pull the max y values into df_summ_long and stagger them a little bit, actually using y coordinates instead of just Inf to get it at the top like I did.
I am trying to recreate the basic temperature trend of this Paleotemperature figure in R. (Original image and data.)
The scale interval of the x-axis changes from 100s of millions of years to 10s of millions to millions, and then to 100s of thousands, and so on, but the ticks marks are evenly spaced. The original figure was carefully laid out in five separate graphs in Excel to achieve the spacing. I am trying to get the same x-axis layout in R.
I have tried two basic approaches. The first approach was to use par(fig=c(x1,x2,y1,y2)) to make five separate graphs placed side by side. The problem is that the intervals among tick marks is not uniform and labels overlap.
#1
par(fig=c(0,0.2,0,0.5), mar=c(3,4,0,0))
plot(paleo1$T ~ paleo1$Years, col='red3', xlim=c(540,60), bty='l',type='l', ylim=c(-6,15), ylab='Temperature Anomaly (°C)')
abline(0,0,col='gray')
#2
par(fig=c(0.185,0.4,0,0.5), mar=c(3,0,0,0), new=TRUE)
plot(paleo2$T ~ paleo2$Year, col='forestgreen', axes=F, type='l', xlim=c(60,5), ylab='', ylim=c(-6,15))
axis(1, xlim=c(60,5))
abline(0,0,col='gray')
#etc.
The second approach (and my preferred approach, if possible) is to plot the data in a single graph. This causes non-uniform distance among tick marks because they follow their "natural" order. (Edit: example data added as well as link to full data set.).
years <- c(500,400,300,200,100,60,50,40,30,20,10,5,4,3,2,1)
temps <- c(13.66, 8.6, -2.16, 3.94, 8.44, 5.28, 12.98, 8.6, 5, 5.34, 3.66, 2.65, 0.78, 0.25, -1.51, -1.77)
test <- data.frame(years, temps)
names(test) <- c('Year','T')
# The full csv file can be used with this line instead of the above.
# test <- read.csv('https://www.dropbox.com/s/u0dfmlvzk0ztpkv/paleo_test.csv?dl=1')
plot(test$T ~ test$Year, type='l', xaxt='n', xlim=c(520,1), bty='l', ylim=c(-5,15), xlab="", ylab='Temperature Anomaly (°C)')
ticklabels = c(500,400,300,200,100,60,50,40,30,20,10,5,4,3,2,1)
axis(1, at=ticklabels)
Adding log='x' to plot comes closest but the intervals between ticks are still not even and the actual scale is, of course, not a log scale.
My examples only go down to 1 million years because I am trying to solve the problem first but my the goal is to match the original figure above. I am open to ggplot solutions although I am only fleetingly familiar with it.
I will strike a different note by saying: don't. In my experience, the harder something is to do in ggplot2 (and to a lesser extent, base graphics), the less likely it is to be a good idea. Here, the problem is that consistently changing the scales like is more likely to lead the viewer astray.
Instead, I recommend using a log scale and manually setting your cutoffs.
First, here is some longer data, just to cover the full likely scale of your question:
longerTest <-
data.frame(
Year = rep(1:9, times = 6) * rep(10^(3:8), each = 9)
, T = rnorm(6*9))
Then, I picked some cutoffs to place the labels at in the plot. These can be adjusted to whatever you want, but are at least a starting point for reasonably spaced ticks:
forLabels <-
rep(c(1,2,5), times = 6) * rep(10^(3:8), each = 3)
Then, I manually set some things to append to the labels. Thus, instead of having to say "Thousands of years" under part of the axis, you can just label those with a "k". Each order of magnitude gets a value. Nnote that the names are just to help keep things straight: below I just use the index to match. So, if you skip the first two, you will need to adjust the indexing below.
toAppend <-
c("1" = "0"
, "2" = "00"
, "3" = "k"
, "4" = "0k"
, "5" = "00k"
, "6" = "M"
, "7" = "0M"
, "8" = "00M")
Then, I change my forLabels into the text versions I want to use by grabbing the first digit, and concatenating with the correct suffix from above.
myLabels <-
paste0(
substr(as.character(forLabels), 1, 1)
, toAppend[floor(log10(forLabels))]
)
This gives:
[1] "1k" "2k" "5k" "10k" "20k" "50k" "100k" "200k" "500k" "1M" "2M"
[12] "5M" "10M" "20M" "50M" "100M" "200M" "500M"
You could likely use these for base graphics, but getting the log scale to do what you want is sometimes tricky. Instead, since you said you are open to a ggplot2 solution, I grabbed this modified log scale from this answer to get a log scale that runs from big to small:
library("scales")
reverselog_trans <- function(base = exp(1)) {
trans <- function(x) -log(x, base)
inv <- function(x) base^(-x)
trans_new(paste0("reverselog-", format(base)), trans, inv,
log_breaks(base = base),
domain = c(1e-100, Inf))
}
Then, just pass in the data, and set the scale with the desired breaks:
ggplot(longerTest
, aes(x = Year
, y = T)) +
geom_line() +
scale_x_continuous(
breaks = forLabels
, labels = myLabels
, trans=reverselog_trans(10)
)
Gives:
Which has a consistent scale, but is labelled far more uniformly.
If you want colors, you can do that using cut:
ggplot(longerTest
, aes(x = Year
, y = T
, col = cut(log10(Year)
, breaks = c(3,6,9)
, labels = c("Thousands", "Millions")
, include.lowest = TRUE)
, group = 1
)) +
geom_line() +
scale_x_continuous(
breaks = forLabels
, labels = myLabels
, trans=reverselog_trans(10)
) +
scale_color_brewer(palette = "Set1"
, name = "How long ago?")
Here is a version using facet_wrap to create different scales. I used 6 here, but you can set whatever thresholds you want instead.
longerTest$Period <-
cut(log10(longerTest$Year)
, breaks = c(3, 4, 5, 6, 7, 8, 9)
, labels = paste(rep(c("", "Ten", "Hundred"), times = 2)
, rep(c("Thousands", "Millions"), each = 3) )
, include.lowest = TRUE)
longerTest$Period <-
factor(longerTest$Period
, levels = rev(levels(longerTest$Period)))
newBreaks <-
rep(c(2,4,6,8, 10), times = 6) * rep(10^(3:8), each = 5)
newLabels <-
paste0(
substr(as.character(newBreaks), 1, 1)
, toAppend[floor(log10(newBreaks))]
)
ggplot(longerTest
, aes(x = Year
, y = T
)) +
geom_line() +
facet_wrap(~Period, scales = "free_x") +
scale_x_reverse(
breaks = newBreaks
, labels = newLabels
)
gives:
Here is a start:
#define the panels
breaks <- c(-Inf, 8, 80, Inf)
test$panel <- cut(test$Year, breaks, labels = FALSE)
test$panel <- ordered(test$panel, levels = unique(test$panel))
#for correct scales
dummydat <- data.frame(Year = c(0, 8, 8, 80, 80, max(test$Year)),
T = mean(test$T),
panel = ordered(rep(1:3, each = 2), levels = levels(test$panel)))
library(ggplot2)
ggplot(test, aes(x = Year, y = T, color = panel)) +
geom_line() +
geom_blank(data = dummydat) + #for correct scales
facet_wrap(~ panel, nrow = 1, scales = "free_x") +
theme_minimal() + #choose a theme you like
theme(legend.position = "none", #and customize it
panel.spacing.x = unit(0, "cm"),
strip.text = element_blank() ,
strip.background = element_blank()) +
scale_x_reverse(expand = c(0, 0))
Here's a basic example of doing it with separate plots using gridExtra. This may be useful to combine with extra grobs, for instance to create the epoch boxes across the top (not done here). If so desired, this might be best combined with Roland's solution.
# ggplot with gridExtra
library('ggplot2')
library('gridExtra')
library('grid')
d1 <- test[1:5, ]
d2 <- test[6:11, ]
d3 <- test[12:16, ]
plot1 <- ggplot(d1, aes(y = T, x = seq(1:nrow(d1)))) +
geom_line() +
ylim(c(-5, 15)) +
theme_minimal() +
theme(axis.title.x = element_blank(),
plot.margin = unit(c(1,0,1,1), "cm")) +
scale_x_continuous(breaks=)
plot2 <- ggplot(d2, aes(y = T, x = seq(1:nrow(d2)))) +
geom_line() +
ylim(c(-5, 15)) +
theme_minimal() +
theme(axis.text.y = element_blank(),
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.x = element_blank(),
plot.margin = unit(c(1,0,1,0), "cm"))
plot3 <- ggplot(d3, aes(y = T, x = seq(1:nrow(d3)))) +
geom_line() +
theme_minimal() +
theme(axis.text.y = element_blank(),
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.x = element_blank(),
plot.margin = unit(c(1,0,1,0), "cm")) +
ylim(c(-5, 15))
# put together
grid.arrange(plot1, plot2, plot3, nrow = 1,
widths = c(1.5,1,1)) # allow extra width for first plot which has y axis
I'd like to make small returns in this plot more visible. The most appropriate function seems to be scale_colour_gradient2, but this washes out the small returns, which happen most often. Using limits helped but I couldn't work out how to set oob (out of bounds) so it would just have a "saturated" value rather than be grey. And the log transform just made small values stand out. Has someone else figured out how to do this elegantly?
library(zoo)
library(ggplot2)
library(tseries)
spx <- get.hist.quote(instrument="^gspc", start="2000-01-01",
end="2013-12-14", quote="AdjClose",
provider="yahoo", origin="1970-01-01",
compression="d", retclass="zoo")
spx.rtn <- diff(log(spx$AdjClose)) * 100
rtn.data <- data.frame(x=time(spx.rtn),yend=spx.rtn)
p <- ggplot(rtn.data) +
geom_segment(aes(x=x,xend=x,y=0,yend=yend,colour=yend)) +
xlab("") + ylab("S&P 500 Daily Return %") +
theme(legend.position="null",axis.title.x=element_blank())
# low returns invisible
p + scale_colour_gradient2(low="blue",high="red")
# extreme values are grey
p + scale_colour_gradient2(low="blue",high="red",limits=c(-3,3))
# log transform returns has opposite problem
max_val <- max(log(abs(spx.rtn)))
values <- seq(-max_val, max_val, length = 11)
library(RColorBrewer)
p + scale_colour_gradientn(colours = brewer_pal(type="div",pal="RdBu")(11),
values = values
, rescaler = function(x, ...) sign(x)*log(abs(x)), oob = identity)
Here is another possibility, using scale_colour_gradientn. Mapping of colours is set using values = rescale(...) so that resolution is higher for values close to zero. I had a look at some colour scales here: http://colorbrewer2.org. I chose a 5-class diverging colour scheme, RdBu, from red to blue via near-white. There might be other scales that suit your needs better, this is just to show the basic principles.
# check the colours
library(RColorBrewer)
# cols <- brewer_pal(pal = "RdBu")(5) # not valid in 1.1-2
cols <- brewer.pal(n = 5, name = "RdBu")
cols
# [1] "#CA0020" "#F4A582" "#F7F7F7" "#92C5DE" "#0571B0"
# show_col(cols) # not valid in 1.1-2
display.brewer.pal(n = 5, name = "RdBu")
Using rescale, -10 corresponds to blue #0571B0; -1 = light blue #92C5DE; 0 = light grey #F7F7F7; 1 = light red #F4A582; 10 = red #CA0020. Values between -1 and 1 are interpolated between light blue and light red, et c. Thus, mapping is not linear and resolution is higher for small values.
library(ggplot2)
library(scales) # needed for rescale
ggplot(rtn.data) +
geom_segment(aes(x = x, xend = x, y = 0, yend = yend, colour = yend)) +
xlab("") + ylab("S&P 500 Daily Return %") +
scale_colour_gradientn(colours = cols,
values = rescale(c(-10, -1, 0, 1, 10)),
guide = "colorbar", limits=c(-10, 10)) +
theme(legend.position = "null", axis.title.x = element_blank())
how about:
p + scale_colour_gradient2(low="blue",high="red",mid="purple")
or
p + scale_colour_gradient2(low="blue",high="red",mid="darkgrey")
I'd like to make small returns in this plot more visible. The most appropriate function seems to be scale_colour_gradient2, but this washes out the small returns, which happen most often. Using limits helped but I couldn't work out how to set oob (out of bounds) so it would just have a "saturated" value rather than be grey. And the log transform just made small values stand out. Has someone else figured out how to do this elegantly?
library(zoo)
library(ggplot2)
library(tseries)
spx <- get.hist.quote(instrument="^gspc", start="2000-01-01",
end="2013-12-14", quote="AdjClose",
provider="yahoo", origin="1970-01-01",
compression="d", retclass="zoo")
spx.rtn <- diff(log(spx$AdjClose)) * 100
rtn.data <- data.frame(x=time(spx.rtn),yend=spx.rtn)
p <- ggplot(rtn.data) +
geom_segment(aes(x=x,xend=x,y=0,yend=yend,colour=yend)) +
xlab("") + ylab("S&P 500 Daily Return %") +
theme(legend.position="null",axis.title.x=element_blank())
# low returns invisible
p + scale_colour_gradient2(low="blue",high="red")
# extreme values are grey
p + scale_colour_gradient2(low="blue",high="red",limits=c(-3,3))
# log transform returns has opposite problem
max_val <- max(log(abs(spx.rtn)))
values <- seq(-max_val, max_val, length = 11)
library(RColorBrewer)
p + scale_colour_gradientn(colours = brewer_pal(type="div",pal="RdBu")(11),
values = values
, rescaler = function(x, ...) sign(x)*log(abs(x)), oob = identity)
Here is another possibility, using scale_colour_gradientn. Mapping of colours is set using values = rescale(...) so that resolution is higher for values close to zero. I had a look at some colour scales here: http://colorbrewer2.org. I chose a 5-class diverging colour scheme, RdBu, from red to blue via near-white. There might be other scales that suit your needs better, this is just to show the basic principles.
# check the colours
library(RColorBrewer)
# cols <- brewer_pal(pal = "RdBu")(5) # not valid in 1.1-2
cols <- brewer.pal(n = 5, name = "RdBu")
cols
# [1] "#CA0020" "#F4A582" "#F7F7F7" "#92C5DE" "#0571B0"
# show_col(cols) # not valid in 1.1-2
display.brewer.pal(n = 5, name = "RdBu")
Using rescale, -10 corresponds to blue #0571B0; -1 = light blue #92C5DE; 0 = light grey #F7F7F7; 1 = light red #F4A582; 10 = red #CA0020. Values between -1 and 1 are interpolated between light blue and light red, et c. Thus, mapping is not linear and resolution is higher for small values.
library(ggplot2)
library(scales) # needed for rescale
ggplot(rtn.data) +
geom_segment(aes(x = x, xend = x, y = 0, yend = yend, colour = yend)) +
xlab("") + ylab("S&P 500 Daily Return %") +
scale_colour_gradientn(colours = cols,
values = rescale(c(-10, -1, 0, 1, 10)),
guide = "colorbar", limits=c(-10, 10)) +
theme(legend.position = "null", axis.title.x = element_blank())
how about:
p + scale_colour_gradient2(low="blue",high="red",mid="purple")
or
p + scale_colour_gradient2(low="blue",high="red",mid="darkgrey")