I'm attempting to generate a forest plot using the Forester package to visualize the stratified hazard ratio. Despite following the instructions in the package guide, I am experiencing issues with proper alignment when only selecting a few variables for the plot.
As an example, when I only include gender and age, the plot is misaligned, as shown in the attached image.
library(grid)
library(forestploter)
# Read provided sample example data
dt <- read.csv(system.file("extdata", "example_data.csv", package = "forestploter"))
# Keep needed columns
dt <- dt[,1:6]
# indent the subgroup if there is a number in the placebo column
dt$Subgroup <- ifelse(is.na(dt$Placebo),
dt$Subgroup,
paste0(" ", dt$Subgroup))
# NA to blank or NA will be transformed to carachter.
dt$Treatment <- ifelse(is.na(dt$Treatment), "", dt$Treatment)
dt$Placebo <- ifelse(is.na(dt$Placebo), "", dt$Placebo)
dt$se <- (log(dt$hi) - log(dt$est))/1.96
# Add blank column for the forest plot to display CI.
# Adjust the column width with space.
dt$` ` <- paste(rep(" ", 20), collapse = " ")
# Create confidence interval column to display
dt$`HR (95% CI)` <- ifelse(is.na(dt$se), "",
sprintf("%.2f (%.2f to %.2f)",
dt$est, dt$low, dt$hi))
head(dt)
The plot is missalligned when I chose only two variables
p <- forest(dt[c(2:7),c(1:3, 8:9)],
est = dt$est,
lower = dt$low,
upper = dt$hi,
sizes = 1,
ci_column = 4,
ref_line = 1,
arrow_lab = c("Placebo Better", "Treatment Better"),
xlim = c(0, 4),
ticks_at = c(0.5, 1, 2, 3),
footnote = "This is the demo data. Please feel free to change\nanything you want.")
# Print plot
plot(p)
resulting in this :
Is there a way to specify certain variables for the plot while preserving proper alignment?"
I am using the R package circlize to create a circos plot.
I am aiming to create something similar to Figure 2 in this paper: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004812.
I would like to custom specify where to shade parts of the chromosomes with different, manually entered colours, but I am struggling.
Reproducible code:
### load packages
library("tidyverse")
library("circlize")
### Generate mock data
# Chromosome sizes - genome with 5 chromosomes size 1-5kb
chrom <- c(1,2,3,4,5)
start <- c(0,0,0,0,0)
end <- c(1000,1700,2200,3100,5000)
chr_sizes_df <- data.frame(chrom,start,end)
# Areas of interest - where I want 'shade_col' shading
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
chr_regions_df <- data.frame(chr,start,end)
# Recombinations - to be depicted with lines connecting chromosomes
chr1 <- c(1,2,2,3,3,3,3,4,4,5,5,5,5)
chr1_pos <- c(100,150,170,20,2100,900,950,200,3000,100,3100,3300,4900)
chr2 <- c(1,4,2,1,3,3,5,5,4,3,5,4,2)
chr2_pos <- c(100,3000,170,100,100,900,3200,4800, 3050,10,3100,3300,40)
location <- c("Non coding", "Coding", "Non coding", "Non coding", "Coding", "Coding", "Coding", "Non coding", "Non coding", "Non coding", "Coding", "Coding", "Non coding")
sv_df <- data.frame(chr1,chr1_pos,chr2,chr2_pos,location)
# SNPs - to be depicted with dots or lines
chrom <- c(1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5)
pos <- c(350,600,200,650,700,300,1100,1500,2000,400,1500,1800,2000,2700,200,1000,1050,2000,2500,4950)
snp_df <- data.frame(chrom,pos)
### Prepare for plot
# Generate colour scheme
sv_df$location_col <- ifelse(sv_df$location=="Coding", "#FB8072",
ifelse(sv_df$location=="Non coding", "#80B1D3",
"#e9e9e9")
)
# Specify chromosome block shading
shade_col <- "#3F75AB"
# Format rearrangement data
nuc1 <- sv_df %>% select(chr1,chr1_pos) # Start positions
nuc2 <- sv_df %>% select(chr2,chr2_pos) # End positions
### Generating plot
## Basic circos graphic parameters
circos.clear()
circos.par(cell.padding=c(0,0,0,0),
track.margin=c(0,0.05),
start.degree = 90,
gap.degree = 3,
clock.wise = TRUE)
## Sector details
circos.initialize(factors = chr_sizes_df$chrom,
xlim = cbind(chr_sizes_df$start, chr_sizes_df$end))
## Generate basic outline with chromosomes
circos.track(ylim=c(0, 1), panel.fun=function(x, y) {
chr=CELL_META$sector.index
xlim=CELL_META$xlim
ylim=CELL_META$ylim
circos.text(mean(xlim), mean(ylim), chr)
},bg.col="#cde3f9", bg.border=TRUE, track.height=0.1)
## Add recombinations - coloured by coding vs non-coding etc
circos.genomicLink(nuc1, nuc2,
col=sv_df$location_col,
h.ratio=0.6,
lwd=3)
The above code produces the plot shown below:
I want to use chr_regions_df to specify the chromosome areas for shading using shade_col. Have tried a few things - draw.sector doesn't work well because it requires to know the angles rather than positions, which is hard to work out. There are cytoband options using circos.initializeWithIdeogram() but this seems to use pre-specified cytoband formats for certain species, rather than custom made areas for shading as in my use case (also why I couldn't use supplying user defined color in r circlize package).
Many thanks for your help.
To draw custom colored areas within chromosomes, use circos.genomicTrackPlotRegion, where you need to provide a bed-like data frame with an additional column specifying the color to be used for each area.
#the first column should match the chromosome names used in 'circos.initialize'
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
#chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
shade_col <- c("blue","red","blue","red","blue","red","blue","red","blue","red","blue","red")
chr_regions_df <- data.frame(chrom_num,start,end,shade_col)
After running circos.initialize, draw the chromosomes with their shaded area. In panel.fun, the first argument (region) contains the coordinates of each feature while the second (value) contains all but the first 3 columns of the data frame.
circos.genomicTrackPlotRegion(chr_regions_df, ylim = c(0, 1),
panel.fun = function(region, value, ...) {
col = value$shade_col
circos.genomicRect(region, value,
ybottom = 0, ytop = 1,
col = col, border = NA)
xlim = get.cell.meta.data("xlim")
circos.rect(xlim[1], 0, xlim[2], 1, border = "black")
ylim = get.cell.meta.data("ylim")
chr = get.current.sector.index()
circos.text(mean(xlim), mean(ylim), chr)
}, bg.col = "#cde3f9", bg.border=TRUE, track.height=0.1)
I've taken this code from this site to make a correlation matrix heatmap. How do I format the numbers in the heatmap to have only 2 decimal places worth?:
http://blog.revolutionanalytics.com/2014/08/quantitative-finance-applications-in-r-8.html
library(xts)
library(Quandl)
my_start_date <- "1998-01-05"
SP500.Q <- Quandl("YAHOO/INDEX_GSPC", start_date = my_start_date, type = "xts")
RUSS2000.Q <- Quandl("YAHOO/INDEX_RUT", start_date = my_start_date, type = "xts")
NIKKEI.Q <- Quandl("NIKKEI/INDEX", start_date = my_start_date, type = "xts")
HANG_SENG.Q <- Quandl("YAHOO/INDEX_HSI", start_date = my_start_date, type = "xts")
DAX.Q <- Quandl("YAHOO/INDEX_GDAXI", start_date = my_start_date, type = "xts")
CAC.Q <- Quandl("YAHOO/INDEX_FCHI", start_date = my_start_date, type = "xts")
KOSPI.Q <- Quandl("YAHOO/INDEX_KS11", start_date = my_start_date, type = "xts")
# Depending on the index, the final price for each day is either
# "Adjusted Close" or "Close Price". Extract this single column for each:
SP500 <- SP500.Q[,"Adjusted Close"]
RUSS2000 <- RUSS2000.Q[,"Adjusted Close"]
DAX <- DAX.Q[,"Adjusted Close"]
CAC <- CAC.Q[,"Adjusted Close"]
KOSPI <- KOSPI.Q[,"Adjusted Close"]
NIKKEI <- NIKKEI.Q[,"Close Price"]
HANG_SENG <- HANG_SENG.Q[,"Adjusted Close"]
# The xts merge(.) function will only accept two series at a time.
# We can, however, merge multiple columns by downcasting to *zoo* objects.
# Remark: "all = FALSE" uses an inner join to merge the data.
z <- merge(as.zoo(SP500), as.zoo(RUSS2000), as.zoo(DAX), as.zoo(CAC),
as.zoo(KOSPI), as.zoo(NIKKEI), as.zoo(HANG_SENG), all = FALSE)
# Set the column names; these will be used in the heat maps:
myColnames <- c("SP500","RUSS2000","DAX","CAC","KOSPI","NIKKEI","HANG_SENG")
colnames(z) <- myColnames
# Cast back to an xts object:
mktPrices <- as.xts(z)
# Next, calculate log returns:
mktRtns <- diff(log(mktPrices), lag = 1)
head(mktRtns)
mktRtns <- mktRtns[-1, ] # Remove resulting NA in the 1st row
require(gplots)
generate_heat_map <- function(correlationMatrix, title)
{
heatmap.2(x = correlationMatrix, # the correlation matrix input
cellnote = correlationMatrix, # places correlation value in each cell
main = title, # heat map title
symm = TRUE, # configure diagram as standard correlation matrix
dendrogram="none", # do not draw a row dendrogram
Rowv = FALSE, # keep ordering consistent
trace="none", # turns off trace lines inside the heat map
density.info="none", # turns off density plot inside color legend
notecol="black") # set font color of cell labels to black
}
corr1 <- cor(mktRtns) * 100
generate_heat_map(corr1, "Correlations of World Market Returns, Jan 1998 - Present")
You might want the color values to use the full unrounded number, but show a rounded number.
In that case do this...
generate_heat_map <- function(correlationMatrix, title)
{
heatmap.2(x = correlationMatrix, # the correlation matrix input
cellnote = round(correlationMatrix, 2), # places correlation value in each cell
main = title, # heat map title
symm = TRUE, # configure diagram as standard correlation matrix
dendrogram="none", # do not draw a row dendrogram
Rowv = FALSE, # keep ordering consistent
trace="none", # turns off trace lines inside the heat map
density.info="none", # turns off density plot inside color legend
notecol="black") # set font color of cell labels to black
}
If you want the colors to match the numbers shown exactly. Leave the existing function alone and change the input...
corr1 <- round(cor(mktRtns) * 100, 2)
generate_heat_map(corr1, "Correlations of World Market Returns, Jan 1998 - Present")
This is an advanced question.
I use my own layout for the chartSeries quantmod function, and I can even create my own newTA. Everything works fine. But ...
What I want to do but I can't:
a) Manipulate the legend of each of the 3 charts:
- move to other corner, (from "topleft" to "topright")
- change the content
- remove completely if needed ...
b) My indicator generates 2 legends:
value1
value2
same as above ... how could I modify them? how could I delete them?
c) control position and range of yaxis (place it on the left / right
or even remove them
same when there is a secundary axis on the graph
d) Modify main legend (the one in the top right
where is written the range of dates
A working sample code:
# Load Library
library(quantmod)
# Get Data
getSymbols("SPY", src="yahoo", from = "2010-01-01")
# Create my indicator (30 values)
value1 <- rnorm(30, mean = 50, sd = 25)
value2 <- rnorm(30, mean = 50, sd = 25)
# merge with the first 30 rows of SPY
dataset <- merge(first(SPY, n = 30),
value1,
value2)
# **** data has now 8 columns:
# - Open
# - High
# - Low
# - Close
# - Volume
# - Adjusted
# - a (my indicator value 1)
# - b (my indicator value 2)
#
# create my TA function - This could also be achieve using the preFUN option of newTA
myTAfun <- function(a){
# input: a: function will receive whole dataset
a[,7:8] # just return my indicator values
}
# create my indicator to add to chartSeries
newMyTA <- newTA(FUN = myTAfun, # chartSeries will pass whole dataset,
# I just want to process the last 2 columns
lty = c("solid", "dotted"),
legend.name = "My_TA",
col = c("red", "blue")
)
# define my layout
layout(matrix(c(1, 2, 3), 3, 1),
heights = c(2.5, 1, 1.5)
)
# create the chart
chartSeries(dataset,
type = "candlesticks",
main = "",
show.grid = FALSE,
name = "My_Indicator_Name",
layout = NULL, # bypass internal layout
up.col = "blue",
dn.col = "red",
TA = c(newMyTA(),
addVo()
),
plot = TRUE,
theme = chartTheme("wsj")
)
I have tried using legend command, and also the option legend.name (with very limited control of the output).
I have had a look at the chob object returned by chartSeries, but I can't figure out what to do next ...
Image below:
After some time learning a little bit more about R internals, S3 and S4 objects, and quantmod package, I've come up with the solution. It can be used to change anything in the graph.
A) If the legend belongs to a secundary indicator window:
Do not print the chartSeries (type option plot = FALSE) and get the returned "chob" object.
In one of the slots of the "chob" object there is a "chobTA" object with 2 params related to legend. Set them to NULL.
Finally, call the hidden function chartSeries.chob
In my case:
#get the chob object
my.chob <- chartSeries(dataset,
type = "candlesticks",
main = "",
show.grid = FALSE,
name = "My_Indicator_Name",
layout = NULL, # bypass internal layout
up.col = "blue",
dn.col = "red",
TA = c(newMyTA(),
addVo()
),
plot = FALSE, # do not plot, just get the chob
#plot = TRUE,
theme = chartTheme("wsj")
)
#if the legend is in a secundary window, and represents
#an indicator created with newTA(), this will work:
my.chob#passed.args$TA[[1]]#params$legend <- NULL
my.chob#passed.args$TA[[1]]#params$legend.name <- NULL
quantmod:::chartSeries.chob(my.chob)
B) In any other case, it is possible to modify "chartSeries.chob", "chartTA", "chartBBands", etc and then call chartSeries.chob
In my case:
fixInNamespace("chartSeries.chob", ns = "quantmod")
quantmod:::chartSeries.chob(my.chob)
It is just enough with adding "#" at the beginning of the lines related to legend().
That's it.