I am using the R package circlize to create a circos plot.
I am aiming to create something similar to Figure 2 in this paper: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004812.
I would like to custom specify where to shade parts of the chromosomes with different, manually entered colours, but I am struggling.
Reproducible code:
### load packages
library("tidyverse")
library("circlize")
### Generate mock data
# Chromosome sizes - genome with 5 chromosomes size 1-5kb
chrom <- c(1,2,3,4,5)
start <- c(0,0,0,0,0)
end <- c(1000,1700,2200,3100,5000)
chr_sizes_df <- data.frame(chrom,start,end)
# Areas of interest - where I want 'shade_col' shading
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
chr_regions_df <- data.frame(chr,start,end)
# Recombinations - to be depicted with lines connecting chromosomes
chr1 <- c(1,2,2,3,3,3,3,4,4,5,5,5,5)
chr1_pos <- c(100,150,170,20,2100,900,950,200,3000,100,3100,3300,4900)
chr2 <- c(1,4,2,1,3,3,5,5,4,3,5,4,2)
chr2_pos <- c(100,3000,170,100,100,900,3200,4800, 3050,10,3100,3300,40)
location <- c("Non coding", "Coding", "Non coding", "Non coding", "Coding", "Coding", "Coding", "Non coding", "Non coding", "Non coding", "Coding", "Coding", "Non coding")
sv_df <- data.frame(chr1,chr1_pos,chr2,chr2_pos,location)
# SNPs - to be depicted with dots or lines
chrom <- c(1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5)
pos <- c(350,600,200,650,700,300,1100,1500,2000,400,1500,1800,2000,2700,200,1000,1050,2000,2500,4950)
snp_df <- data.frame(chrom,pos)
### Prepare for plot
# Generate colour scheme
sv_df$location_col <- ifelse(sv_df$location=="Coding", "#FB8072",
ifelse(sv_df$location=="Non coding", "#80B1D3",
"#e9e9e9")
)
# Specify chromosome block shading
shade_col <- "#3F75AB"
# Format rearrangement data
nuc1 <- sv_df %>% select(chr1,chr1_pos) # Start positions
nuc2 <- sv_df %>% select(chr2,chr2_pos) # End positions
### Generating plot
## Basic circos graphic parameters
circos.clear()
circos.par(cell.padding=c(0,0,0,0),
track.margin=c(0,0.05),
start.degree = 90,
gap.degree = 3,
clock.wise = TRUE)
## Sector details
circos.initialize(factors = chr_sizes_df$chrom,
xlim = cbind(chr_sizes_df$start, chr_sizes_df$end))
## Generate basic outline with chromosomes
circos.track(ylim=c(0, 1), panel.fun=function(x, y) {
chr=CELL_META$sector.index
xlim=CELL_META$xlim
ylim=CELL_META$ylim
circos.text(mean(xlim), mean(ylim), chr)
},bg.col="#cde3f9", bg.border=TRUE, track.height=0.1)
## Add recombinations - coloured by coding vs non-coding etc
circos.genomicLink(nuc1, nuc2,
col=sv_df$location_col,
h.ratio=0.6,
lwd=3)
The above code produces the plot shown below:
I want to use chr_regions_df to specify the chromosome areas for shading using shade_col. Have tried a few things - draw.sector doesn't work well because it requires to know the angles rather than positions, which is hard to work out. There are cytoband options using circos.initializeWithIdeogram() but this seems to use pre-specified cytoband formats for certain species, rather than custom made areas for shading as in my use case (also why I couldn't use supplying user defined color in r circlize package).
Many thanks for your help.
To draw custom colored areas within chromosomes, use circos.genomicTrackPlotRegion, where you need to provide a bed-like data frame with an additional column specifying the color to be used for each area.
#the first column should match the chromosome names used in 'circos.initialize'
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
#chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
shade_col <- c("blue","red","blue","red","blue","red","blue","red","blue","red","blue","red")
chr_regions_df <- data.frame(chrom_num,start,end,shade_col)
After running circos.initialize, draw the chromosomes with their shaded area. In panel.fun, the first argument (region) contains the coordinates of each feature while the second (value) contains all but the first 3 columns of the data frame.
circos.genomicTrackPlotRegion(chr_regions_df, ylim = c(0, 1),
panel.fun = function(region, value, ...) {
col = value$shade_col
circos.genomicRect(region, value,
ybottom = 0, ytop = 1,
col = col, border = NA)
xlim = get.cell.meta.data("xlim")
circos.rect(xlim[1], 0, xlim[2], 1, border = "black")
ylim = get.cell.meta.data("ylim")
chr = get.current.sector.index()
circos.text(mean(xlim), mean(ylim), chr)
}, bg.col = "#cde3f9", bg.border=TRUE, track.height=0.1)
Related
I'm attempting to generate a forest plot using the Forester package to visualize the stratified hazard ratio. Despite following the instructions in the package guide, I am experiencing issues with proper alignment when only selecting a few variables for the plot.
As an example, when I only include gender and age, the plot is misaligned, as shown in the attached image.
library(grid)
library(forestploter)
# Read provided sample example data
dt <- read.csv(system.file("extdata", "example_data.csv", package = "forestploter"))
# Keep needed columns
dt <- dt[,1:6]
# indent the subgroup if there is a number in the placebo column
dt$Subgroup <- ifelse(is.na(dt$Placebo),
dt$Subgroup,
paste0(" ", dt$Subgroup))
# NA to blank or NA will be transformed to carachter.
dt$Treatment <- ifelse(is.na(dt$Treatment), "", dt$Treatment)
dt$Placebo <- ifelse(is.na(dt$Placebo), "", dt$Placebo)
dt$se <- (log(dt$hi) - log(dt$est))/1.96
# Add blank column for the forest plot to display CI.
# Adjust the column width with space.
dt$` ` <- paste(rep(" ", 20), collapse = " ")
# Create confidence interval column to display
dt$`HR (95% CI)` <- ifelse(is.na(dt$se), "",
sprintf("%.2f (%.2f to %.2f)",
dt$est, dt$low, dt$hi))
head(dt)
The plot is missalligned when I chose only two variables
p <- forest(dt[c(2:7),c(1:3, 8:9)],
est = dt$est,
lower = dt$low,
upper = dt$hi,
sizes = 1,
ci_column = 4,
ref_line = 1,
arrow_lab = c("Placebo Better", "Treatment Better"),
xlim = c(0, 4),
ticks_at = c(0.5, 1, 2, 3),
footnote = "This is the demo data. Please feel free to change\nanything you want.")
# Print plot
plot(p)
resulting in this :
Is there a way to specify certain variables for the plot while preserving proper alignment?"
I've taken this code from this site to make a correlation matrix heatmap. How do I format the numbers in the heatmap to have only 2 decimal places worth?:
http://blog.revolutionanalytics.com/2014/08/quantitative-finance-applications-in-r-8.html
library(xts)
library(Quandl)
my_start_date <- "1998-01-05"
SP500.Q <- Quandl("YAHOO/INDEX_GSPC", start_date = my_start_date, type = "xts")
RUSS2000.Q <- Quandl("YAHOO/INDEX_RUT", start_date = my_start_date, type = "xts")
NIKKEI.Q <- Quandl("NIKKEI/INDEX", start_date = my_start_date, type = "xts")
HANG_SENG.Q <- Quandl("YAHOO/INDEX_HSI", start_date = my_start_date, type = "xts")
DAX.Q <- Quandl("YAHOO/INDEX_GDAXI", start_date = my_start_date, type = "xts")
CAC.Q <- Quandl("YAHOO/INDEX_FCHI", start_date = my_start_date, type = "xts")
KOSPI.Q <- Quandl("YAHOO/INDEX_KS11", start_date = my_start_date, type = "xts")
# Depending on the index, the final price for each day is either
# "Adjusted Close" or "Close Price". Extract this single column for each:
SP500 <- SP500.Q[,"Adjusted Close"]
RUSS2000 <- RUSS2000.Q[,"Adjusted Close"]
DAX <- DAX.Q[,"Adjusted Close"]
CAC <- CAC.Q[,"Adjusted Close"]
KOSPI <- KOSPI.Q[,"Adjusted Close"]
NIKKEI <- NIKKEI.Q[,"Close Price"]
HANG_SENG <- HANG_SENG.Q[,"Adjusted Close"]
# The xts merge(.) function will only accept two series at a time.
# We can, however, merge multiple columns by downcasting to *zoo* objects.
# Remark: "all = FALSE" uses an inner join to merge the data.
z <- merge(as.zoo(SP500), as.zoo(RUSS2000), as.zoo(DAX), as.zoo(CAC),
as.zoo(KOSPI), as.zoo(NIKKEI), as.zoo(HANG_SENG), all = FALSE)
# Set the column names; these will be used in the heat maps:
myColnames <- c("SP500","RUSS2000","DAX","CAC","KOSPI","NIKKEI","HANG_SENG")
colnames(z) <- myColnames
# Cast back to an xts object:
mktPrices <- as.xts(z)
# Next, calculate log returns:
mktRtns <- diff(log(mktPrices), lag = 1)
head(mktRtns)
mktRtns <- mktRtns[-1, ] # Remove resulting NA in the 1st row
require(gplots)
generate_heat_map <- function(correlationMatrix, title)
{
heatmap.2(x = correlationMatrix, # the correlation matrix input
cellnote = correlationMatrix, # places correlation value in each cell
main = title, # heat map title
symm = TRUE, # configure diagram as standard correlation matrix
dendrogram="none", # do not draw a row dendrogram
Rowv = FALSE, # keep ordering consistent
trace="none", # turns off trace lines inside the heat map
density.info="none", # turns off density plot inside color legend
notecol="black") # set font color of cell labels to black
}
corr1 <- cor(mktRtns) * 100
generate_heat_map(corr1, "Correlations of World Market Returns, Jan 1998 - Present")
You might want the color values to use the full unrounded number, but show a rounded number.
In that case do this...
generate_heat_map <- function(correlationMatrix, title)
{
heatmap.2(x = correlationMatrix, # the correlation matrix input
cellnote = round(correlationMatrix, 2), # places correlation value in each cell
main = title, # heat map title
symm = TRUE, # configure diagram as standard correlation matrix
dendrogram="none", # do not draw a row dendrogram
Rowv = FALSE, # keep ordering consistent
trace="none", # turns off trace lines inside the heat map
density.info="none", # turns off density plot inside color legend
notecol="black") # set font color of cell labels to black
}
If you want the colors to match the numbers shown exactly. Leave the existing function alone and change the input...
corr1 <- round(cor(mktRtns) * 100, 2)
generate_heat_map(corr1, "Correlations of World Market Returns, Jan 1998 - Present")
I have a matrix (named ichimoku) with 516 rows and 2 columns ,each one of them containing values to be plotted, the goal is to recreate the clouds for the Ichimoku strategy.
Using matpot, I am able to plot these two curves but what I want is to shade the area between the two curves. I have two problems :
I tried using polygon to shade the area but it does not work. I suspect it is because the two series (senkouA and senkouB) cross several times on the plot instead of having one always greater than the other
I would like the area to be shaded in green if senkouA>senkouB and in red if senkouB>senkouA but from what I read the polygon can only be of one color.
Is there an other function that polygon which might help me achieve what I am looking for, that is a shade area in green between senkouA and senkouB when senkouA>senkouB and a shade area in red when senkouB>senkouA ?
The ichimoku matrix looks like this (the first column is senkouA, the other senkouB)
[,1] [,2]
[1,] 23323.62 23320.53
[2,] 23334.67 23328.71
[3,] 23334.11 23323.06
[4,] 23332.94 23323.06
...
here is my matplot function (which works):
matplot(ichimoku,lty=1,lwd=1,pch=20,type="l",col=c("red","blue"))
and my polygon function (which doesn't):
polygon(c(1:516,516:1),c(senkouA,senkouB),col='green')
If you find the intersections between the curves, then you can draw the polygons between the intersections. Here is a modification of a previous post where they find intersections between curves, and a function to draw the polygons.
## Some sample data
set.seed(0)
dat <- data.frame(x1=3*sin(3*(x=seq(0,10,len=100)))+rnorm(100),
x2=2*cos(x)+rnorm(100))
## https://stackoverflow.com/questions/20519431/finding-point-of-intersection-in-r
intersects <- function(x1, x2) {
seg1 <- which(!!diff(x1 > x2)) # location of first point in crossing segments
above <- x2[seg1] > x1[seg1] # which curve is above prior to crossing
slope1 <- x1[seg1+1] - x1[seg1]
slope2 <- x2[seg1+1] - x2[seg1]
x <- seg1 + ((x2[seg1] - x1[seg1]) / (slope1 - slope2))
y <- x1[seg1] + slope1*(x - seg1)
data.frame(x=x, y=y, pindex=seg1, pabove=(1:2)[above+1L]) # pabove is greater curve prior to crossing
}
ichimoku <- function(data, addLines=TRUE) {
## Find points of intersections
ints <- intersects(data[,1], data[,2])
intervals <- findInterval(1:nrow(data), c(0, ints$x))
## Make plot
matplot(data, type="n", col=2:3, lty=1, lwd=4)
legend("topright", c("A", "B"), col=3:2, lty=1, lwd=2)
## Draw the polygons
for (i in seq_along(table(intervals))) {
xstart <- ifelse(i == 1, 0, ints$x[i-1])
ystart <- ifelse(i == 1, dat[1,ints$pindex[1]], ints$y[i-1])
xend <- ints$x[i]
yend <- ints$y[i]
x <- seq(nrow(data))[intervals == i]
polygon(c(xstart, x, xend, rev(x)), c(ystart, data[x,1], yend, rev(data[x,2])),
col=ints$pabove[i]%%2+2)
}
## Add lines for curves
if (addLines)
invisible(lapply(1:2, function(x) lines(seq(nrow(data)), data[,x], col=x%%2+2, lwd=2)))
}
## Plot the data
ichimoku(dat)
Here is some code that works for a simple version of your problem, in which the lines only cross once. I haven't tested it for repeated crossings, however.
# Make toy data
ichimoku <- data.frame(senkouA = rep(10, 10), senkouB = c(3, 5, 4, 7, 10, 11, 15, 12, 13, 14))
# Make indices for the conditions that define the fill colors. They need to intersect for the polygons to connect.
index.green = with(ichimoku, as.logical(senkouA >= senkouB))
index.red = with(ichimoku, as.logical(senkouA <= senkouB))
# Make the line plot
matplot(ichimoku, lty=1, lwd=1, pch=20, type="l", col=c("red","blue"))
# Now add polygons with fill color based on those conditions by subsetting the task using the indices.
with(ichimoku, polygon(x = c(seq(length(senkouA))[index.green], rev(seq(length(senkouA))[index.green])),
y = c(senkouB[index.green], senkouA[index.green]), col = "green"))
with(ichimoku, polygon(x = c(seq(length(senkouA))[index.red], rev(seq(length(senkouA))[index.red])),
y = c(senkouB[index.red], senkouA[index.red]), col = "red"))
Here's my result:
This is an advanced question.
I use my own layout for the chartSeries quantmod function, and I can even create my own newTA. Everything works fine. But ...
What I want to do but I can't:
a) Manipulate the legend of each of the 3 charts:
- move to other corner, (from "topleft" to "topright")
- change the content
- remove completely if needed ...
b) My indicator generates 2 legends:
value1
value2
same as above ... how could I modify them? how could I delete them?
c) control position and range of yaxis (place it on the left / right
or even remove them
same when there is a secundary axis on the graph
d) Modify main legend (the one in the top right
where is written the range of dates
A working sample code:
# Load Library
library(quantmod)
# Get Data
getSymbols("SPY", src="yahoo", from = "2010-01-01")
# Create my indicator (30 values)
value1 <- rnorm(30, mean = 50, sd = 25)
value2 <- rnorm(30, mean = 50, sd = 25)
# merge with the first 30 rows of SPY
dataset <- merge(first(SPY, n = 30),
value1,
value2)
# **** data has now 8 columns:
# - Open
# - High
# - Low
# - Close
# - Volume
# - Adjusted
# - a (my indicator value 1)
# - b (my indicator value 2)
#
# create my TA function - This could also be achieve using the preFUN option of newTA
myTAfun <- function(a){
# input: a: function will receive whole dataset
a[,7:8] # just return my indicator values
}
# create my indicator to add to chartSeries
newMyTA <- newTA(FUN = myTAfun, # chartSeries will pass whole dataset,
# I just want to process the last 2 columns
lty = c("solid", "dotted"),
legend.name = "My_TA",
col = c("red", "blue")
)
# define my layout
layout(matrix(c(1, 2, 3), 3, 1),
heights = c(2.5, 1, 1.5)
)
# create the chart
chartSeries(dataset,
type = "candlesticks",
main = "",
show.grid = FALSE,
name = "My_Indicator_Name",
layout = NULL, # bypass internal layout
up.col = "blue",
dn.col = "red",
TA = c(newMyTA(),
addVo()
),
plot = TRUE,
theme = chartTheme("wsj")
)
I have tried using legend command, and also the option legend.name (with very limited control of the output).
I have had a look at the chob object returned by chartSeries, but I can't figure out what to do next ...
Image below:
After some time learning a little bit more about R internals, S3 and S4 objects, and quantmod package, I've come up with the solution. It can be used to change anything in the graph.
A) If the legend belongs to a secundary indicator window:
Do not print the chartSeries (type option plot = FALSE) and get the returned "chob" object.
In one of the slots of the "chob" object there is a "chobTA" object with 2 params related to legend. Set them to NULL.
Finally, call the hidden function chartSeries.chob
In my case:
#get the chob object
my.chob <- chartSeries(dataset,
type = "candlesticks",
main = "",
show.grid = FALSE,
name = "My_Indicator_Name",
layout = NULL, # bypass internal layout
up.col = "blue",
dn.col = "red",
TA = c(newMyTA(),
addVo()
),
plot = FALSE, # do not plot, just get the chob
#plot = TRUE,
theme = chartTheme("wsj")
)
#if the legend is in a secundary window, and represents
#an indicator created with newTA(), this will work:
my.chob#passed.args$TA[[1]]#params$legend <- NULL
my.chob#passed.args$TA[[1]]#params$legend.name <- NULL
quantmod:::chartSeries.chob(my.chob)
B) In any other case, it is possible to modify "chartSeries.chob", "chartTA", "chartBBands", etc and then call chartSeries.chob
In my case:
fixInNamespace("chartSeries.chob", ns = "quantmod")
quantmod:::chartSeries.chob(my.chob)
It is just enough with adding "#" at the beginning of the lines related to legend().
That's it.
I would like to create an interactive version of charts.PerformanceSummary() using rCharts.
This is my attempt so far...but am struggling to put it all together....
# Load xts and PerformanceAnalytics package
require(xts)
require(PerformanceAnalytics)
# Generate rtns data
set.seed(123)
X.stock.rtns <- xts(rnorm(1000,0.00001,0.0003), Sys.Date()-(1000:1))
Y.stock.rtns <- xts(rnorm(1000,0.00003,0.0004), Sys.Date()-(1000:1))
Z.stock.rtns <- xts(rnorm(1000,0.00005,0.0005), Sys.Date()-(1000:1))
rtn.obj <- merge(X.stock.rtns , Y.stock.rtns, Z.stock.rtns)
colnames(rtn.obj) <- c("x.stock.rtns","y.stock.rtns","z.stock.rtns")
# The below output is what we are aiming for
charts.PerformanceSummary(rtn.obj,lwd=1,main="Performance of stocks x,y and z")
# So this is what I have tried to do to replicate the data and try and generate graphs
# custom function to convert xts to data.frame
xts.2.df <- function(xts.obj){
df <- ggplot2:::fortify(xts.obj)
df[,1] <- as.character(df[,1])
df
}
# calculating the data for the top and bottom graph
cum.rtn <- do.call(merge,lapply(seq(ncol(rtn.obj)),function(y){cumprod(rtn.obj[,y]+1)-1}))
dd.rtn <- do.call(merge,lapply(seq(ncol(rtn.obj)),function(y){Drawdowns(rtn.obj[,y])}))
# Loading rCharts package
require(devtools)
install_github('rCharts', 'ramnathv',ref='dev')
require(rCharts)
# creating the first cumulative return graph
m1 <- mPlot(x = "Index", y = c("x.stock.rtns","y.stock.rtns","z.stock.rtns"), type = "Line", data = xts.2.df(cum.rtn),
pointSize = 0, lineWidth = 1)
# Top cumulative return graph
m1
# Creating the individual bar graphs that are to be shown when one line is hovered over
m.x <- mPlot(x = "Index", y = c("x.stock.rtns"), type="Bar",data = xts.2.df(rtn.obj))
m.y <- mPlot(x = "Index", y = c("y.stock.rtns"), type="Bar",data = xts.2.df(rtn.obj))
m.z <- mPlot(x = "Index", y = c("z.stock.rtns"), type="Bar",data = xts.2.df(rtn.obj))
# Creating the drawdown graph
m2 <- mPlot(x = "Index", y = c("x.stock.rtns","y.stock.rtns","z.stock.rtns"), type = "Line", data = xts.2.df(dd.rtn),
pointSize = 0, lineWidth = 1)
m2
So there are few parts to the question:
How do you put three morris.js charts together so that they are linked?
Can you make bold the line that is being hovered over in the top graph (m1)?
How do you get the middle one (i.e. one of m.x, m.y, m.z)to change according to what's been hovered over, i.e if hovering over stock z, then stock z's returns (m.z) show up un the middle?
Can you get it to make bold in the bottom graph, the same asset that is being made bold in the top graph?
Can you change the information that is being displayed to in the floating box to display some stats about the asset being hovered over?
How do you add axes labels?
How do you add an overall title?
BONUS: How do you integrate crossfilter.js into it so that a subset of time can be chosen...and all graphs get re-drawn?
Even if you can't answer all parts any help/comments/answers would be appreciated...