Heat Map in R error

Heat Map in R error - r

I want to put a heat map on a matrix:
library(ggplot2)
library(RColorBrewer)
library(gplots)
data <- read.csv("C://Users//TestHeatMap/test.csv",sep=",",header= TRUE)
rnames <- data[,1]
mat_data <- data.matrix(data[,2:ncol(data)])
rownames(mat_data) <- rnames
mat_data
Here is what mat_data looks like:
var1 var2 var3 var4
meas 1 0.7305017 0.06576355 0.3570861 0.5359282
meas2 0.3403525 0.35159679 0.2881559 0.2078828
meas 3 0.4292799 0.02639957 0.7336405 0.6969559
meas 4 0.4345162 0.91674849 0.8345379 0.4165677
meas 5 0.2000233 0.21788421 0.7484787 0.8300173
meas 6 0.1365909 0.96092637 0.5466718 0.8219013
meas 7 0.2752694 0.25753156 0.7471216 0.1959987
meas 8 0.5394913 0.64510271 0.4484584 0.9255199
meas 9 0.8634208 0.55507594 0.1108058 0.1642815
meas 10 0.9111965 0.60704937 0.3522915 0.7832306
my_palette <- colorRampPalette(c("red", "yellow", "green"))(n = 299)
col_breaks = c(seq(-1,0,length=100), # for red
seq(0,0.8,length=100), # for yellow
seq(0.8,1,length=100)) # for green
row_distance = dist(mat_data, method = "manhattan")
row_cluster = hclust(row_distance, method = "ward")
col_distance = dist(t(mat_data), method = "manhattan")
col_cluster = hclust(col_distance, method = "ward")
heatmap.2(mat_data,
cellnote = mat_data, # same data set for cell labels
main = "Correlation", # heat map title
notecol="black", # change font color of cell labels to black
density.info="none", # turns off density plot inside color legend
trace="none", # turns off trace lines inside the heat map
margins =c(12,9), # widens margins around plot
col=my_palette, # use on color palette defined earlier
breaks=col_breaks, # enable color transition at specified limits
dendrogram="none", # only draw a row dendrogram
Colv="NA",
key = TRUE,
keysize = 1,
#The 2 lines below cause an error
# the default sorting of of the measurement10 then meansurement10 then measurement8,,,
#i want to sort to be measurment1, then meansurement2...measurement3 etc...so I do the 2
#lines below
Rowv = as.dendrogram(row_cluster), # apply default clustering method
Colv = as.dendrogram(col_cluster) # apply default clustering method
) # turn off column clustering
The error I am getting is:
Error in heatmap.2(mat_data, cellnote = mat_data, main = "Correlation", :
formal argument "Colv" matched by multiple actual arguments

What that means is that heatmap.2 sees two arguments whose name begins with "Colv" . You can't assign two different values to Colv - so either delete teh "NA" or the "as.dendogram" assignment.
I'd reread the help file carefully to be sure you're assigning the right things.

Related

How to shade custom blocks in Circlize package in R

I am using the R package circlize to create a circos plot.
I am aiming to create something similar to Figure 2 in this paper: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004812.
I would like to custom specify where to shade parts of the chromosomes with different, manually entered colours, but I am struggling.
Reproducible code:
### load packages
library("tidyverse")
library("circlize")
### Generate mock data
# Chromosome sizes - genome with 5 chromosomes size 1-5kb
chrom <- c(1,2,3,4,5)
start <- c(0,0,0,0,0)
end <- c(1000,1700,2200,3100,5000)
chr_sizes_df <- data.frame(chrom,start,end)
# Areas of interest - where I want 'shade_col' shading
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
chr_regions_df <- data.frame(chr,start,end)
# Recombinations - to be depicted with lines connecting chromosomes
chr1 <- c(1,2,2,3,3,3,3,4,4,5,5,5,5)
chr1_pos <- c(100,150,170,20,2100,900,950,200,3000,100,3100,3300,4900)
chr2 <- c(1,4,2,1,3,3,5,5,4,3,5,4,2)
chr2_pos <- c(100,3000,170,100,100,900,3200,4800, 3050,10,3100,3300,40)
location <- c("Non coding", "Coding", "Non coding", "Non coding", "Coding", "Coding", "Coding", "Non coding", "Non coding", "Non coding", "Coding", "Coding", "Non coding")
sv_df <- data.frame(chr1,chr1_pos,chr2,chr2_pos,location)
# SNPs - to be depicted with dots or lines
chrom <- c(1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5)
pos <- c(350,600,200,650,700,300,1100,1500,2000,400,1500,1800,2000,2700,200,1000,1050,2000,2500,4950)
snp_df <- data.frame(chrom,pos)
### Prepare for plot
# Generate colour scheme
sv_df$location_col <- ifelse(sv_df$location=="Coding", "#FB8072",
ifelse(sv_df$location=="Non coding", "#80B1D3",
"#e9e9e9")
)
# Specify chromosome block shading
shade_col <- "#3F75AB"
# Format rearrangement data
nuc1 <- sv_df %>% select(chr1,chr1_pos) # Start positions
nuc2 <- sv_df %>% select(chr2,chr2_pos) # End positions
### Generating plot
## Basic circos graphic parameters
circos.clear()
circos.par(cell.padding=c(0,0,0,0),
track.margin=c(0,0.05),
start.degree = 90,
gap.degree = 3,
clock.wise = TRUE)
## Sector details
circos.initialize(factors = chr_sizes_df$chrom,
xlim = cbind(chr_sizes_df$start, chr_sizes_df$end))
## Generate basic outline with chromosomes
circos.track(ylim=c(0, 1), panel.fun=function(x, y) {
chr=CELL_META$sector.index
xlim=CELL_META$xlim
ylim=CELL_META$ylim
circos.text(mean(xlim), mean(ylim), chr)
},bg.col="#cde3f9", bg.border=TRUE, track.height=0.1)
## Add recombinations - coloured by coding vs non-coding etc
circos.genomicLink(nuc1, nuc2,
col=sv_df$location_col,
h.ratio=0.6,
lwd=3)
The above code produces the plot shown below:
I want to use chr_regions_df to specify the chromosome areas for shading using shade_col. Have tried a few things - draw.sector doesn't work well because it requires to know the angles rather than positions, which is hard to work out. There are cytoband options using circos.initializeWithIdeogram() but this seems to use pre-specified cytoband formats for certain species, rather than custom made areas for shading as in my use case (also why I couldn't use supplying user defined color in r circlize package).
Many thanks for your help.

To draw custom colored areas within chromosomes, use circos.genomicTrackPlotRegion, where you need to provide a bed-like data frame with an additional column specifying the color to be used for each area.
#the first column should match the chromosome names used in 'circos.initialize'
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
#chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
shade_col <- c("blue","red","blue","red","blue","red","blue","red","blue","red","blue","red")
chr_regions_df <- data.frame(chrom_num,start,end,shade_col)
After running circos.initialize, draw the chromosomes with their shaded area. In panel.fun, the first argument (region) contains the coordinates of each feature while the second (value) contains all but the first 3 columns of the data frame.
circos.genomicTrackPlotRegion(chr_regions_df, ylim = c(0, 1),
panel.fun = function(region, value, ...) {
col = value$shade_col
circos.genomicRect(region, value,
ybottom = 0, ytop = 1,
col = col, border = NA)
xlim = get.cell.meta.data("xlim")
circos.rect(xlim[1], 0, xlim[2], 1, border = "black")
ylim = get.cell.meta.data("ylim")
chr = get.current.sector.index()
circos.text(mean(xlim), mean(ylim), chr)
}, bg.col = "#cde3f9", bg.border=TRUE, track.height=0.1)

R print groups of data points in different colors

I'm doing some basic statistics in R and I'm trying to have a different color for each iteration of the loop. So all the data points for i=1 should have the same color, all the data points for i=2 should have the same color etc. The best would be to have different colors for the varying i ranging from yellow to blue for exemple. (I already tried to deal with Colorramp etc. but I didn't manage to get it done.)
Thanks for your help.
library(ggplot2)
#dput(thedata[,2])
#c(1.28994585412464, 1.1317747077577, 1.28029504741834, 1.41172820353708,
#1.13172920065253, 1.40276516298315, 1.43679599499374, 1.90618019359643,
#2.33626745030772, 1.98362330686504, 2.22606615548188, 2.40238822720322)
#dput(thedata[,4])
#c(NA, -1.7394747097211, 2.93081902519318, -0.33212717268786,
#-1.78796119503752, -0.5080871442002, -0.10110379236627, 0.18977632798691,
#1.7514277696687, 1.50275797771879, -0.74632159611221, 0.0978774103243802)
#OR
#dput(thedata[,c(2,4)])
#structure(list(LRUN74TTFRA156N = c(1.28994585412464, 1.1317747077577,
#1.28029504741834, 1.41172820353708, 1.13172920065253, 1.40276516298315,
#1.43679599499374, 1.90618019359643, 2.33626745030772, 1.98362330686504,
#2.22606615548188, 2.40238822720322), SELF = c(NA, -1.7394747097211,
#2.93081902519318, -0.33212717268786, -1.78796119503752, -0.5080871442002,
#-0.10110379236627, 0.18977632798691, 1.7514277696687, 1.50275797771879,
#-0.74632159611221, 0.0978774103243802)), row.names = c(NA, 12L
#), class = "data.frame")
x1=1
xn=x1+3
plot(0,0,col="white",xlim=c(0,12),ylim=c(-5,7.5))
for(i in 1:3){
y=thedata[x1:xn,4]
x=thedata[x1:xn,2]
reg<-lm(y~x)
points(x,y,col=colors()[i])
abline(reg,col=colors()[i])
x1=x1+4
xn=x1+3
}

The basic idea of colorRamp and colorRampPalette is that they are functionals - they are functions that return functions.
From the help page:
colorRampPalette returns a function that takes an integer argument (the required number of colors) and returns a character vector of colors (see rgb) interpolating the given sequence (similar to heat.colors or terrain.colors).
So, we'll get a yellow-to-blue palette function from colorRampPalette, and then we'll give it the number of colors we want along that ramp to actually get the colors:
# create the palette function
my_palette = colorRampPalette(colors = c("yellow", "blue"))
# test it out, see how it works
my_palette(3)
# [1] "#FFFF00" "#7F7F7F" "#0000FF"
my_palette(5)
# [1] "#FFFF00" "#BFBF3F" "#7F7F7F" "#3F3FBF" "#0000FF"
# Now on with our plot
x1 = 1
xn = x1 + 3
# Set the number of iterations (number of colors needed) as a variable:
nn = 3
# Get the colors from our palettte function
my_cols = my_palette(nn)
# type = 'n' means nothing will be plotted, no points, no lines
plot(0, 0, type = 'n',
xlim = c(0, 12),
ylim = c(-5, 7.5))
# plot
for (i in 1:nn) {
y = thedata[x1:xn, 2]
x = thedata[x1:xn, 1]
reg <- lm(y ~ x)
# use the ith color
points(x, y, col = my_cols[i])
abline(reg, col = my_cols[i])
x1 = x1 + 4
xn = x1 + 3
}
You can play with just visualizing the palette---try out the following code for different n values. You can also try out different options, maybe different starting colors. I like the results better with the space = "Lab" argument for the palette.
n = 10
my_palette = colorRampPalette(colors = c("yellow", "blue"), space = "Lab")
n_palette = my_palette(n)
plot(1:n, rep(1, n), col = n_palette, pch = 15, cex = 4)

Besides of lacking a reproducible example, you seem to have some misconceptions.
First, the function colors doesn't take a numeric argument, see ?colors. So if you want to fetch a different color in each iteration, you need to call it like colors()[i]. The code should look something similar to this (in absence of a reproducible example):
for (i in 20:30){
plot(1:10, 1:10, col = colors()[i])
}
Please bear in mind that the call of x1 and xn in your first and second lines inside the for loop, before defining them will cause an error too.

Adjusting the y_scale/y_shift parameter in the colored_bars function of dendextend package in R?

I am trying to plot many attributes in color bars beneath a dendrogram and am having trouble getting the positioning right (i.e. how to adjust y_scale and/or y_shift). The default plots 5 out of 25 color bars and setting y_shift=0.7 does allow all color bars to show up, although they cover the dendrogram.
I was wondering how you would change the last lines to get the spacing correct and how you came up with the correct adjustments? Thank you!
### LIBRARIES #################################################################
library(RColorBrewer);
library(amap);
library(dendextend);
### MADE UP DATA ##############################################################
N <- 200 # number samples
G <- 1000 # number of features
A <- 25 # number of attributes (# color_bars)
# data will make no sense, just for example
data <- matrix(rnorm(G*N,mean=0,sd=1), G, N);
data <- cov(data);
# fake binary attributes
attributes <- matrix(sample(c(1,2), N*A, replace=TRUE), N, A);
### DENDOGRAM #################################################################
hc <- hcluster(data, method="correlation", link="average");
dend <- as.dendrogram(hc);
dend.idx <- labels(dend);
### COLOR BAR COLORS ##########################################################
# gather enough colors for all of the different attributes
color1 <- list(brewer.pal(12,"Set3"));
color2 <- list(brewer.pal(12, "Paired"));
color3 <- list(brewer.pal(8, "Dark2"));
color <- c(color1[[1]], color2[[1]], color3[[1]]);
# going to bin them all into 4 bins
n <- 4;
# make A=25 color bars, each with a 4 shades of a single color
color.bar.colors <- NULL;
for (count in 1:A){
col.func <- colorRampPalette(c("white", color[count]));
col.gradient <- col.func(n);
col.item <- col.gradient[attributes[,count]];
color.bar.colors <- cbind(color.bar.colors, col.item[dend.idx]);
}
### PLOT ######################################################################
svg("minimal-dend-example.svg", width=100, height = 10);
dend %>%
plot(main = "sample");
colored_bars(
colors = color.bar.colors,
dend = dend,
sort_by_labels_order = FALSE,
# y_shift= 0.7
);
dev.off();

How to Format Numbers in Heatmap.2 in R

I've taken this code from this site to make a correlation matrix heatmap. How do I format the numbers in the heatmap to have only 2 decimal places worth?:
http://blog.revolutionanalytics.com/2014/08/quantitative-finance-applications-in-r-8.html
library(xts)
library(Quandl)
my_start_date <- "1998-01-05"
SP500.Q <- Quandl("YAHOO/INDEX_GSPC", start_date = my_start_date, type = "xts")
RUSS2000.Q <- Quandl("YAHOO/INDEX_RUT", start_date = my_start_date, type = "xts")
NIKKEI.Q <- Quandl("NIKKEI/INDEX", start_date = my_start_date, type = "xts")
HANG_SENG.Q <- Quandl("YAHOO/INDEX_HSI", start_date = my_start_date, type = "xts")
DAX.Q <- Quandl("YAHOO/INDEX_GDAXI", start_date = my_start_date, type = "xts")
CAC.Q <- Quandl("YAHOO/INDEX_FCHI", start_date = my_start_date, type = "xts")
KOSPI.Q <- Quandl("YAHOO/INDEX_KS11", start_date = my_start_date, type = "xts")
# Depending on the index, the final price for each day is either
# "Adjusted Close" or "Close Price". Extract this single column for each:
SP500 <- SP500.Q[,"Adjusted Close"]
RUSS2000 <- RUSS2000.Q[,"Adjusted Close"]
DAX <- DAX.Q[,"Adjusted Close"]
CAC <- CAC.Q[,"Adjusted Close"]
KOSPI <- KOSPI.Q[,"Adjusted Close"]
NIKKEI <- NIKKEI.Q[,"Close Price"]
HANG_SENG <- HANG_SENG.Q[,"Adjusted Close"]
# The xts merge(.) function will only accept two series at a time.
# We can, however, merge multiple columns by downcasting to *zoo* objects.
# Remark: "all = FALSE" uses an inner join to merge the data.
z <- merge(as.zoo(SP500), as.zoo(RUSS2000), as.zoo(DAX), as.zoo(CAC),
as.zoo(KOSPI), as.zoo(NIKKEI), as.zoo(HANG_SENG), all = FALSE)
# Set the column names; these will be used in the heat maps:
myColnames <- c("SP500","RUSS2000","DAX","CAC","KOSPI","NIKKEI","HANG_SENG")
colnames(z) <- myColnames
# Cast back to an xts object:
mktPrices <- as.xts(z)
# Next, calculate log returns:
mktRtns <- diff(log(mktPrices), lag = 1)
head(mktRtns)
mktRtns <- mktRtns[-1, ] # Remove resulting NA in the 1st row
require(gplots)
generate_heat_map <- function(correlationMatrix, title)
{
heatmap.2(x = correlationMatrix, # the correlation matrix input
cellnote = correlationMatrix, # places correlation value in each cell
main = title, # heat map title
symm = TRUE, # configure diagram as standard correlation matrix
dendrogram="none", # do not draw a row dendrogram
Rowv = FALSE, # keep ordering consistent
trace="none", # turns off trace lines inside the heat map
density.info="none", # turns off density plot inside color legend
notecol="black") # set font color of cell labels to black
}
corr1 <- cor(mktRtns) * 100
generate_heat_map(corr1, "Correlations of World Market Returns, Jan 1998 - Present")

You might want the color values to use the full unrounded number, but show a rounded number.
In that case do this...
generate_heat_map <- function(correlationMatrix, title)
{
heatmap.2(x = correlationMatrix, # the correlation matrix input
cellnote = round(correlationMatrix, 2), # places correlation value in each cell
main = title, # heat map title
symm = TRUE, # configure diagram as standard correlation matrix
dendrogram="none", # do not draw a row dendrogram
Rowv = FALSE, # keep ordering consistent
trace="none", # turns off trace lines inside the heat map
density.info="none", # turns off density plot inside color legend
notecol="black") # set font color of cell labels to black
}
If you want the colors to match the numbers shown exactly. Leave the existing function alone and change the input...
corr1 <- round(cor(mktRtns) * 100, 2)
generate_heat_map(corr1, "Correlations of World Market Returns, Jan 1998 - Present")

R quantmod chartSeries newTA chob - modify legend and axis (primary and secundary)

This is an advanced question.
I use my own layout for the chartSeries quantmod function, and I can even create my own newTA. Everything works fine. But ...
What I want to do but I can't:
a) Manipulate the legend of each of the 3 charts:
- move to other corner, (from "topleft" to "topright")
- change the content
- remove completely if needed ...
b) My indicator generates 2 legends:
value1
value2
same as above ... how could I modify them? how could I delete them?
c) control position and range of yaxis (place it on the left / right
or even remove them
same when there is a secundary axis on the graph
d) Modify main legend (the one in the top right
where is written the range of dates
A working sample code:
# Load Library
library(quantmod)
# Get Data
getSymbols("SPY", src="yahoo", from = "2010-01-01")
# Create my indicator (30 values)
value1 <- rnorm(30, mean = 50, sd = 25)
value2 <- rnorm(30, mean = 50, sd = 25)
# merge with the first 30 rows of SPY
dataset <- merge(first(SPY, n = 30),
value1,
value2)
# **** data has now 8 columns:
# - Open
# - High
# - Low
# - Close
# - Volume
# - Adjusted
# - a (my indicator value 1)
# - b (my indicator value 2)
#
# create my TA function - This could also be achieve using the preFUN option of newTA
myTAfun <- function(a){
# input: a: function will receive whole dataset
a[,7:8] # just return my indicator values
}
# create my indicator to add to chartSeries
newMyTA <- newTA(FUN = myTAfun, # chartSeries will pass whole dataset,
# I just want to process the last 2 columns
lty = c("solid", "dotted"),
legend.name = "My_TA",
col = c("red", "blue")
)
# define my layout
layout(matrix(c(1, 2, 3), 3, 1),
heights = c(2.5, 1, 1.5)
)
# create the chart
chartSeries(dataset,
type = "candlesticks",
main = "",
show.grid = FALSE,
name = "My_Indicator_Name",
layout = NULL, # bypass internal layout
up.col = "blue",
dn.col = "red",
TA = c(newMyTA(),
addVo()
),
plot = TRUE,
theme = chartTheme("wsj")
)
I have tried using legend command, and also the option legend.name (with very limited control of the output).
I have had a look at the chob object returned by chartSeries, but I can't figure out what to do next ...
Image below:

After some time learning a little bit more about R internals, S3 and S4 objects, and quantmod package, I've come up with the solution. It can be used to change anything in the graph.
A) If the legend belongs to a secundary indicator window:
Do not print the chartSeries (type option plot = FALSE) and get the returned "chob" object.
In one of the slots of the "chob" object there is a "chobTA" object with 2 params related to legend. Set them to NULL.
Finally, call the hidden function chartSeries.chob
In my case:
#get the chob object
my.chob <- chartSeries(dataset,
type = "candlesticks",
main = "",
show.grid = FALSE,
name = "My_Indicator_Name",
layout = NULL, # bypass internal layout
up.col = "blue",
dn.col = "red",
TA = c(newMyTA(),
addVo()
),
plot = FALSE, # do not plot, just get the chob
#plot = TRUE,
theme = chartTheme("wsj")
)
#if the legend is in a secundary window, and represents
#an indicator created with newTA(), this will work:
my.chob#passed.args$TA[[1]]#params$legend <- NULL
my.chob#passed.args$TA[[1]]#params$legend.name <- NULL
quantmod:::chartSeries.chob(my.chob)
B) In any other case, it is possible to modify "chartSeries.chob", "chartTA", "chartBBands", etc and then call chartSeries.chob
In my case:
fixInNamespace("chartSeries.chob", ns = "quantmod")
quantmod:::chartSeries.chob(my.chob)
It is just enough with adding "#" at the beginning of the lines related to legend().
That's it.

Categories

HOME

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Heat Map in R error - r

What that means is that heatmap.2 sees two arguments whose name begins with "Colv" . You can't assign two different values to Colv - so either delete teh "NA" or the "as.dendogram" assignment. I'd reread the help file carefully to be sure you're assigning the right things.

Related

How to shade custom blocks in Circlize package in R

R print groups of data points in different colors

Adjusting the y_scale/y_shift parameter in the colored_bars function of dendextend package in R?

How to Format Numbers in Heatmap.2 in R

R quantmod chartSeries newTA chob - modify legend and axis (primary and secundary)

Categories

Resources