I'm trying to do a concentric pie chart. The internal pie represent three classes of subjects and each class has to be splitted in 3 sub-classes (of course the slices for the sub-classes have to be in line with the corresponding internal slice).
this is what I tried:
layout(matrix(c(1,1,1,1,2,1,1,1,1), nrow=3)); pie(x=c(14,22,15,3,15,33,0,6,45),labels="",col=c("#f21c39","#dba814","#7309de")); pie(x=c(51,51,51),labels=c("O","VG","V"),col=c("#c64719","#0600f5","#089c1f"))
This worked, but the internal pie is too small. I tried to play with the radius option, but then the external slices are not correspondent to the internal ones. how can I adjust them?
Use par(new=TRUE) to overplot the pies rather than layout() in this case
pie(x=c(14,22,15,3,15,33,0,6,45),labels="",
col=c("#f21c39","#dba814","#7309de"))
par(new=TRUE)
pie(x=c(51,51,51),labels=c("O","VG","V"),radius=.5,
col=c("#c64719","#0600f5","#089c1f"))
Three years later. this can be achieved using sunburstR package. http://timelyportfolio.github.io/sunburstR/example_baseball.html
Example:
DF <- data.frame(LOGRECNO = c(60, 61, 62, 63, 64, 65),
STATE = c(1, 1, 1, 1, 1, 1),
COUNTY = c(1, 1, 1, 1, 1, 1),
TRACT = c(21100, 21100, 21100, 21100, 21100, 21100),
BLOCK = c(1053, 1054, 1055, 1056, 1057, 1058))
DF$BLOCKID <-
paste(DF$LOGRECNO, DF$STATE, DF$COUNTY,
DF$TRACT, DF$BLOCK, sep = "-")
DF %>%
select(BLOCKID) %>%
group_by(BLOCKID) %>%
summarise(Tots=n())->dftest
sunburst(dftest)
I'm sure you are able to adapt this to suit your needs!
you could also use the ggsunburst package
# install ggsunburst
if (!require("ggplot2")) install.packages("ggplot2")
if (!require("rPython")) install.packages("rPython")
install.packages("http://genome.crg.es/~didac/ggsunburst/ggsunburst_0.0.9.tar.gz", repos=NULL, type="source")
library(ggsunburst)
df <- read.table(header=T, text = "
parent node size
O 1 14
O 2 22
O 3 15
V 1 3
V 2 15
V 3 33
VG 1 1
VG 2 6
VG 3 45")
write.table(df, file = 'df.txt', sep = ',', row.names = F)
sb <- sunburst_data('df.txt', type = "node_parent", sep = ",")
p <- sunburst(sb, node_labels = T, leaf_labels = F, rects.fill.aes = "name")
cols <- c("O" = "#c64719", "V" = "#0600f5", "VG" = "#089c1f", "1" = "#f21c39", "2" = "#dba814", "3" = "#7309de")
p + scale_fill_manual(values = cols)
Related
I am trying to generate a heatmap as the following figure. I have already tried pheatmap and the code is as follows:
breaks_2 <- seq(min(0), max(2), by = 0.1)
pheatmap::pheatmap(
mat = data,
cluster_cols = F,
cluster_rows = F,
scale = "column",
border_color = "white",
color = inferno(20),
show_colnames = TRUE,
show_rownames = FALSE,
breaks = breaks_2
)
But this does not seem to work. So far I am understanding I am mistaking with defining break or have to use another package than pheatmap. Any suggestion will be really helpful.
The color scale in pheatmap adjusts to the range of the input data. If you want anything above a certain value to be coloured daffodil, then simply send pheatmap a copy of your data with the highest values rounded to 2.
Suppose you have a data frame like this, with values anywhere between 0 and 3:
set.seed(1)
data <- as.data.frame(matrix(runif(64, 0, 3), nrow = 8))
names(data) <- LETTERS[1:8]
data
#> A B C D E F G H
#> 1 0.7965260 1.8873421 2.1528555 0.801662 1.4806239 2.46283888 2.1969412 0.9488151
#> 2 1.1163717 0.1853588 2.9757183 1.158342 0.5586528 1.94118058 2.0781947 1.5559028
#> 3 1.7185601 0.6179237 1.1401055 0.040171 2.4821200 2.34879829 1.4328589 1.9860152
#> 4 2.7246234 0.5296703 2.3323357 1.147164 2.0054002 1.65910893 2.5836284 1.2204906
#> 5 0.6050458 2.0610685 2.8041157 2.609073 2.3827196 1.58915874 1.3142913 2.7386278
#> 6 2.6951691 1.1523112 0.6364276 1.021047 0.3238309 2.36806870 0.7343918 0.8808101
#> 7 2.8340258 2.3095243 1.9550213 1.446240 2.1711328 0.06999361 0.2120371 1.3771972
#> 8 1.9823934 1.4930977 0.3766653 1.798697 1.2338233 1.43169020 0.2983985 0.9971840
Some of the values are greater than two. We want all of these to appear the same colour on our heatmap, so we create a copy of our data for plotting, and round down all of the values that were greater than 2 to be exactly 2:
data_2 <- data
data_2[] <- lapply(data_2, function(x) { x[x > 2] <- 2; x })
So now if we run pheatmap on data_2, we see that all the values that were greater than 2 in our original data frame are coloured daffodil.
library(viridis)
library(pheatmap)
breaks_2 <- seq(0, 2, by = 0.1)
pheatmap(
mat = data_2,
cluster_cols = F,
cluster_rows = F,
border_color = "white",
scale = 'none',
color = inferno(22),
show_colnames = TRUE,
show_rownames = FALSE,
legend_breaks = breaks_2
)
I created a Sankey diagram using the plotly package.
Please look at below example. I tried to make five streams, 1_6_7, 2_6_7, and so on. But two of five links between 6 and 7 disappeared. As far as I see, plotly allows to make only three or less links between two nodes.
Can I remove this restrictions ? Any help would be greatly appreciated.
Here is an example code and the outputs:
d <- expand.grid(1:5, 6, 7)
node_label <- 1:max(d)
node_colour <- scales::alpha(RColorBrewer::brewer.pal(7, "Set2"), 0.8)
link_source_nodeind <- c(d[,1], d[,2]) - 1
link_target_nodeind <- c(d[,2], d[,3]) - 1
link_value <- rep(100, nrow(d) * 2)
link_label <- rep(paste(d[,1], d[,2], d[,3], sep = "_"), 2)
link_colour <- rep(scales::alpha(RColorBrewer::brewer.pal(5, "Set2"), 0.2), 2)
p <- plotly::plot_ly(type = "sankey",
domain = c(x = c(0,1), y = c(0,1)),
orientation = "h",
node = list(label = node_label,
color = node_colour),
link = list(source = link_source_nodeind,
target = link_target_nodeind,
value = link_value,
label = link_label,
color = link_colour))
p
I have some authors with their city or country of affiliation. I would like to know if it is possible to plot the coauthors' networks (figure 1), on the map, having the coordinates of the countries. Please consider multiple authors from the same country. [EDIT: Several networks could be generated as in the example and should not show avoidable overlaps]. This is intended for dozens of authors. A zooming option is desirable. Bounty promise +100 for future better answer.
refs5 <- read.table(text="
row bibtype year volume number pages title journal author
Bennett_1995 article 1995 76 <NA> 113--176 angiosperms. \"Annals of Botany\" \"Bennett Md, Leitch Ij\"
Bennett_1997 article 1997 80 2 169--196 estimates. \"Annals of Botany\" \"Bennett MD, Leitch IJ\"
Bennett_1998 article 1998 82 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"Bennett MD, Leitch IJ, Hanson L\"
Bennett_2000 article 2000 82 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"Bennett MD, Someone IJ\"
Leitch_2001 article 2001 83 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"Leitch IJ, Someone IJ\"
New_2002 article 2002 84 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"New IJ, Else IJ\"" , header=TRUE,stringsAsFactors=FALSE)
rownames(refs5) <- refs5[,1]
refs5<-refs5[,2:9]
citations <- as.BibEntry(refs5)
authorsl <- lapply(citations, function(x) as.character(toupper(x$author)))
unique.authorsl<-unique(unlist(authorsl))
coauth.table <- matrix(nrow=length(unique.authorsl),
ncol = length(unique.authorsl),
dimnames = list(unique.authorsl, unique.authorsl), 0)
for(i in 1:length(citations)){
paper.auth <- unlist(authorsl[[i]])
coauth.table[paper.auth,paper.auth] <- coauth.table[paper.auth,paper.auth] + 1
}
coauth.table <- coauth.table[rowSums(coauth.table)>0, colSums(coauth.table)>0]
diag(coauth.table) <- 0
coauthors<-coauth.table
bip = network(coauthors,
matrix.type = "adjacency",
ignore.eval = FALSE,
names.eval = "weights")
authorcountry <- read.table(text="
author country
1 \"LEITCH IJ\" Argentina
2 \"HANSON L\" USA
3 \"BENNETT MD\" Brazil
4 \"SOMEONE IJ\" Brazil
5 \"NEW IJ\" Brazil
6 \"ELSE IJ\" Brazil",header=TRUE,fill=TRUE,stringsAsFactors=FALSE)
matched<- authorcountry$country[match(unique.authorsl, authorcountry$author)]
bip %v% "Country" = matched
colorsmanual<-c("red","darkgray","gainsboro")
names(colorsmanual) <- unique(matched)
gdata<- ggnet2(bip, color = "Country", palette = colorsmanual, legend.position = "right",label = TRUE,
alpha = 0.9, label.size = 3, edge.size="weights",
size="degree", size.legend="Degree Centrality") + theme(legend.box = "horizontal")
gdata
In other words, adding the names of authors, lines and bubbles to the map. Note, several authors maybe from the same city, or country and should not overlap.
Figure 1 Network
EDIT: The current JanLauGe answer overlaps two non-related networks. authors "ELSE" and "NEW" need to be apart from others as in figure 1.
Are you looking for a solution using exactly the packages you used, or would you be happy to use suite of other packages? Below is my approach, in which I extract the graph properties from the network object and plot them on a map using the ggplot2 and map package.
First I recreate the example data you gave.
library(tidyverse)
library(sna)
library(maps)
library(ggrepel)
set.seed(1)
coauthors <- matrix(
c(0,3,1,1,3,0,1,0,1,1,0,0,1,0,0,0),
nrow = 4, ncol = 4,
dimnames = list(c('BENNETT MD', 'LEITCH IJ', 'HANSON L', 'SOMEONE ELSE'),
c('BENNETT MD', 'LEITCH IJ', 'HANSON L', 'SOMEONE ELSE')))
coords <- data_frame(
country = c('Argentina', 'Brazil', 'USA'),
coord_lon = c(-63.61667, -51.92528, -95.71289),
coord_lat = c(-38.41610, -14.23500, 37.09024))
authorcountry <- data_frame(
author = c('LEITCH IJ', 'HANSON L', 'BENNETT MD', 'SOMEONE ELSE'),
country = c('Argentina', 'USA', 'Brazil', 'Brazil'))
Now I generate the graph object using the snp function network
# Generate network
bip <- network(coauthors,
matrix.type = "adjacency",
ignore.eval = FALSE,
names.eval = "weights")
# Graph with ggnet2 for centrality
gdata <- ggnet2(bip, color = "Country", legend.position = "right",label = TRUE,
alpha = 0.9, label.size = 3, edge.size="weights",
size="degree", size.legend="Degree Centrality") + theme(legend.box = "horizontal")
From the network object we can extract the values of each edge, and from the ggnet2 object we can get degree of centrality for nodes as below:
# Combine data
authors <-
# Get author numbers
data_frame(
id = seq(1, nrow(coauthors)),
author = sapply(bip$val, function(x) x$vertex.names)) %>%
left_join(
authorcountry,
by = 'author') %>%
left_join(
coords,
by = 'country') %>%
# Jittering points to avoid overlap between two authors
mutate(
coord_lon = jitter(coord_lon, factor = 1),
coord_lat = jitter(coord_lat, factor = 1))
# Get edges from network
networkdata <- sapply(bip$mel, function(x)
c('id_inl' = x$inl, 'id_outl' = x$outl, 'weight' = x$atl$weights)) %>%
t %>% as_data_frame
dt <- networkdata %>%
left_join(authors, by = c('id_inl' = 'id')) %>%
left_join(authors, by = c('id_outl' = 'id'), suffix = c('.from', '.to')) %>%
left_join(gdata$data %>% select(label, size), by = c('author.from' = 'label')) %>%
mutate(edge_id = seq(1, nrow(.)),
from_author = author.from,
from_coord_lon = coord_lon.from,
from_coord_lat = coord_lat.from,
from_country = country.from,
from_size = size,
to_author = author.to,
to_coord_lon = coord_lon.to,
to_coord_lat = coord_lat.to,
to_country = country.to) %>%
select(edge_id, starts_with('from'), starts_with('to'), weight)
Should look like this now:
dt
# A tibble: 8 × 11
edge_id from_author from_coord_lon from_coord_lat from_country from_size to_author to_coord_lon
<int> <chr> <dbl> <dbl> <chr> <dbl> <chr> <dbl>
1 1 BENNETT MD -51.12756 -16.992729 Brazil 6 LEITCH IJ -65.02949
2 2 BENNETT MD -51.12756 -16.992729 Brazil 6 HANSON L -96.37907
3 3 BENNETT MD -51.12756 -16.992729 Brazil 6 SOMEONE ELSE -52.54160
4 4 LEITCH IJ -65.02949 -35.214117 Argentina 4 BENNETT MD -51.12756
5 5 LEITCH IJ -65.02949 -35.214117 Argentina 4 HANSON L -96.37907
6 6 HANSON L -96.37907 36.252312 USA 4 BENNETT MD -51.12756
7 7 HANSON L -96.37907 36.252312 USA 4 LEITCH IJ -65.02949
8 8 SOMEONE ELSE -52.54160 -9.551913 Brazil 2 BENNETT MD -51.12756
# ... with 3 more variables: to_coord_lat <dbl>, to_country <chr>, weight <dbl>
Now moving on to plotting this data on a map:
world_map <- map_data('world')
myMap <- ggplot() +
# Plot map
geom_map(data = world_map, map = world_map, aes(map_id = region),
color = 'gray85',
fill = 'gray93') +
xlim(c(-120, -20)) + ylim(c(-50, 50)) +
# Plot edges
geom_segment(data = dt,
alpha = 0.5,
color = "dodgerblue1",
aes(x = from_coord_lon, y = from_coord_lat,
xend = to_coord_lon, yend = to_coord_lat,
size = weight)) +
scale_size(range = c(1,3)) +
# Plot nodes
geom_point(data = dt,
aes(x = from_coord_lon,
y = from_coord_lat,
size = from_size,
colour = from_country)) +
# Plot names
geom_text_repel(data = dt %>%
select(from_author,
from_coord_lon,
from_coord_lat) %>%
unique,
colour = 'dodgerblue1',
aes(x = from_coord_lon, y = from_coord_lat, label = from_author)) +
coord_equal() +
theme_bw()
Obviously you can change the colour and design in the usual way with ggplot2 grammar. Notice that you could also use geom_curve and the arrow aesthetic to get a plot similar to the one in the uber post linked in the comments above.
As an effort to avoid the overlapping of the 2 networks, I came to this modification of the x and y coordenates of the ggplot, which by default does not overlap the networks, see figure 1 in the question.
# get centroid positions for countries
# add coordenates to authorcountry table
# download and unzip
# https://worldmap.harvard.edu/data/geonode:country_centroids_az8
setwd("~/country_centroids_az8")
library(rgdal)
cent <- readOGR('.', "country_centroids_az8", stringsAsFactors = F)
countrycentdf<-cent#data[,c("name","Longitude","Latitude")]
countrycentdf$name[which(countrycentdf$name=="United States")]<-"USA"
colnames(countrycentdf)[names(countrycentdf)=="name"]<-"country"
authorcountry$Longitude<-countrycentdf$Longitude[match(authorcountry$country,countrycentdf$country)]
authorcountry$Latitude <-countrycentdf$Latitude [match(authorcountry$country,countrycentdf$country)]
# original coordenates of plot and its transformation
ggnetbuild<-ggplot_build(gdata)
allcoord<-ggnetbuild$data[[3]][,c("x","y","label")]
allcoord$Latitude<-authorcountry$Latitude [match(allcoord$label,authorcountry$author)]
allcoord$Longitude<-authorcountry$Longitude [match(allcoord$label,authorcountry$author)]
allcoord$country<-authorcountry$country [match(allcoord$label,authorcountry$author)]
# increase with factor the distance among dots
factor<-7
allcoord$coord_lat<-allcoord$y*factor+allcoord$Latitude
allcoord$coord_lon<-allcoord$x*factor+allcoord$Longitude
allcoord$author<-allcoord$label
# plot as in answer of JanLauGe, without jitter
library(tidyverse)
library(ggrepel)
authors <-
# Get author numbers
data_frame(
id = seq(1, nrow(coauthors)),
author = sapply(bip$val, function(x) x$vertex.names)) %>%
left_join(
allcoord,
by = 'author')
# Continue as in answer of JanLauGe
networkdata <- ##
dt <- ##
world_map <- map_data('world')
myMap <- ##
myMap
I am running into a problem where I cannot color different sections of a graph I am working on. I want to change the color based on the threshold on the y axis. It seems easy from looking at the documentation from the high charts javascript, but it fails when I try to translate it into R to use rCharts. Here is what I have, with an example data frame called final.
library(rCharts)
final <- data.frame(x = c(1,2,3,4,5), y = c(15,17,20,21,22), radius = rep(3,5))
temp <- apply(final[,c("x","y","radius"])], 1, as.list)
a <- Highcharts$new()
a$series(animation = FALSE, name = "placeholder", data = temp, type = "scatter",
lineWidth = 1, zoneAxis = "y", zones = list(list(value = 20,
color = "rgba(223, 83, 83, 0.5)"),
list(color = "rgba(50, 205, 50, 0.5)")))
This does not change the plot from the default color, so I'm very confused and would greatly appreciate any help or possibly another solution. Thanks!
This is modeled off of the documentation for zones here: http://www.highcharts.com/docs/chart-concepts/series#4aaN
Here are the points I do not understand:
This code has a ] inside the c() without a sense;
temp <- apply(final[,c("x","y","radius"])], 1, as.list)
Your intent I guess, was to select all columns, which can be achieved with a simple:
temp <- apply(final[ , ], 1, as.list)
But this code also seems to be strange for me because the same results can be obtained with just:
temp <- apply(final, 1, as.list)
temp
[[1]]
[[1]]$x
[1] 1
[[1]]$y
[1] 15
[[1]]$radius
[1] 3
[[2]]
[[2]]$x
[1] 2
[[2]]$y
[1] 17
[[2]]$radius
[1] 3
[[3]]
[[3]]$x
[1] 3
[[3]]$y
[1] 20
[[3]]$radius
[1] 3
[[4]]
[[4]]$x
[1] 4
[[4]]$y
[1] 21
[[4]]$radius
[1] 3
[[5]]
[[5]]$x
[1] 5
[[5]]$y
[1] 22
[[5]]$radius
[1] 3
If of course you wanted that.
Edit: Color changed (another approach)
I changed the color even though I have to admit I achieved this solution due to my ignorance of which package the rgba functions belongs to.
I change the rgba to rgb, I have unquoted it (are you sure it should be quoted?) and I set the max = 255 to the rgb. Here is the code.
a$series( animation = FALSE, name = "placeholder",
data = temp, type = "scatter",
lineWidth = 1, zoneAxis = "y",
zones = list(list(value = 20,
color = rgb(223, 83, 83, 0.5, max = 255)),
list(color = rgb(50, 205, 50, 0.5, max = 255))
)
)
I'm using Paul Bleicher's Calendar Heatmap to visualize some events over time and I'm interested to add black-and-white fill patterns instead of (or on top of) the color coding to increase the readability of the Calendar Heatmap when printed in black and white.
Here is an example of the Calendar Heatmap look in color,
and here is how it look in black and white,
it gets very difficult to distinguish between the individual levels in black and white.
Is there an easy way to get R to add some kind of patten to the 6 levels instead of color?
Code to reproduce the Calendar Heatmap in color.
source("http://blog.revolution-computing.com/downloads/calendarHeat.R")
stock <- "MSFT"
start.date <- "2012-01-12"
end.date <- Sys.Date()
quote <- paste("http://ichart.finance.yahoo.com/table.csv?s=", stock, "&a=", substr(start.date,6,7), "&b=", substr(start.date, 9, 10), "&c=", substr(start.date, 1,4), "&d=", substr(end.date,6,7), "&e=", substr(end.date, 9, 10), "&f=", substr(end.date, 1,4), "&g=d&ignore=.csv", sep="")
stock.data <- read.csv(quote, as.is=TRUE)
# convert the continuous var to a categorical var
stock.data$by <- cut(stock.data$Adj.Close, b = 6, labels = F)
calendarHeat(stock.data$Date, stock.data$by, varname="MSFT Adjusted Close")
update 02-13-2013 03:52:11Z, what do I mean by adding a pattern,
I envision adding a pattern to the individual day-boxes in the Calendar Heatmap as pattern is added to the individual slices in the pie chart to the right (B) in this plot,
found here something like the states in this plot.
I answered this question before he becomes a bounty. It looks like the OP find my previous answer a little bit complicated. I organized the code in a single gist here. you need just to download the file and source it.
I create new function extra.calendarHeat which is an extension of the first one to draw hetmap of double time series.(dat,value1,value2). I addedthis new parameters:
pch.symbol : vector of symbols , defualt 15:20
cex.symbol : cex of the symbols , default = 2
col.symbol : color of symbols , default #00000044
pvalues : value of symbols
Here some examples:
## I am using same data
stock <- "MSFT"
start.date <- "2012-01-12"
end.date <- Sys.Date()
quote <- paste("http://ichart.finance.yahoo.com/table.csv?s=",
stock,
"&a=", substr(start.date,6,7),
"&b=", substr(start.date, 9, 10),
"&c=", substr(start.date, 1,4),
"&d=", substr(end.date,6,7),
"&e=", substr(end.date, 9, 10),
"&f=", substr(end.date, 1,4),
"&g=d&ignore=.csv", sep="")
stock.data <- read.csv(quote, as.is=TRUE)
p1 <- extra.calendarHeat(dates= stock.data$Date, values = stock.data$Adj.Close,
pvalues = stock.data$Volume,
varname="W&B MSFT Adjusted Close
\n Volume as no border symbol ")
## multiply symbols
p2 <- extra.calendarHeat(dates= stock.data$Date, values = stock.data$Adj.Close,
pvalues = stock.data$Volume,
varname="W&B MSFT Adjusted Close \n
black Volume as multiply symbol ",
pch.symbol = c(3,4,8,9),
col.symbol='black')
## circles symbols
p3 <- extra.calendarHeat(dates= stock.data$Date, values = stock.data$Adj.Close,
pvalues = stock.data$Volume,
varname="W&B MSFT Adjusted Close \n blue Volume as circles",
pch.symbol = c(1,10,13,16,18),
col.symbol='blue')
## triangles symbols
p4 <- extra.calendarHeat(dates= stock.data$Date, values = stock.data$Adj.Close,
pvalues = stock.data$Volume,
varname="W&B MSFT Adjusted Close \n red Volume as triangles",
pch.symbol = c(2,6,17,24,25),
col.symbol='red')
p5 <- extra.calendarHeat(dates= stock.data$Date, values = stock.data$Adj.Close,
varname="MSFT Adjusted Close",
pch.symbol = LETTERS,
col.symbol='black')
# symbols are LETTERS
p6 <- extra.calendarHeat(dates= stock.data$Date, values = stock.data$Adj.Close,
pvalues = stock.data$Volume,
varname="MSFT Adjusted Close \n Volume as LETTERS symbols",
pch.symbol = letters,
color='r2b')
You can panel.level.plot from latticeExtra to add pattern. I think the question as it is asked is a little bit specific. So I try to generalize it. The idea is to give the steps to transform a time series to a calendar heatmap: with 2 patterns (fill color and a shape). We can imagine multiple time series (Close/Open). For example, you can get something like this
or like this, using a ggplot2 theme:
The function calendarHeat , giving a single time series (dat,value) , transforms data like this :
date.seq value dotw woty yr month seq
1 2012-01-01 NA 0 2 2012 1 1
2 2012-01-02 NA 1 2 2012 1 2
3 2012-01-03 NA 2 2 2012 1 3
4 2012-01-04 NA 3 2 2012 1 4
5 2012-01-05 NA 4 2 2012 1 5
6 2012-01-06 NA 5 2 2012 1 6
So I assume that I have data formated like this, otherwise, I extracted from calendarHeat the part of data transformation in a function(see this gist)
dat <- transformdata(stock.data$Date, stock.data$by)
Then the calendar is essentially a levelplot with custom sacles , custom theme and custom panel' function.
library(latticeExtra)
levelplot(value~woty*dotw | yr, data=dat, border = "black",
layout = c(1, nyr%%7),
col.regions = (calendar.pal(ncolors)),
aspect='iso',
between = list(x=0, y=c(1,1)),
strip=TRUE,
panel = function(...) {
panel.levelplot(...)
calendar.division(...)
panel.levelplot.points(...,na.rm=T,
col='blue',alpha=0.5,
## you can play with cex and pch here to get the pattern you
## like
cex =dat$value/max(dat$value,na.rm=T)*3
pch=ifelse(is.na(dat$value),NA,20),
type = c("p"))
},
scales= scales,
xlim =extendrange(dat$woty,f=0.01),
ylim=extendrange(dat$dotw,f=0.1),
cuts= ncolors - 1,
colorkey= list(col = calendar.pal(ncolors), width = 0.6, height = 0.5),
subscripts=TRUE,
par.settings = calendar.theme)
Where the scales are:
scales = list(
x = list( at= c(seq(2.9, 52, by=4.42)),
labels = month.abb,
alternating = c(1, rep(0, (nyr-1))),
tck=0,
cex =1),
y=list(
at = c(0, 1, 2, 3, 4, 5, 6),
labels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday"),
alternating = 1,
cex =1,
tck=0))
And the theme is setting as :
calendar.theme <- list(
xlab=NULL,ylab=NULL,
strip.background = list(col = "transparent"),
strip.border = list(col = "transparent"),
axis.line = list(col="transparent"),
par.strip.text=list(cex=2))
The panel function uses a function caelendar.division. In fact, the division of the grid(month black countour) is very long and is done using grid package in the hard way (panel focus...). I change it a little bit, and now I call it in the lattice panel function: caelendar.division.
We can use ggplot2's scale_shape_manual to get us shapes that appear close to shading, and we can plot these over the grey heatmap.
Note: This was adapted from #Jay's comments in the original blog posting for the calendar heatmap
# PACKAGES
library(ggplot2)
library(data.table)
# Transofrm data
stock.data <- transform(stock.data,
week = as.POSIXlt(Date)$yday %/% 7 + 1,
month = as.POSIXlt(Date)$mon + 1,
wday = factor(as.POSIXlt(Date)$wday, levels=0:6, labels=levels(weekdays(1, abb=FALSE)), ordered=TRUE),
year = as.POSIXlt(Date)$year + 1900)
# find when the months change
# Not used, but could be
stock.data$mchng <- as.logical(c(0, diff(stock.data$month)))
# we need dummy data for Sunday / Saturday to be included.
# These added rows will not be plotted due to their NA values
dummy <- as.data.frame(stock.data[1:2, ])
dummy[, -which(names(dummy) %in% c("wday", "year"))] <- NA
dummy[, "wday"] <- weekdays(2:3, FALSE)
dummy[, "mchng"] <- TRUE
rbind(dummy, stock.data) -> stock.data
# convert the continuous var to a categorical var
stock.data$Adj.Disc <- cut(stock.data$Adj.Close, b = 6, labels = F)
# vals is the greyscale tones used for the outer monthly borders
vals <- gray(c(.2, .5))
# PLOT
# Expected warning due to dummy variable with NA's:
# Warning message:
# Removed 2 rows containing missing values (geom_point).
ggplot(stock.data) +
aes(week, wday, fill=as.factor(Adj.Disc),
shape=as.factor(Adj.Disc), color=as.factor(month %% 2)) +
geom_tile(linetype=1, size=1.8) +
geom_tile(linetype=6, size=0.4, color="white") +
scale_color_manual(values=vals) +
geom_point(aes(alpha=0.2), color="black") +
scale_fill_grey(start=0, end=0.9) + scale_shape_manual(values=c(2, 3, 4, 12, 14, 8)) +
theme(legend.position="none") + labs(y="Day of the Week") + facet_wrap(~ year, ncol = 1)