Highlight a Specific Section of a Treemap using Treemap package in R - r

I am using the Treemap package in R to highlight the number of COVID outbreaks in different settings. I am making a number of different reports using R Markdown. Each one describes a different type of settings and I would like to highlight that setting in the treemap for each report, showing what proportion of total outbreaks occur in the setting in question. For example you I am currently working on the K-12 school report and would like to highlight the box representing that category in the figure.
I was previously using an exploded donut pie chart however there were two many subcategories and the graph became hard to read.
I am picturing a way to change the label or border on one specific box, ie. put a yellow border around the box or make the label yellow. I found a way to do both these things for all the boxes but not just one specific box. I made this image using the snipping tool to further illustrate what the desired outcome might look like. The code to generate the treemap can be found in the link below. It looks like this:
# library
library(treemap)
# Build Dataset
group <- c(rep("group-1",4),rep("group-2",2),rep("group-3",3))
subgroup <- paste("subgroup" , c(1,2,3,4,1,2,1,2,3), sep="-")
value <- c(13,5,22,12,11,7,3,1,23)
data <- data.frame(group,subgroup,value)
# treemap
treemap(data,
index=c("group","subgroup"),
vSize="value",
type="index"
)
This is the most straightforward information I can find about the package, this is where I took the sample image and code from: https://www.r-graph-gallery.com/236-custom-your-treemap.html

It looks like the treemap package doesn't have a built-in way to do this. But we can hack it by using the data frame returned by treemap() and adding a rectangle to the appropriate viewport.
# Plot the treemap and save the data used for plotting.
t = treemap(data,
index = c("group", "subgroup"),
vSize = "value",
type = "index"
)
# Add a rectangle around subgroup-2.
library(grid)
library(dplyr)
with(
# t$tm is a data frame with one row per rectangle. Filter to the group we
# want to highlight.
t$tm %>%
filter(group == "group-1",
subgroup == "subgroup-2"),
{
# Use grid.rect to add a rectangle on top of the treemap.
grid.rect(x = x0 + (w / 2),
y = y0 + (h / 2),
width = w,
height = h,
gp = gpar(col = "yellow", fill = NA, lwd = 4),
vp = "data")
}
)

Related

Is there a way to have a highlighted chart as well as have interactivity of selecting elements in R?

I have come across a beautiful chart on this webpage: https://ourworldindata.org/coronavirus and interested to know if we can build the same chart in R with functionality of having highlighted series as well as selecting any line on hovering ?
I have build static highlighted charts using gghighlight but those are not interactive.
Plotly can help in interaction but I think they don't work with gghighlight.
So how can we have the combination of both highlight and interactivity in charts as in the link shared on top ?
Is it possible to achieve same results in R ? It would be really helpful if someone could share an example or link that can help.
(UPDATE: May be I can manually highlight lines by creating a factor column instead of using gghighlight and then pass it to ggplotly but then can ggplotly or some other library provide similar results on hover ?)
(NOTE: Not looking for animation. Just need highlighted, hover over interactive chart)
Below is the snapshot of same chart hovered over US (This chart is also similar to the one shared in World Economic Forum many times.)
Using plotly you can use highlight() to achive this.
This is a slightly modified example from here:
library(plotly)
# load the `txhousing` dataset
data(txhousing, package = "ggplot2")
# declare `city` as the SQL 'query by' column
tx <- highlight_key(txhousing, ~city)
# initiate a plotly object
base <- plot_ly(tx, color = I("black")) %>%
group_by(city)
# create a time series of median house price
time_series <- base %>%
group_by(city) %>%
add_lines(x = ~date, y = ~median)
highlight(
time_series,
on = "plotly_hover",
selectize = FALSE,
dynamic = FALSE,
color = "red",
persistent = FALSE
)

Plot a table with box size changing

Does anyone have an idea how is this kind of chart plotted? It seems like heat map. However, instead of using color, size of each cell is used to indicate the magnitude. I want to plot a figure like this but I don't know how to realize it. Can this be done in R or Matlab?
Try scatter:
scatter(x,y,sz,c,'s','filled');
where x and y are the positions of each square, sz is the size (must be a vector of the same length as x and y), and c is a 3xlength(x) matrix with the color value for each entry. The labels for the plot can be input with set(gcf,properties) or xticklabels:
X=30;
Y=10;
[x,y]=meshgrid(1:X,1:Y);
x=reshape(x,[size(x,1)*size(x,2) 1]);
y=reshape(y,[size(y,1)*size(y,2) 1]);
sz=50;
sz=sz*(1+rand(size(x)));
c=[1*ones(length(x),1) repmat(rand(size(x)),[1 2])];
scatter(x,y,sz,c,'s','filled');
xlab={'ACC';'BLCA';etc}
xticks(1:X)
xticklabels(xlab)
set(get(gca,'XLabel'),'Rotation',90);
ylab={'RAPGEB6';etc}
yticks(1:Y)
yticklabels(ylab)
EDIT: yticks & co are only available for >R2016b, if you don't have a newer version you should use set instead:
set(gca,'XTick',1:X,'XTickLabel',xlab,'XTickLabelRotation',90) %rotation only available for >R2014b
set(gca,'YTick',1:Y,'YTickLabel',ylab)
in R, you should use ggplot2 that allows you to map your values (gene expression in your case?) onto the size variable. Here, I did a simulation that resembles your data structure:
my_data <- matrix(rnorm(8*26,mean=0,sd=1), nrow=8, ncol=26,
dimnames = list(paste0("gene",1:8), LETTERS))
Then, you can process the data frame to be ready for ggplot2 data visualization:
library(reshape)
dat_m <- melt(my_data, varnames = c("gene", "cancer"))
Now, use ggplot2::geom_tile() to map the values onto the size variable. You may update additional features of the plot.
library(ggplot2)
ggplot(data=dat_m, aes(cancer, gene)) +
geom_tile(aes(size=value, fill="red"), color="white") +
scale_fill_discrete(guide=FALSE) + ##hide scale
scale_size_continuous(guide=FALSE) ##hide another scale
In R, corrplotpackage can be used. Specifically, you have to use method = 'square' when creating the plot.
Try this as an example:
library(corrplot)
corrplot(cor(mtcars), method = 'square', col = 'red')

How to overlay multiple TA in new plot using quantmod?

We can plot candle stick chart using chart series function chartSeries(Cl(PSEC)) I have created some custom values (I1,I2 and I3) which I want to plot together(overlay) outside the candle stick pattern. I have used addTA() for this purpose
chartSeries(Cl(PSEC)), TA="addTA(I1,col=2);addTA(I2,col=3);addTA(I3,col=4)")
The problem is that it plots four plots for Cl(PSEC),I1,I2 and I3 separately instead of two plots which I want Cl(PSEC) and (I1,I2,I3)
EDITED
For clarity I am giving a sample code with I1, I2 and I3 variable created for this purpose
library(quantmod)
PSEC=getSymbols("PSEC",auto.assign=F)
price=Cl(PSEC)
I1=SMA(price,3)
I2=SMA(price,10)
I3=SMA(price,15)
chartSeries(price, TA="addTA(I1,col=2);addTA(I2,col=3);addTA(I3,col=4)")
Here is an option which preserves largely your original code.
You can obtain the desired result using the option on=2 for each TA after the first:
library(quantmod)
getSymbols("PSEC")
price <- Cl(PSEC)
I1 <- SMA(price,3)
I2 <- SMA(price,10)
I3 <- SMA(price,15)
chartSeries(price, TA=list("addTA(I1, col=2)", "addTA(I2, col=4, on=2)",
"addTA(I3, col=5, on=2)"), subset = "last 6 months")
If you want to overlay the price and the SMAs in one chart, you can use the option on=1 for each TA.
Thanks to #hvollmeier who made me realize with his answer that I had misunderstood your question in the previous version of my answer.
PS: Note that several options are described in ?addSMA(), including with.col which can be used to select a specific column of the time series (Cl is the default column).
If I understand you correctly you want the 3 SMAs in a SUBPLOT and NOT in your main chart window.You can do the following using newTA.
Using your data:
PSEC=getSymbols("PSEC",auto.assign=F)
price=Cl(PSEC)
Now plotting a 10,30,50 day SMA in a window below the main window:
chartSeries(price['2016'])
newSMA <- newTA(SMA, Cl, on=NA)
newSMA(10)
newSMA(30,on=2)
newSMA(50,on=2)
The key is the argument on. Use on = NA in defining your new TA function, because the default value foron is 1, which is the main window. on = NA plots in a new window. Then plot the remaining SMAs to the same window as the first SMA. Style the colours etc.to your liking :-).
You may want to consider solving this task using plotting with the newer quantmod charts in the quantmod package (chart_Series as opposed to chartSeries).
Pros:
-The plots look cleaner and better (?)
-have more flexibility via editing the pars and themes options to chart_Series (see other examples here on SO for the basics of things you can do with pars and themes)
Cons:
-Not well documented.
PSEC=getSymbols("PSEC",auto.assign=F)
price=Cl(PSEC)
chart_Series(price, subset = '2016')
add_TA(SMA(price, 10))
add_TA(SMA(price, 30), on = 2, col = "green")
add_TA(SMA(price, 50), on = 2, col = "red")
# Make plot all at once (this approach is useful in shiny applications):
print(chart_Series(price, subset = '2016', TA = 'add_TA(SMA(price, 10), yaxis = list(0, 10));
add_TA(SMA(price, 30), on = 2, col = "purple"); add_TA(SMA(price, 50), on = 2, col = "red")'))

How do I exclude parameters from an RDA plot

I'm still relatively inexperienced manipulating plots in R, and am in need of assistance. I ran a redundancy analysis in R using the rda() function, but now I need to simplify the figure to exclude unnecessary information. The code I'm currently using is:
abio1516<-read.csv("1516 descriptors.csv")
attach(abio1516)
bio1516<-read.csv("1516habund.csv")
attach(bio1516)
rda1516<-rda(bio1516[,2:18],abio1516[,2:6])
anova(rda1516)
RsquareAdj(rda1516)
summary(rda1516)
varpart(bio1516[,2:18],~Distance_to_source,~Depth, ~Veg._cover, ~Surface_area,data=abio1516)
plot(rda1516,bty="n",xaxt="n",yaxt="n",main="1516; P=, R^2=",
ylab="Driven by , Var explained=",xlab="Driven by , Var explained=")
The produced plot looks like this:
Please help me modify my code to: exclude the sites (sit#), all axes, and the internal dashed lines.
I'd also like to either expand the size of the field, or move the vector labels to all fit in the plotting field.
updated as per responses, working code below this point
plot(rda,bty="n",xaxt="n",yaxt="n",type="n",main="xxx",ylab="xxx",xlab="xxx
Overall best:xxx")
abline(h=0,v=0,col="white",lwd=3)
points(rda,display="species",col="blue")
points(rda,display="cn",col="black")
text(rda,display="cn",col="black")
Start by plotting the rda with type = "n" which generates an empty plot to which you can add the things you want. The dotted lines are hard coded into the plot.cca function, so you need either make your own version, or use abline to hide them (then use box to cover up the holes in the axes).
require(vegan)
data(dune, dune.env)
rda1516 <- rda(dune~., data = dune.env)
plot(rda1516, type = "n")
abline(h = 0, v = 0, col = "white", lwd = 3)
box()
points(rda1516, display = "species")
points(rda1516, display = "cn", col = "blue")
text(rda1516, display = "cn", col = "blue")
If the text labels are not in the correct position, you can use the argument pos to move them (make a vector as long as the number of arrows you have with the integers 1 - 4 to move the label down, left, up, or right. (there might be better solutions to this)

ggplot2 equivalent of 'factorization or categorization' in googleVis in R

Due to static graph prepared by ggplot, we are shifting our graphs to googleVis with interactive charts. But when it comes to categorization we are facing many problems. Let me give example which will help you understand:
#dataframe
df = data.frame( x = sample(1:100), y = sample(1:100), cat = sample(c('a','b','c'), 100, replace=TRUE) )
ggplot2 provides parameter like alpha, colour, linetype, size which we can use with categories like shown below:
ggplot(df) + geom_line(aes(x = x, y = y, colour = cat))
Not just line chart, but majority of ggplot2 graphs provide categorization based on column values. Now I would like to do the same in googleVis, based on value df$cat I would like parameters to get changed or grouping of line or charts.
Note:
I have already tried dcast to make multiple columns based on category column and use those multiple columns as Y input, but that it not what I would like to do.
Can anyone help me regarding this?
Let me know if you need more information.
vrajs5 you are not alone! We struggled with this issue. In our case we wanted to fill bar charts like in ggplot. This is the solution. You need to add specifically named columns, linked to your variables, to your data table for googleVis to pick up.
In my fill example, these are called roles, but once you see my syntax you can abstract it to annotations and other cool features. Google has them all documented here (check out superheroes example!) but it was not obvious how it applied to r.
#mages has this documented on this webpage, which shows features not in demo(googleVis):
http://cran.r-project.org/web/packages/googleVis/vignettes/Using_Roles_via_googleVis.html
EXAMPLE ADDING NEW DIMENSIONS TO GOOGLEVIS CHARTS
# in this case
# How do we fill a bar chart showing bars depend on another variable?
# We wanted to show C in a different fill to other assets
suppressPackageStartupMessages(library(googleVis))
library(data.table) # You can use data frames if you don't like DT
test.dt = data.table(px = c("A","B","C"), py = c(1,4,9),
"py.style" = c('silver', 'silver', 'gold'))
# Add your modifier to your chart as a new variable e.g. py1.style
test <-gvisBarChart(test.dt,
xvar = "px",
yvar = c("py", "py.style"),
options = list(legend = 'none'))
plot(test)
We have shown py.style deterministically here, but you could code it to be dependent on your categories.
The secret is myvar.googleVis_thing_youneed linking the variable myvar to the googleVis feature.
RESULT BEFORE FILL (yvar = "py")
RESULT AFTER FILL (yvar = c("py", "py.style"))
Take a look at mages examples (code also on Github) and you will have cracked the "categorization based on column values" issue.

Resources