I am running into a problem where I cannot color different sections of a graph I am working on. I want to change the color based on the threshold on the y axis. It seems easy from looking at the documentation from the high charts javascript, but it fails when I try to translate it into R to use rCharts. Here is what I have, with an example data frame called final.
library(rCharts)
final <- data.frame(x = c(1,2,3,4,5), y = c(15,17,20,21,22), radius = rep(3,5))
temp <- apply(final[,c("x","y","radius"])], 1, as.list)
a <- Highcharts$new()
a$series(animation = FALSE, name = "placeholder", data = temp, type = "scatter",
lineWidth = 1, zoneAxis = "y", zones = list(list(value = 20,
color = "rgba(223, 83, 83, 0.5)"),
list(color = "rgba(50, 205, 50, 0.5)")))
This does not change the plot from the default color, so I'm very confused and would greatly appreciate any help or possibly another solution. Thanks!
This is modeled off of the documentation for zones here: http://www.highcharts.com/docs/chart-concepts/series#4aaN
Here are the points I do not understand:
This code has a ] inside the c() without a sense;
temp <- apply(final[,c("x","y","radius"])], 1, as.list)
Your intent I guess, was to select all columns, which can be achieved with a simple:
temp <- apply(final[ , ], 1, as.list)
But this code also seems to be strange for me because the same results can be obtained with just:
temp <- apply(final, 1, as.list)
temp
[[1]]
[[1]]$x
[1] 1
[[1]]$y
[1] 15
[[1]]$radius
[1] 3
[[2]]
[[2]]$x
[1] 2
[[2]]$y
[1] 17
[[2]]$radius
[1] 3
[[3]]
[[3]]$x
[1] 3
[[3]]$y
[1] 20
[[3]]$radius
[1] 3
[[4]]
[[4]]$x
[1] 4
[[4]]$y
[1] 21
[[4]]$radius
[1] 3
[[5]]
[[5]]$x
[1] 5
[[5]]$y
[1] 22
[[5]]$radius
[1] 3
If of course you wanted that.
Edit: Color changed (another approach)
I changed the color even though I have to admit I achieved this solution due to my ignorance of which package the rgba functions belongs to.
I change the rgba to rgb, I have unquoted it (are you sure it should be quoted?) and I set the max = 255 to the rgb. Here is the code.
a$series( animation = FALSE, name = "placeholder",
data = temp, type = "scatter",
lineWidth = 1, zoneAxis = "y",
zones = list(list(value = 20,
color = rgb(223, 83, 83, 0.5, max = 255)),
list(color = rgb(50, 205, 50, 0.5, max = 255))
)
)
Related
I am trying to generate a heatmap as the following figure. I have already tried pheatmap and the code is as follows:
breaks_2 <- seq(min(0), max(2), by = 0.1)
pheatmap::pheatmap(
mat = data,
cluster_cols = F,
cluster_rows = F,
scale = "column",
border_color = "white",
color = inferno(20),
show_colnames = TRUE,
show_rownames = FALSE,
breaks = breaks_2
)
But this does not seem to work. So far I am understanding I am mistaking with defining break or have to use another package than pheatmap. Any suggestion will be really helpful.
The color scale in pheatmap adjusts to the range of the input data. If you want anything above a certain value to be coloured daffodil, then simply send pheatmap a copy of your data with the highest values rounded to 2.
Suppose you have a data frame like this, with values anywhere between 0 and 3:
set.seed(1)
data <- as.data.frame(matrix(runif(64, 0, 3), nrow = 8))
names(data) <- LETTERS[1:8]
data
#> A B C D E F G H
#> 1 0.7965260 1.8873421 2.1528555 0.801662 1.4806239 2.46283888 2.1969412 0.9488151
#> 2 1.1163717 0.1853588 2.9757183 1.158342 0.5586528 1.94118058 2.0781947 1.5559028
#> 3 1.7185601 0.6179237 1.1401055 0.040171 2.4821200 2.34879829 1.4328589 1.9860152
#> 4 2.7246234 0.5296703 2.3323357 1.147164 2.0054002 1.65910893 2.5836284 1.2204906
#> 5 0.6050458 2.0610685 2.8041157 2.609073 2.3827196 1.58915874 1.3142913 2.7386278
#> 6 2.6951691 1.1523112 0.6364276 1.021047 0.3238309 2.36806870 0.7343918 0.8808101
#> 7 2.8340258 2.3095243 1.9550213 1.446240 2.1711328 0.06999361 0.2120371 1.3771972
#> 8 1.9823934 1.4930977 0.3766653 1.798697 1.2338233 1.43169020 0.2983985 0.9971840
Some of the values are greater than two. We want all of these to appear the same colour on our heatmap, so we create a copy of our data for plotting, and round down all of the values that were greater than 2 to be exactly 2:
data_2 <- data
data_2[] <- lapply(data_2, function(x) { x[x > 2] <- 2; x })
So now if we run pheatmap on data_2, we see that all the values that were greater than 2 in our original data frame are coloured daffodil.
library(viridis)
library(pheatmap)
breaks_2 <- seq(0, 2, by = 0.1)
pheatmap(
mat = data_2,
cluster_cols = F,
cluster_rows = F,
border_color = "white",
scale = 'none',
color = inferno(22),
show_colnames = TRUE,
show_rownames = FALSE,
legend_breaks = breaks_2
)
With sample data and code below, I'm able to dynamically draw gt() plots for each element of list of dataframes, and I set color for error column:
df <- structure(list(id = c("M0000607", "M0000609", "M0000612"), `2021-08(actual)` = c(12.6,
19.2, 8.3), `2021-09(actual)` = c(10.3, 17.3, 6.4), `2021-10(actual)` = c(8.9,
15.7, 5.3), `2021-11(actual)` = c(7.3, 14.8, 3.1), `2021-12(actual)` = c(6.1,
14.2, 3.5), `2021-08(pred)` = c(11.65443222, 14.31674997, 7.084180415
), `2021-09(pred)` = c(12.29810914, 17.7143733, 6.057927385),
`2021-10(pred)` = c(9.619846116, 15.54553601, 6.525992602
), `2021-11(pred)` = c(8.352097939, 13.97318204, 3.164682627
), `2021-12(pred)` = c(6.113631596, 14.16243166, 3.288372517
), `2021-08(error)` = c(2.082307066, 1.146759554, 0.687406723
), `2021-09(error)` = c(1.631350383, 2.753457736, 2.952737781
), `2021-10(error)` = c(0.945567783, 4.883250027, 1.215819585
), `2021-11(error)` = c(1.998109138, 0.414373304, 0.342072615
), `2021-12(error)` = c(0.719846116, 0.154463985, 1.225992602
)), class = "data.frame", row.names = c(NA, -3L))
year_months <- c('2021-12', '2021-11', '2021-10')
curr <- lubridate::ym(year_months)
prev <- curr - months(2L)
dfs <- mapply(function(x, y) {
df[c(
"id",
format(seq.Date(y, x, by = "month"), "%Y-%m(actual)"),
format(x, "%Y-%m(pred)"),
format(x, "%Y-%m(error)")
)]
}, curr, prev, SIMPLIFY = FALSE)
plotGT <- function(data){
plot <- data %>%
gt() %>%
data_color(
columns = 6, # set color for error column
colors = scales::col_numeric(
palette =
c("blue", "green", "orange", "red"), # named with color 1
# c('#feb8cd', '#ffffff', '#69cfd5'), # named with color 2
domain = c(0, 10)
)
)
print(plot)
# gtsave(plot, file = file.path(glue("./plot_color1.png")))
}
mapply(plotGT, dfs)
Result for colors c("blue", "green", "orange", "red"):
Result for colors c('#feb8cd', '#ffffff', '#69cfd5'):
In order to go further, I hope to save the outputs based if conditions: if I choose the first color palette, I will name the plot by i.e., plot_color1.png, for the second, named by plot_color2.png, but I wish to run the whole code once, save all two figures one time.
So my question is how could I modify the code above to achieve that? Thanks for your help at advance.
Maybe some code like: gtsave(plot, file = file.path(glue("./plot_color{i}.png"))) based on if-else conditions, but I don't know how to do that exactly.
One option would be to make use of a named list of color palettes like so, which would also make it easier to switch between different palettes:
EDIT
I fixed a bug. I used a <- inside the pals list instead of = which was the reason for the error you got.
To loop over the palettes I added pal_choice as an argument to your table function. Doing so we can loop over pals using e.g. lapply.
Additionally, as you are looping over multiple dfs I added a name argument and added names to your list of data frames. As is the tables were exported under the same filename so actually you ended up with one file containing the last table.
I also uncommented the print for the reprex.
library(gt)
pal_choice <- "color2"
pals <- list(color1 = c("blue", "green", "orange", "red"),
color2 = c('#feb8cd', '#ffffff', '#69cfd5'))
plotGT <- function(data, name, pal_choice){
plot <- data %>%
gt() %>%
data_color(
columns = 6, # set color for error column
colors = scales::col_numeric(
palette = pals[[pal_choice]],
domain = c(0, 10)
)
)
#print(plot)
gtsave(plot, file = glue::glue("./plot_{name}_{pal_choice}.png"))
}
names(dfs) <- letters[seq_along(dfs)]
lapply(names(pals), function(x) {
mapply(plotGT, dfs, names(dfs), MoreArgs = list(pal_choice = x))
})
#> [[1]]
#> a
#> "/private/var/folders/l7/hltv70p95zqfdq9z09h8m9bw0000gn/T/Rtmp9LLLHO/reprex-2b71746b0fd-petit-dore/plot_a_color1.png"
#> b
#> "/private/var/folders/l7/hltv70p95zqfdq9z09h8m9bw0000gn/T/Rtmp9LLLHO/reprex-2b71746b0fd-petit-dore/plot_b_color1.png"
#> c
#> "/private/var/folders/l7/hltv70p95zqfdq9z09h8m9bw0000gn/T/Rtmp9LLLHO/reprex-2b71746b0fd-petit-dore/plot_c_color1.png"
#>
#> [[2]]
#> a
#> "/private/var/folders/l7/hltv70p95zqfdq9z09h8m9bw0000gn/T/Rtmp9LLLHO/reprex-2b71746b0fd-petit-dore/plot_a_color2.png"
#> b
#> "/private/var/folders/l7/hltv70p95zqfdq9z09h8m9bw0000gn/T/Rtmp9LLLHO/reprex-2b71746b0fd-petit-dore/plot_b_color2.png"
#> c
#> "/private/var/folders/l7/hltv70p95zqfdq9z09h8m9bw0000gn/T/Rtmp9LLLHO/reprex-2b71746b0fd-petit-dore/plot_c_color2.png"
I'm trying to get familiar with plotly's functionality and syntax and have tried several of the scripts provided to compose and render plots of data. However, when generating the plotly output using RStudio I'm getting the following error: "Warning message:
Specifying width/height in layout() is now deprecated.
Please specify in ggplotly() or plot_ly()"
The output image appears jumbled and uninterpretable in the RStudio console and I've tried a few changes like setting the plotly object's width and height equal to null etc without luck.
Here is one of the sample scripts I've used when experiencing this issue:
library(plotly)
trace1 <- list(
x = c("Aug-12", "Sep-12", "Oct-12", "Nov-12", "Dec-12", "Jan-12", "Feb-13", "Mar-13", "Apr-13", "May-13", "Jun-13", "Jul-13"),
y = c(65, 77, 112, 279, 172, 133, 152, 106, 79, 225, 99, 150),
hoverinfo = "x+y+name",
line = list(
color = "#5BC075",
width = "3"
),
mode = "lines",
name = "Median deal size",
type = "scatter",
uid = "a8e83b",
xsrc = "jackluo:508:b357d2",
ysrc = "jackluo:508:d19900"
)
trace2 <- list(
x = c("Aug-12", "Sep-12", "Oct-12", "Nov-12", "Dec-12", "Jan-12", "Feb-13", "Mar-13", "Apr-13", "May-13", "Jun-13", "Jul-13"),
y = c(116, 125, 126, 125, 244, 136, 80, 82, 89, 82, 95, 107),
hoverinfo = "x+y+name",
line = list(
color = "#CC6E55",
width = "3"
),
mode = "lines",
name = "Number of deals",
type = "scatter",
uid = "2be33b",
xsrc = "jackluo:508:b357d2",
ysrc = "jackluo:508:5d533d"
)
data <- list(trace1, trace2)
layout <- list(
autosize = TRUE,
font = list(
family = "Overpass",
size = 12
),
height = 720,
legend = list(
x = 0,
y = -0.1,
bgcolor = "rgba(255, 255, 255, 0)",
orientation = "h"
),
margin = list(
r = 40,
t = 40,
b = 40,
l = 40,
pad = 2
),
title = "",
width = 1280,
xaxis = list(
autorange = TRUE,
nticks = 12,
range = c(0, 11),
rangemode = "tozero",
type = "category"
),
yaxis = list(
autorange = TRUE,
range = c(0, 293.6842105263158),
rangemode = "tozero",
type = "linear"
)
)
p <- plot_ly()
p <- add_trace(p, x=trace1$x, y=trace1$y, hoverinfo=trace1$hoverinfo, line=trace1$line, mode=trace1$mode, name=trace1$name, type=trace1$type, uid=trace1$uid, xsrc=trace1$xsrc, ysrc=trace1$ysrc)
p <- add_trace(p, x=trace2$x, y=trace2$y, hoverinfo=trace2$hoverinfo, line=trace2$line, mode=trace2$mode, name=trace2$name, type=trace2$type, uid=trace2$uid, xsrc=trace2$xsrc, ysrc=trace2$ysrc)
p <- layout(p, autosize=layout$autosize, font=layout$font, height=layout$height, legend=layout$legend, margin=layout$margin, title=layout$title, width=layout$width, xaxis=layout$xaxis, yaxis=layout$yaxis)
p$x$layout$width <- NULL
p$x$layout$height <- NULL
p$width <- NULL
p$height <- NULL
p
Any help resolving this issue so charts are correctly scaled and legible would be much appreciated!
As #NoahOlsen suggested, you need to format your x-axis values as a date.
trace1$x <- as.Date(paste0("01-", trace1$x), format = "%d-%b-%y")
trace2$x <- as.Date(paste0("01-", trace2$x), format = "%d-%b-%y")
Explanation
as.Date() tries to format an input into a date object. It works well with ISO date strings (e.g., 2019-04-21), but needs some help with more tricky formats.
From ?strptime:
%d - Day of the month as decimal number (01–31).
%b - Abbreviated month name in the current locale on this platform. (Also matches full name on input: in some locales there are no abbreviations of names.)
%Y - Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC): see https://en.wikipedia.org/wiki/0_(year). Note that the standards also say that years before 1582 in its calendar should only be used with agreement of the parties involved. For input, only years 0:9999 are accepted.
Furthermore, we also need a specific day of the month. As it does not exist in your data, I added 01- via paste0() to every value of the date vector. Other values, such as 15-, would also have been a valid choice (depending on your data and what type of output you expect). This way, we can make the function recognize your date via format = "%d-%b-%y".
Check out ?as.Date and ?strptime for more information. Ping me if you require further guidance. Happy to help.
It looks like your X axis is a character rather than a date so the axis is sorted alphabetically rather than chronologically. I would try making the x values dates.
I really appreaciate the 'plotly' r-package. Currently I run into an issue, where I want to visualize a data frame as points and map the point size (as well as the shape potentially) to a dimension of the data frame.
The problem I run into with my own dataset is, that the sizes are somehow "mixed up" in the sense, that the bigger points don't correspond to the bigger values.
I haven't fully understood the options I have with plotly (sizeref and other marker-options; the fundamental difference between mapping the dimension directly or in the marker arguments; etc) , so this is my best shot as a minimal example right here.
(The second plot is closer to what I currently do. If this one could be fixed, it would be preferable to me)
Your thoughts are greatly appreciated. :)
library(plotly)
set.seed(1)
df <- data.frame(x = 1:10,
y = rep(c("id1", "id2"), 5),
col = factor(sample(3, 10, replace = TRUE)))
df$size <- c(40, 40, 40, 30, 30, 30, 20, 20, 20, 10)
df
#> x y col size
#> 1 1 id1 1 40
#> 2 2 id2 2 40
#> 3 3 id1 2 40
#> 4 4 id2 3 30
#> 5 5 id1 1 30
#> 6 6 id2 3 30
#> 7 7 id1 3 20
#> 8 8 id2 2 20
#> 9 9 id1 2 20
#> 10 10 id2 1 10
# Mapping looks right, but the size may not be correct
plot_ly(df,
x = ~x,
y = ~y,
color = ~col,
size = ~size,
type = 'scatter',
mode = 'markers',
hoverinfo = "text",
text = ~paste('</br> x: ', x,
'</br> y: ', y,
'</br> col: ', col,
'</br> size: ', size)
# , marker = list(size = ~size)
)
# Size looks right, but mapping to points is wrong
plot_ly(df,
x = ~x,
y = ~y,
color = ~col,
# size = ~size,
type = 'scatter',
mode = 'markers',
hoverinfo = "text",
text = ~paste('</br> x: ', x,
'</br> y: ', y,
'</br> col: ', col,
'</br> size: ', size)
, marker = list(size = ~size)
)
devtools::session_info() # excerpt
#> plotly * 4.8.0
I'm trying to do a concentric pie chart. The internal pie represent three classes of subjects and each class has to be splitted in 3 sub-classes (of course the slices for the sub-classes have to be in line with the corresponding internal slice).
this is what I tried:
layout(matrix(c(1,1,1,1,2,1,1,1,1), nrow=3)); pie(x=c(14,22,15,3,15,33,0,6,45),labels="",col=c("#f21c39","#dba814","#7309de")); pie(x=c(51,51,51),labels=c("O","VG","V"),col=c("#c64719","#0600f5","#089c1f"))
This worked, but the internal pie is too small. I tried to play with the radius option, but then the external slices are not correspondent to the internal ones. how can I adjust them?
Use par(new=TRUE) to overplot the pies rather than layout() in this case
pie(x=c(14,22,15,3,15,33,0,6,45),labels="",
col=c("#f21c39","#dba814","#7309de"))
par(new=TRUE)
pie(x=c(51,51,51),labels=c("O","VG","V"),radius=.5,
col=c("#c64719","#0600f5","#089c1f"))
Three years later. this can be achieved using sunburstR package. http://timelyportfolio.github.io/sunburstR/example_baseball.html
Example:
DF <- data.frame(LOGRECNO = c(60, 61, 62, 63, 64, 65),
STATE = c(1, 1, 1, 1, 1, 1),
COUNTY = c(1, 1, 1, 1, 1, 1),
TRACT = c(21100, 21100, 21100, 21100, 21100, 21100),
BLOCK = c(1053, 1054, 1055, 1056, 1057, 1058))
DF$BLOCKID <-
paste(DF$LOGRECNO, DF$STATE, DF$COUNTY,
DF$TRACT, DF$BLOCK, sep = "-")
DF %>%
select(BLOCKID) %>%
group_by(BLOCKID) %>%
summarise(Tots=n())->dftest
sunburst(dftest)
I'm sure you are able to adapt this to suit your needs!
you could also use the ggsunburst package
# install ggsunburst
if (!require("ggplot2")) install.packages("ggplot2")
if (!require("rPython")) install.packages("rPython")
install.packages("http://genome.crg.es/~didac/ggsunburst/ggsunburst_0.0.9.tar.gz", repos=NULL, type="source")
library(ggsunburst)
df <- read.table(header=T, text = "
parent node size
O 1 14
O 2 22
O 3 15
V 1 3
V 2 15
V 3 33
VG 1 1
VG 2 6
VG 3 45")
write.table(df, file = 'df.txt', sep = ',', row.names = F)
sb <- sunburst_data('df.txt', type = "node_parent", sep = ",")
p <- sunburst(sb, node_labels = T, leaf_labels = F, rects.fill.aes = "name")
cols <- c("O" = "#c64719", "V" = "#0600f5", "VG" = "#089c1f", "1" = "#f21c39", "2" = "#dba814", "3" = "#7309de")
p + scale_fill_manual(values = cols)