What I would like to do is use color_tile on a column based on a calculation instead of basing it on it's own continuous scale. I have code to format the color of the text in the following example:
library(formattable)
my.df <- data.frame(
Type = c("A", "B", "C", "D", "E"),
Count = c(5, 7, 3, 8, 4),
Expect = c(2, 6, 3, 15, 0)
)
formattable(my.df, list(
Type = formatter("span",
style = ~style(color = ifelse(Count/Expect - 1 > .2, "red",
ifelse(Count/Expect - 1 < -.2, "green", "black")))
)
))
Those are the colors I want, but instead of colored text I would like to have tiles. I haven't figured out how I can use color_tile that is based on a calculation or based on something other than its own column's values. Can I condition using tiles?
Related
Is there a distance function that can calculate both binary and numeric column distances at once?
tibble( Observation = c(1:6), V1 = c(3, 5, 4, 6, 9, 5),
V2 = c("a", "b", "a", "c", "b", "a"),
label = c("Red", "Red", "Blue", "Blue", "Red", "Blue")) %>%
select(2:4) %>%
fastDummies::dummy_cols() %>%
select(c(-V2, -label))
I typically use dist(df, method = 'binary'), but now I have a numeric column with the new dummy columns I created. The numeric column, V1, is equally important as the dummy variables.
There is a distmix function from kmed where we specify the index of numeric/binary/categorical columns in idnum/idbin/idcat respectively. It is mentioned in the ?distmix
idnum -
A vector of column index of the numerical variables.
idbin -
A vector of column index of the binary variables.
idcat -
A vector of column index of the categorical variables.
library(kmed)
distmix(df1, idnum = 1, idbin = 2:ncol(df1))
In the example data, numeric column is the first column and all other columns are binary, thus we specify 2:ncol(df1) as index for idbin
data
df1 <- tibble( Observation = c(1:6), V1 = c(3, 5, 4, 6, 9, 5),
V2 = c("a", "b", "a", "c", "b", "a"),
label = c("Red", "Red", "Blue", "Blue", "Red", "Blue")) %>%
select(2:4) %>%
fastDummies::dummy_cols() %>%
select(c(-V2, -label))
I have two data sets from which I would like to generate histograms showing how the data overlap by name (A, B, C). I have written a custom function so I can use ggplot with map2.
I would like the graphs to be titled according to the name of each data set, so "A", "B", "C." Does anyone know of a way to do this?
# load packages
library(ggplot2)
library(dplyr)
library(purrr)
## load and format data 1
df1_raw <- data.frame(name = c("A", "B", "C", "A", "C", "B"),
start = c(1, 3, 4, 5, 2, 1),
end = c(6, 5, 7, 8, 6, 7))
df1 <- split(x = df1_raw, f = df1_raw$name) # split data by name
df1 <- lapply(df1, function(x) Map(seq.int, x$start, x$end)) # generate sequence intervals
df1 <- map(df1, unlist) # unlist sequences
df1 <- lapply(df1, data.frame) # convert to df
## load and format data 2
df2_raw <- data.frame(name = c("C", "B", "C", "A", "A", "B"),
start = c(5, 4, 3, 4, 4, 5),
end = c(7, 8, 7, 6, 9, 6))
df2 <- split(x = df2_raw, f = df2_raw$name) # split data by name
df2 <- lapply(df2, function(x) Map(seq.int, x$start, x$end)) # generate sequence intervals
df2 <- map(df2, unlist) # unlist sequences
df2 <- lapply(df2, data.frame) # convert to df
## write custom ggplot function and generate graphs
gplot <- function(data1, data2) {
ggplot() +
geom_histogram(data = data1, aes(x = X..i..), binwidth = 1, color = "grey", fill = "grey") +
geom_histogram(data = data2, aes(x = X..i..), binwidth = 1, fill = "pink", alpha = 0.7) +
labs(
title = ls(data1))
}
hist <- map2(df1, df2, gplot)
I also tried the following in the title field in my function:
deparse(substitute(data1))
Another similar option to what #GregorThomas mentioned in the comments, you could add a name variable to your data.frames and pull from that in your gplot() function. I've also shown how you might combine a few of your data manipulation steps:
# load packages
library(ggplot2)
library(dplyr)
library(purrr)
## load and format data 1
df1_raw <- data.frame(name = c("A", "B", "C", "A", "C", "B"),
start = c(1, 3, 4, 5, 2, 1),
end = c(6, 5, 7, 8, 6, 7))
df1 <- df1_raw %>%
split(.$name) %>% # split data by name
imap(function(x, x_name) {
data.frame(value = Map(seq.int, x$start, x$end) %>% unlist,
name = x_name)
})
## load and format data 2
df2_raw <- data.frame(name = c("C", "B", "C", "A", "A", "B"),
start = c(5, 4, 3, 4, 4, 5),
end = c(7, 8, 7, 6, 9, 6))
df2 <- df2_raw %>%
split(.$name) %>% # split data by name
imap(function(x, x_name) {
data.frame(value = Map(seq.int, x$start, x$end) %>% unlist,
name = x_name)
})
## change the title component of your previous function
gplot <- function(data1, data2) {
ggplot() +
geom_histogram(data = data1, aes(x = value), binwidth = 1, color = "grey", fill = "grey") +
geom_histogram(data = data2, aes(x = value), binwidth = 1, fill = "pink", alpha = 0.7) +
ggtitle(data1$name[1])
}
## plot it
map2(df1, df2, gplot)
I'd like to color the nodes of a graph based on an attribute in the original dataframe. But I think I haven't "carried through" that aestetic variable to the graph.
Example here that works:
library(dplyr)
library(igraph)
library(ggraph)
data <-
tibble(
from = c("a", "a", "a", "b", "b", "c"),
to = c(1, 2, 3, 1, 4, 2),
type = c("X", "Y", "Y", "X", "Y", "X")
)
graph <-
graph_from_data_frame(data)
ggraph(graph,
layout = "fr") +
geom_node_point() +
geom_edge_link()
I'd like something like geom_node_point(aes(color = type)), but haven't made type findable in the graph?
The issue here is that you added the type column as an edge-attribute whereas geom_node_point expects a vertex-attribute (see ?graph_from_data_frame: Additional columns are considered as edge attributes.).
Another issue is that type is not consistent for either node column (e.g. a is associated with type X and also Y, the same is true for node 2).
To address the first issue you could add additional vertex information to the vertices argument of the graph_from_data_frame function.
The simplest solution to address both issues is to add the type attribute after creating the graph:
data <-
tibble(
from = c("a", "a", "a", "b", "b", "c"),
to = c(1, 2, 3, 1, 4, 2)
)
graph <- graph_from_data_frame(data)
V(graph)$type <- bipartite.mapping(graph)$type
The bipartite.mapping function adds either TRUE or FALSE consistently to each vertex of different type.
I would like to create a flowchart with the DiagrammeR package in R. The graphic should look as in the following example, but I would like to add some labels above the arrows.
Consider the following reproducible example in R:
library("DiagrammeR")
# Create a node data frame (ndf)
ndf <- create_node_df(n = 4,
shape = c("rectangle"))
# Create an edge data frame (edf)
edf <- create_edge_df(from = c(1, 2, 3, 3),
to = c(4, 3, 1, 4),
rel = c("a", "b", "c", "d"))
# Create a graph with the ndf and edf
graph <- create_graph(nodes_df = ndf,
edges_df = edf)
# Create a PDF file for the graph (`graph.pdf`)
graph %>%
export_graph(file_name = "graph.pdf",
title = "Simple Graph")
With this code, I can create the following graph:
Above the 4 arrows of the graph I would like to add the labels a, b, c and d. Unfortunately, I wasn't able to find anything about that in the documentation. It seems like I am doing something wrong with the rel argument in within the function create_edge_df.
I think you add label = my_vector_of_labels into the definition of create_edges:
# Create an edge data frame (edf) using diagrammer v 0.9.2
edf <- create_edge_df(from = c(1, 2, 3, 3),
to = c(4, 3, 1, 4),
rel = c("a", "b", "c", "d"),
label = c("a", "b", "c", "d"))
With the library highcharter in R, I would like to reuse the default legend of the heatmap (a gradient color with some value) but with a bar plot.
How is it possible ?
Many thanks
Regards
Sam
Yes, here it is. The first figure is what I want and the second one is what I have done. The two axis are inverted but it is not important. Now, I would like to draw the two legends :
one with a gradient of color which corresponds to the colors of the circles (slot 'Color' in my dataframe example)
one with the different sizes if circles (value are in slot 'Count' of the dataframe example)
What I would like to draw
What I have done
The dataframe I used for the example is the following :
SO <- data.frame(Description=c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K"),
Ratio = c(0.22, 0.14, 0.14, 0.14, 0.14, 0.10, 0.10, 0.08, 0.06, 0.06, 0.06),
Color = c(31, 20, 20, 13, 13, 12, 7, 13, 7, 6, 5),
Count = c(11, 7, 7, 7, 7, 5, 5, 4, 3, 3, 3))
the function I wrote for that second plot is the following :
nRes <- nrow(SO)
SO <- SO[order(SO$Ratio, decreasing=TRUE),]
SO <- SO[seq(1:nRes),]
colfunc <- colorRampPalette(c("red","royalblue"))
nbColors <- 5
pal <- colfunc(nbColors)
t <- SO$Color
d <- (max(t) - min(t))/nbColors
base <- seq(from=min(t), to=max(t), by = d)
tmpList <- lapply(t, function(x){
if (x == min(t)){ ind <- 1}
else {ind <- which(x > base)[length(which(x > base))]}
})
myColorsIndex <- unlist(tmpList)
df <- data.frame(x=c(0:(nRes-1)),
y=SO$Ratio,
z=SO$Count,
color=pal[myColorsIndex],
colorSegment=pal[myColorsIndex],
Color = SO$Color,
name = SO$Description)
highchart() %>%
hc_chart(type = "bubble") %>%
hc_add_series(df) %>%
hc_legend(enabled = FALSE) %>%
hc_xAxis(type = "category", categories = df$name) %>%
hc_yAxis(title = list(text = "Ratio"))
Thanks for any help