DiagrammeR: How to add labels beside arrows? - r

I would like to create a flowchart with the DiagrammeR package in R. The graphic should look as in the following example, but I would like to add some labels above the arrows.
Consider the following reproducible example in R:
library("DiagrammeR")
# Create a node data frame (ndf)
ndf <- create_node_df(n = 4,
shape = c("rectangle"))
# Create an edge data frame (edf)
edf <- create_edge_df(from = c(1, 2, 3, 3),
to = c(4, 3, 1, 4),
rel = c("a", "b", "c", "d"))
# Create a graph with the ndf and edf
graph <- create_graph(nodes_df = ndf,
edges_df = edf)
# Create a PDF file for the graph (`graph.pdf`)
graph %>%
export_graph(file_name = "graph.pdf",
title = "Simple Graph")
With this code, I can create the following graph:
Above the 4 arrows of the graph I would like to add the labels a, b, c and d. Unfortunately, I wasn't able to find anything about that in the documentation. It seems like I am doing something wrong with the rel argument in within the function create_edge_df.

I think you add label = my_vector_of_labels into the definition of create_edges:
# Create an edge data frame (edf) using diagrammer v 0.9.2
edf <- create_edge_df(from = c(1, 2, 3, 3),
to = c(4, 3, 1, 4),
rel = c("a", "b", "c", "d"),
label = c("a", "b", "c", "d"))

Related

For loop over the same variable in multiple datasets

I have multiple datasets and would like to create a contingency table for the same variable in each of them. I am attempting to write a for loop over these datasets, but am having difficulty accessing the necessary variable. Here's a fake set-up to illustrate my issue:
data1 <- data.frame(name = c("A", "B", "C"),
value1 = c(1, 2, 2),
value2 = c(1, 3, 7))
data2 <- data.frame(name = c("D", "E", "F"),
value1 = c(3, 4, 3),
value2 = c(8, 2, 1))
datasets <- c("data1", "data2")
If I manually execute table(data1$value1) then I receive a result. However, if I try something like the following:
for (i in seq_along(datasets)) {
variable <- datasets[[i]]$value1
table(variable)
}
then R throws an error message "Error: $ operator is invalid for atomic vectors." Given this, what is the best way to achieve my initial aim?

How can I reclassify a categorical raster in terra using category names

I'd like to be able to merge two categories in a categorical raster. The only solution I've figured out so far uses the level index number, not the name of the category. How could I do this using the name of the category?
library(terra)
m <- matrix(rep(c("a", "b", "c"), each = 3), nrow = 3, ncol = 3)
x <- rast(m)
x[x$lyr.1 == "c"]
m2 <- matrix(c("a", "a", "b", "b", "c", "b"), nrow = 3, ncol = 2, byrow = TRUE)
test <- classify(x, m2)
#doesn't work with category names
test <- subst(x, "c", "b")
#doesn't work with category names
test <- subst(x, 2, 1)
#works with category index
Example data
library(terra)
m <- matrix(rep(c("a", "b", "c"), each = 3), nrow = 3, ncol = 3)
x <- rast(m)
m2 <- matrix(c("a", "a", "b", "b", "c", "b"), nrow = 3, ncol = 2, byrow = TRUE)
With the current version of terra you can do either
test1 <- subst(x, "c", "b")
or
test2 <- subst(x, 2, 1, raw=TRUE)
library(terra)
library(tidyverse)
m <- matrix(rep(c("a", "b", "c"), each = 3), nrow = 3, ncol = 3)
x <- rast(m)
plot(x)
reclassified <- cats(x)[[1]] %>%
mutate(label_reclass = forcats::fct_collapse(cats(x)[[1]]$label,c="b"))
x <- categories(x, layer=1, value=reclassified, active=2)
plot(x)
levels(x)
[[1]]
value label_reclass
1 1 a
2 2 c
3 3 c

Plotting Number of Times Value Appears in Two Dataframes in R

I have two sets of data. Each contains a column for the name of the molecule and a column for the number of times that molecule appears in the sample. I want to create a scatterplot with the number of times a molecule appears in dataset #1 on the x-axis and how many times it appears in dataset #2. If a molecule is in one dataset and not the other, it appears 0 times.
Example:
dat1 <- data.frame(
name = c("A", "B", "D", "E")
count = c(10, 1, 30, 10)
)
dat2 <- data.frame(
name = c("A", "B", "C", "F")
count = c(1, 3, 50, 40)
)
Point #1 would be (10,1) corresponding to A, Point #2 would be (1,3), Point #3 would be (0,50) and so on. I don't want to label my points since my datasets contain tens of thousands of molecules.
Try joining the data.frames
full_join(dat1, dat2, by="name") %>%
mutate_all(function(xx) ifelse(is.na(xx), 0, xx)) %>%
ggplot(aes(count.x, count.y)) +
geom_point()
which produces
You would need a full_join():
library(dplyr)
library(ggplot2)
#Data
dat1 <- data.frame(
name = c("A", "B", "D", "E"),
count = c(10, 1, 30, 10)
)
dat2 <- data.frame(
name = c("A", "B", "C", "F"),
count = c(1, 3, 50, 40)
)
#Code
dat1 %>% full_join(dat2 %>% rename(count2=count)) %>%
replace(is.na(.),0) %>%
ggplot(aes(x=count,y=count2))+
geom_point()+
geom_text(aes(label=name),vjust=-0.5)
Output:

color nodes in igraph ggraph r

I'd like to color the nodes of a graph based on an attribute in the original dataframe. But I think I haven't "carried through" that aestetic variable to the graph.
Example here that works:
library(dplyr)
library(igraph)
library(ggraph)
data <-
tibble(
from = c("a", "a", "a", "b", "b", "c"),
to = c(1, 2, 3, 1, 4, 2),
type = c("X", "Y", "Y", "X", "Y", "X")
)
graph <-
graph_from_data_frame(data)
ggraph(graph,
layout = "fr") +
geom_node_point() +
geom_edge_link()
I'd like something like geom_node_point(aes(color = type)), but haven't made type findable in the graph?
The issue here is that you added the type column as an edge-attribute whereas geom_node_point expects a vertex-attribute (see ?graph_from_data_frame: Additional columns are considered as edge attributes.).
Another issue is that type is not consistent for either node column (e.g. a is associated with type X and also Y, the same is true for node 2).
To address the first issue you could add additional vertex information to the vertices argument of the graph_from_data_frame function.
The simplest solution to address both issues is to add the type attribute after creating the graph:
data <-
tibble(
from = c("a", "a", "a", "b", "b", "c"),
to = c(1, 2, 3, 1, 4, 2)
)
graph <- graph_from_data_frame(data)
V(graph)$type <- bipartite.mapping(graph)$type
The bipartite.mapping function adds either TRUE or FALSE consistently to each vertex of different type.

Tile Formats Using Formattable in R

What I would like to do is use color_tile on a column based on a calculation instead of basing it on it's own continuous scale. I have code to format the color of the text in the following example:
library(formattable)
my.df <- data.frame(
Type = c("A", "B", "C", "D", "E"),
Count = c(5, 7, 3, 8, 4),
Expect = c(2, 6, 3, 15, 0)
)
formattable(my.df, list(
Type = formatter("span",
style = ~style(color = ifelse(Count/Expect - 1 > .2, "red",
ifelse(Count/Expect - 1 < -.2, "green", "black")))
)
))
Those are the colors I want, but instead of colored text I would like to have tiles. I haven't figured out how I can use color_tile that is based on a calculation or based on something other than its own column's values. Can I condition using tiles?

Resources