Related
I'm trying to plot and metaMDS with the labels (spp) of the distance data.
dat<-data.frame(
"site" = c("a", "a", "a",'a', "b", "b", "b", 'b', "c", "c", "c",'c'),
"sample"=c(rep(1,4),rep(2,4),rep(3,4)),
"sp1"=c(1,3,3,2,4,2,1,5,3,6,1,5),
"sp2"=c(1,3,3,2,4,2,1,5,3,6,1,5),
"sp3"=c(1,3,3,2,4,2,1,5,3,6,1,5),
"sp4"=c(2,3,4,1,5,3,1,5,5,8,9,1),
"sp5"=c(3,4,3,1,6,7,5,8,3,1,3,2)
)
I tried
```
dat1<-dat[,3:7]
dmds<-metaMDS(dat1, distance = "bray", autotransform = FALSE)
mds1 <- dmds$points[,1]
mds2 <- dmds$points[,2]
plt<-cbind(dat,mds1,mds2)
spp<-names(dat1)
ggplot(plt,aes(mds1,mds2, shape=site,color=site))+
geom_point()+
geom_text(aes(label=spp), size=3)
```
However, a get an error
Error: Aesthetics must be either length 1 or the same as the data (12): label
Because the number of rows of dat and rows of "spp<-names(dat1)" are different
Is there any way to workaround this error?
I would like to have something like this, but with the correct spp number
As a further potential direction - I think you mean that you'll have a separate set of species for each site, rather than the five (sp1:sp5) identical species labelled in each site? Here's a worked example of what that might look like, passing three dataframes to allow for different columns representing three different sets of species:
library(tidyverse)
library(vegan)
dat_a<-data.frame(
"site" = "a",
sample = c(1:4),
"sp1_a"=c(1,3,3,2),
"sp2_a"=c(1,3,3,2),
"sp3_a"=c(1,3,3,2),
"sp4_a"=c(2,3,4,1),
"sp5_a"=c(3,4,3,1)
)
dat_b<-data.frame(
"site" = "b",
sample = c(1:4),
"sp1_b"=c(4,2,1,5),
"sp2_b"=c(4,2,1,5),
"sp3_b"=c(4,2,1,5),
"sp4_b"=c(5,3,1,5),
"sp5_b"=c(6,7,5,8)
)
dat_c<-data.frame(
"site" = "c",
sample = c(1:4),
"sp1_c"=c(3,6,1,5),
"sp2_c"=c(3,6,1,5),
"sp3_c"=c(3,6,1,5),
"sp4_c"=c(5,8,9,1),
"sp5_c"=c(3,1,3,2)
)
# Using purrr::map to iterate over lists of dataframes
results <-
map(list(dat_a, dat_b, dat_c), function(x)
metaMDS(x[, 3:7], , distance = "bray", autotransform = FALSE))
results %>%
map2(c("a", "b", "c"), ~ tibble(
site = .y,
species = rownames(.x$species),
x = .x$species[,1],
y = .x$species[,2]
)) %>%
bind_rows() %>%
ggplot(aes(x, y, label = species, col = site)) +
geom_text()
Created on 2021-05-08 by the reprex package (v2.0.0)
I would like to create a flowchart with the DiagrammeR package in R. The graphic should look as in the following example, but I would like to add some labels above the arrows.
Consider the following reproducible example in R:
library("DiagrammeR")
# Create a node data frame (ndf)
ndf <- create_node_df(n = 4,
shape = c("rectangle"))
# Create an edge data frame (edf)
edf <- create_edge_df(from = c(1, 2, 3, 3),
to = c(4, 3, 1, 4),
rel = c("a", "b", "c", "d"))
# Create a graph with the ndf and edf
graph <- create_graph(nodes_df = ndf,
edges_df = edf)
# Create a PDF file for the graph (`graph.pdf`)
graph %>%
export_graph(file_name = "graph.pdf",
title = "Simple Graph")
With this code, I can create the following graph:
Above the 4 arrows of the graph I would like to add the labels a, b, c and d. Unfortunately, I wasn't able to find anything about that in the documentation. It seems like I am doing something wrong with the rel argument in within the function create_edge_df.
I think you add label = my_vector_of_labels into the definition of create_edges:
# Create an edge data frame (edf) using diagrammer v 0.9.2
edf <- create_edge_df(from = c(1, 2, 3, 3),
to = c(4, 3, 1, 4),
rel = c("a", "b", "c", "d"),
label = c("a", "b", "c", "d"))
What I would like to do is use color_tile on a column based on a calculation instead of basing it on it's own continuous scale. I have code to format the color of the text in the following example:
library(formattable)
my.df <- data.frame(
Type = c("A", "B", "C", "D", "E"),
Count = c(5, 7, 3, 8, 4),
Expect = c(2, 6, 3, 15, 0)
)
formattable(my.df, list(
Type = formatter("span",
style = ~style(color = ifelse(Count/Expect - 1 > .2, "red",
ifelse(Count/Expect - 1 < -.2, "green", "black")))
)
))
Those are the colors I want, but instead of colored text I would like to have tiles. I haven't figured out how I can use color_tile that is based on a calculation or based on something other than its own column's values. Can I condition using tiles?
I'm working with public transit data in the GTFS standard and have been building edge lists of origin stop to target stop in a sequence across an entire route. I've put some sample R code below to show a sample of the data and graph.
library(igraph)
# edgelist with two nodes with outdegree > 1.
edgelist <- data.frame(source = c("Z","A", "B", "C", "D", "E", "F", "F", "A"),
target = c("A","B", "C", "D", "E", "F", "G", "H", "I"),
edge_sequence = c(0,1, 2, 3, 4, 5, 6, NA , NA),
source_node_out_degree = c(1,1, 1, 1, 1, 1, 2, 2, 2),
group = factor(c(1,1,1,1,1,1,1,2,2)))
# i would like to remove edges within my sequence that have an outdegree of
# one and merge the original source with the
plot(graph.data.frame(edgelist), edge.arrow.size = 0.3)
Below is the edgelist I would like to generate. In this example I've reduced the connection from A->F because it is a. along the sequence and b. only nodes with an out degree of one are between A and F.
# the expected edgelist after simplifying the network. Connecting nodes that
# have outdegree > 1 on the sequence of edges.
new_expected_edgelist <- data.frame(source = c("Z","A", "F", "F", "A"),
target = c("A","F", "G", "H", "I"))
# edges with outdegree == 1 have been reduced.
plot(graph.data.frame(new_expected_edgelist), edge.arrow.size = 0.3)
The application of this would allow me to simplify my network for visualizing only the edges that are shared between multiple public transit routes. Some routes extend for many stops outside a connection to any other stop and make visualizing the complexity of network more difficult.
You can use the contract.vertices command:
g<-graph.data.frame(edgelist)
h<-contract.vertices(g,c(1,2,3,3,3,3,3,8,9,10))
I implemented the FR test here and now I would like to test it by means of visualizing the resulting minimum spanning trees in R. The vertices and edges should be plotted in a coordinate system.
Moreover I want to set the color for every dot (depending on to which sample it belongs) and express a possible third dimension through the size of the dots.
This is what I have got so far:
library(ggplot2)
nodes <- data.frame(cbind(c("A", "A", "A", "B", "B", "B"), c(1,2,3,8,2,1), c(6,3,1,4,5,6)))
edges <- data.frame(cbind(c("A", "A", "A"), c("A", "B", "B"), c(1,3,2), c(6,1,5), c(2,8,1), c(3,4,6)))
p <- ggplot() +
geom_point(nodes, aes(x=nodes[,2], y=nodes[,3])) +
geom_line(edges)
p
I also think igraph would be best here...
nodes <- data.frame(a=c("A", "A", "A", "B", "B", "B"), b=c(1,2,3,8,2,1),
d=c(6,3,1,4,5,6))
#cbind made your nodes characters so i have removed it here
edges <- data.frame(a=c("A", "A", "A"), b=c("A", "B", "B"), d=c(1,3,2),
e=c(6,1,5), f=c(2,8,1), g=c(3,4,6))
Here is an example using your data as above, to produce the colours colouring with the coordinate layout system coords
library(igraph)
from <- c(rep(edges[,3],3),rep(edges[,4],2),edges[,5])
to <- c(edges[,4],edges[,5],edges[,6],edges[,5],edges[,6],edges[,6])
myedges <- data.frame(from,to)
actors <- data.frame(acts=c(1,2,3,4,5,6,8))
colouring <- sample(colours(), 7)
sizes <- sample(15,7)
coords<-cbind(x=runif(7,0,1),y=runif(7,0,1))
myg <- graph.data.frame(myedges, vertices=actors, directed=FALSE)
V(myg)$colouring <- colouring
V(myg)$sizes <- sizes
plot(myg,vertex.color=V(myg)$colouring,vertex.size=V(myg)$sizes,
layout=coords,edge.color="#55555533")
for plotting a spanning there are also many options, e.g.
d <- c(1,2,3)
E(myg)$colouring <- "#55555533"
E(myg, path=d)$colouring <- "red"
V(myg)[ d ]$colouring <- "red"
plot(myg,vertex.color=V(myg)$colouring,vertex.size=V(myg)$sizes
,edge.width=3,layout=coords,edge.color=E(myg)$colouring )
with axes:
plot(myg,vertex.color=V(myg)$colouring,vertex.size=V(myg)$sizes
,edge.width=3,layout=coords,edge.color=E(myg)$colouring, axes=TRUE )
and use rescale=FALSE to keep original axes scale