I have a network with some directed and some undirected edges. I'm trying to use igraph to plot it using the arrow.mode parameter, but the graph is always showing arrows with default parameters. Here's an example
Here are some data:
spearRhoP_lagged4 <- structure(list(Var1 = c("ARISA_538.9", "ARISA_538.9", "ARISA_666.4",
"ARISA_686.9", "ARISA_538.9", "ARISA_594.1"), Var2 = c("ARISA_666.4",
"ARISA_686.9", "ARISA_686.9", "ARISA_666.4", "ARISA_561.8", "ARISA_561.8"
), rho = c(0.280885191364122, 0.415365287156247, 0.614493076574831,
0.312630564055403, 0.295296877306726, 0.381890811408216), p = c(0.00206314544835896,
2.9098006351119e-06, 1.35005674822095e-13, 0.000567475872663549,
0.00116911931220592, 1.98010880043619e-05), delay = c(0, 0, 0,
1, 0, 0), fdr = c(0.0135393920048557, 7.97032347878478e-05, 2.83511917126399e-11,
0.00503534929264839, 0.00898225813036257, 0.000366902513022),
arrow = c("-", "-", "-", ">", "-", "-")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L), groups = structure(list(
Var1 = c("ARISA_538.9", "ARISA_538.9", "ARISA_538.9", "ARISA_594.1",
"ARISA_666.4", "ARISA_686.9"), Var2 = c("ARISA_561.8", "ARISA_666.4",
"ARISA_686.9", "ARISA_561.8", "ARISA_686.9", "ARISA_666.4"
), .rows = list(5L, 1L, 2L, 6L, 3L, 4L)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))
Then I build the graph
LaggedSpearGraph <- graph_from_data_frame(spearRhoP_lagged4)
Lastly I plot the graph, telling it that I want the arrow direction to be specified by the parameter arrow
plot(LaggedSpearGraph,
vertex.size=2,
arrow.mode = E(LaggedSpearGraph)$arrow)
I get an output that looks like this.
But what I want is a network where there is only one edge with an arrow on it.
Any suggestions?
You need to add edge as a prefix:
LaggedSpearGraph <- graph_from_data_frame(spearRhoP_lagged4, directed=T)
plot(LaggedSpearGraph,
vertex.size=10,
edge.arrow.mode = E(LaggedSpearGraph)$arrow)
See here:
https://github.com/igraph/igraph/issues/954
Related
My data is a set of activities completed by persons. The sequence of activities a person takes varies. The data below show the activities for each step (Step1, Step2, etc). I'd like an alluvial plot that labels the activities at each step (each a different node 1, 2, 3...) What is the best approach? Here's what I have so far:
df<-structure(list(acts_activity_id = c("9928131", "445661", "686203", "687868", "688564"), Step1 = c("Unable to Reach", "Unable to Reach",
"Search Correspondence", "Unable to Reach", "Unable to Reach"), Step2 = c("Match Request", NA, "Connection Made", NA, "Match Request"
), Step3 = c("Support Group Request", NA, "Connection Contact Attempt", NA, "Support Group Request"),Step4 = c("Information Provided",
NA, "Not Available to Support", NA, "Information Provided"),
Step5 = c(NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_)), class = c("grouped_df", "tbl_df", "tbl",
"data.frame"),
row.names = c(NA, -5L),
groups = structure(list(acts_activity_id = c("9928131", "445661", "686203", "687868", "688564"), .rows = structure(list(1L, 2L, 3L, 4L, 5L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), .drop = TRUE))
df %>%
ggplot(
aes(
axis1=Step1, #each step has different values; individuals go thru different sequence of steps
axis2=Step2, axis3=Step3, axis4=Step4, axis5=Step5 ))+
geom_flow()+
geom_stratum()+
labs(title="Activity Sequence")
The first
If you have your data in this order (each column is a set of different activities), then use ggsankey:
df$acts_activity_id<-NULL
x<-df %>% ggsankey::make_long(Step1,Step2,Step3,Step4,Step5)
ggplot(x, aes(x = x, next_x = next_x,
node = node, next_node = next_node,
fill = factor(node), label = node)) +
geom_sankey(flow.alpha = 0.6, node.color = "gray30") +
geom_sankey_label(size = 3, color = "white", fill = "gray40") +
scale_fill_viridis_d() +
theme_sankey(base_size = 18) +
labs(x = NULL) +
theme(legend.position = "none",
plot.title = element_text(hjust = .5))
So I got a little issue.... Here's my code :
ggplot(GFAPdata_numb, aes(x=Level, y=Pos.Area, fill=Statut))+
geom_bar(stat="identity", color="black", position = "dodge")+
geom_errorbar(aes(ymin=lower, ymax=higher), width=.2, position=position_dodge(.9))
And for some weird and unknown reason, my plot look like this : weird dodge
And I don't know why ! The dodge seems to have work somehow but it like looks like the "ghost" of the data are still stacked and screwing up with my errorbars...
Do you guys have any ideas what's causing that ?
Edit : I was asked to put some data with dput so here it is (first time using this function so I'm not sure I did it right)
> dput(head(GFAPdata_numb))
structure(list(Agneau = c(1L, 1L, 1L, 1L, 2L, 2L), Statut = c("terme",
"terme", "terme", "terme", "terme", "terme"), Area = c(6.53,
6.53, 6.53, 6.53, 4.93, 4.93), Level = c("Weak", "Pos", "Strong",
"Neg", "Weak", "Pos"), Values = c(6744015L, 5076648L, 787615L,
13099676L, 5356151L, 3978924L), Positivity = c(0.262331844844596,
0.197473824638087, 0.0306370160768142, 0.509557314440504, 0.275961978086681,
0.205003880155091), Pos.Area = c(0.0401733299915154, 0.0302410144928157,
0.00469173293672499, 0.0780332793936453, 0.0559760604638299,
0.041582937151134), moyenne = c(0.0382848392036753, 0.0382848392036753,
0.0382848392036753, 0.0382848392036753, 0.050709939148073, 0.050709939148073
), ecart.type = c(0.0304231534615388, 0.0304231534615388, 0.0304231534615388,
0.0304231534615388, 0.0391149666608345, 0.0391149666608345),
SEM = c(0.0152115767307694, 0.0152115767307694, 0.0152115767307694,
0.0152115767307694, 0.0195574833304173, 0.0195574833304173
), lower = c(0.00847014881136729, 0.00847014881136729, 0.00847014881136729,
0.00847014881136729, 0.0123772718204552, 0.0123772718204552
), higher = c(0.0680995295959834, 0.0680995295959834, 0.0680995295959834,
0.0680995295959834, 0.0890426064756909, 0.0890426064756909
)), row.names = c(NA, -6L), groups = structure(list(Agneau = 1:2,
.rows = structure(list(1:4, 5:6), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
Ok I think I managed to correct the issue by myself after a couple of days not working on it to refresh my brain.
I was using "Pos.Area" as my y-value (which mean the value for each of my smaples) instead of the mean of the Pos.Area to create my plot. And I guess that's why my errorbars were so wild : I had the errorbars for each values of Pos.Area
Once I changed that, the plot was way better.
I've several data objects nested in a huge object which I need to stack using rbind. However, before stacking these, I need to convert column names to lower case, once the data objects were stored with different case styles. How could I make this happen?
Toy data
df <- list(structure(list(a = 1:3, x = c(-1.99, -1.11, -0.34), y = c("C", "B", "A")), .Names = c("a", "x",
"y"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-3L)), structure(list(a = 1:3, x = c(-0.44, -1.07,
-0.23)), .Names = c("A", "x"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -3L)), structure(list(
a = 1:3, x = c(-0.62, -0.60, -0.06
), y = c(3L, 2L, 1L)), .Names = c("a", "X", "y"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -3L)))
lapply(df, names)
rbind
data.table::rbindlist(df, fill=TRUE, idcol = TRUE)
Here is a solution using lapply. However, it creates a duplicate of the original list.
df_lower <- lapply(df, function(x) setNames(x, tolower(names(x))))
Design a function and use lapply to apply that function to all the data frames. This will change all column names to lower cases.
colname_fun <- function(dt){
dt <- setNames(dt, tolower(names(dt)))
return(dt)
}
lapply(df, colname_fun)
Enchanté.
EDIT: Solution
As pointed out by MartineJ and emilliman5, nodes should be uniquely labelled (below).
library("riverplot")
nodes<-structure(list(ID = c("2011+", "2011-", "2016+", "2016-"), x = c(20,
20, 30, 30), y = c(50, 40, 50, 40)), class = "data.frame", row.names = c(NA,
-4L))
edges<-structure(list(N1 = c("2011+", "2011-", "2011+", "2011-"), N2 =
c("2016+", "2016-", "2016-", "2016+"), Value = c(461, 7, 0, 46)), class =
"data.frame", row.names = c(NA, -4L))
river <- makeRiver(nodes,edges)
riverplot(river)
I've been toying to plot a Sankey diagram/riverplot (using the riverplot package) of how cancer registrations evolve over time, though this code has bought me little success so far. Could anyone possibly direct me on the faults of this code?
Warning message: In checkedges(x2$edges, names(x2)) : duplicated edge information, removing 1 edges
Here is the suspect code:
library(“riverplot”)
edges<-structure(list(N1 = c("+", "-", "+", "-"), N2 = c("+", "-", "-", "+"), Value = c(664L, 50L, 0L, 46L)), .Names = c("N1", "N2", "Value"), class = "data.frame", row.names = c(NA, -4L))
nodes = data.frame(ID = unique(c(edges$N1, edges$N2)), stringsAsFactors = FALSE)
nodes$x = c(1,2)
rownames(nodes) = nodes$ID
rp <- list(nodes=nodes, edges=edges)
class(rp) <- c(class(rp), "riverplot")
plot(rp)
And the data, which is included in code:
N1 N2 Value
+ + 664
- - 50
+ - 0
- + 46
Eternally grateful.
It looks like you're using the same value multiple times in N1 (and in N2). Try to make them all different (per column) and try again, f.i.:
N1: plus1 minus1 plus2 minus2
If you want to show only + and -: in makeRiver, there is an option **node_labels **
Your nodes need to be named uniquely and then use the nodes$labels to change it back:
library(riverplot)
edges<-structure(list(N1 = c("+", "-", "+", "-"), N2 = c("+", "-", "-", "+"), Value = c(664L, 50L, 0L, 46L)), .Names = c("N1", "N2", "Value"), class = "data.frame", row.names = c(NA, -4L))
edges$N1 <- paste0(edges$N1, "a")
edges$N2 <- paste0(edges$N2, "b")
nodes = data.frame(ID = unique(c(edges$N1, edges$N2)), stringsAsFactors = FALSE)
nodes$x = c(1,1,2,2)
nodes$labels <- as.character(substr(nodes$ID, 1, 1))
rownames(nodes) = nodes$ID
rp <- list(nodes=nodes, edges=edges)
class(rp) <- c(class(rp), "riverplot")
plot(rp)
My data is structured as follows:
dput(head(CharacterAnalysis,5))
structure(list(Character = c("A", "a", "B", "b", "C"),
Descriptor = c("Jog", "Change Direction", "Shuffle", "Walk", "Stop"),
.Names = c("Character", "Descriptor"),
row.names = c(NA, 5L), class = "data.frame")
I wish to lookup the Character and relevant Descriptor in the following data frame, but am unsure how to do so:
dput(head(StringAnalysis,3))
structure(list(MovementString = c("ACb", "aAaB", "BbCa"),
.Names = c("MovementString"),
row.names = c(NA, 3L), class = "data.frame")
My expected outcome/ data frame would be:
dput(head(Output,3))
structure(list(MovementString = c("ACb", "aAaB", "BbCa"),
MovementPerformed = c("Jog/ Stop/ Walk", "Change Direction/ Jog/ Change Direction/ Shuffle", "Shuffle/ Walk/ Stop/ Change Direction")
.Names = c("MovementString", "MovementPerformed"),
row.names = c(NA, 3L), class = "data.frame")
I would like a forward stroke (/) or similar to separate each Descriptor as it signals a new movement. Any advice on how to please complete this? My data frame CharacterAnalysis is over 1 million rows long, so I do not wish to have to search for each MovementString separately!
Thank you.
CharacterAnalysis <-
structure(list(Character = c("A", "a", "B", "b", "C"),
Descriptor = c("Jog", "Change Direction", "Shuffle", "Walk", "Stop")),
.Names = c("Character", "Descriptor"),
row.names = c(NA, 5L), class = "data.frame")
Output <-
structure(list(MovementString = c("ACb", "aAaB", "BbCa"),
MovementPerformed = c("Jog/ Stop/ Walk", "Change Direction/ Jog/ Change Direction/ Shuffle", "Shuffle/ Walk/ Stop/ Change Direction")),
.Names = c("MovementString", "MovementPerformed"),
row.names = c(NA, 3L), class = "data.frame")
# A simple approach based on names
# Build the lookup table just once
m <- CharacterAnalysis$Descriptor
names(m) <- CharacterAnalysis$Character
# Build the MovementPerformed column
Output$MovementPerformed <-
sapply(strsplit(Output$MovementString,""),
FUN = function(x) paste(m[x], collapse = "/ "))