ordering nodes in Sankey diagram using rCharts

ordering nodes in Sankey diagram using rCharts - r

I'm building a Sankey diagram in R using rCharts per https://github.com/timelyportfolio/rCharts_d3_sankey
Everything is fine except that I'd like to have control over the placement of the nodes. As I run the R script, it produces this:
I want all node-columns in ascending order, like the 2012 and 2013 node-columns. And like the image below (which I modified manually).
My graph.data is already sorted in the proper order as you can see:
g <- graph.data.frame(network.df[ , c("source","target","weight")])
edgelist <- get.data.frame(g)
colnames(edgelist) <- c("source","target","value")
edgelist$source <- as.character(edgelist$source)
edgelist$target <- as.character(edgelist$target)
edgelist #<-edgelist is sorted by source
source target value
1 2012-0 2013-0 5
2 2012-0 2013-1 21
3 2012-1 2013-1 79
4 2013-0 2014-0 42
5 2013-0 2014-1 10
6 2013-0 2014-2 13
7 2013-0 2014-3 19
8 2013-0 2014-4 12
9 2013-0 2014-5 1
10 2013-1 2014-0 29
11 2013-1 2014-1 29
12 2013-1 2014-2 23
13 2013-1 2014-3 54
14 2013-1 2014-4 17
15 2014-0 2015-0 2
16 2014-0 2015-1 8
17 2014-0 2015-2 1
18 2014-0 2015-3 1
19 2014-0 2015-4 9
20 2014-1 2015-0 5
21 2014-1 2015-1 13
22 2014-1 2015-2 68
23 2014-1 2015-3 7
24 2014-1 2015-4 66
25 2014-2 2015-0 9
26 2014-2 2015-2 23
27 2014-2 2015-3 21
28 2014-3 2015-3 56
29 2014-4 2015-4 2
30 2014-5 2015-5 1
31 2015-0 2016-0 1
32 2015-0 2016-1 1
33 2015-0 2016-2 4
<more rows omitted>
sankeyPlot <- rCharts$new()
sankeyPlot$setLib('/rCharts_d3_sankey-gh-pages/rCharts_d3_sankey-gh-pages')
sankeyPlot$setTemplate(script = "rCharts_d3_sankey-gh-
pages/rCharts_d3_sankey-gh-pages/layouts/chart.html")

Related

Convert entire data set into numeric form during retrieving the file

i am working on Shiny app and want to convert entire data set into numeric form.I have used this code for retrieving file from local PC. what changes can be done that while retrieving i can convert entire data set into numeric form
datami <- reactive({
file1 <- input$file
if(is.null(file1)){return()}
read.csv(file=file1$datapath, sep=input$sep, header = input$header, stringsAsFactors = input$stringAsFactors)})
output$table <- renderPrint({
if(is.null(datami())){return ()}
str(datami())})
tabsetPanel(tabPanel("Data",div(h5("Data",style="color:red")),verbatimTextOutput("table"))```

Depending on how you want to deal with lower/uppercase letters (if you have them in your data) we could do one of the following:
MRE:
letter_variable <- c(letters, LETTERS)
Same numeric value for upper and lower case letters:
letter_variable_as_numeric1 <- as.numeric(factor(toupper(letter_variable), levels = LETTERS))
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
[22] 22 23 24 25 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
[43] 17 18 19 20 21 22 23 24 25 26
Different numeric value for upper and lower case letters:
letter_variable_as_numeric2 <- as.numeric(factor(letter_variable), levels = c(letters, LETTERS))
[1] 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41
[22] 43 45 47 49 51 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32
[43] 34 36 38 40 42 44 46 48 50 52

Optimizing preprocessing data frame in R

I have the following data frame with the name dataValues:
dates hours
1 2015-10-12 1
5 2015-10-12 5
9 2015-10-12 9
11 2015-10-12 11
14 2015-10-12 14
15 2015-10-12 15
17 2015-10-12 17
19 2015-10-12 19
22 2015-10-12 22
23 2015-10-12 23
24 2015-10-12 24
27 2015-10-13 3
29 2015-10-13 5
33 2015-10-13 9
36 2015-10-13 12
37 2015-10-13 13
38 2015-10-13 14
40 2015-10-13 16
42 2015-10-13 18
44 2015-10-13 20
45 2015-10-13 21
46 2015-10-13 22
47 2015-10-13 23
49 2015-10-14 1
54 2015-10-14 6
56 2015-10-14 8
59 2015-10-14 11
60 2015-10-14 12
61 2015-10-14 13
63 2015-10-14 15
64 2015-10-14 16
66 2015-10-14 18
69 2015-10-14 21
71 2015-10-14 23
72 2015-10-14 24
I have preprocessed this data frame to get all hours on a certain day, which is variable totallist and has output:
[[1]]
[1] 1 5 9 11 14 15 17 19 22 23 24
[[2]]
[1] 3 5 9 12 13 14 16 18 20 21 22 23
[[3]]
[1] 1 6 8 11 12 13 15 16 18 21 23 24
The code I used for this is the following:
uniqueDates <- unique(dataValues$dates)
totallist <- {}
for(date in uniqueDates){
templist <- {}
for(i in 1:length(dataValues$dates)){
if(dataValues$dates[i]==date){
newlist <- append(templist,dataValues$hours[i])
}
}
totallist <- append(totallist,list(templist))
}
For the example in this question (with 3 days) it works fine and the result is what I want, but if I use this on a large dataset (which has about 260 days), it takes about 6 to 7 minutes to finish.
My question is if there is an optimized way to do what I want?

Try any of these:
# 1
with(unique(dataValues), split(hours, dates))
# 1a - variation of last solution
with(dataValues, lapply(split(hours, dates), unique))
# 2
unstack(unique(dataValues), hours ~ dates)
# 2a - variation of last solution
lapply(unstack(dataValues, hours ~ dates), unique)
Note that if the data values are known to be unique already, as is the case in the sample data shown in the question, then unique(dataValues) in #1 and #2 could be replaced with just dataValues.

I believe you would be better by using the tapply function. I've created a simpler dataframe just to show what it is doing:
df <- data.frame(dates=rep(c("2015-01-02","2015-01-03","2015-01-04"),10),hours=trunc(runif(30,1,10)))
tapply(df$hours,df$dates,unique)
Output:
$`2015-01-02`
[1] 2 8 6 1 5
$`2015-01-03`
[1] 7 5 2 3
$`2015-01-04`
[1] 1 2 6 5 8 4 9

How to update and replace part of old data

I want to merge the df OldData and NewData.
In this case, Nov-2015 and Dec 2015 are present in both df.
Since NewData is the most accurate update available, I want to update the value of Nov-2015 and Dec 2015 using the value in df NewData and of course adding the records of Jan-2016 and Feb-2016 as well.
Can anyone help?
OldData
Month Value
1 Jan-2015 3
2 Feb-2015 76
3 Mar-2015 31
4 Apr-2015 45
5 May-2015 99
6 Jun-2015 95
7 Jul-2015 18
8 Aug-2015 97
9 Sep-2015 61
10 Oct-2015 7
11 Nov-2015 42
12 Dec-2015 32
NewData
Month Value
1 Nov-2015 88
2 Dec-2015 45
3 Jan-2016 32
4 Feb-2016 11
Here is the output I want:
JoinData
Month Value
1 Jan-2015 3
2 Feb-2015 76
3 Mar-2015 31
4 Apr-2015 45
5 May-2015 99
6 Jun-2015 95
7 Jul-2015 18
8 Aug-2015 97
9 Sep-2015 61
10 Oct-2015 7
11 Nov-2015 88
12 Dec-2015 45
13 Jan-2016 32
14 Feb-2016 11
Thanks for #akrun, the problem is solved, and the following code works smoothly!!
rbindlist(list(OldData, NewData))[!duplicated(Month, fromLast=TRUE)]
Update: Now, let's upgrade our problem little bit.
suppose our OldData and NewData have another column called "Type".
How do we merge/update it this time?
> OldData
Month Type Value
1 2015-01 A 3
2 2015-02 A 76
3 2015-03 A 31
4 2015-04 A 45
5 2015-05 A 99
6 2015-06 A 95
7 2015-07 A 18
8 2015-08 A 97
9 2015-09 A 61
10 2015-10 A 7
11 2015-11 B 42
12 2015-12 C 32
13 2015-12 D 77
> NewData
Month Type Value
1 2015-11 A 88
2 2015-12 C 45
3 2015-12 D 22
4 2016-01 A 32
5 2016-02 A 11
The JoinData will suppose to update all value from NewData ass following:
> JoinData
Month Type Value
1 2015-01 A 3
2 2015-02 A 76
3 2015-03 A 31
4 2015-04 A 45
5 2015-05 A 99
6 2015-06 A 95
7 2015-07 A 18
8 2015-08 A 97
9 2015-09 A 61
10 2015-10 A 7
11 2015-11 B 42
12 2015-11 A 88 (originally not included, added from the NewData)
12 2015-12 C 45 (Updated the value by NewData)
13 2015-12 D 22 (Updated the value by NewData)
14 2016-01 A 32 (newly added from NewData)
15 2016-02 A 11 (newly added from NewData)
Thanks for #akrun: I have got the solution here for the second question as well.
Thanks for the help for everyone here!
Here is the answer:
d1 <- merge(OldData, NewData, by = c("Month","Type"), all = TRUE);d2 <- transform(d1, Value.x= ifelse(!is.na(Value.y), Value.y, Value.x))[-4];d2[!duplicated(d2[1:2], fromLast=TRUE),]

Here is an option using data.table (similar approach as #thelatemail mentioned in the comments)
library(data.table)
rbindlist(list(OldData, NewData))[!duplicated(Month, fromLast=TRUE)]
Or
rbindlist(list(OldData, NewData))[,if(.N >1) .SD[.N] else .SD, Month]

Error while plotting a tree with some squirrels using trees package

I am using the package trees found here, by #jbaums and explained in this post.
My data are the following:
the tree is composed by
the trunk
Trunk
[1] 13.60415
and the branches
Tree
TreeBranchLength TreeBranchID
1 10.004269 1
2 7.994269 2
3 9.028834 11
4 10.817401 12
5 8.551311 111
6 10.599798 112
7 11.073243 121
8 13.367392 122
9 9.625431 1111
10 10.793569 1112
11 9.896499 11121
12 8.687741 11122
13 7.791180 1211
14 12.506105 1212
15 6.768478 1221
16 10.441796 1222
17 10.751892 1121
18 9.458651 1122
19 10.768509 11221
20 10.150673 11222
21 12.377448 111211
22 12.235136 111212
23 9.074079 11211
24 9.996334 11212
25 9.807019 112221
26 10.895809 112222
27 6.741274 1122211
28 15.841272 1122212
29 5.753920 11222111
30 8.846389 11222112
31 11.925961 112111
32 9.780776 112112
33 8.207965 12221
34 10.079375 12222
the 50 squirrel populations -
Populations
PopulationPositionOnBranch PopulationBranchID ID
1 10.6321655 112111 1
2 1.0644897 1 2
3 3.9315473 1 3
4 1.0310244 0 4
5 9.1768846 0 5
6 13.4267181 0 6
7 7.9461528 0 7
8 6.0533401 121 8
9 2.1227425 121 9
10 1.8256787 121 10
11 4.7332588 11222112 11
12 4.4837432 11222112 12
13 4.6200834 11222112 13
14 2.5622276 1221 14
15 1.2446683 1221 15
16 7.0674052 111 16
17 1.3854674 111 17
18 4.8735635 111 18
19 9.5007998 1222 19
20 6.6373468 1222 20
21 12.6757728 122 21
22 4.2685465 122 22
23 3.9806540 2 23
24 3.1025403 2 24
25 3.9119065 11122 25
26 1.5527653 11122 26
27 1.6687957 11122 27
28 8.0697456 1122 28
29 6.7871391 1122 29
30 9.8050713 111212 30
31 8.5226920 111212 31
32 3.6113379 111212 32
33 7.3184965 111211 33
34 8.6142984 111211 34
35 1.3550870 1211 35
36 8.3650639 12 36
37 4.6411446 112112 37
38 3.2985541 112112 38
39 12.2344148 1212 39
40 9.0290776 1212 40
41 1.3900249 1121 41
42 0.9261425 1122212 42
43 15.2522199 1122212 43
44 4.0253771 12222 44
45 8.7507678 11222 45
46 4.6289841 1122211 46
47 9.1799522 112 47
48 5.1293838 12221 48
49 1.1543080 12221 49
50 10.1014837 112222 50
the code to produce the plot
g <- germinate(list(trunk.height=Trunk,
branches=Tree$TreeBranchID,
lengths=Tree$TreeBranchLength),
left='1', right='2', angle=30))
xy <- squirrels(g, Populations$PopulationBranchID, pos=Populations$PopulationPositionOnBranch,
left='1', right='2', pch=21, bg='white', cex=3, lwd=2)
text(xy$x, xy$y, labels=seq_len(nrow(xy)), font=1)
, which produces
As you can see on the plot bellow population 43 (blue arrow) is out of the tree.. It seems that the length of the branches on the plot do not correspond to the data. For example the branch (left green arrow) on which are populations 38 and 37 is longer than the one where population 43 is (right green arrow), that is not the case in the data. What am I doing wrong? Have I understood correctly how to use trees?

On studying the germinate function it seems to me that the Tree values that you are passing to it needs to be sorted on TreeBranchId field in the ascending order.
The BranchID: 1122212 where you have placed 43 is not the actual 1122212 branch.
Due to the order in which you have fed the values in the Tree, the function is somehow messing the location of branch.
I was curious to see if I increase the length of Branch ID: 1122212, will it change the branch where 43 is placed, and guess what? it didn't. The branch which actually showed an increase in length was the branch where you have placed 37 and 38.
So this hint pointed out that something was wrong with germinate function. On further debugging I was able to make it work using the below code.
Tree<-read.csv("treeBranch.csv")
Tree<-Tree[order(Tree$TreeBranchID),]
g <- germinate(list(trunk.height=15,
branches=Tree$TreeBranchID,
lengths=Tree$TreeBranchLength),
left='1', right='2', angle=30)
xy <- squirrels(g, Populations$PopulationBranchID,pos=Populations$PopulationPositionOnBranch,
left='1', right='2', pch=21, bg='white', cex=3, lwd=2)
text(xy$x, xy$y, labels=seq_len(nrow(xy)), font=1)

how to fix "undefined columns selected" for network meta-analysis in R?

I am conducting a network meta-analysis on R with two packages, gemtc and rjags. However, when I type
Model <- mtc.model (network, linearmodel=’fixed’).
R always returns “
Error in [.data.frame(data, sel1 | sel2, columns, drop = FALSE) :
undefined columns selected In addition: Warning messages: 1: In
mtc.model(network, linearModel = "fixed") : Likelihood can not be
inferred. Defaulting to normal. 2: In mtc.model(network, linearModel =
"fixed") : Link can not be inferred. Defaulting to identity “
How to fix this problem? Thanks!
I am attaching my codes and data here:
SAE <- read.csv(file.choose(),head=T, sep=",")
head(SAE)
network <- mtc.network(data.ab=SAE)
summary(network)
plot(network)
model.fe <- mtc.model (network, linearModel="fixed")
plot(model.fe)
summary(model.fe)
cat(model.fe$code)
model.fe$data
# run this model
result.fe <- mtc.run(model.fe, n.adapt=0, n.iter=50)
plot(result.fe)
gelman.diag(result.fe)
result.fe <- mtc.run(model.fe, n.adapt=1000, n.iter=5000)
plot(result.fe)
gelman.diag(result.fe)
following is my data: SAE
study treatment responder sample.size
1 1 3 0 76
2 1 30 2 72
3 2 3 99 1389
4 2 23 132 1383
5 3 1 6 352
6 3 30 2 178
7 4 2 6 106
8 4 30 3 95
9 5 3 49 393
10 5 25 18 198
11 6 1 20 65
12 6 22 10 26
13 7 1 1 76
14 7 30 3 76
15 8 3 7 441
16 8 26 1 220
17 9 2 1 47
18 9 30 0 41
19 10 3 10 156
20 10 30 9 150
21 11 1 4 85
22 11 25 5 85
23 11 30 4 84
24 12 3 6 152
25 12 30 5 160
26 13 18 4 158
27 13 21 8 158
28 14 1 3 110
29 14 30 2 111
30 15 3 3 83
31 15 30 1 92
32 16 1 3 124
33 16 22 6 123
34 16 30 4 125
35 17 3 236 1553
36 17 23 254 1546
37 18 6 5 398
38 18 7 6 403
39 19 1 64 588
40 19 22 73 584

How about reading the manual ?mtc.model. It clearly states the following:
Required columns [responders, sampleSize]
So your responder variable should be responders and your sample.size variable should be sampleSize.
Next, your plot(network) should help you determine that some comparisons can not be made. In your data, there are 2 subgroups of trials that were compared. Treatment 18 and 21 were not compared with any of the others. Therefore you can only do a meta-analysis of 21 and 18 or a network meta-analysis of the rest.
network <- mtc.network(data.ab=SAE[!SAE$treatment %in% c(21, 18), ])
model.fe <- mtc.model(network, linearModel="fixed")

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

ordering nodes in Sankey diagram using rCharts - r

Related

Convert entire data set into numeric form during retrieving the file

Optimizing preprocessing data frame in R

How to update and replace part of old data

Error while plotting a tree with some squirrels using trees package

how to fix "undefined columns selected" for network meta-analysis in R?

Categories

Resources