How to create a list of igraphs from ncol files? - r

I have a number of weighted graphs stored in *.ncol files. For the sake of the question lets say I have 2 files,
ncol_1.ncol
21 53 1.0
5 52 1.0
32 52 1.0
119 119 0.5
119 85 0.5
87 0 1.0
36 116 1.0
85 87 1.0
116 5 1.0
4 52 1.0
115 4 1.0
53 115 1.0
52 36 0.3333333333333333
52 21 0.3333333333333333
52 119 0.3333333333333333
ncol_2.ncol
21 115 1.0
119 85 1.0
87 0 1.0
85 87 1.0
4 48 0.3333333333333333
4 20 0.6666666666666666
115 4 1.0
55 119 1.0
48 4 1.0
20 4 0.25
20 20 0.25
20 21 0.25
20 55 0.25
0 20 1.0
I would like to read these into a list of graphs, that is I would like to have a list where my_graphs[1] would give me the graph from ncol_1.ncol
(I've crossed over from Python so my go to data structure is a list, I am open to a better R solution if one exists)
My attempt in R,
library(igraph)
f_list <- list.files(pattern = "\\.ncol$")
set.seed(123)
my_graphs <- list(length(f_list)
g_count = 1
for (f in f_list){
print(f)
my_graphs[g_count] <- read.graph(f, format = "ncol", directed = T)
g_count = g_count + 1
}
This is what I get as output,
[1] "r_test_1_0_100.ncol"
[1] "r_test_1_0_125.ncol"
Warning messages:
1: In my_graphs[g_count] <- read.graph(f, format = "ncol", directed = T) :
number of items to replace is not a multiple of replacement length
2: In my_graphs[g_count] <- read.graph(f, format = "ncol", directed = T) :
number of items to replace is not a multiple of replacement length
> my_graphs[1]
[[1]]
[1] 13
So what am I doing wrong here? Is the list initialization wrong? is this something that just cant be done? am I expecting Pythonic behaviour where none exists?

For list, you should use
my_graphs[[g_count]] <- read.graph(f, format = "ncol", directed = T)
or
my_graphs[g_count] <- list(read.graph(f, format = "ncol", directed = T))

Related

Splitting a matrix into multiple matrices

There are two matrices:
Matrix with 2 columns: node name and node degree (k1):
Matrix with 1 column: degrees (ms):
I need to split 1st matrix into multiple matrices, where every matrix has nodes of same degree. Then, write matrices to csv-files. But my code is not working. How can i do this correctly?
k1<-read.csv2("VandD.csv", header = FALSE)
fnk1<-as.matrix(k1)
ms<-read.csv2("mas.csv", header = FALSE)
massive<-as.matrix(ms)
wlk<-1
varbl<-1
rtt<-list()
for (wlk in 1:384) {
rtt<-NULL
stepen<-massive[wlk]
for (varbl in 1:2154) {
if(fnk1[varbl,2]==stepen){
kapa<-fnk1[varbl,1]
rtt<-append(rtt,kapa)
}
}
namef<-paste("reslt",stepen,".csv",sep = "")
write.csv2(rtt, file=namef)
}
k1
V1 V2
1 UC7Ucs42FZy3uYzjrqzOIHsw 81
2 UCyWDmyZRjrGHeKF-ofFsT5Q 81
3 UCIZP6nCTyU9VV0zIhY7q1Aw 81
4 UCqk3CdGN_j8IR9z4uBbVPSg 81
5 UCjWzQkWu0l1yAhcBoavokng 81
6 UCRXiA3h1no_PFkb1JCP0yMA 81
7 UC2w9SdXpwq2Uq-MV4W4A8kw 81
8 UCdJqTQJZleoxZFReiyNvn8w 81
9 UC2Qw1dzXDBAZPwS7zm37g8g 81
10 UCTOovOHTf4efJOmGvJBxIQQ 81
ms
V1
1 81
2 82
3 83
4 84
5 85
6 86
7 87
8 88
9 89
10 90
Seems you need split
split(k1,k1$v2)
We can use group_split
library(dplyr)
k1 %>%
group_split(v2)

How to resample and remodel n times by vectorization?

here's my for loop version of doing resample and remodel,
B <- 999
n <- nrow(butterfly)
estMat <- matrix(NA, B+1, 2)
estMat[B+1,] <- model$coef
for (i in 1:B) {
resample <- butterfly[sample(1:n, n, replace = TRUE),]
re.model <- lm(Hk ~ inv.alt, resample)
estMat[i,] <- re.model$coef
}
I tried to avoid for loop,
B <- 999
n <- nrow(butterfly)
resample <- replicate(B, butterfly[sample(1:n, replace = TRUE),], simplify = FALSE)
re.model <- lapply(resample, lm, formula = Hk ~ inv.alt)
re.model.coef <- sapply(re.model,coef)
estMat <- cbind(re.model.coef, model$coef)
It worked but didn't improve efficiency. Is there any approach I can do vectorization?
Sorry, not quite familiar with StackOverflow. Here's the dataset butterfly.
colony alt precip max.temp min.temp Hk
pd+ss 0.5 58 97 16 98
sb 0.8 20 92 32 36
wsb 0.57 28 98 26 72
jrc+jrh 0.55 28 98 26 67
sj 0.38 15 99 28 82
cr 0.93 21 99 28 72
mi 0.48 24 101 27 65
uo+lo 0.63 10 101 27 1
dp 1.5 19 99 23 40
pz 1.75 22 101 27 39
mc 2 58 100 18 9
hh 4.2 36 95 13 19
if 2.5 34 102 16 42
af 2 21 105 20 37
sl 6.5 40 83 0 16
gh 7.85 42 84 5 4
ep 8.95 57 79 -7 1
gl 10.5 50 81 -12 4
(Assuming butterfly$inv.alt <- 1/butterfly$alt)
You get the error because resample is not a list of resampled data.frames, which you can obtain with:
resample <- replicate(B, butterfly[sample(1:n, replace = TRUE),], simplify = FALSE)
The the following should work:
re.model <- lapply(resample, lm, formula = Hk ~ inv.alt)
To extract coefficients from a list of models, re.model$coef does work. The correct path to coefficients are: re.model[[1]]$coef, re.model[[2]]$coef, .... You can get all of them with the following code:
re.model.coef <- sapply(re.model, coef)
Then you can combined it with the observed coefficients:
estMat <- cbind(re.model.coef, model$coef)
In fact, you can put all of them into replicate:
re.model.coef <- replicate(B, {
bf.rs <- butterfly[sample(1:n, replace = TRUE),]
coef(lm(formula = Hk ~ inv.alt, data = bf.rs))
})
estMat <- cbind(re.model.coef, model$coef)

Flip Every Nth Coin in R [duplicate]

This question already has answers here:
R: How to use ifelse statement for a vector of characters
(2 answers)
Closed 6 years ago.
My friend gave me a brain teaser that I wanted to try on R.
Imagine 100 coins in a row, with heads facing up for all coins. Now every 2nd coin is flipped (thus becoming tails). Then every 3rd coin is flipped. How many coins are now showing heads?
To create the vector, I started with:
flips <- rep('h', 100)
levels(flips) <- c("h", "t")
Not sure how to proceed from here. Any help would be appreciated.
Try this:
coins <- rep(1, 100) # 1 = Head, 0 = Tail
n = 3 # run till the time when you flip every 3rd coin
invisible(sapply(2:n function(i) {indices <- seq(i, 100, i); coins[indices] <<- (coins[indices] + 1) %% 2}) )
which(coins == 1)
# [1] 1 5 6 7 11 12 13 17 18 19 23 24 25 29 30 31 35 36 37 41 42 43 47 48 49 53 54 55 59 60 61 65 66 67 71 72 73 77 78 79 83 84 85 89 90 91 95 96 97
sum(coins==1)
#[1] 49
If you run till n = 100, only the coins at the positions which are perfect squares will be showing heads.
coins <- rep(1, 100) # 1 = Head, 0 = Tail
n <- 100
invisible(sapply(2:n, function(i) {indices <- seq(i, 100, i); coins[indices] <<- (coins[indices] + 1) %% 2}) )
which(coins == 1)
# [1] 1 4 9 16 25 36 49 64 81 100
sum(coins==1)
# [1] 10

How to fit the dual exponential or double exponential in R language

I have some data to fit. The function is like :
y = a*exp(b*x) + c*exp(d*x) ,
where "a", "b" ,"c" and "d" are the coefficients
I want to use gnm package to fit the dual exponential function. However, the result seems not good.
Is any other package to do this ?
Can Java or other language do it?
library(gnm);
data = read.table("F:\\AP\\R\\data.txt", header = T);
x <- data$X1;
y <- data$Y1;
set.seed(1);
saved.fits <- list();
for(i in 1:60){
saved.fits[[i]] <- suppressWarnings(gnm(y ~ Exp(1+x, inst = 1)+ Exp(1+x, inst =2),verbose = FALSE))
}
table(round(unlist(sapply(saved.fits, deviance)), 4))
X1:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
Y1:
9514.833
9463.002
9386.4277
9320.292
9252.0957
9187.4775
9122.5947
9068.1279
9013.3232
8930.418
8875.416
8789.1973
8727.9355
8649.0547
8600.0693
8529.3359
8490.0801
8421.7842
8371.4688
8303.041
8256.1719
8193.1416
8159.1553
8091.3022
8028.9263
7966.6748
7893.2056
7819.4702
7710.0962
7613.6069
5609.2266
5573.5923
5537.665
5501.6279
5477.5825
5450.0518
5435.9521
5402.5327
5379.1743
5348.1226
5320.5049
5282.2158
5263.5146
5236.125
5216.4038
5188.0493
5170.293
5142.6416
5114.8125
5087.1606
5059.5898
5032.0352
5001.5537
4979.8364
4951.5854
4932.1138
4903.7363
4888.1841
4869.7168
4854.7617
I also have a question about can I use Matlab instead of R in the web server. Because I have to process some electrical signals by methods such as filter, smooth, fitting and so on. But I am afraid that the Matlab liabray will be crashed when the concurrent tasks increase.

R error type "Subscript out of bounds"

I am simulating a correlation matrix, where the 60 variables correlate in the following way:
more highly (0.6) for every two variables (1-2, 3-4... 59-60)
moderate (0.3) for every group of 12 variables (1-12,13-24...)
mc <- matrix(0,60,60)
diag(mc) <- 1
for (c in seq(1,59,2)){ # every pair of variables in order are given 0.6 correlation
mc[c,c+1] <- 0.6
mc[c+1,c] <- 0.6
}
for (n in seq(1,51,10)){ # every group of 12 are given correlation of 0.3
for (w in seq(12,60,12)){ # these are variables 11-12, 21-22 and such.
mc[n:n+1,c(n+2,w)] <- 0.2
mc[c(n+2,w),n:n+1] <- 0.2
}
}
for (m in seq(3,9,2)){ # every group of 12 are given correlation of 0.3
for (w in seq(12,60,12)){ # these variables are the rest.
mc[m:m+1,c(1:m-1,m+2:w)] <- 0.2
mc[c(1:m-1,m+2:w),m:m+1] <- 0.2
}
}
The first loop works well, but not the second and third ones. I get this error message:
Error in `[<-`(`*tmp*`, m:m + 1, c(1:m - 1, m + 2:w), value = 0.2) :
subscript out of bounds
Error in `[<-`(`*tmp*`, m:m + 1, c(1:m - 1, m + 2:w), value = 0.2) :
subscript out of bounds
I would really appreciate any hints, since I don't see the loop commands get to exceed the matrix dimensions. Thanks a lot in advance!
Note that : takes precedence over +. E.g., n:n+1 is the same as n+1. I guess you want n:(n+1).
The maximal value of w is 60:
w <- 60
m <- 1
m+2:w
#[1] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
#[49] 51 52 53 54 55 56 57 58 59 60 61
And 61 is out of bounds. You need to add a lot of parentheses.

Resources