Calculating network indexes for 1000 simulated null matrices in R - r

I am trying to calculate network indexes (clustering, modularity, edge density, degree, centrality etc) from 1000 simulated null matrices using the igraph package in R. The data I'm using is a mixed-species bird flock data that I've used to generate the null matrices.
Here's the code:
## Construct null matrices ##
library(EcoSimR)
library(igraph)
# creating a 1000 empty matrices
fl_emp <- lapply(1:1000, function(i) data.frame())
# simulating 1000 matrices by randomization
fl_wp_n <- replicate(1000, sim5(fl_wp[,3:ncol(fl_wp)]),simplify = FALSE) #fl_wp is the raw data
#sim5 function is from the package 'EcoSimR'
for(i in 1:length(fl_emp))
{
fl_wp_ig <- graph_from_incidence_matrix(fl_wp_n[[i]]) #Creating new igraph object to convert the null matrices to igraph objects to calculate network indexes
fl_wp_cw <- cluster_walktrap(fl_wp_ig[[i]])
fl_wp_mod <- modularity(fl_wp_cw[[i]]) ##Network index, this does not work
}
Here's what the simulated matrices look like(fl_wp_n) :
[1]: https://i.stack.imgur.com/1Q0Na.png
It is basically a list of 1000 elements, where each element is a simulated 133x74 matrix where the rows represent flock ID and the columns represent Species ID.
This is the error I'm getting when I run the loop:
> for(i in 1:length(fl_emp))
+ {
+ fl_wp_ig <- graph_from_incidence_matrix(fl_wp_n[[i]])
+ fl_wp_cw <- cluster_walktrap(fl_wp_ig[[i]])
+ fl_wp_mod <- modularity(fl_wp_cw[[i]])
+ }
Error in cluster_walktrap(fl_wp_ig[[i]]) : Not a graph object!
It seems to be not recognizing fl_wp_ig as an igraph object. Any idea why?
Is there a better way to do calculate indices for a 1000 matrices in one loop?
Sorry if this is a dumb question, I'm new to igraph and R in general
Thanks a lot in advance!

If you have a look at the documentation for 1. cluster_walktrap, you will see the function expects a graph object. As #Szabolcs pointed out, when you are index fl_wp_ig[[i]] in the for-loop, you are returning the vertices adjacent to vertex [[i]], but not the graph itself. You only should iterate over fl_wp_n[[i]] because you want to use every time a 'matrix' but not the other variables.
So you could try:
list_outputs = list()
for(i in 1:length(fl_emp))
{
# fl_wp_n[[i]] gets 1 matrix each iteration. Output -> graph object
fl_wp_ig <- graph_from_incidence_matrix(fl_wp_n[[i]])
# Use the whole graph object fl_wp_ig
fl_wp_cw <- cluster_walktrap(fl_wp_ig)
# Use the whole fl_wp_cw output
fl_wp_mod <- modularity(fl_wp_cw)
# NOTE that you are not storing the result of each iteration in a variable to keep it,
# you are overwritting fl_wp_mod
# You could have create a empty list before the for-loop and then fill it
list_outputs = append(list_outputs, fl_wp_mod)
}
Also, if you find it difficult to see the whole picture, you could try to create a custom function and use apply methods instead of a for-loop.
# Custom function
cluster_modularity = function(graph_object){
# takes only one graph_object at time
fl_wp_ig <- graph_from_incidence_matrix(graph_object)
fl_wp_cw <- cluster_walktrap(fl_wp_ig)
fl_wp_mod <- modularity(fl_wp_cw)
}
# Iterate using lapply to store the outputs in a list - for example
list_outputs = lapply(fl_wp_n, cluster_modularity)

Related

Sampling with list of rasters and locations using function in R

I am sweating over this piece of code. I have received previously help to build it here. In short, what I am doing here I have list of three rasters that I am randomly sampling numberv times. Therefore, the output is a list of four lists, each list has three rasters. After I obtain the random points locations, I then take the raster value in this location.
Problem I want to solve is that I would like to take the second sample locations, ie sample.set[[1]][2] and obtain raster value from rasters[1]. Then I would like to take sample.set[[1]][3] and obtain raster value from rasters[2]. Then sample.set[[2]][2] and obtain raster value from rasters[1] and sample.set[[2]][3] and obtain raster value from rasters[2] etc. The result would be a list of 4 lists, each list with 2 elements with sample xy values (locations) and previous raster value.
Help will be much appreciated.
y <- matrix(1:150,50,3)
mv <- c(1,2,3)
rep = 20
valuematrix <- vector("list",ncol(y))
for (i in 1:ncol(y)) {
newmatrix <- replicate(rep,y[,i])
valuematrix[[i]] <- newmatrix
}
library(sp)
library(raster)
rasters <- setNames(lapply(valuematrix, function(x) raster(x)),
paste0('raster',1:length(mv)))
# Create a loop that will sample the rasters
library(dismo)
numberv = c(10,12,14,16) # sample number vector
# Function to sample using a given number (returns list of three)
sample.number <- function(x) {
rps <- lapply(rasters, function(y) randomPoints(raster(y),n=x))
setNames(rps,paste0('sample',1:length(mv)))
}
# Apply sample.number() to your numberv list
sample.set <- lapply(numberv,sample.number)
# Function to extract values from a given sample
sample.extract <- function(x) {
lapply(1:length(x),function(y) data.frame(x[[y]],
extract(rasters[[y]],x[[y]])))
}
# Apply sample.extract() to the set of samples (returns list of four lists)
sample.values <- lapply(sample.set,sample.extract)
Now I would like to use the sample values from the second element of the list sample.set to sample 1st raster in list rasters I try this but no success:
sample.extract.prev <- function(x) {
lapply(1:length(x),function(y) data.frame(x[[y]],
extract(rasters[[y]],x+1[[y]])))
}
sample.values.prev <- lapply(sample.set,sample.extract.prev)
Managed to solve this (big high five to myself ;)
Unfortunately I managed to do it with a loop, would be great to see an example of a function.
samplevaluesnext <- vector("list",length(sample.set))
## Look up values
for (j in 1:length(sample.set)) {
for (i in 1:(length(rasters)-1)) {
samplevaluesnext[[j]][[i]] <- data.frame(sample.set[[j]][[i+1]],
extract(rasters[[i]],
as.data.frame(sample.set[[j]][i+1])))
}
}

Create an edge list from co-authership network in R

I am trying to create an edge list with igraph for a co-authorship network analysis project. My data is stored in a way that every author of a specific paper is listen in rows, meaning that every paper is an observation and the columns contain the authors of that paper.
Is it possible to use the combn function to create an edge list of every combination of authors within each paper?
i guess you will have to do it one by one but you can put them all together using do.call('c',...)
library(utils)
## original data as a list
data.in = list(c(1,2,3),c(4,5),c(3),c(1,4,6))
## function that makes all pairs
f.pair.up <- function(x) {
n = length(x)
if (n<2) {
res <- NULL
} else {
q <- combn(n,2)
x2 <- x[q]
#dim(x2) <- dim(q)
res <- x2
}
return(res)
}
## for each paper create all author pairs (as a flat vector)
data.pairs.bypaper = lapply(data.in,f.pair.up)
## remove papers that contribute no edges
data.pairs.noedge = sapply(data.pairs.bypaper,is.null)
data.pairs2.bypaper <- data.pairs.bypaper[!data.pairs.noedge]
## combine all 'subgraphs'
data.pairs.all <- do.call('c',data.pairs2.bypaper)
## how many authors are there?
n.authors <- length(unique(data.pairs.all))
## make a graph
my.graph = graph(data.pairs.all,directed=FALSE,n=n.authors)
## plot it
tkplot(my.graph)

Running the same function multiple times and saving results with different names in workspace

So, I built a function called sort.song.
My goal with this function is to randomly sample the rows of a data.frame (DATA) and then filter it out (DATA.NEW) to analyse it. I want to do it multiple times (let's say 10 times). By the end, I want that each object (mantel.something) resulted from this function to be saved in my workspace with a name that I can relate to each cycle (mantel.something1, mantel.somenthing2...mantel.something10).
I have the following code, so far:
sort.song<-function(DATA){
require(ade4)
for(i in 1:10){ # Am I using for correctly here?
DATA.NEW <- DATA[sample(1:nrow(DATA),replace=FALSE),]
DATA.NEW <- DATA.NEW[!duplicated(DATA.NEW$Point),]
coord.dist<-dist(DATA.NEW[,4:5],method="euclidean")
num.notes.dist<-dist(DATA.NEW$Num_Notes,method="euclidean")
songdur.dist<-dist(DATA.NEW$Song_Dur,method="euclidean")
hfreq.dist<-dist(DATA.NEW$High_Freq,method="euclidean")
lfreq.dist<-dist(DATA.NEW$Low_Freq,method="euclidean")
bwidth.dist<-dist(DATA.NEW$Bwidth_Song,method="euclidean")
hfreqlnote.dist<-dist(DATA.NEW$HighFreq_LastNote,method="euclidean")
mantel.numnotes[i]<<-mantel.rtest(coord.dist,num.notes.dist,nrepet=1000)
mantel.songdur[i]<<-mantel.rtest(coord.dist,songdur.dist,nrepet=1000)
mantel.hfreq[i]<<-mantel.rtest(coord.dist,hfreq.dist,nrepet=1000)
mantel.lfreq[i]<<-mantel.rtest(coord.dist,lfreq.dist,nrepet=1000)
mantel.bwidth[i]<<-mantel.rtest(coord.dist,bwidth.dist,nrepet=1000)
mantel.hfreqlnote[i]<<-mantel.rtest(coord.dist,hfreqlnote.dist,nrepet=1000)
}
}
Could someone please help me to do it the right way?
I think I'm not assigning the cycles correctly for each mantel.somenthing object.
Many thanks in advance!
The best way to implement what you are trying to do is through a list. You can even make it take two indices, the first for the iterations, the second for the type of analysis.
mantellist <- as.list(1:10) ## initiate list with some values
for (i in 1:10){
...
mantellist[[i]] <- list(numnotes=mantel.rtest(coord.dist,num.notes.dist,nrepet=1000),
songdur=mantel.rtest(coord.dist,songdur.dist,nrepet=1000),
hfreq=mantel.rtest(coord.dist,hfreq.dist,nrepet=1000),
...)
}
return(mantellist)
In this way you can index your specific analysis for each iteration in an intuitive way:
mantellist[[2]][['hfreq']]
mantellist[[2]]$hfreq ## alternative
EDIT by Mohr:
Just for clarification...
So, according to your suggestion the code should be something like this:
sort.song<-function(DATA){
require(ade4)
mantellist <- as.list(1:10)
for(i in 1:10){
DATA.NEW <- DATA[sample(1:nrow(DATA),replace=FALSE),]
DATA.NEW <- DATA.NEW[!duplicated(DATA.NEW$Point),]
coord.dist<-dist(DATA.NEW[,4:5],method="euclidean")
num.notes.dist<-dist(DATA.NEW$Num_Notes,method="euclidean")
songdur.dist<-dist(DATA.NEW$Song_Dur,method="euclidean")
hfreq.dist<-dist(DATA.NEW$High_Freq,method="euclidean")
lfreq.dist<-dist(DATA.NEW$Low_Freq,method="euclidean")
bwidth.dist<-dist(DATA.NEW$Bwidth_Song,method="euclidean")
hfreqlnote.dist<-dist(DATA.NEW$HighFreq_LastNote,method="euclidean")
mantellist[[i]] <- list(numnotes=mantel.rtest(coord.dist,num.notes.dist,nrepet=1000),
songdur=mantel.rtest(coord.dist,songdur.dist,nrepet=1000),
hfreq=mantel.rtest(coord.dist,hfreq.dist,nrepet=1000),
lfreq=mantel.rtest(coord.dist,lfreq.dist,nrepet=1000),
bwidth=mantel.rtest(coord.dist,bwidth.dist,nrepet=1000),
hfreqlnote=mantel.rtest(coord.dist,hfreqlnote.dist,nrepet=1000)
)
}
return(mantellist)
}
You can achieve your objective of repeating this exercise 10 (or more times) without using an explicit for-loop. Rather than have the function run the loop, write the sort.song function to run one iteration of the process, then you can use replicate to repeat that process however many times you desire.
It is generally good practice not to create a bunch of named objects in your global environment. Instead, you can hold of the results of each iteration of this process in a single object. replicate will return an array (if possible) otherwise a list (in the example below, a list of lists). So, the list will have 10 elements (one for each iteration) and each element will itself be a list containing named elements corresponding to each result of mantel.rtest.
sort.song<-function(DATA){
DATA.NEW <- DATA[sample(1:nrow(DATA),replace=FALSE),]
DATA.NEW <- DATA.NEW[!duplicated(DATA.NEW$Point),]
coord.dist <- dist(DATA.NEW[,4:5],method="euclidean")
num.notes.dist <- dist(DATA.NEW$Num_Notes,method="euclidean")
songdur.dist <- dist(DATA.NEW$Song_Dur,method="euclidean")
hfreq.dist <- dist(DATA.NEW$High_Freq,method="euclidean")
lfreq.dist <- dist(DATA.NEW$Low_Freq,method="euclidean")
bwidth.dist <- dist(DATA.NEW$Bwidth_Song,method="euclidean")
hfreqlnote.dist <- dist(DATA.NEW$HighFreq_LastNote,method="euclidean")
return(list(
numnotes = mantel.rtest(coord.dist,num.notes.dist,nrepet=1000),
songdur = mantel.rtest(coord.dist,songdur.dist,nrepet=1000),
hfreq = mantel.rtest(coord.dist,hfreq.dist,nrepet=1000),
lfreq = mantel.rtest(coord.dist,lfreq.dist,nrepet=1000),
bwidth = mantel.rtest(coord.dist,bwidth.dist,nrepet=1000),
hfreqlnote = mantel.rtest(coord.dist,hfreqlnote.dist,nrepet=1000)
))
}
require(ade4)
replicate(10, sort.song(DATA))

using interp1 in R for matrix

I am trying to use the interp1 function in R for linearly interpolating a matrix without using a for loop. So far I have tried:
bthD <- c(0,2,3,4,5) # original depth vector
bthA <- c(4000,3500,3200,3000,2800) # original array of area
Temp <- c(4.5,4.2,4.2,4,5,5,4.5,4.2,4.2,4)
Temp <- matrix(Temp,2) # matrix for temperature measurements
# -- interpolating bathymetry data --
depthTemp <- c(0.5,1,2,3,4)
layerZ <- seq(depthTemp[1],depthTemp[5],0.1)
library(signal)
layerA <- interp1(bthD,bthA,layerZ);
# -- interpolate= matrix --
layerT <- list()
for (i in 1:2){
t <- Temp[i,]
layerT[[i]] <- interp1(depthTemp,t,layerZ)
}
layerT <- do.call(rbind,layerT)
So, here I have used interp1 on each row of the matrix in a for loop. I would like to know how I could do this without using a for loop. I can do this in matlab by transposing the matrix as follows:
layerT = interp1(depthTemp,Temp',layerZ)'; % matlab code
but when I attempt to do this in R
layerT <- interp1(depthTemp,t(Temp),layerZ)
it does not return a matrix of interpolated results, but a numeric array. How can I ensure that R returns a matrix of the interpolated values?
There is nothing wrong with your approach; I probably would avoid the intermediate t <-
If you want to feel R-ish, try
apply(Temp,1,function(t) interp1(depthTemp,t,layerZ))
You may have to add a t(ranspose) in front of all if you really need it that way.
Since this is a 3d-field, per-row interpolation might not be optimal. My favorite is interp.loess in package tgp, but for regular spacings other options might by available. The method does not work for you mini-example (which is fine for the question), but required a larger grid.

make loop to create list of igraph objects in R

I'd like to create a list of Igraph objects with the data used for each Igraph object determined by another variable.
This is how I create a single Igraph object
netEdges <- NULL
for (idi in c("nom1", "nom2", "nom3")) {
netEdge <- net[c("id", idi)]
names(netEdge) <- c("id", "friendID")
netEdge$weight <- 1
netEdges <- rbind(netEdges, netEdge)
}
g <- graph.data.frame(netEdges, directed=TRUE)
For each unique value of net$community I'd like to make a new Igraph object. Then I would like to calculate measures of centrality for each object and then bring those measures back into my net dataset. Many thanks for your help!
Since the code you provide isn't completely reproducible, what follows is not guaranteed to run. It is intended as a guide for how to structure a real solution. If you provide example data that others can use to run your code, you will get better answers.
The simplest way to do this is probably to split net into a list with one element for each unique value of community and then apply your graph building code to each piece, storing the results for each piece in another list. There are several ways to doing this type of thing in R, one of which is to use lapply:
#Break net into pieces based on unique values of community
netSplit <- split(net,net$community)
#Define a function to apply to each element of netSplit
myFun <- function(dataPiece){
netEdges <- NULL
for (idi in c("nom1", "nom2", "nom3")) {
netEdge <- dataPiece[c("id", idi)]
names(netEdge) <- c("id", "friendID")
netEdge$weight <- 1
netEdges <- rbind(netEdges, netEdge)
}
g <- graph.data.frame(netEdges, directed=TRUE)
#This will return the graph itself; you could change the function
# to return other values calculated on the graph
g
}
#Apply your function to each subset (piece) of your data:
result <- lapply(netSplit,FUN = myFun)
If all has gone well, result should be a list containing a graph (or whatever you modified myFun to return) for each unique value of community. Other popular tools for doing similar tasks include ddply from the plyr package.

Resources