R function for manipulative field experiment design - Plant ecology - r

I am a plant ecologist and I would like to design a field manipulative experiment. To achieve a good degree of randomisation in the disposition of plants inside experimental plots, I would like to use R (with which I'm familiar).
The picture is a schematisation of the design: in total there would be 6 plots, each containing 24 different plant species. In total I have 36 plant species, from which I would like to randomly sample 24 species per plot (with high variation between plots). Each species should be present in 4 out of the 6 plots but in different positions, twice along the border (in two of the warm colours) and twice in the core area (once per cold colour); in the picture I visually explain what I mean using 4 different species (A, B, C, D).
Could anyone suggest a way to do it or some insight to write the function?

I borrowed some insight from this thread and this thread. Basically, you make a circular matrix and shuffle groups.
coords <- list(c(1,1),c(1,2),c(2,1),c(3,1),
c(1,3),c(1,4),c(2,4),c(3,4),
c(2,2),c(3,2),c(2,3),c(3,3),
c(4,1),c(5,1),c(6,1),c(6,2),
c(4,2),c(5,2),c(4,3),c(5,3),
c(4,4),c(5,4),c(6,3),c(6,4))
Matrix <- matrix(c(LETTERS,0:9)[1:36][matrix(1:36,36+1,36+1,byrow=T)[c(1,36:2),1:36]],36,36)
PlotLayouts <- Matrix[(1:6*6),1:24][,unlist(lapply(split(1:24,rep(1:6,each=4)),sample,4))]
PlotLayouts <- split(PlotLayouts,sample(1:6,6))
Result <- lapply(PlotLayouts,function(Vector){
Layout <- matrix(NA,nrow=6,ncol=4)
for(i in 1:24){
Layout[coords[[i]][1],coords[[i]][2]] <- Vector[i]
}
Layout
})
#Species Counts
table(unlist(Result))
0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Result
$`1`
[,1] [,2] [,3] [,4]
[1,] "I" "H" "M" "L"
[2,] "K" "R" "S" "N"
[3,] "J" "P" "Q" "O"
[4,] "U" "Z" "0" "2"
[5,] "V" "Y" "X" "1"
[6,] "T" "W" "4" "3"
$`2`
[,1] [,2] [,3] [,4]
[1,] "0" "Z" "4" "3"
[2,] "2" "9" "A" "5"
[3,] "1" "7" "8" "6"
[4,] "C" "H" "I" "K"
[5,] "D" "G" "F" "J"
[6,] "B" "E" "M" "L"
...

Related

give names to values in a matrix in R

not sure if this is possible but it should. i want to have a matrix which elements have names just like you can do in a vector like this:
v = 1:10
names(v) = LETTERS[1:10]
result:
A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10
I've tried to create a matrix and use the same sintax:
m = matrix(v, ncol=2, nrow=5)
names(m) = letters[1:8]
but the result is not what i hoped for.
result:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
attr(,"names")
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
I dont want it to be two separated entities. is there a way to do this without any libraries? or at all?
Thank you

Creating an edgelist from Patent data in R

I am trying to create an edgelist out of patent data of the form:
PatentID InventorIDs CoinventorIDs
1 A ; B C,D,E ; F,G,H,C
2 J ; K ; L M,O ; N ; P, Q
What I would like is the edgelist below showing the connections between inventors and patents. (the semicolons separate the coinventors associated with each primary inventor):
1 A B
1 A C
1 A D
1 A E
1 B F
1 B G
1 B H
1 B C
2 J K
2 J L
2 J M
2 J O
2 K N
2 L P
2 L Q
Is there an easy way to do this with igraph in R?
I'm confused by the edges going between the inventorIds. But, here is a kind of brute force function that you could just apply by row. There may be a way with igraph, it being a massive library, that is better, but once you have the data in an this form it should be simple to convert to an igraph data structure.
Note that this leaves out the edges between primary inventors.
## A function to make the edges for each row
rowFunc <- function(row) {
tmp <- lapply(row[2:3], strsplit, '\\s*;\\s*')
tmp2 <- lapply(tmp[[2]], strsplit, ',')
do.call(rbind, mapply(cbind, row[[1]], unlist(tmp[[1]]), unlist(tmp2, recursive=FALSE)))
}
## Apply the function by row
do.call(rbind, apply(dat, 1, rowFunc))
# [,1] [,2] [,3]
# [1,] "1" "A" "C"
# [2,] "1" "A" "D"
# [3,] "1" "A" "E"
# [4,] "1" "B" "F"
# [5,] "1" "B" "G"
# [6,] "1" "B" "H"
# [7,] "1" "B" "C"
# [8,] "2" "J" "M"
# [9,] "2" "J" "O"
# [10,] "2" "K" "N"
# [11,] "2" "L" "P"
# [12,] "2" "L" " Q"

R: transposing and splitting a row with a delimiter.

I have a table
rawData <- as.data.frame(matrix(c(1,2,3,4,5,6,"a,b,c","d,e","f"),nrow=3,ncol=3))
1 4 a,b,c
2 5 d,e
3 6 f
I would like to convert to
1 2 3
4 5 6
a d f
b e
c
so far I can transpose and split the third column, however, I'm lost as to how to reconstruct a new table with the format outline above?
new = t(rawData)
for (e in 1:ncol(new)){
s<-strsplit(new[3:3,e], split=",")
print(s)
}
I tried creating new vectors for each iteration but I'm not sure how to efficiently put each one back into a dataframe. Would be grateful for any help. thanks!
You can use stri_list2matrix from the stringi package:
library(stringi)
rawData <- as.data.frame(matrix(c(1,2,3,4,5,6,"a,b,c","d,e","f"),nrow=3,ncol=3),stringsAsFactors = F)
d1 <- t(rawData[,1:2])
rownames(d1) <- NULL
d2 <- stri_list2matrix(strsplit(rawData$V3,split=','))
rbind(d1,d2)
# [,1] [,2] [,3]
# [1,] "1" "2" "3"
# [2,] "4" "5" "6"
# [3,] "a" "d" "f"
# [4,] "b" "e" NA
# [5,] "c" NA NA
You can also use cSplit from my "splitstackshape" package.
By default, it just creates additional columns after splitting the input:
library(splitstackshape)
cSplit(rawData, "V3")
# V1 V2 V3_1 V3_2 V3_3
# 1: 1 4 a b c
# 2: 2 5 d e NA
# 3: 3 6 f NA NA
You can just transpose that to get your desired output.
t(cSplit(rawData, "V3"))
# [,1] [,2] [,3]
# V1 "1" "2" "3"
# V2 "4" "5" "6"
# V3_1 "a" "d" "f"
# V3_2 "b" "e" NA
# V3_3 "c" NA NA

R - Splitting a column text into 2 columns without delimiter

I need to manipulate the following data frame (data) so that the PATCH_CODE column is split into 2 resulting columns where the 1st column contains the letter of the string and the 2nd column contains the number as in the 2nd example dataframe below.
EDIT PATCH_CODE is not always 2 letters, occasional cases have a single letter in which case I need to force a 1 into the resulting code column.
initial data frame: head(data,4)
PATCH_CODE TERR PC1
A1 MENS_10 0.8629186
A3 MENS_10 -0.2703238
B1 MENS_10 0.9516067
B2 MENS_10 -0.1722446
resulting data frame:
PATCH CODE TERR PC1
A 1 MENS_10 0.8629186
A 3 MENS_10 -0.2703238
B 1 MENS_10 0.9516067
B 2 MENS_10 -0.1722446
I have seen examples of how to accomplish this when the column to be split has an identifiable text delimiter such as a comma by using colsplit in reshape but I have failed to find a solution for a structure like mine. Is this possible?
output of str(data)
'data.frame': 240 obs. of 3 variables:
$ PATCH_CODE: Factor w/ 42 levels "A","A1","A2",..: 2 3 4 7 8 12 13 16 17 18 ...
$ TERR : Factor w/ 19 levels "MENS_10","MENS_14",..: 1 1 1 1 1 1 1 1 1 1 ...
$ PC1 : num 0.548 1.228 0.273 5.548 3.853 ...
You can use strsplit. Passing an empty string as a delimiter results in a split at each letter.
a <- c("A1", "B1", "C2", "D5", "R3")
strsplit(a, "")
[[1]]
[1] "A" "1"
[[2]]
[1] "B" "1"
[[3]]
[1] "C" "2"
[[4]]
[1] "D" "5"
[[5]]
[1] "R" "3"
If you want to put that in a matrix
> do.call(rbind, strsplit(a, ""))
[,1] [,2]
[1,] "A" "1"
[2,] "B" "1"
[3,] "C" "2"
[4,] "D" "5"
[5,] "R" "3"
By the sounds of your description, strsplit should work fine. If your data are a little more complicated, you can also look at a possible regex-based solution.
For this particular example, try:
do.call(rbind, strsplit(mydf$PATCH_CODE,
split = "(?<=[a-zA-Z])(?=[0-9])",
perl = TRUE))
# [,1] [,2]
# [1,] "A" "1"
# [2,] "A" "3"
# [3,] "B" "1"
# [4,] "B" "2"

R/Igraph Display edge weights in an edge list?

Is there any way to display edge weights when viewing the graph object as an edge list?
I want to do something in the spirit of:
get.edgelist(graph, attr='weight')
so as to view the edge pairings with the weights listed alongside the nodes, but that seems not to be allowed. Only way I know how to view the weights is to view the network data as an adjacency matrix. Hoping that's not the only way.
Using the example in the help page for function get.edgelist in pkg:igraph:
> cbind( get.edgelist(g) , round( E(g)$weight, 3 ))
[,1] [,2] [,3]
[1,] "a" "b" "0.342"
[2,] "b" "d" "0.181"
[3,] "b" "e" "0.403"
[4,] "b" "f" "0.841"
[5,] "d" "f" "0.997"
[6,] "e" "g" "0.029"
[7,] "a" "h" "0.17"
[8,] "b" "j" "0.69"
[9,] "g" "j" "0.422"
Another option is to use get.data.frame() from the igraph package
# create a random graph with weighted edges
g <- erdos.renyi.game(5, 5/10, directed = TRUE)
E(g)$weight <- runif(length(E(g)), 1, 5)
# pull nodes and edge weights
get.data.frame(g)
from to weight
1 1 5 4.716679
2 2 1 4.119414
3 1 2 4.535791
4 2 5 2.486553
5 3 2 4.932118
6 5 2 3.353693
7 1 3 3.003062
8 2 3 3.350118
9 1 4 2.929069
10 2 4 4.929474
11 5 4 4.333134

Resources