incorrect number of dimensions even if the size of the array is the same - r

I have been working on this program for many days, and decide to rewrite it today....
But this problem keeps bothering me.
I thought the csm[1,] and Prank[1,] has the same dimension.
Who can help me with this problem?
Prank<-read.csv("result.csv")
nrP<-nrow(Prank)
ncP<-ncol(Prank)
csm<-matrix(0,nrP*3,ncP)
ccsm<-matrix(0,nrP*3,ncP)
nrC<-nrow(csm)
ncC<-ncol(csm)
nrP
[1] 30
ncP
[1] 144
nrC
[1] 90
ncC
[1] 144
Prank[1,]
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 P17 P18 P19 P20 P21 P22 P23 P24 P25 P26 P27 P28 P29 P30 P31 P32
1 4 2 3 1 4 2 3 1 4 2 3 1 3 1 4 2 4 2 3 1 4 1 3 2 4 1 3 2 4 2 3 1
P33 P34 P35 P36 P37 P38 P39 P40 P41 P42 P43 P44 P45 P46 P47 P48 P49 P50 P51 P52 P53 P54 P55 P56 P57 P58 P59 P60 P61
1 4 1 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
P62 P63 P64 P65 P66 P67 P68 P69 P70 P71 P72 P73 P74 P75 P76 P77 P78 P79 P80 P81 P82 P83 P84 P85 P86 P87 P88 P89 P90
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
P91 P92 P93 P94 P95 P96 P97 P98 P99 P100 P101 P102 P103 P104 P105 P106 P107 P108 P109 P110 P111 P112 P113 P114 P115
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
P116 P117 P118 P119 P120 P121 P122 P123 P124 P125 P126 P127 P128 P129 P130 P131 P132 P133 P134 P135 P136 P137 P138
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
P139 P140 P141 P142 P143 P144
1 0 0 0 0 0 0
csm[1,]
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[59] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[117] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
csm[1,]<-Prank[1,]
csm[1,]
Error in csm[1, ] : incorrect number of dimensions

The problem is that Prank[1, ] is a data.frame (i.e. a list) so when you try to assign it to the first row of csm, it has the unexpected side-effect of converting csm to a list. At that point, doing csm[1, ] does not make any sense (a list has a single dimension) hence the error.
A solution is to unlist Prank[1, ] before assigning:
csm[1,] <- unlist(Prank[1,])

read.csv() returns a data.frame, and unless all of the columns of Prank are numeric, the assignment
csm[1,]<-Prank[1,]
will cause csm to be coerced to a list because Prank[1,] is not a numeric vector. You will want to make sure that Prank[1,] is a numeric vector (i.e. is.numeric(Prank[1,])).
Revised suggestion: take a look at data.frame (head(Prank)) and it may be obvious that one or more columns are not numeric. To inspect the classes of each field in prank, you can use
lapply(Prank,class)
or
sapply(Prank,class)
If all the fields in Prank are integer or numeric, you can coerce them all to numeric via
Prank[] <- lapply(Prank,as.numeric)
If not all the fields are numeric, you will want to coerce the problem fields to numeric, or
or remove the offending fields from Prank (e. g. Prank$ProblemField <- NULL) before the assignment.

Related

Multiplying multiple columns with each other into a new dataframe in R

I want to multiply many of my binary variables into new columns, so called interactive variables. My dataset is structured like this:
YearCountry <- data.frame( Time = c("2000","2001", "2002", "2003",
"2000","2001", "2002", "2003",
"2000","2001", "2002", "2003"),
AL = c(1,1,1,1,0,0,0,0,0,0,0,0),
FR = c(0,0,0,0,1,1,1,1,0,0,0,0),
UK = c(0,0,0,0,0,0,0,0,1,1,1,1),
Y2000d = c(1,0,0,0,1,0,0,0,1,0,0,0),
Y2001d = c(0,1,0,0,0,1,0,0,0,1,0,0),
Y2002d = c(0,0,1,0,0,0,1,0,0,0,1,0),
Y2003d = c(0,0,0,1,0,0,0,1,0,0,0,1))
YearCountry
Time AL FR UK Y2000d Y2001d Y2002d Y2003d
1 2000 1 0 0 1 0 0 0
2 2001 1 0 0 0 1 0 0
3 2002 1 0 0 0 0 1 0
4 2003 1 0 0 0 0 0 1
5 2000 0 1 0 1 0 0 0
6 2001 0 1 0 0 1 0 0
7 2002 0 1 0 0 0 1 0
8 2003 0 1 0 0 0 0 1
9 2000 0 0 1 1 0 0 0
10 2001 0 0 1 0 1 0 0
11 2002 0 0 1 0 0 1 0
12 2003 0 0 1 0 0 0 1
I need to multiply the binary variable for each of the countries (AL,FR,UK) with each of the binary variables for a given year so that I get #country x #year new variables. In this case I have three countries and four years which gives 12 new variables. My full data contains 105 countries/regions and stretches over twenty years. I therefore need a general formula. I want data that looks like this
Interact <- data.frame(Time = c("2000","2001", "2002", "2003",
"2000","2001", "2002", "2003",
"2000","2001", "2002", "2003"),
Y2000xAL = c(1,0,0,0,0,0,0,0,0,0,0,0),
Y2001xAL = c(0,1,0,0,0,0,0,0,0,0,0,0),
Y2002xAL = c(0,0,1,0,0,0,0,0,0,0,0,0),
Y2003xAL = c(0,0,0,1,0,0,0,0,0,0,0,0),
Y2000xFR = c(0,0,0,0,1,0,0,0,0,0,0,0),
Y2001xFR = c(0,0,0,0,0,1,0,0,0,0,0,0),
Y2002xFR = c(0,0,0,0,0,0,1,0,0,0,0,0),
Y2003xFR = c(0,0,0,0,0,0,0,1,0,0,0,0),
Y2000xUk = c(0,0,0,0,0,0,0,0,1,0,0,0),
Y2001xUK = c(0,0,0,0,0,0,0,0,0,1,0,0),
Y2002xUK = c(0,0,0,0,0,0,0,0,0,0,1,0),
Y2003xUK = c(0,0,0,0,0,0,0,0,0,0,0,1))
Interact
Time Y2000xAL Y2001xAL Y2002xAL Y2003xAL Y2000xFR Y2001xFR Y2002xFR Y2003xFR Y2000xUk Y2001xUK Y2002xUK Y2003xUK
1 2000 1 0 0 0 0 0 0 0 0 0 0 0
2 2001 0 1 0 0 0 0 0 0 0 0 0 0
3 2002 0 0 1 0 0 0 0 0 0 0 0 0
4 2003 0 0 0 1 0 0 0 0 0 0 0 0
5 2000 0 0 0 0 1 0 0 0 0 0 0 0
6 2001 0 0 0 0 0 1 0 0 0 0 0 0
7 2002 0 0 0 0 0 0 1 0 0 0 0 0
8 2003 0 0 0 0 0 0 0 1 0 0 0 0
9 2000 0 0 0 0 0 0 0 0 1 0 0 0
10 2001 0 0 0 0 0 0 0 0 0 1 0 0
11 2002 0 0 0 0 0 0 0 0 0 0 1 0
12 2003 0 0 0 0 0 0 0 0 0 0 0 1
Here's an approach with dplyr::across. We can make the final result into a plain data.frame with purrr:invoke as demonstrated in this answer.
library(dplyr)
library(purrr)
YearCountry %>%
mutate(across(AL:UK, ~ . * select(cur_data(), Y2000d:Y2003d))) %>%
select(-(Y2000d:Y2003d)) %>%
invoke(.f = data.frame) %>%
rename_with(~str_replace(.,"\\.",""))
Time ALY2000d ALY2001d ALY2002d ALY2003d FRY2000d FRY2001d FRY2002d FRY2003d UKY2000d UKY2001d UKY2002d UKY2003d
1 2000 1 0 0 0 0 0 0 0 0 0 0 0
2 2001 0 1 0 0 0 0 0 0 0 0 0 0
3 2002 0 0 1 0 0 0 0 0 0 0 0 0
4 2003 0 0 0 1 0 0 0 0 0 0 0 0
5 2000 0 0 0 0 1 0 0 0 0 0 0 0
6 2001 0 0 0 0 0 1 0 0 0 0 0 0
7 2002 0 0 0 0 0 0 1 0 0 0 0 0
8 2003 0 0 0 0 0 0 0 1 0 0 0 0
9 2000 0 0 0 0 0 0 0 0 1 0 0 0
10 2001 0 0 0 0 0 0 0 0 0 1 0 0
11 2002 0 0 0 0 0 0 0 0 0 0 1 0
12 2003 0 0 0 0 0 0 0 0 0 0 0 1
1) model.matrix We split the names by the number of characters in them (the countries have 2 characters in their names and the years have 6) and paste pluses in each. (Alternately use Plus(grep("^..$", nms, value = TRUE)) to get the country names and use that in place of spl["2"] and similarly Plus(grep("^Y....d$", nms, value = TRUE)) in place of spl["6"].)
c(`2` = "AL+FR+UK", `6` = "Y2000d+Y2001d+Y2002d+Y2003d")
and from that the formula:
~(AL + FR + UK):(Y2000d + Y2001d + Y2002d + Y2003d) + 0
and then compute its model matrix.
The formula could also be expanded to one accepted by lm by modifying the sprintf format so we might not even need to create the model matrix. For example, if we had a response vector R then we could write: s <- sprintf("R ~ (%s)*(%s)", spl["2"], spl["4"]); fo <- formula(s); lm(fo, YearCountry) to include all variables and the interactions of countries and year as well as an intercept.
Plus <- function(x) paste(x, collapse = "+")
nms <- names(YearCountry)[-1]
spl <- sapply(split(nms, nchar(nms)), Plus)
s <- sprintf("~ (%s):(%s)+0", spl["2"], spl["6"])
fo <- formula(s)
model.matrix(fo, YearCountry)
giving this matrix:
AL:Y2000d AL:Y2001d AL:Y2002d AL:Y2003d FR:Y2000d FR:Y2001d FR:Y2002d FR:Y2003d UK:Y2000d UK:Y2001d UK:Y2002d UK:Y2003d
1 1 0 0 0 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0 0 0 0
4 0 0 0 1 0 0 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0 0 0
6 0 0 0 0 0 1 0 0 0 0 0 0
7 0 0 0 0 0 0 1 0 0 0 0 0
8 0 0 0 0 0 0 0 1 0 0 0 0
9 0 0 0 0 0 0 0 0 1 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0
12 0 0 0 0 0 0 0 0 0 0 0 1
attr(,"assign")
[1] 1 2 3 4 5 6 7 8 9 10 11 12
Alternately we can write it compactly like this:
Plus <- function(x) paste(x, collapse = "+")
nms <- names(YearCountry)
s <- sprintf("~ (%s):(%s)+0", Plus(nms[2:4]), Plus(nms[5:8]))
fo <- formula(s)
model.matrix(fo, YearCountry)
2) eList Another approach is to use list comprehensions. With the eList package we can do this:
library(eList)
DF(for(i in YearCountry[2:4]) for(j in YearCountry[5:8]) i*j)
giving this data frame. Use as.matrix(...) on it if you want a matrix.
AL.Y2000d AL.Y2001d AL.Y2002d AL.Y2003d FR.Y2000d FR.Y2001d FR.Y2002d FR.Y2003d UK.Y2000d UK.Y2001d UK.Y2002d UK.Y2003d
1 1 0 0 0 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0 0 0 0
4 0 0 0 1 0 0 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0 0 0
6 0 0 0 0 0 1 0 0 0 0 0 0
7 0 0 0 0 0 0 1 0 0 0 0 0
8 0 0 0 0 0 0 0 1 0 0 0 0
9 0 0 0 0 0 0 0 0 1 0 0 0
10 0 0 0 0 0 0 0 0 0 1 0 0
11 0 0 0 0 0 0 0 0 0 0 1 0
12 0 0 0 0 0 0 0 0 0 0 0 1
3) listcompr listcompr is another list comprehension package. Note that the development version of this package is needed in order to use bycol=. Replace gen.named.matrix with gen.named.data.frame if you want a data frame.
# devtools::github_github("patrickroocks/listcompr")
library(listcompr)
nms <- names(YearCountry)
gen.named.matrix("{nms[i]}.{nms[j]}", YearCountry[[i]] * YearCountry[[j]],
i = 2:4, j = 5:8, bycol = TRUE)

Standard deviation error for EcoTest.sample

I am using EcoTest.sample to compare rarefaction curves for 19 vegetation plots on two soil types (alluvial and canyon). The code below produces the following
warning (more than 50 times): "In cor(x > 0) : the standard deviation is zero".
The test still produces all the expected output. Should I be concerned about the warnings? Is it a result of my relatively small sample size?
rawdata<-read.table(text="Plot SiteType sp1 sp2 sp3 sp4 sp5 sp6 sp7 sp8 sp9 sp10 sp11 sp12 sp13 sp14 sp15 sp16 sp17 sp18 sp19 sp20 sp21 sp22 sp23 sp24 sp25 sp26 sp27 sp28 sp29 sp30 sp31 sp32 sp33 sp34 sp35
2 canyon 1 0 1 0 1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0
3 alluvial 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0
5 alluvial 1 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
6 alluvial 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0
7 alluvial 1 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
8 alluvial 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0
10 alluvial 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 0
11 canyon 1 1 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 0
12 canyon 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
13 canyon 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0
14 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
15 canyon 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0
16 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
17 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0
18 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0
19 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0
20 canyon 1 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1
22 alluvial 1 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 0
23 alluvial 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0
", header=T)
data<-rawdata[,-1]
rownames(data)<-rawdata[,1]
test.data<-EcoTest.sample(data[,-1], by=data$SiteType, MARGIN=1, trace=F)
EDIT: Perhaps you need to set the nature of the index using q. For instance if I use q=2 the inverse Simpson index, I cannot reproduce your error. As it stands you're using q=0, the species richness. Perhaps there's nothing to do rather than using a different index. I'm not aware of the factors affecting index choice. I've read a thing or two here: http://www.tiem.utk.edu/~gross/bioed/bealsmodules/shannonDI.html and found this paper that I didn't go into much detail: https://dx.doi.org/10.1002%2Fece3.1155
Using Simpson's index: No warnings.
test.data<-EcoTest.sample(data[,-1], by=data$SiteType, MARGIN=1, trace=F,q=2)
Sample-based method
P(Obs <= null) = 0.205
As stated in this answer on SE, a standard deviation of zero will have an impact on the nature of the distribution. Therefore, any tests you perform that may have depended on a normal distribution will likely be erroneous. The p-values obtained say by a t-test may therefore be "insignificant."
When standard deviation is zero, your Gaussian (normal) PDF turns into Dirac delta function. You can't simply plug zero standard deviation into the conventional expression. For instance, if the PDF is plugged into some kind of numerical integration, this won't work. (Aksakal on SE)
https://stats.stackexchange.com/questions/233834/what-is-the-normal-distribution-when-standard-deviation-is-zero

How do I change color of interactions based on interaction value using an ifelse() statement in an plotweb bipartite?

Hi I am having trouble trying to get ifelse statements to work in a plotweb fuction (from bipartite) to color interaction based on the total quantity of interaction of each cell in the matrix. I had the same problem with the high bar colors, but since there were only a few values and one vector, it was easy to manually code.
Here is the code I am using, I want to color interactions greater than 15 as dark turquoise and keep the rest as default grey (grey80).
I have tried many different statements but I cant seem how to figure out what to put in the [,] to signify for the function to go through every individual cell and apply the statement instead of summing them, elem,elem also doesn't seem to work. Attached is a picture of the function's output currently
plotweb(LadyNet,
abuns.type='additional',
arrow="up.center",
text.rot=90,
col.low=c("olivedrab3"),
col.interaction =(ifelse(LadyNet[,] < 15,'grey80','darkturquoise')),
col.high = c("grey10","#FF0000","grey10","#FF0000","grey10","#FF0000","grey10","grey10","grey10"),
high.lab.dis = 0,
ybig=1.2,
y.width.high = .06,
high.spacing = 0.011,
y.lim = c(-1,2))
COCCAL COCSEP CYCPOL CYCSAN EXOFAS HIPCON PSYVIG SCY1 SCYMAR
Acmispon glaber 0 1 0 1 0 0 0 0 0
Ambrosia psilostachya 1 36 0 24 0 6 0 0 0
Artemisia douglasiana 0 0 0 1 0 1 0 0 0
Asclepias fascicularis 0 5 0 4 0 2 0 0 0
Avena fatua 6 10 0 0 0 4 0 0 0
Baccharis pilularis 9 76 0 38 0 27 0 1 0
Baccharis salicifolia 0 2 0 0 0 0 0 0 0
Bromus diandrus 1 8 0 0 0 4 0 0 0
Capsicum annuum 0 0 0 0 0 0 0 0 1
Chenopodium murale 0 1 0 0 0 0 0 0 0
Croton californicus 3 20 0 13 0 54 4 0 0
DEAD WOOD 0 1 0 0 0 0 0 0 0
Distichilis spicata 0 1 0 0 0 0 0 0 0
Echium candicans 0 1 0 3 0 0 0 0 0
Eleocharis acicularis 0 1 0 0 0 0 0 0 0
Encelia californica 1 1 0 3 0 2 0 0 0
Epilobium canum 0 0 0 1 0 0 0 0 0
Erigeron bonariensis 0 4 0 0 0 0 0 0 0
Erigeron canadensis 0 17 0 10 0 2 0 0 0
Erigeron sumatrensis 0 13 0 0 0 1 0 0 0
Eriophyllum confertiflorum 1 10 0 0 0 1 0 0 0
Fence 0 0 0 1 0 0 0 0 0
Festuca perennis 0 1 0 0 0 2 0 0 0
Gambelium speciosa 0 0 0 0 0 1 0 0 0
Geranium dissectum 0 0 0 3 0 0 0 0 0
GROUND 0 1 0 1 0 0 0 0 0
Helminthotheca echioides 0 1 2 17 0 1 0 0 0
Heterotheca grandiflora 2 92 0 12 0 7 1 0 0
Hirschfieldia incana 0 3 0 0 0 1 0 0 0
Juncus patens 0 1 0 0 0 0 0 0 0
Laennecia coulteri 1 65 0 2 0 3 0 0 0
Lobularia maritima 1 1 0 0 0 0 0 0 0
Morus sp. 0 0 0 1 0 0 0 0 0
NoPicture 4 3 0 3 3 2 3 0 0
Oxalis pes-caprae 4 6 0 0 0 2 0 0 0
Pennisetum clandestinum 1 5 0 0 0 0 0 0 0
Polygonum arenastrum 0 1 0 0 0 0 0 0 0
Raphanus sativus 0 1 0 0 0 0 0 0 0
ROCK 0 0 0 1 0 0 0 0 0
Rumex crispus 0 1 0 0 0 0 0 0 0
Rumex salicifolius 0 0 0 3 0 0 0 0 0
Salsola tragus 1 6 0 1 0 1 0 0 0
Salvia leucophylla 0 1 0 0 0 1 0 0 0
Schenoplectus americanus 0 1 0 0 0 0 0 0 0
Solanum nigrum 0 0 0 0 0 1 0 0 0
Sonchus arvensis 0 1 0 0 0 0 0 0 0
Spinacia oleracea 0 0 0 0 0 0 1 0 0
Stipa pulchra 0 1 0 0 0 0 0 0 0
Symphiotrichum subulatum 0 88 0 7 0 3 0 0 0
THATCH 1 3 0 0 0 4 0 0 0
Verbena lasiostachys 1 9 0 0 0 2 0 0 0
For Reference, I have gotten the ifelse statement to function properly in the plotweb function when there was only one species in the lower level attached is an example along with the code:
plotweb(rnet,
abuns.type='additional',
arrow="down.center",
text.rot=90,
col.low=c("olivedrab3"),
col.interaction =(ifelse(rnet[1,] < 12,'grey80','darkturquoise')),
col.high = (ifelse(rnet[1,] < 12,'grey10','darkturquoise')),
high.lab.dis = 0,
ybig=1.2,
y.width.high = .06,
high.spacing = 0.011)
One thing to note is that the col.interaction color matrix should be transposed.
Here is an example that I trust you will find useful:
library(bipartite)
library(grDevices)
plotweb(df,
abuns.type='additional',
arrow="up.center",
text.rot=90,
col.low=c("olivedrab3"),
col.interaction = t(ifelse(df[,] < 15,
adjustcolor('grey80', alpha.f = 0.5), #add alpha to colors
adjustcolor('darkturquoise', alpha.f = 0.5))),
col.high = c("grey10",
"#FF0000",
"grey10",
"#FF0000",
"grey10",
"#FF0000",
"grey10",
"grey10",
"grey10"),
bor.col.interaction = NA, #remove the black border color
high.lab.dis = 0,
ybig=1.2,
y.width.high = .06,
high.spacing = 0.011,
y.lim = c(-1,2))

Turn a long data structure to a wide matrix structure

I do have the following data structure...
ID value
1 1 1
2 1 63
3 1 2
4 1 58
5 2 3
6 2 4
7 3 34
8 3 25
Now I want to turn it into a kind of dyadic data structure. Every ID with the same value should have a relationship.
I tried several option and:
df_wide <- dcast(df, ID ~ value)
... have brought me a long way down the road...
ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 39 40
1 1001 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 1006 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 1007 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0
4 1011 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 1018 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 1020 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
7 1030 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
8 1036 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Now is my main problem to turn it into a proper matrix to get a igraph object out of it.
df_wide_matrix <- data.matrix(df_wide)
df_aus_wide_g <- graph.edgelist(df_wide_matrix ,directed = TRUE)
don't get me there...
I also tried to transform it into a adjacency matrix...
df_wide_matrix <- get.adjacency(graph.edgelist(as.matrix(df_wide), directed=FALSE))
... but it didn't work either
If you want to create an edge between all IDs with the same value, try something like this instead. First merge the data frame onto itself by the value. Then, remove the value column, and remove all (undirected) edges that are duplicate or just points. Finally, convert to a two-column matrix and create the edges.
res <- merge(df, df, by='value', all=FALSE)[,c('ID.x','ID.y')]
res <- res[res$ID.x<res$ID.y,]
resg <- graph.edgelist(as.matrix(res))

Loosing observation when I use reshape in R

I have data set
> head(pain_subset2, n= 50)
PatientID RSE SE SECODE
1 1001-01 0 0 0
2 1001-01 0 0 0
3 1001-02 0 0 0
4 1001-02 0 0 0
5 1002-01 0 0 0
6 1002-01 1 2a 1
7 1002-02 0 0 0
8 1002-02 0 0 0
9 1002-02 0 0 0
10 1002-03 0 0 0
11 1002-03 0 0 0
12 1002-03 1 1 1
> dim(pain_subset2)
[1] 817 4
> table(pain_subset2$RSE)
0 1
788 29
> table(pain_subset2$SE)
0 1 2a 2b 3 4 5
788 7 5 1 6 4 6
> table(pain_subset2$SECODE)
0 1
788 29
I want to create matrix with n * 6 (n :# of PatientID, column :6 levels of SE)
I use reshape, I lost many observations
> dim(p)
[1] 246 9
My code:
p <- reshape(pain_subset2, timevar = "SE", idvar = c("PatientID","RSE"),v.names = "SECODE", direction = "wide")
p[is.na(p)] <- 0
> table(p$RSE)
0 1
226 20
Compare with table of RSE, I lost 9 patients having 1.
This is out put I have
PatientID RSE SECODE.0 SECODE.2a SECODE.1 SECODE.5 SECODE.3 SECODE.2b SECODE.4
1 1001-01 0 0 0 0 0 0 0 0
3 1001-02 0 0 0 0 0 0 0 0
5 1002-01 0 0 0 0 0 0 0 0
6 1002-01 1 0 1 0 0 0 0 0
7 1002-02 0 0 0 0 0 0 0 0
10 1002-03 0 0 0 0 0 0 0 0
12 1002-03 1 0 0 1 0 0 0 0
13 1002-04 0 0 0 0 0 0 0 0
15 1003-01 0 0 0 0 0 0 0 0
18 1003-02 0 0 0 0 0 0 0 0
21 1003-03 0 0 0 0 0 0 0 0
24 1003-04 0 0 0 0 0 0 0 0
27 1003-05 0 0 0 0 0 0 0 0
30 1003-06 0 0 0 0 0 0 0 0
32 1003-07 0 0 0 0 0 0 0 0
35 1004-01 0 0 0 0 0 0 0 0
36 1004-01 1 0 0 0 1 0 0 0
40 1004-02a 0 0 0 0 0 0 0 0
Anyone knows what happens, I really appreciate.
Thanks for your help, best.
Try:
library(dplyr)
library(tidyr)
pain_subset2 %>%
spread(SE, SECODE)

Resources