How to use package adespatial as same as sPCA package in ade4 - r

I used the package adegenet with the function sPCA to understand if there are geographical patterns in my genetic data.
vcf<- read.table("AMZ.012") #samples per line
vcf_m<-as.matrix(vcf)
# Add coordinates of samples
xy <-read.table("CoordAMZ_m.csv", sep=",") #geo coordinates for each sample
The matrix "vcf" have 0 and 1 (1 means that the information is there and 0 means no information) in each line is a different sample, as the following example:
0 1 0 1
1 1 0 1
1 1 0 0
I ran sPCA using adegenet package in R, following the example:
mySpca <- spca(vcf_m, xy, ask=FALSE, type=5, scannf=FALSE)
The result was:
This function is now deprecated. Please use the 'multispati' function in the 'adespatial' package.
I tried to use this new function but I don't have any idea how could I use as the same as implemented in sPCA and get similar results. I am expecting something like in this pdf (http://adegenet.r-forge.r-project.org/files/tutorial-spca.pdf), page 7.
I would be very happy if someone could help me.
Thanks.

I am not sure if you are still interested but it is very simple. First have a look at the args(multispati) to see what is required.
The first argument required is a multi-variate ordination analysis using dudi in the ade4 package.
I created a dudi.pca.
Then a listw is required.
This is simply a weighted connection network similar to what you use in sPCA but in a different format.
You can use chooseCN just like in your sPCA.
Then convert your myCN to a listw using the nb2listw function.
Hope this helps!
Cheers,
Kat

Related

Memory management in R ComplexUpset Package

I'm trying to plot an stacked barplot inside an upset-plot using the ComplexUpset package. The plot I'd like to get looks something like this (where mpaa would be component in my example):
I have a dataframe of size 57244 by 21, where one column is ID and the other is type of recording, and other 19 columns are components from 1 to 19:
ID component1 component2 ... component19 type
1 1 0 1 a
2 0 0 1 b
3 1 1 0 b
Ones and zeros indicate affiliation with a certain component. As shown in the example in the docs, I first convert these ones and zeros to logical, and then try to plot the basic upset plot. Here's the code:
df <- df %>% mutate(across(where(is.numeric), as.logical))
components <- colnames(df)[2:20]
upset(df, components, name='protein', width_ratio = 0.1)
But unfortunately after thinking for a while when processing the last line it spits out an error message like this:
Error: cannot allocate vector of size 176.2 Mb
Though I know I'm using the 32Gb RAM architecture, I'm sure I couldn't have flooded the memory so much that 167 Mb can't be allocated, so my guess is I am managing memory in R somehow wrong. Could you please explein what's faulty in my code, if possible.
I also know that UpsetR package plots the same data, but as far as i know it provides no way for the stacked barplotting.
Somehow, it works if you:
Tweak the min_size parameter so that the plot is not overloaded and makes a better impression
Making the first argument of ComplexUpset a sample with some data also helps, even if your sample is the whole dataset.

R - Making a ggplot while using survey package

I am stuck with a real problem.
My dataset comes from a survey and to make it usable to find statistics about the whole French population, I must weight it with weights.
For this purpose, I used the survey package, but the syntax is not really easy to use with R.
Is there a way to use ggplot while having weights?
To explain it a bit better, here is my dataset:
head(df)
Id Weight Var1
1 30 0
2 12.4 0
3 68.2 1
So my individual 1 accounts for 30 people in the French population.
I create a df_weighted dataset using the survey package.
How can I use ggplot now? df_weighted is a list!
I did something like this to try to escape the list problem but I did not work at all...
df_weighted_ggplot$var1 <- svytable(~var1, df_weighted)
df_weighted_ggplot$var_fill <- svytable(~var_fill, df_weighted)
ggplot(df_weighted_ggplot, aes(fill = var_fill , x =var1)) + geom_bar(position = "fill")
I received this predictable error:
Erreur : `data` must be a data frame, or other object coercible by `fortify()`, not a list
Do you know any other package which should help me? But I read many forums and it seems to be the most helpful...

Circular-linear regression with covariates in R

I have data showing when an animal came to a survey station. example csv file here The first few lines of data look like this:
Site_ID DateTime HourOfDay MinTemp LunarPhase Habitat
F1 6/12/2013 14:01:00 14 -1 0 river
F1 6/12/2013 14:23:00 14 -1 0 river
F2 6/13/2013 1:21:00 1 3 1 upland
F2 6/14/2013 1:33:00 1 4 2 upland
F3 6/14/2013 1:48:00 1 4 2 river
F3 6/15/2013 11:08:00 11 0 0 river
I would like to perform a circular-linear regression in R to determine peak activity times. The dependent variable could be DateTime or HourOfDay, whichever is easier. I would like to incorporate the covariates Site_ID (random effect), plus MinTemp, LunarPhase, and Habitat into a mixed-effects model.
I have tried using the lm.circular function of program circular, and have the following code:
data<-read.csv("StackOverflowExampleData.csv")
data$DateTime<-as.POSIXct(as.character(data$DateTime), format = "%m/%d/%Y %H:%M:%S")
data$LunarPhase<-as.factor(data$LunarPhase)
str(data)
library(circular)
y<-data$DateTime
y<-circular(y, units ="hours",template = "clock24",rotation = "clock")
x<-data[,c(1,4,5,6)]
lm.circular(y=y, x=x, init=c(1,1,1,1), type='c-l', verbose=TRUE)
I keep getting the error:
Error in Ops.POSIXt(x, 12) : '/' not defined for "POSIXt" objects
Apparently this is a known bug, but I was confused by this threat about it and could not determine an appropriate work-around. Suggestions?
Also, my ultimate goal with this data was to run a circular-linear version of a glm, and then test several models against one another using AIC or some other information theoretics method. The model I'm seeking would be a circular-linear version of something like this:
glmer(HourOfDay~MinTemp+LunarPhase+Habitat+(1|Site_ID),family=binomial,data=data)
Perhaps this is an inappropriate application of the circular package. If so, I'm open to other suggestions of models and/or graphics that would investigate peak activity using the data and covariates.
Note: I did search for related discussions and found this somewhat relevant thread, but it was never answered, did not request a solution in R, and was of a different scope.
The specific problem is caused by conversion.circular. There, a POSIXlt object is divided by 12. This is an operation that has a non-defined outcome:
> as.POSIXlt('2005-07-16') / 2
Error in Ops.POSIXt(as.POSIXlt("2005-07-16"), 2) :
'/' not defined for "POSIXt" objects
So, it seems that you cannot use data of this class as input for the circular package. I could not find any mention of POSIXlt data in the examples. Maybe you need to specify the timestamps simply as a number, not as a POSIXlt object.

Spatial auto-correlation test for binary data in R

I want to test a species' presence / absence records for spatial autocorrelation. My data contain >130,000 grids in GIS and with about 700 species' presence records.
I have read that the normal Moran's $I$ can't deal with this kind of data, but that the join count method in package spdep can do it. However, I'm new to R and I still can't understand the information and code in the help for joincount.mc or joincount.test.
My data is like this:
gridnumber species
1 1
2 0
3 0
4 1
……
I know how to read a shp.file into R and I know I must calculate the weight of my data, but the following steps with spdep is beyond my ability.

I want to extract coordinate data from NetLogo using the RNetLogo package

I'm using the sample Flocking code as an example to play with if anyone is familiar
NLCommand("set population 1")
NLCommand("setup")
nruns <- 10
timedata <- list()
for(i in 1:nruns) {
NLCommand("go")
timedata[[i]] <- NLGetAgentSet(c("who","xcor","ycor"),"turtles",
as.data.frame=T,df.col.names=c("who","xcor","ycor")) }
timedata
The problem is that it produces new headers for each model iteration. So I get the following instead of the header appearing only once:
[[1]]
who xcor ycor
1 0 34.56833 -26.47777
[[2]]
who xcor ycor
1 0 35.19765 -25.70063
Any help would be much appreciated
There's good discussion and answers on this at http://groups.yahoo.com/neo/groups/netlogo-users/conversations/topics/15551 (where OP asked the same question). Jan Thiele, author of the R extension for NetLogo, writes:
If you really want to have all turtle coordinates in R, the more appropriate function is NLGetAgentSet and executing this in a loop over the ticks.
I have written a tutorial which comes with the RNetLogo package (see your RNetLogo installation directory). There is an example in chapter 11.5 (Time sliding visualization), where a similar thing is done. Adapting it to the Flocking model it could look like this: [...]

Resources