Change name of object variable R - r

Sorry for the poor wording, I'm hoping to change the name of an internal variable in a dgCMatrix. Specifically I want to change "Dimnames" to "dimnames" (I've attached a picture of the object variables for clarity), as I believe that may help with an error I'm getting (I'll post that at the bottom).
I've tried this, but to no avail
rename(emat#Dimnames, "dimnames")
The error I hope to fix with this:
> rvel.cd <- gene.relative.velocity.estimates(emat,nmat,deltaT=2,
+ kCells=10,
+ cell.dist=cell.dist,
+ fit.quantile=fit.quantile,
+ n.cores=2)
matching cells between cell.dist and emat/nmat ... done
calculating cell knn ... done
calculating convolved matrices ... Error in intI(i, n = d[1], dn[[1]], give.dn = FALSE) :
no 'dimnames[[.]]': cannot use character indexing
Reproducible data:
#Generate dgCMatrix
library(Matrix)
i <- c(1,3:8)
j <- c(2,9,6:10)
x <- 7 * (1:7)
emat <- sparseMatrix(i, j, x = x)

Related

R function error, "no applicable method for 'predict' applied to an object of class "NULL"

I am trying to use a function to generate a series of KS test's p values. In my case, I wish to choose different first M[i] rows from my aaa dataset and after a few steps process to generate the predict glm regression values dd, and finally compare every two datasets to get a KS test's p values. But my code didn't work well. I attached my code in the following using simulated data:
set.seed(1)
aaa <- matrix(runif(8000*950), 8000, 950)
ph = sample(rep(0:1,l=950))
kst <- function(M){
N = length(M)
aa = list()
bb = list()
cc = list()
dd = list()
ee = list()
p = 0
for (i in 2:N){
aa[[i]] = aaa[1:M[i],] ## class(aaa) is matrix
bb[[i]] = rbind(ph, aa[[i]])
cc[[i]] = as.data.frame(t(bb[[i]]))
dd[[i]] = glm(data = cc[[i]], formula = ph~., family = binomial)
ee[[i]] = ks.test(predict(dd[[i-1]]), predict(dd[[i]]))
p[i] = ee[[i]]$p.value
}
return(p)
}
kst(M = seq(50,950,by=50))
but R said:
> kst(M = seq(50,950,by=50))
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "NULL"
I think my ks.test loop also has error in it, but I did not know how to fix it.
Could anyone help me about this? Thanks in advance.
The issue is probably at for (i in 1:N-1)
R interprets 1:N-1 as sequence 1 to N, then subtract 1 from the whole vector.
Try 1:5-1 to see what I mean.
Try for (i in 1:(N-1))
Also in this line
ee[[i]] = ks.test(predict(dd[[i]]), predict(dd[[i+1]])) ## this line may also have problem.
dd[[i+1]] will not be defined yet, as your loop has only filled dd to i.
Not sure this is what you are trying to do but maybe:
if(i>1L)ee[[i]] = ks.test(predict(dd[[i-1]]), predict(dd[[i]]))
That way after you have filled the first entry of dd you start testing the current dd vs. the previous.

loop through column glmer

I am trying to run a glmer by looping through columns in my dataset which contain response variables (dat_prob).The code I am using is as follows, adapted from code researched on another stackoverflow question (Looping through columns in R).
Their code:
dat_y<-(dat[,c(2:1130)])
dat_x<-(dat[,c(1)])
models <- list()
#
for(i in names(dat_y)){
y <- dat_y[i]
model[[i]] = lm( y~dat_x )
}
My code:
dat_prob<-(probs[,c(108:188)])
dat_age<-(probs[,c(12)])
dat_dist<-(probs[,c(20)])
fyearcap=(probs[,c(25)])
fstation=(probs[,c(22)])
fnetnum=(probs[,c(23)])
fdepth=(probs[,c(24)])
models <- list()
#
for(i in names(dat_prob)){
y <- dat_prob[i]
y2=as.vector(y)
model[[i]] = glmer( y ~ dat_age * dat_dist + (1|fyearcap) + (1|fstation)+
(1|fnetnum)+ (1|fdepth),family=binomial,REML=TRUE )
}
And I receive this error, similar to the error received in the hyperlinked question:
Error in model.frame.default(drop.unused.levels = TRUE, formula = y ~ :
invalid type (list) for variable 'y'
I have been working through this for hours and now can't see the forest through the trees.
Any help is appreciated.
y <- dat_prob[i] makes y a list (or data frame, whatever). Lists are vectors - try is.vector(list()), so even y2 = as.vector(y) is still a list/data frame (even though you don't use it).
class(as.vector(mtcars[1]))
# [1] "data.frame"
To extract a numeric vector from a data frame, use [[: y <- dat_prob[[i]].
class(mtcars[[1]])
# [1] "numeric"
Though I agree with Roman - using formulas is probably a nicer way to go. Try something like this:
for(i in names(dat_prob)) {
my_formula = as.formula(paste(i,
"~ dat_age * dat_dist + (1|fyearcap) + (1|fstation)+ (1|fnetnum)+ (1|fdepth)"
))
model[[i]] = glmer(my_formula, family = binomial, REML = TRUE)
}
I'm also pretty skeptical of whatever you're doing trying 80 different response variables, but that's not your question...

How to append bootstrapped values of cluster's (tree) nodes in NEWICK format in R

I want to make a tree (cluster) using Interactive Tree of Life web-based tool (iTOL). As an input file (or string) this tool uses Newick format which is a way of representing graph-theoretical trees with edge lengths using parentheses and commas. Beside that, additional information might be supported such as bootstrapped values of cluster's nodes.
For example, here I created dataset for a cluster analysis using clusterGeneration package:
library(clusterGeneration)
set.seed(1)
tmp1 <- genRandomClust(numClust=3, sepVal=0.3, numNonNoisy=5,
numNoisy=3, numOutlier=5, numReplicate=2, fileName="chk1")
data <- tmp1$datList[[2]]
Afterwards I performed cluster analysis and assessed support for the cluster's nodes by bootstrap using pvclust package:
set.seed(2)
y <- pvclust(data=data,method.hclust="average",method.dist="correlation",nboot=100)
plot(y)
Here is the cluster and bootstrapped values:
In order to make a Newick file, I used ape package:
library(ape)
yy<-as.phylo(y$hclust)
write.tree(yy,digits=2)
write.tree function will print tree in a Newick format:
((x2:0.45,x6:0.45):0.043,((x7:0.26,(x4:0.14,(x1:0.14,x3:0.14):0.0064):0.12):0.22,(x5:0.28,x8:0.28):0.2):0.011);
Those numbers represent branch lengths (cluster's edge lengths). Following instructions from iTOL help page ("Uploading and working with your own trees" section) I manually added bootstrapped values into my Newick file (bolded values below):
((x2:0.45,x6:0.45)74:0.043,((x7:0.26,(x4:0.14,(x1:0.14,x3:0.14)55:0.0064)68:0.12)100:0.22,(x5:0.28,x8:0.28)100:0.2)63:0.011);
It works fine when I upload the string into iTOL. However, I have a huge cluster and doing it by hand seems tedious...
QUESTION: What would be a code that can perform it instead of manual typing?
Bootstrap values can be obtained by:
(round(y$edges,2)*100)[,1:2]
Branch lengths used to form Newick file can be obtained by:
yy$edge.length
I tried to figure out how write.tree function works after debugging it. However, I noticed that it internally calls function .write.tree2 and I couldn't understand how to efficiently change the original code and obtain bootstrapped values in appropriate position in a Newick file.
Any suggestion are welcome.
Here is one solution for you: objects of class phylo have an available slot called node.label that, appropriately, gives you the label of a node. You can use it to store your bootstrap values. There will be written in your Newick File at the appropriate place as you can see in the code of .write.tree2:
> .write.tree2
function (phy, digits = 10, tree.prefix = "")
{
brl <- !is.null(phy$edge.length)
nodelab <- !is.null(phy$node.label)
...
if (is.null(phy$root.edge)) {
cp(")")
if (nodelab)
cp(phy$node.label[1])
cp(";")
}
else {
cp(")")
if (nodelab)
cp(phy$node.label[1])
cp(":")
cp(sprintf(f.d, phy$root.edge))
cp(";")
}
...
The real difficulty is to find the proper order of the nodes. I searched and searched but couldn't find a way to find the right order a posteriori.... so that means we will have to get that information during the transformation from an object of class hclust to an object of class phylo.
And luckily, if you look into the function as.phylo.hclust, there is a vector containing the nodes index in their correct order vis-à-vis the previous hclust object:
> as.phylo.hclust
function (x, ...)
{
N <- dim(x$merge)[1]
edge <- matrix(0L, 2 * N, 2)
edge.length <- numeric(2 * N)
node <- integer(N) #<-This one
...
Which means we can make our own as.phylo.hclust with a nodenames parameter as long as it is in the same order as the nodes in the hclust object (which is the case in your example since pvclust keeps a coherent order internally, i. e. the order of the nodes in the hclust is the same as in the table in which you picked the bootstraps):
# NB: in the following function definition I only modified the commented lines
as.phylo.hclust.with.nodenames <- function (x, nodenames, ...) #We add a nodenames argument
{
N <- dim(x$merge)[1]
edge <- matrix(0L, 2 * N, 2)
edge.length <- numeric(2 * N)
node <- integer(N)
node[N] <- N + 2L
cur.nod <- N + 3L
j <- 1L
for (i in N:1) {
edge[j:(j + 1), 1] <- node[i]
for (l in 1:2) {
k <- j + l - 1L
y <- x$merge[i, l]
if (y > 0) {
edge[k, 2] <- node[y] <- cur.nod
cur.nod <- cur.nod + 1L
edge.length[k] <- x$height[i] - x$height[y]
}
else {
edge[k, 2] <- -y
edge.length[k] <- x$height[i]
}
}
j <- j + 2L
}
if (is.null(x$labels))
x$labels <- as.character(1:(N + 1))
node.lab <- nodenames[order(node)] #Here we define our node labels
obj <- list(edge = edge, edge.length = edge.length/2, tip.label = x$labels,
Nnode = N, node.label = node.lab) #And you put them in the final object
class(obj) <- "phylo"
reorder(obj)
}
In the end, here is how you would use this new function in your case study:
bootstraps <- (round(y$edges,2)*100)[,1:2]
yy<-as.phylo.hclust.with.nodenames(y$hclust, nodenames=bootstraps[,2])
write.tree(yy,tree.names=TRUE,digits=2)
[1] "((x5:0.27,x8:0.27)100:0.24,((x7:0.25,(x4:0.14,(x1:0.13,x3:0.13)61:0.014)99:0.11)100:0.23,(x2:0.46,x6:0.46)56:0.022)61:0.027)100;"
#See the bootstraps ^^^ here for instance
plot(yy,show.node.label=TRUE) #To show that the order is correct
plot(y) #To compare with (here I used the yellow value)

How to debug "invalid subscript type 'list'" error in R (genalg package)

I am new to genetic algorithms and am trying a simple variable selection code based on the example on genalg package's documentation:
data(iris)
library(MASS)
X <- cbind(scale(iris[,1:4]), matrix(rnorm(36*150), 150, 36))
Y <- iris[,5]
iris.evaluate <- function(indices) {
result = 1
if (sum(indices) > 2) {
huhn <- lda(X[,indices==1], Y, CV=TRUE)$posterior
result = sum(Y != dimnames(huhn)[[2]][apply(huhn, 1,
function(x)
which(x == max(x)))]) / length(Y)
}
result
}
monitor <- function(obj) {
minEval = min(obj$evaluations);
plot(obj, type="hist");
}
woppa <- rbga.bin(size=40, mutationChance=0.05, zeroToOneRatio=10,
evalFunc=iris.evaluate, verbose=TRUE, monitorFunc=monitor)
The code works just fine on its own, but when I try to apply my dataset (here), I get the following error:
X <- reducedScaledTrain[,-c(541,542)]
Y <- reducedScaledTrain[,542]
ga <- rbga.bin(size=540, mutationChance=0.05, zeroToOneRatio=10,
evalFunc=iris.evaluate, verbose=TRUE, monitorFunc=monitor)
Testing the sanity of parameters...
Not showing GA settings...
Starting with random values in the given domains...
Starting iteration 1
Calucating evaluation values... Error in dimnames(huhn)[[2]][apply(huhn, 1, function(x) which(x == max(x)))] :
invalid subscript type 'list'
I am trying to perform feature selection on 540 variables (I've eliminated the variables with 100% correlation) using LDA. I've tried transforming my data into numeric or list, but to no avail. I have also tried entering the line piece by piece, and the 'huhn' line works just fine with my data. Please help, I might be missing something...

R Custom proxy function in dtw for wraping

I have two column of data Tm and Ts and I want to apply the dtw algorithm changing the distance function. Proxy provide this possibility but I can't understand why it gives me an error.
I have 2 vector of data with the same length:
Tm Ts
301.0607 300.6008
301.3406 300.6515
301.5912 300.7289
301.5777 300.8506
301.5996 301.0158
301.6414 301.2103
301.7181 301.4113
myDTW<-function(x,y)(diff(x,lag=1,difference=1)-diff(y,lag=1,difference=1))^2
pr_DB$set_entry(FUN = myDTW, names = c("test_myDTW", "myDTW"))
Alignment<-dtw(a$Ts,b$Tm,dist.method="test_myDTW",keep.internals=TRUE)
Error in do.call(".External", c(list(CFUN, x, y, pairwise,
if (!is.function(method)) get(method) else method), :
not a scalar return value
diff() changes the length of the vector from n to n-1 but both vectors are changed, so I think that the problem are not on matching vector of different length.
Do you have any suggestion?
The error is explicit :
not a scalar return value
Your myDTW don't return a scalar. You need to define it as a valid distance function. If you change it to something like :
myDTW <- function(x,y){
res <- (diff(x,lag=1,difference=1)
-diff(y,lag=1,difference=1))^2
sum(res) ## I return the sum of square here
}
It will works. I think also you need to use modify_entry to modify the method value in the register.
dat <- read.table(text='Tm Ts
301.0607 300.6008
301.3406 300.6515
301.5912 300.7289
301.5777 300.8506
301.5996 301.0158
301.6414 301.2103
301.7181 301.4113',header=TRUE)
myDTW <- function(x,y){
res <- (diff(x,lag=1,difference=1)
-diff(y,lag=1,difference=1))^2
sum(res)
}
pr_DB$modify_entry(FUN = myDTW, names = c("test_myDTW", "myDTW"))
library(dtw)
## I change a and b to dat here
dtw(dat$Ts,dat$Tm,dist.method="test_myDTW",keep.internals=TRUE)
The result is :
DTW alignment object
Alignment size (query x reference): 7 x 7
Call: dtw(x = dat$Ts, y = dat$Tm, dist.method = "test_myDTW", keep.internals = TRUE)

Resources