Can not inspect S4 object after modification - r

I am having problems with my S4 object resafter I appended a list of values to it. The object was created with the DESeq2 package. The object was created via:
dds <- DESeqDataSetFromMatrix(countData = count.matrix,
colData = coldata,
design = ~ Condition)
dds <- DESeq(dds, test = "Wald")
res <- results(dds)
I did the following:
x <- qvalue(res#listData[["pvalue"]]) #calc qvalues based on pvalues from S4 object 'res'
res#listData[["qval"]] <- x[["qvalues"]] #append qvalues from x to 'res' as new col named "qval"
Now when I try to inspect the object with head() I get the following error:
> head(res)
Error in `rownames<-`(`*tmp*`, value = names(x)) :
invalid rownames length
The funny thing is that with View()I can inspect the S4 object in RStudio and I can see that everything went fine, adding the qvalues. Does anyone know why this happens? Is there a way to avoid that?

For you to get the qvalues.. you can do this first:
library(qvalue)
library(DESeq2)
dds = makeExampleDESeqDataSet()
dds = DESeq(dds)
res = results(dds)
res$qvalue = qvalue(res$pvalue)$qvalue
I will follow up with why there is an error.. you need to look into how it is constructed.

Related

passing arguments to igraph function within sapply

I am new to network analyses and ERGM models and have a list of lists which has 100 Erdos Renyi Models as content and was created with the code below
set.seed(666)
gs <- list()
for (x in seq_len(100L)) {
gs[[x]] <- erdos.renyi.game(374, 0.0084, type = "gnp", directed = F)
E(gs[[x]])$weight <- sample(1:5, ecount(gs[[x]]), T)
}
Now I would like to calculate the mean path length between two nodes as well as the average clustering across these 100 models.
For the mean path length I used the following code:
random_mean_paths <- sapply(gs, igraph::mean_distance, 1:100)
mean(random_mean_paths)
However, if I try the same with igraph::transitivity , i.e.
random_mean_clus <- sapply(gs, igraph::transitivity, 1:100)
I get the error
Error in match.arg(arg = arg, choices = choices, several.ok = several.ok) :
'arg' must be of length 1
and when trying to resolve this error by setting type = "global", i.e.
random_mean_clus <- sapply(gs, igraph::transitivity(type = "global", 1:100)
I get the error argument "graph" is missing with no default
I cannot specify gs in the transitivity() function, since it is not an igraph object and I am stuck trying to pass the correct argument to this function.
Thanks in advance.
Any of
random_mean_clus <- sapply(gs, igraph::transitivity, type = "global", 1:100)
random_mean_clus <- sapply(gs, \(s) igraph::transitivity(s, type = "global", 1:100))
will solve the problem.
The first includes the named argument in the arguments list and the second defines an anonymous function, \(s) using the new way introduced in R 4.1.

subset data based on read counts for Seurat object (Error in FetchData)

I'm trying to subset a Seurat object (called dNSC_cells) based on counts of genes of interest. Specifically, I have a list of genes and I plan on looping through them to subset my data and do some Wilcox tests.
What I have so far looks like this:
pro_genes_list <- c("Bcl2", "Bid")
for (p in pro_genes_list) {
median_prog <- median(GetAssayData(dNSC_cells, slot = 'counts')[p,])
with_p <- colnames(subset(dNSC_cells, subset = as.name(p) > median_prog))
However, it blocks at the last line, with this Error:
Error in FetchData(object = object, vars = unique(x = expr.char[vars.use]), :
None of the requested variables were found:
I also tried using
subset = p > median_prog
but it gave the same Error.
Would be super grateful for any pointers!
the subset function does not support gene symbol in variable, you need to extract a dataframe first.
for (p in pro_genes_list) {
median_prog <- median(GetAssayData(dNSC_cells, slot = 'counts')[p,])
expr <- FetchData(object = dNSC_cells, vars = p)
with_p <- colnames(dNSC_cells[, which(expr > median_prog)])
}

The function lda() throws an error when passing a subset argument

This error looks common but I've can't seem to get my head round this.
I've been given the following code (on a course but it's (the code) not graded) as a shortcut to doing LDA. Apparently it works on some computers but not mine. I've upgraded R and R studio and also the MASS library. Any ideas?
The error I get is:
Error in eval(expr, envir, enclos) : object 'training' not found
The code is
lda.valid <- function(formula,data,...,train.fraction=0.75){
grouping <- model.response(model.frame(formula,data))
tbl <- table(grouping,lda(formula,data,...,CV=TRUE)$class)
CV <- sum(diag(tbl))/sum(tbl)
n <- nrow(data)
training <- sample(1:n,n*train.fraction)
lda.training <- lda(formula,data,...,subset=training)
lda.pred <- predict(lda.training,data[-training,])
tbl <- table(grouping[-training],lda.pred$class)
VAL <- sum(diag(tbl))/sum(tbl)
c(CV=CV,VAL=VAL)
}
I run the following and get the error. Is it related to the "..." (ellipsis)
lda.valid(Species~.,data=iris,prior=c(1/3,1/3,1/3),train.fraction=0.5)
I was looking at the trycatch stuff to catch the error but don't see how I can print the stacktrace.
Any hints or suggestions. I probably don't understand the stacktrace at this point.
The error occurs where you call lda.training <- lda(...). This seems to be related to internals of the lda() function, and it's not clear to me why this happens.
However, the intent of this code seems to perform the lda using a only a training subset of the data.
This is easy enough to specify directly by subsetting the data in advance. So I suggest replacing the offending line with
lda.training <- lda(formula, data[training, ], ...)
Thus the complete function is:
library(MASS)
lda.valid <- function(formula, data, ..., train.fraction = 0.75){
grouping <- model.response(model.frame(formula, data))
tbl <- table(grouping, lda(formula, data, ..., CV = TRUE)$class)
CV <- sum(diag(tbl))/sum(tbl)
n <- nrow(data)
training <- sample(1:n, n*train.fraction)
lda.training <- lda(formula, data[training, ], ...) # <<<--- Changed
lda.pred <- predict(lda.training, data[-training, ])
tbl <- table(grouping[-training], lda.pred$class)
VAL <- sum(diag(tbl))/sum(tbl)
c(CV = CV, VAL = VAL)
}
lda.valid(Species~., data = iris, prior = c(1/3, 1/3, 1/3), train.fraction = 0.5)
This results in:
> lda.valid(Species~., data = iris, prior = c(1/3, 1/3, 1/3), train.fraction = 0.5)
CV VAL
0.98 0.96

How to use the for loop with function needing for a string field?

I am using the smbinning R package to compute the variables information value included in my dataset.
The function smbinning() is pretty simple and it has to be used as follows:
result = smbinning(df= dataframe, y= "target_variable", x="characteristic_variable", p = 0.05)
So, df is the dataset you want to analyse, y the target variable and x is the variable of which you want to compute the information value statistics; I enumerate all the characteristic variables as z1, z2, ... z417 to be able to use a for loop to mechanize all the analysis process.
I tried to use the following for loop:
for (i in 1:417) {
result = smbinning(df=DATA, y = "FLAG", x = "DATA[,i]", p=0.05)
}
in order to be able to compute the information value for each variable corresponding to i column of the dataframe.
The DATA class is "data.frame" while the resultone is "character".
So, my question is how to compute the information value of each variable and store that in the object denominated result?
Thanks! Any help will be appreciated!
No sample data is provided I can only hazard a guess that the following will work:
results_list = list()
for (i in 1:417) {
current_var = paste0('z', i)
current_result = smbinning(df=DATA, y = "FLAG", x = current_var, p=0.05)
results_list[i] = current_result$iv
}
You could try to use one of the apply methods, iterating over the z-counts. The x value to smbinning should be the column name not the column.
results = sapply(paste0("z",1:147), function(foo) {
smbinning(df=DATA, y = "FLAG", x = foo, p=0.05)
})
class(results) # should be "list"
length(results) # should be 147
names(results) # should be z1,...
results[[1]] # should be the first result, so you can also iterate by indexing
I tried the following, since you had not provided any data
> XX=c("IncomeLevel","TOB","RevAccts01")
> res = sapply(XX, function(z) smbinning(df=chileancredit.train,y="FlagGB",x=z,p=0.05))
Warning message:
NAs introduced by coercion
> class(res)
[1] "list"
> names(res)
[1] "IncomeLevel" "TOB" "RevAccts01"
> res$TOB
...
HTH

I do not understand error "object not found" inside the function

I have roughly this function:
plot_pca_models <- function(models, id) {
library(lattice)
splom(models, groups=id)
}
and I'm calling it like this:
plot_pca_models(data.pca, log$id)
wich results in this error:
Error in eval(expr, envir, enclos) : object 'id' not found
when I call it without the wrapping function:
splom(data.pca, groups=log$id)
it raises this error:
Error in log$id : object of type 'special' is not subsettable
but when I do this:
id <- log$id
splom(models, groups=id)
it behaves as expected.
Please can anybody explain why it behaves like this and how to correct it? Thanks.
btw:
I'm aware of similar questions here, eg:
Help understand the error in a function I defined in R
Object not found error with ddply inside a function
Object disappears from namespace in function
but none of them helped me.
edit:
As requested, there is full "plot_pca_models" function:
plot_pca_models <- function(data, id, sel=c(1:4), comp=1) {
# 'data' ... princomp objects
# 'id' ... list of samples id (classes)
# 'sel' ... list of models to compare
# 'comp' ... which pca component to compare
library(lattice)
models <- c()
models.size <- 1:length(data)
for(model in models.size) {
models <- c(models, list(data[[model]]$scores[,comp]))
}
names(models) <- 1:length(data)
models <- do.call(cbind, models[sel])
splom(models, groups=id)
}
edit2:
I've managed to make the problem reproducible.
require(lattice)
my.data <- data.frame(pca1 = rnorm(100), pca2 = rnorm(100), pca3 = rnorm(100))
my.id <- data.frame(id = sample(letters[1:4], 100, replace = TRUE))
plot_pca_models2 <- function(x, ajdi) {
splom(x, group = ajdi)
}
plot_pca_models2(x = my.data, ajdi = my.id$id)
which produce the same error like above.
The problem is that splom evaluates its groups argument in a nonstandard way.A quick fix is to rewrite your function so that it constructs the call with the appropriate syntax:
f <- function(data, id)
eval(substitute(splom(data, groups=.id), list(.id=id)))
# test it
ir <- iris[-5]
sp <- iris[, 5]
f(ir, sp)
log is a function in base R. Good practice is to not name objects after functions...it can create confusion. Type log$test into a clean R session and you'll see what's happening:
object of type 'special' is not subsettable
Here's a modification of Hong Oi's answer. First I would recommend to include id in the main data frame, i.e
my.data <- data.frame(pca1 = rnorm(100), pca2 = rnorm(100), pca3 = rnorm(100), id = sample(letters[1:4], 100, replace = TRUE))
.. and then
plot_pca_models2 <- function(x, ajdi) {
Call <- bquote(splom(x, group = x[[.(ajdi)]]))
eval(Call)
}
plot_pca_models2(x = my.data, ajdi = "id")
The cause of the confusion is the following line in lattice:::splom.formula:
groups <- eval(substitute(groups), data, environment(formula))
... whose only point is to be able to specify groups without quotation marks, that is,
# instead of
splom(DATA, groups="ID")
# you can now be much shorter, thanks to eval and substitute:
splom(DATA, groups=ID)
But of course, this makes using splom (and other functions e.g. substitute which use "nonstandard evaluation") harder to use from within other functions, and is against the philosophy that is "mostly" followed in the rest of R.

Resources