I copied the function from the web:
# function used to predict Best Subset Selection Regression
predict.regsubsets = function(object, newdata, id, ...) {
form = as.formula(object$call[[2]])
mat = model.matrix(form, newdata)
coefi = coef(object, id = id)
mat[, names(coefi)] %*% coefi
}
However, when I try to use the above function within another function , I kept getting the following error.
library(leaps)
abc <- function(){
regfit <- regsubsets(lpsa ~.,data = XTraining, nvmax = 8)
predict.regsubsets(regfit, data = XTesting, id = 1)
}
abc()
Error in object$call[[2]] : subscript out of bounds
I read ?call in R already. But it doesn't help me understanding what went wrong here, in particular what is $call[[2]] ?
How can I edit the function above such that when I call the above function inside another function I won't get an error ?
The culprit is the line
form = as.formula(object$call[[2]])
This implies that object (which is the variable you pass to the function, in your example regfit) has a member called call, which is a list with at least two elements. [[ ]] is the R operator used to take the elements of a list.
For instance:
> a <- list(1:10, 1:5, letters[15:20])
> a[[2]]
[1] 1 2 3 4 5
> a[[3]]
[1] "o" "p" "q" "r" "s" "t"
However
> a[[5]] # This does not work, as a only has three elements
Error in a[[5]] : subscript out of bounds
You should not check ?call but rather the help for the function that generates object, in your case regsubsets.
As you can see from ?regsubsets, or by using str(regfit), that function does not return an object with a member named call.
To get the formula from a regsubsets object you need to look at the obj member of the summary.
For instance you could use:
sm <- summary(regfit)
sm$obj$call
Your mistake is in the function abc. The argument in the predict.regsubsets is called newdata, but you refer to is as data....
The object is probably the output from a previous analysis (look at the place you got the code from what function). The line form = as.formula(objects$call[[2]]) extracts the formula used to create the object and stores it in form. In the next lines it is used to create the model matrix of the new data, and finally uses it to predict the new data.
Related
I have created a function for GWR maps and I have run the code without it being in the function and it works well. However, when I create into a function I get an error. I was wondering if anyone could help, thank you!
#a=polygonshapefile
#b= Dependent variabable of shapefile
#c= Explantory variable 1
#d= Explantory vairbale 2
GWR_map <- function(a,b,c,d){
GWRbandwidth <- gwr.sel(a$b ~ a$c+a$d, a,adapt=T)
gwr.model = gwr(a$b ~ a$c+a$d, data = a, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE)
gwr.model
}
GWR_map(OA.Census,"Qualification", "Unemployed", "White_British")
The above code produces the following error:
Error in model.frame.default(formula = a$b ~ a$c + a$d, data = a, drop.unused.levels = TRUE) :
invalid type (NULL) for variable 'a$b'
You can't use function parameters with the $. Try changing your function to use the [[x]] notation instead. It should look like this:
GWR_map <- function(a,b,c,d){
GWRbandwidth <- gwr.sel(a[[b]] ~ a[[c]]+a[[d]], a,adapt=T)
gwr.model = gwr(a[[b]] ~ a[[c]]+a[[d]], data = a, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE)
gwr.model
}
The R help docs (section 6.2 on lists) explain this difference well:
Additionally, one can also use the names of the list components in double square brackets,
i.e., Lst[["name"]] is the same as Lst$name. This is especially useful, when the name of the component to be extracted is stored in another variable as in
x <- "name"; Lst[[x]] It is very important to distinguish Lst[[1]] from Lst[1]. ‘[[...]]’ is the operator used to select a single element, whereas ‘[...]’ is a general subscripting operator. Thus the former is the first object in the list Lst, and if it is a named list the name is not included. The latter
is a sublist of the list Lst consisting of the first entry only. If it is a named list, the names are transferred to the sublist.
I have a Seurat single-cell gene expression object, which has slots.
One of the slots is #meta.data, which is a matrix.
I'd like to create a column $orig.ident by assigning it the value of meta$brain.region as a factor. meta is my original table of metadata.
I'm doing this for a bunch of datasets and would like to make it generalizable.
The idea is that the user would only have to enter the name of the original object, and everything from then on would be called accordingly.
User prompt:
> dataset <- "path/to/gw14.RData"
> seurat.obj <- "gw14"
The workspace is then loaded, which includes the Seurat object gw14.
> load(dataset)
> seurat.obj.new <- paste0(seurat.obj, ".", 2)
I don't understand why using get here returns the error below:
> get(seurat.obj.new)#meta.data$orig.ident <- factor(meta$brain.region)
Error in get(seurat.obj.new)#meta.data$orig.ident = factor(meta$brain.region) :
could not find function "get<-"
Whereas using it here works as expected:
> assign(seurat.obj.new, CreateSeuratObject(raw.data = get(seurat.obj)#raw.data,
min.cells = 0, min.genes = 0, project=age))
First, just write a function that assumes you pass in the actual data object and returns an updated data object. For example
my_fun <- function(x) {
x#meta.data$orig.ident <- factor(meta$brain.region)
x
}
Then normally you would call it like this
gw14.2 <- my_fun(gw14)
Note functions in R should return take a value and return an updated value. They should not have side effects like creating variables. That should be in the user''s control.
If you did want to work with the data objects as strings, you could do
seurat.obj <- "gw14"
seurat.obj.new <- paste0(seurat.obj, ".", 2)
assign(seurat.obj.new, my_fun(get(seurat.obj)))
But this type of behavior is not consistent with how most R functions operate.
Normally I wonder where mysterious errors come from but now my question is where a mysterious lack of error comes from.
Let
numbers <- c(1, 2, 3)
frame <- as.data.frame(numbers)
If I type
subset(numbers, )
(so I want to take some subset but forget to specify the subset-argument of the subset function) then R reminds me (as it should):
Error in subset.default(numbers, ) :
argument "subset" is missing, with no default
However when I type
subset(frame,)
(so the same thing with a data.frame instead of a vector), it doesn't give an error but instead just returns the (full) dataframe.
What is going on here? Why don't I get my well deserved error message?
tl;dr: The subset function calls different functions (has different methods) depending on the type of object it is fed. In the example above, subset(numbers, ) uses subset.default while subset(frame, ) uses subset.data.frame.
R has a couple of object-oriented systems built-in. The simplest and most common is called S3. This OO programming style implements what Wickham calls a "generic-function OO." Under this style of OO, an object called a generic function looks at the class of an object and then applies the proper method to the object. If no direct method exists, then there is always a default method available.
To get a better idea of how S3 works and the other OO systems work, you might check out the relevant portion of the Advanced R site. The procedure of finding the proper method for an object is referred to as method dispatch. You can read more about this in the help file ?UseMethod.
As noted in the Details section of ?subset, the subset function "is a generic function." This means that subset examines the class of the object in the first argument and then uses method dispatch to apply the appropriate method to the object.
The methods of a generic function are encoded as
< generic function name >.< class name >
and can be found using methods(<generic function name>). For subset, we get
methods(subset)
[1] subset.data.frame subset.default subset.matrix
see '?methods' for accessing help and source code
which indicates that if the object has a data.frame class, then subset calls the subset.data.frame the method (function). It is defined as below:
subset.data.frame
function (x, subset, select, drop = FALSE, ...)
{
r <- if (missing(subset))
rep_len(TRUE, nrow(x))
else {
e <- substitute(subset)
r <- eval(e, x, parent.frame())
if (!is.logical(r))
stop("'subset' must be logical")
r & !is.na(r)
}
vars <- if (missing(select))
TRUE
else {
nl <- as.list(seq_along(x))
names(nl) <- names(x)
eval(substitute(select), nl, parent.frame())
}
x[r, vars, drop = drop]
}
Note that if the subset argument is missing, the first lines
r <- if (missing(subset))
rep_len(TRUE, nrow(x))
produce a vector of TRUES of the same length as the data.frame, and the last line
x[r, vars, drop = drop]
feeds this vector into the row argument which means that if you did not include a subset argument, then the subset function will return all of the rows of the data.frame.
As we can see from the output of the methods call, subset does not have methods for atomic vectors. This means, as your error
Error in subset.default(numbers, )
that when you apply subset to a vector, R calls the subset.default method which is defined as
subset.default
function (x, subset, ...)
{
if (!is.logical(subset))
stop("'subset' must be logical")
x[subset & !is.na(subset)]
}
The subset.default function throws an error with stop when the subset argument is missing.
I wrapped up the caret functionality to use with different random splits (variable i) and currently stacked over the problem - I don't know how to loop over the methods. paste doesn't work for me.
methods <- c("svmLinear","svmRadial")
for (M in methods) {
for (i in c(1:5)) {
data_load(act_file = "act.txt",
inact_file = "inact.txt")
sets(Rand=i)
mod_parms(k_folds=5)
modeling_y_testing(method = M, metric='ROC',tuneLength=10)
rm(list = ls())
}
}
the error is following
object 'M' not found
In addition: Warning message:
In .local(x, ...) : Variable(s) `' constant. Cannot scale data.
Execution halted
I suppose I need to convert the variable in methods into some sort of a character that should be accepted to modeling_y_testing (a caret-based function), but I don't know. You help is very appreciated.
Your problem is
`rm(list = ls())`
This function removes the variable M. So on the second iteration of the loop
`for(i in 1:5){`
we get the error
object `M` not found
i am struggling with an assignment and i would like your input.
note: this is a homework but when i tried to add the tag it said not to add it..
i don't want the resulting code, just suggestions on how to get this working :)
so, i have a t.test function as such:
my.t.test <- function(x,s1,s2){
x1 <- x[s1]
x2 <- x[s2]
x1 <- as.numeric(x1)
x2 <- as.numeric(x2)
t.out <- t.test(x1,x2,alternative="two.sided",var.equal=T)
out <- as.numeric(t.out$p.value)
return(out)
}
a matrix 30cols x 12k rows called data and an annotation file containing col names and data on the colums named dataAnn
dataAnn first column contains a list of M (male) or F (female) corresponding to the samples (or cols) in data (that follow the same order as in dataAnn), i have to run a t.test comparing the two samples and get the p values out
when i call
raw.pValue <- apply(data,1,my.t.test,s1=dataAnn[,1]=="M",s2=dataAnn[,1]=="F")
i get the error
Error in t.test(x1, x2, alternative = "two.sided", var.equal = T) :
unused argument(s) (alternative = "two.sided", var.equal = T)
i even tried to use
raw.pValue <- apply(data,1,my.t.test,s1=unlist(data[,1:18]),s2=unlist(data[,19:30]))
to divide the cols i want to compare but in this case i get the error
Error in x[s1] : invalid subscript type 'list'
i have been looking online, i understand that the second error is caused by an indices being a list...but this didn't really clarify it for me...
any input would be appreciated!!
You have overwritten the t.test function. Try calling it something like my.t.test, or when you want to call the original one use stats::t.test (this calls the one from the stats namespace). Remember that when you have overwritten a function you need to rm it from your workspace before you can use the original one without specifying the namespace.