defining non-standard classes in Reference Class object - r

Reference Classes only seem to accept the basic/standard object types are permitted. For instance, I want a chron object but this does not allow me to define it:
> newclass <- setRefClass("newclass",fields=list(time="chron"))
Error in refClassInformation(Class, contains, fields, methods, where) :
class "chron" for field 'time' is not defined
Is this a limitation or there is a better way? I tried maybe setting it in the initialize method but apparently this is not the way to go either:
> newclass <- setRefClass("newclass",
+ fields=list(time="numeric"),
+ methods=list(initialize=function() time <<- as.chron(time)))
library(chron)
> x <- newclass(time=as.chron("2011-01-01"))
Error in .Object$initialize(...) : unused argument (time = 14975)

I think that you need to register your non standard class using setOldclass first.
require(chron)
dts <- dates(c("05/20/13", "06/10/13"))
tms <- times(c("19:30:00", "22:30:05"))
setOldClass("chron")
newclass <- setRefClass("newclass",
fields = list(time = "chron"))
mydate <- newclass(time = chron(dates = dts, times = tms))
mydate$time
## [1] (05/20/13 19:30:00) (06/10/13 22:30:05)

Related

Error in is.single.string(object) : argument "object" is missing, with no default

I want to parse the AAChange.refGene column and then use biomaRt R package to extract information. My code is raising Error in is.single.string(object) : argument "object" is missing, with no default even though the getSequence function is meant to accept multiple arguments.
library(tidyr)
variant_calls = read.delim("variant_calls.txt")
info = tidyr::separate(variant_calls["AAChange.refGene"], AAChange.refGene, c("Refseq ID", "cDNA level change", "Protein level change"), ":")
df = cbind(variant_calls["Gene.refGene"],info)
library(biomaRt)
ensembl <- useMart(biomart="ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl", host="https://grch37.ensembl.org", path="/biomart/martservice")
pep <- vector()
for(i in 1:length(df$`Refseq ID`)){
temp <- getSequence(id=df$`Refseq ID`[i],type='refseq_mrna',seqType='peptide', mart=ensembl)
temp <- sapply(temp$peptide, nchar)
temp <- sort(temp, decreasing = TRUE)
temp <- names(temp[1])
pep[i] <- temp
}
df$Sequence <- pep
Traceback:
Error in is.single.string(object) :
argument "object" is missing, with no default
I got the same error and found out (using ?getSequence) that it was a conflict between packages (classic R), specifically biomart and seqinr which is used to handle fasta format thus probably used together often.
My solution consisted in calling the function like this:
biomaRt::getSequence()

R - Parallel Processing and ldply error

I am trying to use the below code to make API calls in a parallel process to speed up the API calls. (I know this isn't the best way to speed up API calls but it works)
It only fails when I try to use parallel, otherwise it works. In the ldply function I am getting the below error:
Error in do.ply(i) :
task 1 failed - "object of type 'closure' is not subsettable"
In addition:
Warning messages:
1: : ... may be used in an incorrect context: ‘.fun(piece, ...)’
2: : ... may be used in an incorrect context: ‘.fun(piece, ...)’
any help would be appreciated!
One <- 26
cl<-makeCluster(4)
registerDoSNOW(cl)
func.time <- Sys.time()
## API CALL ONE FOR "kline"
url <- "https://api.binance.com"
path <- paste("/api/v1/klines?symbol=",pairs[1],"&interval=1m&limit=1", sep = "")
raw.results <- GET(url = url, path = path)
text_content <- content(raw.results, as = "text", encoding = "UTF-8")
kline <- data.frame(text_content %>% fromJSON())
kline$symbol <- pairs[1]
## API FUNCTION TO BE APPLIED FOR REST
loopfunction <- function(i){
url <- "https://api.binance.com"
path <- paste("/api/v1/klines?symbol=",pairs[i],"&interval=1m&limit=1", sep = "")
raw.results <- GET(url = url, path = path)
text_content <- content(raw.results, as = "text", encoding = "UTF-8")
kline_temp <- data.frame(text_content %>% fromJSON())
kline_temp$symbol <- pairs[i]
kline <- rbind(kline,kline_temp)
return(kline)
}
## DPLY PARALLEL FUNCTION
kline2 <- data.frame(ldply(2:(One - 1), .fun = loopfunction, .parallel = T, .paropts = c("httr", "jsonlite", "dplyr"))) ##"ONE" is a list varriable created earlier
stopCluster(cl)
func.end.time <- Sys.time()
func.tot.time <- func.end.time - func.time
Your question isn't fully reproducible, so the following is an educated guess.
Your loopfunction() references an object called pairs. It seems from your script that a variable called pairs is defined somewhere in your local environment. However, when loopfunction() is passed to ldply(), it no longer has access to that variable (ordinarily, it would, but parallelization requires fresh R environments to be created). Having failed to find an object called pairs in the environment, R continues searching, and finds a match in stats::pairs(). This is a plotting function, not a subsettable object like a vector or data frame. Hence the error message, "object of type 'closure' is not subsettable".
I'm not especially familiar with how ldply implements parallel processing, but you could probably modify your function definition like this:
loopfunction <- function(i, pairs) {
...[body of function]...
}
And pass pairs as an extra parameter in your ldply call:
kline2 <- data.frame(ldply(2:(One - 1), .fun = loopfunction, pairs = pairs, .parallel = T, .paropts = list(.packages = c("httr", "jsonlite", "dplyr"))))

S4 Clases for GTFS in R.....help subsetting?

As part of an assignment in college I am trying to make a small r package that provides some basic statistics and graphics on GTFS feeds.
I am using the files from https://github.com/ondrejivanic/131500/blob/master/gtfs.r.
I have to create a number of S4 classes to as part of the assignment. I have created a separate classes for each GTFS feed file. I am trying to make a list of service id's to produce a graphic for the number of trips on a given day.
Here I define and create and object of the class.
# Create the S4 Class for calendar_dates.txt
# calendar_dates.txt - service_id, date, exception_type
setClass("CalendarDates", representation(service_id = "factor", date = "POSIXct", exception_type = "numeric"))
# create new object of SHAPES from files
calendar_dates <- transform(
read.gtfs.file("calendar_dates.txt", "data"),
date = ymd(date)
)
# create S4 object of routes
calendar_datesS4 <- new("CalendarDates", service_id = calendar_dates$service_id, date = calendar_dates$date, exception_type = calendar_dates$exception_type)
The part I cannot understand is how to perform this subset on an S4 object. The piece below works with a dataframe object:
calendar.dates <- calendar_datesS4
calendar.dates[calendar.dates$date == d & calendar.dates$exception_type == 1, c("service_id")]
[1] "daily_1" "daily_2" "daily_3" "daily_4"
Doing the following results in an error:
calendar.dates[calendar.dates#date == d & calendar.dates#exception_type == 1, c("service_id")]
Error in calendar.dates[dates == d & exceptions == 2, c("service_id")] :
object of type 'S4' is not subsettable
I have not found any questions elsewhere, where a condition must be met for the subset.
I really appreciate any help with this!

R refClass Methods

I am using R refClass example below.
Person = setRefClass("Person",fields = list(name = "character", age = "numeric")
) ## Person = setRefClass("Person",
Person$methods = list(
increaseAge <- function(howMuch){
age = age + howMuch
}
)
When I store this program in a file called Person.R and source it, it does not show any errors. Now I instantiate a new object.
p = new("Person",name="sachin",age=40)
And I try to invoke the method increaseAge, using p$increaseAge(40), and it shows the following error
Error in envRefInferField(x, what, getClass(class(x)), selfEnv) :
"increaseAge" is not a valid field or method name for reference class "Person"
I cannot figure out why it says that the method increaseAge is not a valid method name when I have defined it.
To specify a method independent of class definition, invoke the methods() function on the generator. Also, use either <<- or .self$age = for the assignment.
Person$methods(increaseAge=function(howMuch) {
age <<- age + howMuch
## alterenatively, .self$age = age + howMuch or .self$age <- age + howMuch
})
Remember that R works best on vectors, so think of a Persons class (modeling columns) representing all the individuals in your study, rather than a collection of Person instances (modeling rows).
I get an error using your code. I would do something like this:
Person = setRefClass("Person",
fields = list(name = "character", age = "numeric"),
methods = list(
increaseAge = function(howMuch) age <<- age + howMuch
))
> p = new("Person",name="sachin",age=40)
> p$increaseAge(5)
> p$age
[1] 45

Why am I getting the message "node stack overflow" when the superclass is "VIRTUAL"?

I am getting the message
Error in parent.frame() : node stack overflow
Error during wrapup: node stack overflow
when I try to construct an object using the S4 command "as", but only when a superclass is declared "VIRTUAL".
The class hierarchy is as follows:
PivotBasic contains Pivot contains Model
The setClass commands for Pivot and Pivot Basic and the constructor for PivotBasic are below. Class Pivot does not have a constructor. The Model constructor is too big to insert here.
This is really not a big deal (I think) because everything works fine if the "VIRTUAL" keyword is removed from the representation argument of setClass. But I am curious about the reason for the problem. Would anyone have insights on it?
Thanks,
Fernando Saldanha
setClass(Class = "Pivot",
representation = representation(
pivotName = "character",
pivotNames = "character",
pivotData = "data.frame",
"VIRTUAL"
),
contains = "Model"
)
setClass(Class = "PivotBasic",
representation = representation(),
contains = "Pivot"
)
pivotBasic <- function(
portfolio,
assets,
controlVariableList,
pivotData = NULL, # pivotName is ignored if pivotData is not null
pivotName = "N_WEEKDAY_3_6",
firstPredictionDate = as.Date(integer(), origin = "1970-01-01"),
name = NULL,
tags = "Event"
) {
if (missing(portfolio)) stop("[PivotBasic: pivotBasic] - Missing portfolio argument")
if (missing(assets)) stop("[PivotBasic: pivotBasic] - Missing assets argument")
if (missing(controlVariableList)) stop("[PivotBasic: pivotBasic] - Missing controlVariableList argument")
object <- model(
portfolio,
assets,
controlVariableList,
firstPredictionDate,
name,
tags)
# The error message happens when this command is executed
mdl <- as(object, "PivotBasic")
# Other code
mdl
} # end pivotBasic
Is this a minimal example that illustrates your problem
.Model <- setClass(Class = "Model",
representation=representation(x="integer")
)
setClass(Class = "Pivot",
representation = representation("VIRTUAL"),
contains = "Model"
)
.PivotBasic <- setClass(Class = "PivotBasic",
contains = "Pivot"
)
This generates an error
> as(.Model(), "PivotBasic")
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
> R.version.string
[1] "R version 3.0.0 Patched (2013-04-15 r62590)"
but might generate an error like you see under an earlier version of R. This thread on the R-devel mailing list is relevant, where a solution is to define a setIs method such as
setIs("PivotBasic", "Model",
coerce = function(from) .PivotBasic(x = from#x),
replace = function(from, value) {
from#x = value#x
from
}
)
I think of setIs as part of the class definition. If there are many slots needing copying, then a further work-around might be, in the replace function,
nms <- intersect(slotNames(value), slotNames(from))
for (nm in nms)
slot(from, nm) <- slot(value, nm)
from
but the underlying issue is really in S4's implementation. A cost to removing the "VIRTUAL" specification is that it compromises your class design, and presumably the formalism of the S4 system is what motivated your choice in the first place; maybe that's not such a bad cost when faced with the alternatives.

Resources