I am using rpy2 to import a library from CRAN repository called "MatrixEQTL" to run in within Python using importr, here is my attempt:
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
robjects.r('install.packages("MatrixEQTL")')
mtrql = importr('MatrixEQTL')
I am trying to access both class fields as well as class methods but I failed here is what the class looks like when printing into the python shell:
# to view class SlicedData of MatrixEQTL R package
print(mtrql.SlicedData)
Generator for class "SlicedData":
Class fields:
Name: dataEnv nSlices1 rowNameSlices
Class: environment numeric list
Name: columnNames fileDelimiter fileSkipColumns
Class: character character numeric
Name: fileSkipRows fileSliceSize fileOmitCharacters
Class: numeric numeric character
Class Methods:
"Clear", "show#envRefClass", "nSlices", "getRefClass", "export",
"initialize", "CombineInOneSlice", "callSuper", "initFields", "nCols",
"getClass", "RowStandardizeCentered", "import", "SaveFile", "RowReorder",
"setSlice", "getSlice", "RowReorderSimple", "CreateFromMatrix", "nRows",
"ResliceCombined", "LoadFile", "ColumnSubsample", "SetNanRowMean",
"setSliceRaw", "getSliceRaw", "copy", "RowMatrixMultiply", "usingMethods",
"GetAllRowNames", "RowRemoveZeroEps", "field", ".objectParent",
"IsCombined", "untrace", "trace", "Clone", "GetNRowsInSlice",
".objectPackage", "show", "FindRow"
Reference Superclasses:
"envRefClass"
I want to access for instance class field say fileDelimiter and also class Methods say LoadFile but I couldn't, here is also my attempt to access class method LoadFile and the error message that generated:
# If I try to run this without "()"
data = mtrql.SlicedData
load_my_file = data.LoadFile(file)
data = data.LoadFile(file)
AttributeError: 'DocumentedSTFunction' object has no attribute 'LoadFile'
# And that's my attempt for when adding "()" to the class name
data = mtrql.SlicedData
load_my_file = data.LoadFile(file)
data = data.LoadFile(file)
AttributeError: 'RS4' object has no attribute 'LoadFile'
I tried to look for a solution to this issue but I wasn't successful, please help me understand and solve this issue.
Thank you so much in advance.
mtrql.SlicedData is a "Generator", that is a constructor. In R that means a function (that will return an instance of the class), which is why mtrql.SlicedData is an R function with rpy2.
Consider the following example from the R documentation (https://rdrr.io/r/methods/Introduction.html):
import rpy2.robjects as ro
Pos = ro.r("""
setClass("Pos", slots = c(latitude = "numeric", longitude = "numeric",
altitude = "numeric"))
""")
With rpy2, the object Pos is an R function:
>>> type(Pos)
rpy2.robjects.functions.SignatureTranslatedFunction
However that function object has an S3 class in R (there are several OOP systems in R, 2 of them coexisting in standard R unfortunately).
>>> tuple(Pos.rclass)
('classGeneratorFunction',)
A specific print function is implemented for this S3 class, and it will use the attributes package and className for the object (the R function-that-is-in-fact-a-Generator) to display all the info about the clas you are seeing.
Since this is a constructor, to create an instance you can do:
pos = Pos()
or
pos = Pos(latitude=0, longitude=0, altitude=-10)
The instance will have instance attributes (and methods)
>>> tuple(pos.list_attrs())
('latitude', 'longitude', 'altitude', 'class')
>>> tuple(pos.do_slot('altitude'))
(-10,)
With your specific example, you probably want something like:
slidata = mtrql.SlicedData()
slidata.do_slot('LoadFile')(file)
The documentation can provide more information about R S4 classes in rpy2, and how to make wrapper classes in Python that map attributes and methods.
https://rpy2.github.io/doc/v3.3.x/html/robjects_oop.html#s4-objects
Related
I'm following this tutorial that codes a sentiment analysis classifier using BERT with the huggingface library and I'm having a very odd behavior. When trying the BERT model with a sample text I get a string instead of the hidden state. This is the code I'm using:
import transformers
from transformers import BertModel, BertTokenizer
print(transformers.__version__)
PRE_TRAINED_MODEL_NAME = 'bert-base-cased'
PATH_OF_CACHE = "/home/mwon/data-mwon/paperChega/src_classificador/data/hugingface"
tokenizer = BertTokenizer.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
sample_txt = 'When was I last outside? I am stuck at home for 2 weeks.'
encoding_sample = tokenizer.encode_plus(
sample_txt,
max_length=32,
add_special_tokens=True, # Add '[CLS]' and '[SEP]'
return_token_type_ids=False,
padding=True,
truncation = True,
return_attention_mask=True,
return_tensors='pt', # Return PyTorch tensors
)
bert_model = BertModel.from_pretrained(PRE_TRAINED_MODEL_NAME,cache_dir = PATH_OF_CACHE)
last_hidden_state, pooled_output = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print([last_hidden_state,pooled_output])
that outputs:
4.0.0
['last_hidden_state', 'pooler_output']
While the answer from Aakash provides a solution to the problem, it does not explain the issue. Since one of the 3.X releases of the transformers library, the models do not return tuples anymore but specific output objects:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask']
)
print(type(o))
print(o.keys())
Output:
transformers.modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions
odict_keys(['last_hidden_state', 'pooler_output'])
You can return to the previous behavior by adding return_dict=False to get a tuple:
o = bert_model(
encoding_sample['input_ids'],
encoding_sample['attention_mask'],
return_dict=False
)
print(type(o))
Output:
<class 'tuple'>
I do not recommend that, because it is now unambiguous to select a specific part of the output without turning to the documentation as shown in the example below:
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], return_dict=False, output_attentions=True, output_hidden_states=True)
print('I am a tuple with {} elements. You do not know what each element presents without checking the documentation'.format(len(o)))
o = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'], output_attentions=True, output_hidden_states=True)
print('I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; {} '.format(o.keys()))
Output:
I am a tuple with 4 elements. You do not know what each element presents without checking the documentation
I am a cool object and you can acces my elements with o.last_hidden_state, o["last_hidden_state"] or even o[0]. My keys are; odict_keys(['last_hidden_state', 'pooler_output', 'hidden_states', 'attentions'])
I faced the same issue while learning how to implement Bert. I noticed that using
last_hidden_state, pooled_output = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
is the issue. Use:
outputs = bert_model(encoding_sample['input_ids'], encoding_sample['attention_mask'])
and extract the last_hidden state using
output[0]
You can refer to the documentation here which tells you what is returned by the BertModel
I develop some package using rstan::stan(). I make a function whose return value is an S4 class object generated by rstan::stan(). To access estimates comfortablely or to add an information for data, I want to make a new S4 class object which inherits S4 class of rstan::stan()so that there is new slots.
Furthermore, the new S4class object also can be available for any functions in rstan such as rstan::traceplot().
fit <- rstan::stan( model_name=scr, data=data) # This is a fictitious code.
Suppose we get S4 (stanfit) object named fit .
Define an extended class of stanfit
InheritedClass <- setClass("InheritedClass",
# New slots
representation(slotA="character",
slotB="character",
slotC="numeric"),
contains = "stanfit"
)
To create an S4 object of the inherited class using an existing S4 class object ,i.e., fit,
so what I need is only to input values for added new slots, i.e., slotA, slotB, slotC.
Using the following code, we can convert the S4 object for old class to inherited class:
fit2 <- as(fit,"InheritedClass")
Using this we can edit slot like following:
fit2#slotA <- "aaaaaaaaaaaa"
See help(setClass). I believe it would be something like
setClass("classname", slots = c(foo = "numeric", bar = "character"),
contains = "stanfit")
And I'm sure you would have to include rstan in the Imports: line of the DESCRIPTION file in your package in order for it to find the S4 class definition for stanfit.
I can't figure out what I'm missing. Following Situation:
I wrote a R-Package lets call it "pkg_A"
It depends on "Rcpp" and I have a Module loaded and a class set up like :
Rcpp::loadModule("pkg_A_modul", what = "pkg_A_cppClass_A")
cppClass_A <- setRcppClass(
Class = "pkg_A_cppClass_A",
CppClass = "pkg_A_cppClass_A",
module = "pkg_A_modul",
fields = c(
remark = "character"
)
)
and a additional constructor
classA <- function(a,b){
# some stuff
tmpObj <- cppClass_A()
# some more stuff
return(tmpObj)
}
class(classA ) <- "classA "
The only Thing what I export to the NAMESPACE is the classA function/class.
That all works fine and I can build that package without any warnings and even can check it with "--as-cran" flag.
Now I want to build a second package on top of that package, lets call that "pkg_B" Therefore I list "pkg_A" in the DESCRIPTION Depends of "pkg_B"
as well as I load the class with importFrom(pkg_A, classA) in the NAMESPACE of "pkg_B".
Now I want to implement a class
classB <- setRefClass(
"classB",
contains = c("classA"),
fields = c( b = "numeric)
)
But when I now want to build "pkg_B" I get the error:
Error in getClass(what, where = where) : "classA" is not a defined
class
I also tried to use the "pkg_A_cppClass_A" instead or imprt the whole pkg_A or use pkg_A::classA. Nothing changed.
Hope the question is complete enough. If you missing some info let me know.
Thankful for any suggestions!
I'm currently programming an R-script which uses a java .jar that makes use of the java/lang/Vector class, which in this case uses a class in a method that is not native. In java source code:
public static Vector<ClassName> methodname(String param)
I found nothing in the documentation of rJava on how to handle a template class like vector and what to write when using jcall or any other method.
I'm currently trying to do something like this:
v <- .jnew("java/util/Vector")
b <- .jcall(v, returnSig = "Ljava/util/Vector", method = "methodname",param)
but R obviously throws an exception:
method methodname with signature (Ljava/lang/String;)Ljava/util/Vector not found
How do I work the template class into this command? Or for that matter, how do I create a vector of a certain class in the first place? Is this possible?
rJava does not know java generics, there is no syntax that will create a Vector of a given type. You can only create Vectors of Objects.
Why are you sticking with the old .jcall api when you can use the J system, which lets you use java objects much more nicely:
> v <- new( J("java.util.Vector") )
> v$add( 1:10 )
[1] TRUE
> v$size()
[1] 1
# code completion
> v$
v$add( v$getClass() v$removeElement(
v$addAll( v$hashCode() v$removeElementAt(
v$addElement( v$indexOf( v$retainAll(
v$capacity() v$insertElementAt( v$set(
v$clear() v$isEmpty() v$setElementAt(
v$clone() v$iterator() v$setSize(
v$contains( v$lastElement() v$size()
v$containsAll( v$lastIndexOf( v$subList(
v$copyInto( v$listIterator( v$toArray(
v$elementAt( v$listIterator() v$toArray()
v$elements() v$notify() v$toString()
v$ensureCapacity( v$notifyAll() v$trimToSize()
v$equals( v$remove( v$wait(
v$firstElement() v$removeAll( v$wait()
v$get( v$removeAllElements()
I have a function in R that looks somewhat like this:
setMethod('[', signature(x="stack"),definition=function(x,i,j,drop){
new('class', as(x, "SpatialPointsDataFrame")[i,]) })
I use it to get a single element out of a stacked object. For the package I'm building I need a .Rd file to document the function. I stored it as [.Rd but somehow the R CMD check does not see this. It returns:
Undocumented S4 methods: generic '[' and siglist 'MoveStack,ANY,ANY'
The [.Rd file starts with these lines:
\name{[}
\alias{[}
\alias{[,stack,ANY,ANY-method}
\docType{methods}
\title{Returns an object from a stack}
\description{Returning a single object}
\usage{
\S4method{\[}{stack,ANY,ANY}(x,i,y,drop)
}
Any idea how I make R CMD check aware of this file?
If you look at the source code of the sp package, for example SpatialPolygons-class.Rd, the Methods section:
\section{Methods}{
Methods defined with class "SpatialPolygons" in the signature:
\describe{
\item{[}{\code{signature(obj = "SpatialPolygons")}: select subset of (sets of) polygons; NAs are not permitted in the row index}
\item{plot}{\code{signature(x = "SpatialPolygons", y = "missing")}:
plot polygons in SpatialPolygons object}
\item{summary}{\code{signature(object = "SpatialPolygons")}: summarize object}
\item{rbind}{\code{signature(object = "SpatialPolygons")}: rbind-like method}
}
}
method for [ is defined.
Name and class of the file are
\name{SpatialPolygons-class}
\alias{[,SpatialPolygons-method}
If you look at the help page for ?SpatialPolygons you should see
> Methods
>
> Methods defined with class "SpatialPolygons" in the signature:
>
> [ signature(obj = "SpatialPolygons"): select subset of (sets of)
> polygons; NAs are not permitted in the row index
>
So I would venture a guess that if you specify a proper (ASCII named) file name, give it an alias as in the above example, you should be fine.