where do we use the methods __str__ and __repr__ in python3? [duplicate] - python-3.4

This question already has answers here:
What is the difference between __str__ and __repr__?
(28 answers)
Closed 2 years ago.
I really don't understand where are __str__ and __repr__ used in Python. I mean, I get that __str__ returns the string representation of an object. But why would I need that? In what use case scenario? Also, I read about the usage of __repr__
But what I don't understand is, where would I use them?

__repr__
Called by the repr() built-in function and by string conversions (reverse quotes) to compute the "official" string representation of an object. If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).
__str__
Called by the str() built-in function and by the print statement to compute the "informal" string representation of an object.
Use __str__ if you have a class, and you'll want an informative/informal output, whenever you use this object as part of string. E.g. you can define __str__ methods for Django models, which then gets rendered in the Django administration interface. Instead of something like <Model object> you'll get like first and last name of a person, the name and date of an event, etc.
__repr__ and __str__ are similar, in fact sometimes equal (Example from BaseSet class in sets.py from the standard library):
def __repr__(self):
"""Return string representation of a set.
This looks like 'Set([<list of elements>])'.
"""
return self._repr()
# __str__ is the same as __repr__
__str__ = __repr__

The one place where you use them both a lot is in an interactive session. If you print an object then its __str__ method will get called, whereas if you just use an object by itself then its __repr__ is shown:
>>> from decimal import Decimal
>>> a = Decimal(1.25)
>>> print(a)
1.25 <---- this is from __str__
>>> a
Decimal('1.25') <---- this is from __repr__
The __str__ is intended to be as human-readable as possible, whereas the __repr__ should aim to be something that could be used to recreate the object, although it often won't be exactly how it was created, as in this case.
It's also not unusual for both __str__ and __repr__ to return the same value (certainly for built-in types).

Building up and on the previous answers and showing some more examples. If used properly, the difference between str and repr is clear. In short repr should return a string that can be copy-pasted to rebuilt the exact state of the object, whereas str is useful for logging and observing debugging results. Here are some examples to see the different outputs for some known libraries.
Datetime
print repr(datetime.now()) #datetime.datetime(2017, 12, 12, 18, 49, 27, 134411)
print str(datetime.now()) #2017-12-12 18:49:27.134452
The str is good to print into a log file, where as repr can be re-purposed if you want to run it directly or dump it as commands into a file.
x = datetime.datetime(2017, 12, 12, 18, 49, 27, 134411)
Numpy
print repr(np.array([1,2,3,4,5])) #array([1, 2, 3, 4, 5])
print str(np.array([1,2,3,4,5])) #[1 2 3 4 5]
in Numpy the repr is again directly consumable.
Custom Vector3 example
class Vector3(object):
def __init__(self, args):
self.x = args[0]
self.y = args[1]
self.z = args[2]
def __str__(self):
return "x: {0}, y: {1}, z: {2}".format(self.x, self.y, self.z)
def __repr__(self):
return "Vector3([{0},{1},{2}])".format(self.x, self.y, self.z)
In this example, repr returns again a string that can be directly consumed/executed, whereas str is more useful as a debug output.
v = Vector3([1,2,3])
print str(v) #x: 1, y: 2, z: 3
print repr(v) #Vector3([1,2,3])
One thing to keep in mind, if str isn't defined but repr, str will automatically call repr. So, it's always good to at least define repr

Grasshopper, when in doubt go to the mountain and read the Ancient Texts. In them you will find that __repr__() should:
If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value.

Lets have a class without __str__ function.
class Employee:
def __init__(self, first, last, pay):
self.first = first
self.last = last
self.pay = pay
emp1 = Employee('Ivan', 'Smith', 90000)
print(emp1)
When we print this instance of the class, emp1, this is what we get:
<__main__.Employee object at 0x7ff6fc0a0e48>
This is not very helpful, and certainly this is not what we want printed if we are using it to display (like in html)
So now, the same class, but with __str__ function:
class Employee:
def __init__(self, first, last, pay):
self.first = first
self.last = last
self.pay = pay
def __str__(self):
return(f"The employee {self.first} {self.last} earns {self.pay}.")
# you can edit this and use any attributes of the class
emp2 = Employee('John', 'Williams', 90000)
print(emp2)
Now instead of printing that there is an object, we get what we specified with return of __str__ function:
The employee John Williams earns 90000

str will be informal and readable format whereas repr will give official object representation.
class Complex:
# Constructor
def __init__(self, real, imag):
self.real = real
self.imag = imag
# "official" string representation of an object
def __repr__(self):
return 'Rational(%s, %s)' % (self.real, self.imag)
# "informal" string representation of an object (readable)
def __str__(self):
return '%s + i%s' % (self.real, self.imag)
t = Complex(10, 20)
print (t) # this is usual way we print the object
print (str(t)) # this is str representation of object
print (repr(t)) # this is repr representation of object
Answers :
Rational(10, 20) # usual representation
10 + i20 # str representation
Rational(10, 20) # repr representation

str and repr are both ways to represent. You can use them when you are writing a class.
class Fraction:
def __init__(self, n, d):
self.n = n
self.d = d
def __repr__(self):
return "{}/{}".format(self.n, self.d)
for example when I print a instance of it, it returns things.
print(Fraction(1, 2))
results in
1/2
while
class Fraction:
def __init__(self, n, d):
self.n = n
self.d = d
def __str__(self):
return "{}/{}".format(self.n, self.d)
print(Fraction(1, 2))
also results in
1/2
But what if you write both of them, which one does python use?
class Fraction:
def __init__(self, n, d):
self.n = n
self.d = d
def __str__(self):
return "str"
def __repr__(self):
return "repr"
print(Fraction(None, None))
This results in
str
So python actually uses the str method not the repr method when both are written.

Suppose you have a class and wish to inspect an instance, you see the print doesn't give much useful information
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
p1 = Person("John", 36)
print(p1) # <__main__.Animal object at 0x7f9060250410>
Now see a class with a str, it shows the instance information and with repr you even don't need the print. Nice no?
class Animal:
def __init__(self, color, age, breed):
self.color = color
self.age = age
self.breed = breed
def __str__(self):
return f"{self.color} {self.breed} of age {self.age}"
def __repr__(self):
return f"repr : {self.color} {self.breed} of age {self.age}"
a1 = Animal("Red", 36, "Dog")
a1 # repr : Red Dog of age 36
print(a1) # Red Dog of age 36

Related

Airflow 2 loosely coupling #task return values to receiving #task?

I'm trying to write two tasks that have no knowledge of the other. One task returns a dict (via XComArg) and I want to pass a single property of that object to the next task. If I pass the entire XComArg object, its value is populated as expected. But selecting a single property results in a None.
#dag(...):
def _dag():
#task
def A(**ctx):
# ...
return {'a': 42, 'b': 'B', 'c': 'C'}
#task
def B(a, _res, **ctx):
print('A', a) # >>> A None
print('RES', _res) # >>> RES {'a': 42, ...}
res = A()
B(res['a'])
dag = _dag
Ideally, B doesn't know where the value for a comes from, nor how to get it. Yes, passing all of res and having B extract what it needs with res['a'] works, but my goal is loose coupling.
See example in https://airflow.apache.org/docs/apache-airflow/stable/tutorial_taskflow_api.html
You need to specify 'multiple_outputs=true" in task A

Avoid `self.` and variable boilerplate in classes

Question What is the proper way to wrap data in Python? Is the __init__ and self. boilerplate needed?
As a class Sometimes I wish to create classes that represent Data, but don't have complex computational mechanics inside. For example:
class DataClass:
def __init__(self, a, b, c):
self.a = b
self.b = b
self.c = c
def sum(self):
return self.a + self.b + self.c
dc = DataClass(1,2,3)
print(dc.sum())
I dislike this code, the mathematics of a + b + c is hard to read due to the self. and the constructor is boilerplate. Moreover, I would create this class whenever I noticed that a tuple or dict around a, b and c is getting messy. In particular, tuples are ugly when the returned values change in amount or order, as happend with cv2.findContours between opencv versions 3 and 4 (returning 3, respectively 2 values). Lastly, I often copy-paste boilerplate stuff, easily allowing for painful mistakes, as is the case here with self.a = b.
As a function Therefore I often do something like this:
def DataFunc(a, b, c):
class Result:
def sum(self):
return a + b + c
return Result()
df = DataFunc(1,2,3)
print(df.sum())
It works and it is significantly more transparent (imho), but also a bit weird w.r.t. types, i.e.
rf2 = ResultFunc(1,2,3)
print(type(rf) == type(rf2)) # -> False
Moreover, the data members are not accessible (rf.a, rf.b, etc).
As a class with an annotator Finally, one can create an annotator to add the constructor:
def add_constructor(*vars):
def f(class_):
def init(self, *args, **kwargs):
unasigned = set(vars[len(args):]) - set(kwargs.keys())
if len(unasigned) > 0:
raise ValueError(f"Missing argument(s): {' '.join(unasigned)}")
for name, value in zip(vars, args):
setattr(self, name, value)
for name, value in kwargs.items():
setattr(self, name, value)
setattr(class_, '__init__', init)
return class_
return f
#add_constructor('a', 'b', 'c')
class DataAnnot:
def sum(self):
return self.a + self.b + self.c
da = DataAnnot(1,2,c=3)
print(da.sum())

Pyparsing: ParseAction not called

On a simple grammar I am in the bad situation that one of my ParseActions is not called.
For me this is strange as parseActions of a base symbol ("logic_oper") and a derived symbol ("cmd_line") are called correctly. Just "pa_logic_cmd" is not called. You can see this on the output which is included at the end of the code.
As there is no exception on parsing the input string, I am assuming that the grammar is (basically) correct.
import io, sys
import pyparsing as pp
def diag(msg, t):
print("%s: %s" % (msg , str(t)) )
def pa_logic_oper(t): diag('logic_oper', t)
def pa_operand(t): diag('operand', t)
def pa_ident(t): diag('ident', t)
def pa_logic_cmd(t): diag('>>>>>> logic_cmd', t)
def pa_cmd_line(t): diag('cmd_line', t)
def make_grammar():
semi = pp.Literal(';')
ident = pp.Word(pp.alphas, pp.alphanums).setParseAction(pa_ident)
operand = (ident).setParseAction(pa_operand)
op_and = pp.Keyword('A')
op_or = pp.Keyword('O')
logic_oper = (( op_and | op_or) + pp.Optional(operand))
logic_oper.setParseAction(pa_logic_oper)
logic_cmd = logic_oper + pp.Suppress(semi)
logic_cmd.setParseAction(pa_logic_cmd)
cmd_line = (logic_cmd)
cmd_line.setParseAction(pa_cmd_line)
grammar = pp.OneOrMore(cmd_line) + pp.StringEnd()
return grammar
if __name__ == "__main__":
inp_str = '''
A param1;
O param2;
A ;
'''
grammar = make_grammar()
print( "pp-version:" + pp.__version__)
parse_res = grammar.parseString( inp_str )
'''USAGE/Output: python test_4.py
pp-version:2.0.3
operand: ['param1']
logic_oper: ['A', 'param1']
cmd_line: ['A', 'param1']
operand: ['param2']
logic_oper: ['O', 'param2']
cmd_line: ['O', 'param2']
logic_oper: ['A']
cmd_line: ['A']
'''
Can anybody give me a hint on this parseAction problem?
Thanks,
The problem is here:
cmd_line = (logic_cmd)
cmd_line.setParseAction(pa_cmd_line)
The first line assigns cmd_line to be the same expression as logic_cmd. You can verify by adding this line:
print("???", cmd_line is logic_cmd)
Then the second line calls setParseAction, which overwrites the parse action of logic_cmd, so the pa_logic_cmd will never get called.
Remove the second line, since you are already testing the calling of the parse action with pa_logic_cmd. You could change to using the addParseAction method instead, but to my mind that is an invalid test (adding 2 parse actions to the same pyparsing expression object).
Or, change the definition of cmd_line to:
cmd_line = pp.Group(logic_cmd)
Now you will have wrapped logic_cmd inside another expression, and you can then independently set and test the running of parse actions on the two different expressions.

rpy2 to subset RS4 object (expressionSet)

I'm building an ExpressionSet class using rpy2, following the relevant tutorial as a guide. One of the most common things I do with the Eset object is subsetting, which in native R is as straightforward as
eset2<-eset1[1:10,1:5] # first ten features, first five samples
which returns a new ExpressionSet object with subsets of both the expression and phenotype data, using the given indices. Rpy2's RS4 object doesn't seem to allow direct subsetting, or have rx/rx2 attributes unlike e.g. RS3 vectors. I tried, with ~50% success, adding a '_subset' function (below) that creates subsets of these two datasets separately and assigns them back to Eset, but is there a more straightforward way that I'm missing?
from rpy2 import (robjects, rinterface)
from rpy2.robjects import (r, pandas2ri, Formula)
from rpy2.robjects.packages import (importr,)
from rpy2.robjects.methods import (RS4,)
class ExpressionSet(RS4):
# funcs to get the attributes
def _assay_get(self): # returns an environment, use ['exprs'] key to access
return self.slots["assayData"]
def _pdata_get(self): # returns an RS4 object, use .slots("data") to access
return self.slots["phenoData"]
def _feats_get(self): # returns an RS4 object, use .slots("data") to access
return self.slots["featureData"]
def _annot_get(self): # slots returns a tuple, just pick 1st (only) element
return self.slots["annotation"][0]
def _class_get(self): # slots returns a tuple, just pick 1st (only) element
return self.slots["class"][0]
# funcs to set the attributes
def _assay_set(self, value):
self.slots["assayData"] = value
def _pdata_set(self, value):
self.slots["phenoData"] = value
def _feats_set(self,value):
self.slots["featureData"] = value
def _annot_set(self, value):
self.slots["annotation"] = value
def _class_set(self, value):
self.slots["class"] = value
# funcs to work with the above to get/set the data
def _exprs_get(self):
return self.assay["exprs"]
def _pheno_get(self):
pdata = self.pData
return pdata.slots["data"]
def _exprs_set(self, value):
assay = self.assay
assay["exprs"] = value
def _pheno_set(self, value):
pdata = self.pData
pdata.slots["data"] = value
assay = property(_assay_get, _assay_set, None, "R attribute 'assayData'")
pData = property(_pdata_get, _pdata_set, None, "R attribute 'phenoData'")
fData = property(_feats_get, _feats_set, None, "R attribute 'featureData'")
annot = property(_annot_get, _annot_set, None, "R attribute 'annotation'")
exprs = property(_exprs_get, _exprs_set, None, "R attribute 'exprs'")
pheno = property(_pheno_get, _pheno_set, None, "R attribute 'pheno")
def _subset(self, features=None, samples=None):
features = features if features else self.exprs.rownames
samples = samples if samples else self.exprs.colnames
fx = robjects.BoolVector([f in features for f in self.exprs.rownames])
sx = robjects.BoolVector([s in samples for s in self.exprs.colnames])
self.pheno = self.pheno.rx(sx, self.pheno.colnames)
self.exprs = self.exprs.rx(fx,sx) # can't assign back to exprs this way
When doing
eset2<-eset1[1:10,1:5]
in R, the R S4 method "[" with the signature ("ExpressionSet") is fetched and run using the parameter values you provided.
The documentation is suggesting the use of getmethod (see http://rpy2.readthedocs.org/en/version_2.7.x/generated_rst/s4class.html#methods ) to facilitate the task of fetching the relevant S4 method, but its behaviour seems to have changed after the documentation was written (resolution of the dispatch through inheritance is no longer done).
The following should do it though:
from rpy2.robjects.packages import importr
methods = importr('methods')
r_subset_expressionset = methods.selectMethod("[", "ExpressionSet")
with thanks to #lgautier's answer, here's a snippet of my above code, modified to allow subsetting of the RS4 object:
from multipledispatch import dispatch
#dispatch(RS4)
def eset_subset(eset, features=None, samples=None):
"""
subset an RS4 eset object
"""
features = features if features else eset.exprs.rownames
samples = samples if samples else eset.exprs.colnames
fx = robjects.BoolVector([f in features for f in eset.exprs.rownames])
sx = robjects.BoolVector([s in samples for s in eset.exprs.colnames])
esub=methods.selectMethod("[", signature="ExpressionSet")(eset, fx,sx)
return esub

How can I use functools.partial on multiple methods on an object, and freeze parameters out of order?

I find functools.partial to be extremely useful, but I would like to be able to freeze arguments out of order (the argument you want to freeze is not always the first one) and I'd like to be able to apply it to several methods on a class at once, to make a proxy object that has the same methods as the underlying object except with some of its methods parameters being frozen (think of it as generalizing partial to apply to classes). And I'd prefer to do this without editing the original object, just like partial doesn't change its original function.
I've managed to scrap together a version of functools.partial called 'bind' that lets me specify parameters out of order by passing them by keyword argument. That part works:
>>> def foo(x, y):
... print x, y
...
>>> bar = bind(foo, y=3)
>>> bar(2)
2 3
But my proxy class does not work, and I'm not sure why:
>>> class Foo(object):
... def bar(self, x, y):
... print x, y
...
>>> a = Foo()
>>> b = PureProxy(a, bar=bind(Foo.bar, y=3))
>>> b.bar(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: bar() takes exactly 3 arguments (2 given)
I'm probably doing this all sorts of wrong because I'm just going by what I've pieced together from random documentation, blogs, and running dir() on all the pieces. Suggestions both on how to make this work and better ways to implement it would be appreciated ;) One detail I'm unsure about is how this should all interact with descriptors. Code follows.
from types import MethodType
class PureProxy(object):
def __init__(self, underlying, **substitutions):
self.underlying = underlying
for name in substitutions:
subst_attr = substitutions[name]
if hasattr(subst_attr, "underlying"):
setattr(self, name, MethodType(subst_attr, self, PureProxy))
def __getattribute__(self, name):
return getattr(object.__getattribute__(self, "underlying"), name)
def bind(f, *args, **kwargs):
""" Lets you freeze arguments of a function be certain values. Unlike
functools.partial, you can freeze arguments by name, which has the bonus
of letting you freeze them out of order. args will be treated just like
partial, but kwargs will properly take into account if you are specifying
a regular argument by name. """
argspec = inspect.getargspec(f)
argdict = copy(kwargs)
if hasattr(f, "im_func"):
f = f.im_func
args_idx = 0
for arg in argspec.args:
if args_idx >= len(args):
break
argdict[arg] = args[args_idx]
args_idx += 1
num_plugged = args_idx
def new_func(*inner_args, **inner_kwargs):
args_idx = 0
for arg in argspec.args[num_plugged:]:
if arg in argdict:
continue
if args_idx >= len(inner_args):
# We can't raise an error here because some remaining arguments
# may have been passed in by keyword.
break
argdict[arg] = inner_args[args_idx]
args_idx += 1
f(**dict(argdict, **inner_kwargs))
new_func.underlying = f
return new_func
Update: In case anyone can benefit, here's the final implementation I went with:
from types import MethodType
class PureProxy(object):
""" Intended usage:
>>> class Foo(object):
... def bar(self, x, y):
... print x, y
...
>>> a = Foo()
>>> b = PureProxy(a, bar=FreezeArgs(y=3))
>>> b.bar(1)
1 3
"""
def __init__(self, underlying, **substitutions):
self.underlying = underlying
for name in substitutions:
subst_attr = substitutions[name]
if isinstance(subst_attr, FreezeArgs):
underlying_func = getattr(underlying, name)
new_method_func = bind(underlying_func, *subst_attr.args, **subst_attr.kwargs)
setattr(self, name, MethodType(new_method_func, self, PureProxy))
def __getattr__(self, name):
return getattr(self.underlying, name)
class FreezeArgs(object):
def __init__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
def bind(f, *args, **kwargs):
""" Lets you freeze arguments of a function be certain values. Unlike
functools.partial, you can freeze arguments by name, which has the bonus
of letting you freeze them out of order. args will be treated just like
partial, but kwargs will properly take into account if you are specifying
a regular argument by name. """
argspec = inspect.getargspec(f)
argdict = copy(kwargs)
if hasattr(f, "im_func"):
f = f.im_func
args_idx = 0
for arg in argspec.args:
if args_idx >= len(args):
break
argdict[arg] = args[args_idx]
args_idx += 1
num_plugged = args_idx
def new_func(*inner_args, **inner_kwargs):
args_idx = 0
for arg in argspec.args[num_plugged:]:
if arg in argdict:
continue
if args_idx >= len(inner_args):
# We can't raise an error here because some remaining arguments
# may have been passed in by keyword.
break
argdict[arg] = inner_args[args_idx]
args_idx += 1
f(**dict(argdict, **inner_kwargs))
return new_func
You're "binding too deep": change def __getattribute__(self, name): to def __getattr__(self, name): in class PureProxy. __getattribute__ intercepts every attribute access and so bypasses everything that you've set with setattr(self, name, ... making those setattr bereft of any effect, which obviously's not what you want; __getattr__ is called only for access to attributes not otherwise defined so those setattr calls become "operative" & useful.
In the body of that override, you can and should also change object.__getattribute__(self, "underlying") to self.underlying (since you're not overriding __getattribute__ any more). There are other changes I'd suggest (enumerate in lieu of the low-level logic you're using for counters, etc) but they wouldn't change the semantics.
With the change I suggest, your sample code works (you'll have to keep testing with more subtle cases of course). BTW, the way I debugged this was simply to stick in print statements in the appropriate places (a jurassic=era approach but still my favorite;-).

Resources