天天看點

Python屬性和方法

關鍵字:Python 屬性 方法

Copyright © 2005-2009 Shalabh Chaturvedi

<a></a>

All Rights Reserved.

About This Book

Explains the mechanics of object attribute access for new-style Python objects:

how functions become methods

how descriptors and properties work

determining method resolution order

New-style implies Python version 2.2 and upto and including 3.x. There have been some behavioral changes during these version but all the concepts covered here are valid.

This book is part of a series:

<a href="http://www.cafepy.com/article/python_types_and_objects/" target="_top">Python Types and Objects</a>

Python Attributes and Methods [you are here]

This revision: 

Table of Contents

<dl></dl>

<dd><dl></dl></dd>

List of Figures

List of Examples

Some points you should note:

This book covers the new-style objects (introduced a long time ago in Python 2.2). Examples are valid for Python 2.5 and all the way to Python 3.x.

This book is not for absolute beginners. It is for people who already know Python (some Python at least), and want to know more.

Happy pythoneering!

What is an attribute? Quite simply, an attribute is a way to get from one object to another. Apply the power of the almighty dot - <code>objectname.attributename</code> - and voila! you now have the handle to another object. You also have the power to create attributes, by assignment: <code>objectname.attributename = notherobject</code>.

Which object does an attribute access return, though? And where does the object set as an attribute end up? These questions are answered in this chapter.

Example 1.1. Simple attribute access

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#simple-attribute-access-example-1"></a>

Attributes can be set on a class.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#simple-attribute-access-example-2"></a>

Or even on an instance of a class.

Both, class and instance attributes are accessible from an instance.

Attributes really sit inside a dictionary-like <code>__dict__</code> in the object.

<code>__dict__</code> contains only the user-provided attributes.

Ok, I admit 'user-provided attribute' is a term I made up, but I think it is useful to understand what is going on. Note that <code>__dict__</code> is itself an attribute. We didn't set this attribute ourselves, but Python provides it. Our old friends<code>__class__</code> and <code>__bases__</code> (none which appear to be in <code>__dict__</code> either) also seem to be similar. Let's call themPython-provided attributes. Whether an attribute is Python-provided or not depends on the object in question (<code>__bases__</code>, for example, is Python-provided only for classes).

We, however, are more interested in user-defined attributes. These are attributes provided by the user, and they usually (but not always) end up in the <code>__dict__</code> of the object on which they're set.

When accessed (for e.g. <code>print objectname.attributename</code>), the following objects are searched in sequence for the attribute:

The object itself (<code>objectname.__dict__</code> or any Python-provided attribute of <code>objectname</code>).

The object's type (<code>objectname.__class__.__dict__</code>). Observe that only <code>__dict__</code> is searched, which means only user-provided attributes of the class. In other words <code>objectname.__bases__</code> may not return anything even though <code>objectname.__class__.__bases__</code> does exist.

The bases of the object's class, their bases, and so on. (<code>__dict__</code> of each of<code>objectname.__class__.__bases__</code>). More than one base does not confuse Python, and should not concern us at the moment. The point to note is that all bases are searched until an attribute is found.

If all this hunting around fails to find a suitably named attribute, Python raises an <code>AttributeError</code>. The type of the type (<code>objectname.__class__.__class__</code>) is never searched for attribute access on an object (<code>objectname</code> in the example).

The above section explains the general mechanism for all objects. Even for classes (for example accessing<code>classname.attrname</code>), with a slight modification: the bases of the class are searched before the class of the class (which is <code>classname.__class__</code> and for most types, by the way, is <code>&lt;type 'type'&gt;</code>).

Some objects, such as built-in types and their instances (lists, tuples, etc.) do not have a <code>__dict__</code>. Consequently user-defined attributes cannot be set on them.

We're not done yet! This was the short version of the story. There is more to what can happen when setting and getting attributes. This is explored in the following sections.

Continuing our Python experiments:

Example 1.2. A function is more

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#function-descriptor-example-1"></a>

Two innocent looking class attributes, a string 'classattr' and a function 'f'.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#function-descriptor-example-2"></a>

Accessing the string really gets it from the class's <code>__dict__</code>, as expected.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#function-descriptor-example-3"></a>

Not so for the function! Why?

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#function-descriptor-example-4"></a>

Hmm, it does look like a different object. (A bound method is a callable object that calls a function (<code>C.f</code> in the example) passing an instance (<code>cobj</code> in the example) as the first argument in addition to passing through all arguments it was called with. This is what makes method calls on instance work.)

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#function-descriptor-example-5"></a>

Here's the spoiler - this is what Python did to create the bound method. While looking for an attribute for an instance, if Python finds an object with a <code>__get__()</code> method inside the class's <code>__dict__</code>, instead of returning the object, it calls the <code>__get__()</code> method and returns the result. Note that the <code>__get__()</code> method is called with the instance and the class as the first and second arguments respectively.

It is only the presence of the <code>__get__()</code> method that transforms an ordinary function into a bound method. There is nothing really special about a function object. Anyone can put objects with a <code>__get__()</code> method inside the class<code>__dict__</code> and get away with it. Such objects are called descriptors and have many uses.

Any object with a <code>__get__()</code> method, and optionally <code>__set__()</code> and <code>__delete__()</code> methods, accepting specific parameters is said to follow the descriptor protocol. Such an object qualifies as a descriptor and can be placed inside a class's <code>__dict__</code> to do something special when an attribute is retrieved, set or deleted. An empty descriptor is shown below.

Example 1.3. A simple descriptor

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#simple-descriptor-example-1"></a>

Called when attribute is read (eg. <code>print objectname.attrname</code>). Here <code>obj</code> is the object on which the attribute is accessed (may be <code>None</code> if the attribute is accessed directly on the class, eg. <code>print classname.attrname</code>). Also <code>cls</code> is the class of <code>obj</code> (or the class, if the access was on the class itself. In this case, <code>obj</code> is None).

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#simple-descriptor-example-2"></a>

Called when attribute is set on an instance (eg. <code>objectname.attrname = 12</code>). Here <code>obj</code> is the object on which the attribute is being set and <code>val</code> is the object provided as the value.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#simple-descriptor-example-3"></a>

Called when attribute is deleted from an instance (eg. <code>del objectname.attrname</code>). Here <code>obj</code> is the object on which the attribute is being deleted.

What we defined above is a class that can be instantiated to create a descriptor. Let's see how we can create a descriptor, attach it to a class and put it to work.

Example 1.4. Using a descriptor

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-1"></a>

Now the attribute called <code>d</code> is a descriptor. (This uses <code>Desc</code> from previous example.)

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-2"></a>

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-3"></a>

Calls <code>d.__set__(cobj, "setting a value")</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-4"></a>

Sticking a value directly in the instance's <code>__dict__</code> works, but...

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-5"></a>

is futile. This still calls <code>d.__get__(cobj, C)</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-6"></a>

Calls <code>d.__delete__(cobj)</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-7"></a>

Calls <code>d.__get__(None, C)</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-descriptor-example-8"></a>

Doesn't call anything. This replaces the descriptor with a new string object. After this, accessing <code>cobj.d</code> or <code>C.d</code>will just return the string <code>"setting a value on class"</code>. The descriptor has been kicked out of <code>C</code>'s <code>__dict__</code>.

Note that when accessed from the class itself, only the <code>__get__()</code> method comes in the picture, setting or deleting the attribute will actually replace or remove the descriptor.

Descriptors work only when attached to classes. Sticking a descriptor in an object that is not a class gives us nothing.

In the previous section we used a descriptor with both <code>__get__()</code> and <code>__set__()</code> methods. Such descriptors, by the way, are called data descriptors. Descriptors with only the <code>__get__()</code> method are somewhat weaker than their cousins, and called non-data descriptors.

Repeating our experiment, but this time with non-data descriptors, we get:

Example 1.5. Non-data descriptors

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#non-data-descriptor-example-1"></a>

Calls <code>d.__get__(cobj, C)</code> (just like before).

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#non-data-descriptor-example-2"></a>

Puts <code>"setting a value"</code> in the instance itself (in <code>cobj.__dict__</code> to be precise).

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#non-data-descriptor-example-3"></a>

Surprise! This now returns <code>"setting a value"</code>, that is picked up from <code>cobj.__dict__</code>. Recall that for a data descriptor, the instance's <code>__dict__</code> is bypassed.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#non-data-descriptor-example-4"></a>

Deletes the attribute <code>d</code> from the instance (from <code>cobj.__dict__</code> to be precise).

These function identical to a data descriptor.

Interestingly, not having a <code>__set__()</code> affects not just attribute setting, but also retrieval. What is Python thinking? If on setting, the descriptor gets fired and puts the data somewhere, then it follows that the descriptor only knows how to get it back. Why even bother with the instance's <code>__dict__</code>?

Data descriptors are useful for providing full control over an attribute. This is what one usually wants for attributes used to store some piece of data. For example an attribute that gets transformed and saved somewhere on setting, would usually be reverse-transformed and returned when read. When you have a data descriptor, it controls all access (both read and write) to the attribute on an instance. Of course, you could still directly go to the class and replace the descriptor, but you can't do that from an instance of the class.

Non-data descriptors, in contrast, only provide a value when an instance itself does not have a value. So setting the attribute on an instance hides the descriptor. This is particularly useful in the case of functions (which are non-data descriptors) as it allows one to hide a function defined in the class by attaching one to an instance.

Example 1.6. Hiding a method

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#hiding-method-example-1"></a>

Calls the bound method returned by <code>f.__get__(cobj, C)</code>. Essentially ends up calling <code>C.__dict__['f'](cobj)</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#hiding-method-example-2"></a>

Calls <code>another_f()</code>. The function <code>f()</code> defined in <code>C</code> has been hidden.

This is the long version of the attribute access story, included just for the sake of completeness.

When retrieving an attribute from an object (<code>print objectname.attrname</code>) Python follows these steps:

If <code>attrname</code> is a special (i.e. Python-provided) attribute for <code>objectname</code>, return it.

Check <code>objectname.__class__.__dict__</code> for <code>attrname</code>. If it exists and is a data-descriptor, return the descriptor result. Search all bases of <code>objectname.__class__</code> for the same case.

Check <code>objectname.__dict__</code> for <code>attrname</code>, and return if found. If <code>objectname</code> is a class, search its bases too. If it is a class and a descriptor exists in it or its bases, return the descriptor result.

Check <code>objectname.__class__.__dict__</code> for <code>attrname</code>. If it exists and is a non-data descriptor, return the descriptor result. If it exists, and is not a descriptor, just return it. If it exists and is a data descriptor, we shouldn't be here because we would have returned at point 2. Search all bases of <code>objectname.__class__</code> for same case.

Raise <code>AttributeError</code>

Note that Python first checks for a data descriptor in the class (and its bases), then for the attribute in the object<code>__dict__</code>, and then for a non-data descriptor in the class (and its bases). These are points 2, 3 and 4 above.

The descriptor result above implies the result of calling the <code>__get__()</code> method of the descriptor with appropriate arguments. Also, checking a <code>__dict__</code> for <code>attrname</code> means checking if <code>__dict__["attrname"]</code> exists.

Now, the steps Python follows when setting a user-defined attribute (<code>objectname.attrname = something</code>):

Check <code>objectname.__class__.__dict__</code> for <code>attrname</code>. If it exists and is a data-descriptor, use the descriptor to set the value. Search all bases of <code>objectname.__class__</code> for the same case.

Insert <code>something</code> into <code>objectname.__dict__</code> for key <code>"attrname"</code>.

Think "Wow, this was much simpler!"

What happens when setting a Python-provided attribute depends on the attribute. Python may not even allow some attributes to be set. Deletion of attributes is very similar to setting as above.

Before you rush to the mall and get yourself some expensive descriptors, note that Python ships with some very useful ones that can be found by simply looking in the box.

Example 1.7. Built-in descriptors

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#builtin-descriptors-example-1"></a>

A property provides an easy way to call functions whenever an attribute is retrieved, set or deleted on the instance. When the attribute is retrieved from the class, the getter method is not called but the property object itself is returned. A docstring can also be provided which is accessible as <code>HidesA.a.__doc__</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#builtin-descriptors-example-2"></a>

A classmethod is similar to a regular method, except that is passes the class (and not the instance) as the first argument to the function. The remaining arguments are passed through as usual. It can also be called directly on the class and it behaves the same way. The first argument is named <code>cls</code> instead of the traditional<code>self</code> to avoid confusion regarding what it refers to.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#builtin-descriptors-example-3"></a>

A staticmethod is just like a function outside the class. It is never bound, which means no matter how you access it (on the class or on an instance), it gets called with exactly the same arguments you pass. No object is inserted as the first argument.

As we saw earlier, Python functions are descriptors too. They weren't descriptors in earlier versions of Python (as there were no descriptors at all), but now they fit nicely into a more generic mechanism.

A property is always a data-descriptor, but not all arguments are required when defining it.

Example 1.8. More on properties

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#more-on-properties-example-1"></a>

Can be set, retrieved, or deleted.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#more-on-properties-example-2"></a>

Attempting to delete this attribute from an instance will raise <code>AttributeError</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#more-on-properties-example-3"></a>

Attempting to set or delete this attribute from an instance will raise <code>AttributeError</code>.

The getter and setter functions need not be defined in the class itself, any function can be used. In any case, the functions will be called with the instance as the first argument. Note that where the functions are passed to the property constructor above, they are not bound functions anyway.

Another useful observation would be to note that subclassing the class and redefining the getter (or setter) functions is not going to change the property. The property object is holding on to the actual functions provided. When kicked, it is going to say "Hey, I'm holding this function I was given, I'll just call this and return the result.", and not "Hmm, let me look up the current class for a method called 'get_a' and then use that". If that is what one wants, then defining a new descriptor would be useful. How would it work? Let's say it is initialized with a string (i.e. the name of the method to call). On activation, it does a <code>getattr()</code> for the method name on the class, and use the method found. Simple!

Classmethods and staticmethods are non-data descriptors, and so can be hidden if an attribute with the same name is set directly on the instance. If you are rolling your own descriptor (and not using properties), it can be made read-only by giving it a <code>__set__()</code> method but raising <code>AttributeError</code> in the method. This is how a property behaves when it does not have a setter function.

Why do we need Method Resolution Order? Let's say:

We're happily doing OO programming and building a class hierarchy.

Our usual technique to implement the <code>do_your_stuff()</code> method is to first call <code>do_your_stuff()</code> on the base class, and then do our stuff.

Example 2.1. Usual base call technique

We subclass a new class from two classes and end up having the same superclass being accessible through two paths.

Example 2.2. Base call technique fails

Figure 2.1. The Diamond Diagram

Now we're stuck if we want to implement <code>do_your_stuff()</code>. Using our usual technique, if we want to call both <code>B</code>and <code>C</code>, we end up calling <code>A.do_your_stuff()</code> twice. And we all know it might be dangerous to have <code>A</code> do its stuff twice, when it is only supposed to be done once. The other option would leave either <code>B</code>'s stuff or <code>C</code>'s stuff not done, which is not what we want either.

There are messy solutions to this problem, and clean ones. Python, obviously, implements a clean one which is explained in the next section.

Let's say:

For each class, we arrange all superclasses into an ordered list without repetitions, and insert the class itself at the start of the list. We put this list in an class attribute called <code>next_class_list</code> for our use later.

Example 2.3. Making a "Who's Next" list

We use a different technique to implement <code>do_your_stuff()</code> for our classes.

Example 2.4. Call next method technique

The interesting part is how we <code>find_out_whos_next()</code>, which depends on which instance we are working with. Note that:

Depending on whether we passed an instance of <code>D</code> or of <code>B</code>, <code>next_class</code> above will resolve to either <code>C</code> or <code>A</code>.

Using this technique, each method is called only once. It appears clean, but seems to require too much work. Fortunately for us, we neither have to implement <code>find_out_whos_next()</code> for each class, nor set the <code>next_class_list</code>, as Python does both of these things.

Python provides a class attribute <code>__mro__</code> for each class, and a type called <code>super</code>. The <code>__mro__</code> attribute is a tuple containing the class itself and all of its superclasses without duplicates in a predictable order. A <code>super</code> object is used in place of the <code>find_out_whos_next()</code> method.

Example 2.5. One super technique

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-super-example-2"></a>

If we're using a class method, we don't have an instance <code>self</code> to pass into the <code>super</code> call. Fortunately for us, <code>super</code>works even with a class as the second argument. Observe that above, <code>super</code> uses <code>self</code> only to get at<code>self.__class__.__mro__</code>. The class can be passed directly to <code>super</code> as shown below.

Example 2.6. Using super with a class method

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#super-with-classmethod-example-1"></a>

This example is for classmethods (not instance methods).

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#super-with-classmethod-example-2"></a>

Note we pass <code>cls</code> (the class and not the instance) to super().

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#super-with-classmethod-example-3"></a>

This prints out:

A says hello

B says hello

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#super-with-classmethod-example-4"></a>

This prints out (observe each method is called only once):

C says hello

D says hello

There is yet another way to use <code>super</code>:

Example 2.7. Another super technique

While using <code>super</code> we typically use only one <code>super</code> call in one method even if the class has multiple bases. Also, it is a good programming practice to use <code>super</code> instead of calling methods directly on a base class.

A possible pitfall appears if <code>do_your_stuff()</code> accepts different arguments for <code>C</code> and <code>A</code>. This is because, if we use<code>super</code> in <code>B</code> to call <code>do_your_stuff()</code> on the next class, we don't know if it is going to be called on <code>A</code> or <code>C</code>. If this scenario is unavoidable, a case specific solution might be required.

One question as of yet unanswered is how does Python determine the <code>__mro__</code> for a type? A basic idea behind the algorithm is provided in this section. This is not essential for just using <code>super</code>, or reading following sections, so you can jump to the next section if you want.

Python determines the precedence of types (or the order in which they should be placed in any <code>__mro__</code>) from two kinds of constraints specified by the user:

If <code>A</code> is a superclass of <code>B</code>, then <code>B</code> has precedence over <code>A</code>. Or, <code>B</code> should always appear before <code>A</code> in all <code>__mro__</code>s (that contain both). In short let's denote this as <code>B &gt; A</code>.

If <code>C</code> appears before <code>D</code> in the list of bases in a class statement (eg. <code>class Z(C,D):</code>), then <code>C &gt; D</code>.

In addition, to avoid being ambiguous, Python adheres to the following principle:

If <code>E &gt; F</code> in one scenario (or one <code>__mro__</code>), then it should be that <code>E &gt; F</code> in all scenarios (or all <code>__mro__</code>s).

We can satisfy the constraints if we build the <code>__mro__</code> for each new class <code>C</code> we introduce, such that:

All superclasses of <code>C</code> appear in the <code>C.__mro__</code> (plus <code>C</code> itself, at the start), and

The precedence of types in <code>C.__mro__</code> does not conflict with the precedence of types in <code>B.__mro__</code> for each <code>B</code>in <code>C.__bases__</code>.

Here the same problem is translated into a game. Consider a class hierarchy as follows:

Figure 2.2. A Simple Hierarchy

Since only single inheritance is in play, it is easy to find the <code>__mro__</code> of these classes. Let's say we define a new class as <code>class N(A,B,C)</code>. To compute the <code>__mro__</code>, consider a game using abacus style beads over a series of strings.

Figure 2.3. Beads on Strings - Unsolvable

Beads can move freely over the strings, but the strings cannot be cut or twisted. The strings from left to right contain beads in the order of <code>__mro__</code> of each of the bases. The rightmost string contains one bead for each base, in the order the bases are specified in the class statement.

The objective is to line up beads in rows, so that each row contains beads with only one label (as done with the <code>O</code>bead in the diagram). Each string represents an ordering constraint, and if we can reach the goal, we would have an order that satisfies all constraints. We could then just read the labels off rows from the bottom up to get the <code>__mro__</code>for <code>N</code>.

Unfortunately, we cannot solve this problem. The last two strings have <code>C</code> and <code>B</code> in different orders. However, if we change our class definition to <code>class N(A,C,B)</code>, then we have some hope.

Figure 2.4. Beads on Strings - Solved

We just found out that <code>N.__mro__</code> is <code>(N,A,C,B,object)</code> (note we inserted <code>N</code> at the head). The reader can try out this experiment in real Python (for the unsolvable case above, Python raises an exception). Observe that we even swapped the position of two strings, keeping the strings in the same order as the bases are specified in the class statement. The usefulness of this is seen later.

Sometimes, there might be more than one solution, as shown in the figure below. Consider four classes <code>class A(object)</code>, <code>class B(A)</code>, <code>class C(object)</code> and <code>class D(C)</code>. If a new class is defined as <code>class E(B, D)</code>, there are multiple possible solutions that satisfy all constraints.

Figure 2.5. Multiple Solutions

Possible positions for <code>A</code> are shown as the little beads. The order can be kept unambiguous (more correctly,monotonic) if the following policies are followed:

Arrange strings from left to right in order of appearance of bases in the class statement.

Attempt to arrange beads in rows moving from bottom up, and left to right. What this means is that the MRO of<code>class E(B, D)</code> will be set to: <code>(E,B,A,D,C,object)</code>. This is because <code>A</code>, being left of <code>C</code>, will be selected first as a candidate for the second row from bottom.

This chapter includes usage notes that do not fit in other chapters.

In Python, we can use methods with special name like <code>__len__()</code>, <code>__str__()</code> and <code>__add__()</code> to make objects convenient to use (for example, with the built-in functions <code>len()</code>, <code>str()</code> or with the '<code>+</code>' operator, etc.)

Example 3.1. Special methods work on type only

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#emulator-methods-example-1"></a>

Usually we put the special methods in a class.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#emulator-methods-example-2"></a>

We can try to put them in the instance itself, but it doesn't work.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#emulator-methods-example-3"></a>

This goes straight to the class (calls <code>C.__len__()</code>), not to the instance.

The same is true for all such methods, putting them on the instance we want to use them with does not work. If it did go to the instance then even something like <code>str(C)</code> (<code>str</code> of the class <code>C</code>) would go to <code>C.__str__()</code>, which is a method defined for an instance of <code>C</code>, and not <code>C</code> itself.

A simple technique to allow defining such methods for each instance separately is shown below.

Example 3.2. Forwarding special method to instance

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#forwarding-emulator-method-example-1"></a>

We call another method on the instance,

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#forwarding-emulator-method-example-2"></a>

for which we provide a default implementation in the class.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#forwarding-emulator-method-example-3"></a>

But it can be overwritten (rather hidden) by setting on the instance.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#forwarding-emulator-method-example-4"></a>

This now calls <code>mylen()</code>.

Subclassing built-in types is straightforward. Actually we have been doing it all along (whenever we subclass <code>&lt;type 'object'&gt;</code>). Some built-in types (<code>types.FunctionType</code>, for example) are not subclassable (not yet, at least). However, here we talk about subclassing <code>&lt;type 'list'&gt;</code>, <code>&lt;type 'tuple'&gt;</code> and other basic data types.

Example 3.3. Subclassing &lt;type 'list'&gt;

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#list-subtype-example-1"></a>

A regular class statement.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#list-subtype-example-2"></a>

Define the method to be overridden. In this case we will convert all items passed through <code>append()</code> to integers.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#list-subtype-example-3"></a>

Upcall to the base if required. <code>list.append()</code> works like an unbound method, and is passed the instance as the first argument.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#list-subtype-example-4"></a>

Append a float and...

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#list-subtype-example-5"></a>

watch it automatically become an integer.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#list-subtype-example-6"></a>

Otherwise, it behaves like any other list.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#list-subtype-example-7"></a>

This doesn't go through append. We would have to define <code>__setitem__()</code> in our class to massage such data. The upcall would be to <code>list.__setitem__(self,item)</code>. Note that the special methods such as <code>__setitem__</code>exist on built-in types.

We can set attributes on our instance. This is because it has a <code>__dict__</code>.

Basic lists do not have <code>__dict__</code> (and so no user-defined attributes), but ours does. This is usually not a problem and may even be what we want. If we use a very large number of <code>MyList</code>s, however, we could optimize our program by telling Python not to create the <code>__dict__</code> for instances of <code>MyList</code>.

Example 3.4. Using <code>__slots__</code> for optimization

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-slots-example-1"></a>

The <code>__slots__</code> class attribute tells Python to not create a <code>__dict__</code> for instances of this type.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-slots-example-2"></a>

Setting any attribute on this raises an exception.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-slots-example-3"></a>

<code>__slots__</code> can contain a list of strings. The instances still don't get a real dictionary for <code>__dict__</code>, but they get a proxy. Python reserves space in the instance for the specified attributes.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-slots-example-4"></a>

Now, if an attribute has space reserved, it can be used.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#using-slots-example-5"></a>

Otherwise, it cannot. This will raise an exception.

The purpose and recommended use of <code>__slots__</code> is for optimization. After a type is defined, its slots cannot be changed. Also, every subclass must define <code>__slots__</code>, otherwise its instances will end up having <code>__dict__</code>.

We can create a list even by instantiating it like any other type: <code>list([1,2,3])</code>. This means <code>list.__init__()</code>accepts the same argument (i.e. any iterable) and initializes a list. We can customize initialization in a subclass by redefining <code>__init__()</code> and upcalling <code>__init__()</code> on the base.

Tuples are immutable and different from lists. Once an instance is created, it cannot be changed. Note that the instance of a type already exists when <code>__init__()</code> is called (in fact the instance is passed as the first argument). The<code>__new__()</code> static method of a type is called to create an instance of the type. It is passed the type itself as the first argument, and passed through other initial arguments (similar to <code>__init__()</code>). We use this to customize immutable types like a tuple.

Example 3.5. Customizing creation of subclasses

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#customizing-creation-example-1"></a>

For a list, we massage the arguments and hand them over to <code>list.__init__()</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#customizing-creation-example-2"></a>

For a tuple, we have to override <code>__new__()</code>.

<a href="http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html#customizing-creation-example-3"></a>

A <code>__new__()</code> should always return. It is supposed to return an instance of the type.

The <code>__new__()</code> method is not special to immutable types, it is used for all types. It is also converted to a static method automatically by Python (by virtue of its name).