Python's enum module source code analysis

Python's enum module source code analysis

Author: weapon, free to laugh Bliss involved cents a pen hanging end, Zhu Fu temporary body with the speaker but 23.

Blog: zhihu.com/people/hong-wei-peng

To read this part, you need to understand metaclass programming.

Member names are not allowed to be repeated

My first idea in this part is to control __dict__the key. But this way is not good, the __dict__scope is large, it contains all the attributes and methods of the class. It's not just an enumerated namespace. I found in the source code that enum uses another method. By __prepare__can return an instance of a class dictionary magic method used in this example __prepare__magic method custom namespace defining member name must be unique within the space.

# Realize by yourself
class _Dict(dict):
    def __setitem__(self, key, value):
        if key in self:
            raise TypeError('Attempted to reuse key: %r'% key)
        super().__setitem__(key, value)

class MyMeta(type):
    @classmethod
    def __prepare__(metacls, name, bases):
        d = _Dict()
        return d

class Enum(metaclass=MyMeta):
    pass

class Color(Enum):
    red = 1
    red = 1 # TypeError: Attempted to reuse key:'red'

Let's take a look at the specific implementation of the Enum module:

class _EnumDict(dict):
    def __init__(self):
        super().__init__()
        self._member_names = []
        ...

    def __setitem__(self, key, value):
        ...
        elif key in self._member_names:
            # descriptor overwriting an enum?
            raise TypeError('Attempted to reuse key: %r'% key)
        ...
        self._member_names.append(key)
        super().__setitem__(key, value)

class EnumMeta(type):
    @classmethod
    def __prepare__(metacls, cls, bases):
        enum_dict = _EnumDict()
        ...
        return enum_dict

class Enum(metaclass=EnumMeta):
    ...

The module in the module _EnumDictcreates a _member_nameslist to store member names, because not all members in the namespace are enumerated members. For example __str__, __new__such as the method is not magic, so here __setitem__we need to do some filtering:

def __setitem__(self, key, value):
    if _is_sunder(key): # beginning and ending with an underscore, such as _order__
        raise ValueError('_names_ are reserved for future Enum use')
    elif _is_dunder(key): # Double underscore ending, such as __new__
        if key =='__order__':
            key ='_order_'
    elif key in self._member_names: # Duplicate defined key
        raise TypeError('Attempted to reuse key: %r'% key)
    elif not _is_descriptor(value): # value is not a descriptor
        self._member_names.append(key)
        self._last_values.append(value)
    super().__setitem__(key, value)

Module considerations will be more comprehensive.

Each member has a name attribute and a value attribute

In the above code, the value obtained by Color.red is 1. In the eumu module, in the defined enumeration class, each member has a name and attribute value; and if you are careful, you will find that Color.red is an example of Color. How is this situation achieved?

Or use yuan to complete the class, metaclass in __new__the realization of specific idea is to create the target class, then creates the same class for each member, and through setattr way as a follow-up class attributes to the target class The pseudo code is as follows:

def __new__(metacls, cls, bases, classdict):
    __new__ = cls.__new__
    # Create an enumeration class
    enum_class = super().__new__()
    # Each member is an example of cls, injected into the target class through setattr
    for name, value in cls.members.items():
        member = super().__new__()
        member.name = name
        member.value = value
        setattr(enum_class, name, member)
    return enum_class

Look at the next demo that can be run:

class _Dict(dict):
    def __init__(self):
        super().__init__()
        self._member_names = []

    def __setitem__(self, key, value):
        if key in self:
            raise TypeError('Attempted to reuse key: %r'% key)

        if not key.startswith("_"):
            self._member_names.append(key)
        super().__setitem__(key, value)

class MyMeta(type):
    @classmethod
    def __prepare__(metacls, name, bases):
        d = _Dict()
        return d

    def __new__(metacls, cls, bases, classdict):
        __new__ = bases[0].__new__ if bases else object.__new__
        # Create an enumeration class
        enum_class = super().__new__(metacls, cls, bases, classdict)

        # Create member
        for member_name in classdict._member_names:
            value = classdict[member_name]
            enum_member = __new__(enum_class)
            enum_member.name = member_name
            enum_member.value = value
            setattr(enum_class, member_name, enum_member)

        return enum_class

class MyEnum(metaclass=MyMeta):
    pass

class Color(MyEnum):
    red = 1
    blue = 2

    def __str__(self):
        return "%s.%s"% (self.__class__.__name__, self.name)

print(Color.red) # Color.red
print(Color.red.name) # red
print(Color.red.value) # 1

The enum module has the same idea of ​​implementing attributes with names and values ​​for each member (I will not post the code). EnumMeta.__new__It is the focus of this module. Almost all enumerated features are implemented in this function.

When the member values ​​are the same, the second member is the alias of the first member

From this section, we no longer use the description of the class implemented by ourselves, but illustrate its implementation by disassembling the code of the enum module. From the use characteristics of the module, we can know that if the member values ​​are the same, the latter will be the former. An alias for:

from enum import Enum
class Color(Enum):
    red = 1
    _red = 1

print(Color.red is Color._red) # True

From this we can know that red and _red are the same object. How can this be achieved?

The metaclass will create _member_map_attributes for the enumeration class to store the mapping relationship between member names and members. If it finds that the value of the created member is already in the mapping relationship, it will replace it with the object in the mapping table:

class EnumMeta(type):
    def __new__(metacls, cls, bases, classdict):
        ...
        # create our new Enum type
        enum_class = super().__new__(metacls, cls, bases, classdict)
        enum_class._member_names_ = [] # names in definition order
        enum_class._member_map_ = OrderedDict() # name->value map

        for member_name in classdict._member_names:
            enum_member = __new__(enum_class)

            # If another member with the same value was already defined, the
            # new member becomes an alias to the existing one.
            for name, canonical_member in enum_class._member_map_.items():
                if canonical_member._value_ == enum_member._value_:
                    enum_member = canonical_member # replace
                    break
            else:
                # Aliases don't appear in member names (only in __members__).
                enum_class._member_names_.append(member_name) # New member, add to _member_names_

            enum_class._member_map_[member_name] = enum_member
            ...

From the code point of view, even if the member values ​​are the same, objects will be created for both of them first, but the ones created later will soon be garbage collected (I think there is room for optimization here). By _member_map_doing comparative mapping table to create value in place of members of the members of the follow-up, but both are members in name _member_map_, as in the example redand _redare in the dictionary, but they point to the same object.

The attribute _member_names_will only record the first one, which will be related to the iteration of the enumeration.

Members can be obtained by member value

print(Color['red']) # Color.red Get members by member name
print(Color(1)) # Color.red Get members by member value

The members in the enumeration class are all in singleton mode. The enumeration class created by the metaclass also maintains the mapping relationship between values ​​and members _value2member_map_:

class EnumMeta(type):
    def __new__(metacls, cls, bases, classdict):
        ...
        # create our new Enum type
        enum_class = super().__new__(metacls, cls, bases, classdict)
        enum_class._value2member_map_ = {}

        for member_name in classdict._member_names:
            value = enum_members[member_name]
            enum_member = __new__(enum_class)

            enum_class._value2member_map_[value] = enum_member
            ...

Then __new__return the singleton in Enum :

class Enum(metaclass=EnumMeta):
    def __new__(cls, value):
        if type(value) is cls:
            return value

        # Try to get from _value2member_map_
        try:
            if value in cls._value2member_map_:
                return cls._value2member_map_[value]
        except TypeError:
            # Get from _member_map_ mapping
            for member in cls._member_map_.values():
                if member._value_ == value:
                    return member

        raise ValueError("%r is not a valid %s"% (value, cls.__name__))

Iteratively traverse members

The enumeration class supports iterative traversal of members. In the order of definition, if there are members with duplicate values, only the first member with duplicate values ​​is obtained. For repeated member values, only the first member is obtained, and just the attribute _member_names_will only record the first one:

class Enum(metaclass=EnumMeta):
    def __iter__(cls):
        return (cls._member_map_[name] for name in cls._member_names_)

summary

This is how the core features of the enum module are implemented. Almost all are implemented through metaclass black magic. The size of the members cannot be compared, but the equivalent comparison can be done. On the contrary, there is no need to talk about it. In fact, this is what inherited from object, and it has "features" without doing anything extra.

In short, the enum module is relatively independent, and the amount of code is not much. For those who want to know metaclass programming, you can read it, textbook teaching, and singleton mode, etc., it is worth reading.

"45 Lectures on Data Analysis and Actual Combat"

The author Chen Yang, Ph.D., Department of Computer Science, Tsinghua University, began programming at the age of 10, won the first prize of the National Informatics Olympiad twice, and the Asian Bronze Award of the ACM International Programming Competition twice. Through data analysis, it has successively analyzed user portraits and communication topics for Tencent Video, Bitauto, 58.com, Ant Financial, and JD, and made communication decisions for brand activities.

There are also good benefits. Geek Time 2019 Limited Edition Weekly Calendar (worth ¥99), we will randomly send it to 3 people for free from the fans who scanned the poster today (the deadline is 24:00 today!)

How to get the weekly calendar: Scan the poster-purchase column-send a screenshot of the successful purchase to the backstage of the official account-randomly select 3 digits.

§§

As a decentralized global technology community, the Python Chinese community has the vision of becoming a spiritual tribe of 200,000 Python Chinese developers around the world. It currently covers major mainstream media and collaboration platforms, and is closely related to Alibaba, Tencent, Baidu, Microsoft, Amazon, and open source. China, CSDN and other industry well-known companies and technology communities have established extensive connections, with tens of thousands of registered members from more than a dozen countries and regions, members from the Ministry of Public Security, the Ministry of Industry and Information Technology, Tsinghua University, Peking University, Beijing University of Posts and Telecommunications, and the People of China Government agencies, scientific research institutions, financial institutions, and well-known companies at home and abroad represented by banks, the Chinese Academy of Sciences, CICC, Huawei, BAT, Google, and Microsoft, have attracted nearly 200,000 developers on the entire platform.

Reference: https://cloud.tencent.com/developer/article/1379569 Python's enum module source code analysis-Cloud + Community-Tencent Cloud