Skip to content

Support field docstrings for dataclasses that use __slots__ #113878

Closed
@rhettinger

Description

@rhettinger

Feature or enhancement

Proposal:

A really nice __slots__ feature is the ability to provide a data dictionary in the form of docstrings. That is also available for property objects. Consider this example:

class Rocket:
    __slots__ = {
        'velocity': 'Speed relative to Earth in meters per second',
        'mass': 'Gross weight in kilograms',
    }
    @property
    def kinetic_energy(self):
        'Kinetic energy in Newtons'
        return self.mass * self.velocity ** 2

Running help(Rocket) shows all the field docstrings which is super helpful:

class Rocket(builtins.object)
 |  Readonly properties defined here:
 |
 |  kinetic_energy
 |      Kinetic energy in Newtons
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  mass
 |      Gross weight in kilograms
 |
 |  velocity
 |      Speed relative to Earth in meters per second

It would be nice is the same can be done with dataclasses that define __slots__:

@dataclass(slots=True)
class Rocket:
    velocity: float = field(doc='Speed relative to Earth in meters per second')
    weight: float = field(doc='Gross weight in kilograms')

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

Activity

sobolevn

sobolevn commented on Jan 10, 2024

@sobolevn
Member

The change is quite simple, it results in:

import dataclasses

@dataclasses.dataclass(slots=True)
class A:
    a: int
    b: int = dataclasses.field(doc='bb')

print(help(A))
Снимок экрана 2024-01-10 в 11 41 17

The change:

diff --git Lib/dataclasses.py Lib/dataclasses.py
index 2fba32b5ffb..356c3df5874 100644
--- Lib/dataclasses.py
+++ Lib/dataclasses.py
@@ -303,11 +303,12 @@ class Field:
                  'compare',
                  'metadata',
                  'kw_only',
+                 'doc',
                  '_field_type',  # Private: not to be used by user code.
                  )
 
     def __init__(self, default, default_factory, init, repr, hash, compare,
-                 metadata, kw_only):
+                 metadata, kw_only, doc):
         self.name = None
         self.type = None
         self.default = default
@@ -320,6 +321,7 @@ def __init__(self, default, default_factory, init, repr, hash, compare,
                          if metadata is None else
                          types.MappingProxyType(metadata))
         self.kw_only = kw_only
+        self.doc = doc
         self._field_type = None
 
     @_recursive_repr
@@ -335,6 +337,7 @@ def __repr__(self):
                 f'compare={self.compare!r},'
                 f'metadata={self.metadata!r},'
                 f'kw_only={self.kw_only!r},'
+                f'doc={self.doc!r},'
                 f'_field_type={self._field_type}'
                 ')')
 
@@ -402,7 +405,7 @@ def __repr__(self):
 # so that a type checker can be told (via overloads) that this is a
 # function whose type depends on its parameters.
 def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
-          hash=None, compare=True, metadata=None, kw_only=MISSING):
+          hash=None, compare=True, metadata=None, kw_only=MISSING, doc=None):
     """Return an object to identify dataclass fields.
 
     default is the default value of the field.  default_factory is a
@@ -414,7 +417,7 @@ def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
     comparison functions.  metadata, if specified, must be a mapping
     which is stored but not otherwise examined by dataclass.  If kw_only
     is true, the field will become a keyword-only parameter to
-    __init__().
+    __init__().  doc is a docstring for this field.
 
     It is an error to specify both default and default_factory.
     """
@@ -422,7 +425,7 @@ def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
     if default is not MISSING and default_factory is not MISSING:
         raise ValueError('cannot specify both default and default_factory')
     return Field(default, default_factory, init, repr, hash, compare,
-                 metadata, kw_only)
+                 metadata, kw_only, doc)
 
 
 def _fields_in_init_order(fields):
@@ -1156,7 +1159,7 @@ def _process_class(cls, init, repr, eq, order, unsafe_hash, frozen,
     if weakref_slot and not slots:
         raise TypeError('weakref_slot is True but slots is False')
     if slots:
-        cls = _add_slots(cls, frozen, weakref_slot)
+        cls = _add_slots(cls, frozen, weakref_slot, fields)
 
     abc.update_abstractmethods(cls)
 
@@ -1191,7 +1194,7 @@ def _get_slots(cls):
             raise TypeError(f"Slots of '{cls.__name__}' cannot be determined")
 
 
-def _add_slots(cls, is_frozen, weakref_slot):
+def _add_slots(cls, is_frozen, weakref_slot, defined_fields):
     # Need to create a new class, since we can't set __slots__
     #  after a class has been created.
 
@@ -1208,16 +1211,17 @@ def _add_slots(cls, is_frozen, weakref_slot):
     )
     # The slots for our class.  Remove slots from our base classes.  Add
     # '__weakref__' if weakref_slot was given, unless it is already present.
-    cls_dict["__slots__"] = tuple(
-        itertools.filterfalse(
+    cls_dict["__slots__"] = {
+        slot: getattr(defined_fields.get(slot), 'doc', None)
+        for slot in itertools.filterfalse(
             inherited_slots.__contains__,
             itertools.chain(
                 # gh-93521: '__weakref__' also needs to be filtered out if
                 # already present in inherited_slots
                 field_names, ('__weakref__',) if weakref_slot else ()
             )
-        ),
-    )
+        )
+    }
 
     for field_name in field_names:
         # Remove our attributes, if present. They'll still be

The only concern I have is that we change how __slots__ is defined: instead of tuple it would be a dict. I think that it should be fine, since it is a part of the language. But, some people might expect it to be only a tuple.

If others are fine with the change, I will add tests with docs and send a PR.

ericvsmith

ericvsmith commented on Jan 13, 2024

@ericvsmith
Member

I think this change is okay. I don't think it would be reasonable for anyone to assume that dataclasses' __slots__ would always be a tuple.

That said, the documentation at https://docs.python.org/3/reference/datamodel.html says of __slots__: "This class variable can be assigned a string, iterable, or sequence of strings with variable names used by instances.". Perhaps that needs to be updated for the dict case?

rhettinger

rhettinger commented on Jan 13, 2024

@rhettinger
ContributorAuthor

It would be nice to have the dict case specifically called out in the data model docs. Right now, all we have is the announcement in the 3.8 whatsnew.

ericvsmith

ericvsmith commented on Jan 13, 2024

@ericvsmith
Member

Created #114044 for the documentation issue.

ericvsmith

ericvsmith commented on Jan 13, 2024

@ericvsmith
Member

One question I have about this feature: should it be an error to specify doc if not using __slots__? It will be available in the Field object, so I guess that's of some utility. I guess if we got super excited about it, we could teach inspect about Field.doc if the class is a dataclass and __slots__ isn't a dict. Or maybe even if it is a dict, then we could skip changing __slots__ to a dict for dataclasses.

sobolevn

sobolevn commented on Jan 13, 2024

@sobolevn
Member

One question I have about this feature: should it be an error to specify doc if not using slots?

I think it should be ok, some other tool might also use this doc= field.

I guess if we got super excited about it, we could teach inspect about Field.doc if the class is a dataclass and slots isn't a dict. Or maybe even if it is a dict, then we could skip changing slots to a dict for dataclasses.

As a separate feature, I guess :)
Right now I don't understand the whole idea.

rhettinger

rhettinger commented on Jan 13, 2024

@rhettinger
ContributorAuthor

One question I have about this feature: should it be an error to specify doc if not using slots? It will be available in the Field object, so I guess that's of some utility.

That makes sense to me. In this regard, it would be a bit like the metadata field but for a dedicated purpose.

ericvsmith

ericvsmith commented on Jan 13, 2024

@ericvsmith
Member

Right now I don't understand the whole idea.

I'm not sure this answers your question, @sobolevn, but:

I mean that inspect.getdoc() could contain logic that says "if this is a dataclass, then show the documentation of the fields, using the Field.doc values, if they exist". It doesn't really fit into the "Data descriptors defined here" section, but we should be able to come up with something.

I'll grant you that this is an unrelated issue.

added a commit that references this issue on Jan 14, 2024

pythongh-113878: Add `doc` parameter to `dataclasses.field`

6a4b4fb
ericvsmith

ericvsmith commented on Jan 14, 2024

@ericvsmith
Member

My typical questions when adding a feature to dataclasses are:

  • Does attrs do anything similar? If so, we probably want to do it in a similar way.
  • Is there any impact on typing.dataclass_transform (PEP 681)?

11 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

stdlibPython modules in the Lib dirtype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Support field docstrings for dataclasses that use __slots__ · Issue #113878 · python/cpython

    Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

    Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant