Description
Background
CPython currently exposes several internal implementation details via APIs such as:
sys._defer_refcount
sys._is_immortal
sys._is_interned
These APIs leak implementation-specific details and create implicit expectations of cross-version(e.g 3.13, 3.14, 3.15...) and cross-implementation compatibility(e.g CPython, PyPy, RustPython), even though they are not part of any formal public API contract.
For example, other Python implementations may not support or emulate these features, and yet their presence in CPython can create unintentional backward compatibility burdens when new releases are made.
Proposal
To address this, I would like to propose introducing two weak introspection APIs in the sys
module:
sys.set_tags(obj, ["defer_refcount", "tag2"]) -> None
sys.get_tags(obj) -> tuple
sys.set_tags(obj, tags: Iterable[str]) -> None
- Sets optional "tags" on an object.
- Tags are hints for the Python implementation and are not guaranteed to be applied or have any effect.
- The implementation may accept or ignore any or all provided tags.
- These tags are advisory only, intended primarily for debugging, experimentation, and tooling used by Python implementation developers.
sys.get_tags(obj) -> tuple[str, ...]
- Returns the tags currently associated with the object.
- These reflect only the tags actually recognized and retained by the interpreter.
- For example:
sys.set_tags(o, ["defer_refcount", "tag2"]) print(sys.get_tags(o)) # May return: ('defer_refcount',)
- If the object is already immortal due to previous operations, you might see:
sys.get_tags(o) # May return: ('defer_refcount', 'immortal')
Goals and Non-Goals
Goals:
- Provide a mechanism to annotate or mark objects for introspection/debugging.
- Allow developers of Python implementations or advanced tools to experiment with internal object states in a controlled manner.
Non-Goals:
- These APIs are not intended to be stable or relied upon for program behavior.
- No tag is guaranteed to have any effect or to be preserved between runs, interpreter versions, or across implementations.
Documentation and Guarantees
We will clearly document that:
- These APIs are for Python implementation developers only.
- The presence or absence of any particular tag does not imply any behavioral guarantees.
- Tags may be implementation-specific, and unsupported tags will be silently ignored.
- Maybe possible to provide Python-specific tags in somewhere but should note that it will not be guarantee according to versions
cc @ZeroIntensity @vstinner @Fidget-Spinner @colesbury
Activity
threading
concurrency primitives #134762[-]Add sys.set_tags() and sys.get_tags() APIs for Debugging and Experimental Use[/-][+]Add sys.set_tags() and sys.get_tags() APIs for debugging and experimental Use[/+]corona10 commentedon May 28, 2025
FYI, I am even fine with
sys._set_tags
andsys._get_tags
, but I believe that it would be better than providing every Python API per implementations.corona10 commentedon May 28, 2025
And please let me know if there are better namings
ZeroIntensity commentedon May 28, 2025
I like the general idea, but I have a few notes/concerns:
set_tags
be plural? I'd think that in most cases, you'd want to set one tag only.set_tags
seems too misleading if the interpreter is allowed to ignore it. If I see anything called "set", I'd expect it to actually set something upon calling it. How aboutrequest_tags
?get_tags
/set_tags
only covers implementation details for objects themselves. They won't work for experimental APIs that need parameters.An alternative could be to properly expose unstable APIs like we do with
PyUnstable
in the C API. Maybe something likesys.unstable_defer_refcount
, or anunstable
module (from unstable.sys import defer_refcount
) could work.corona10 commentedon May 28, 2025
I no longer like adding more such APIs. The basic concept of this API is not making CPython a specific implementation's API anymore. It will break other implementations and cause compatibility issues. unstable.sys will not solve the current situtations.
Well, API will not care about whether the user adding multiple attributes and singe attribute anyway.
I don't care about the naming, I thought that get/set is conventional naming. For me, this is matter of documentation and I still think that people should not use this API as much as possible.
Would you like to provide a concrete example? Currently, we only care about defer_refcount and immortal, so I didn't think about it. Well, we could change the signature of set_tags to be set_tag and make it receive parameters.
ZeroIntensity commentedon May 28, 2025
I'm worried that
get_tags
isn't much better. If someone were to write something like this, it would not be portable to other implementations:Couldn't other implementations just implement
_is_interned
or whatever as justreturn False
?What if we wanted to provide an API for object flags someday?
It's also not totally clear to me if
get_tags
/set_tags
is supposed to cover general object implementation details (e.g., immortality and DRC), or something specific to a type (e.g., string interning).corona10 commentedon May 28, 2025
The key point is where the focus lies. If you care about portability, then you shouldn’t rely on unstable or implementation-specific APIs, which may not exist in other versions or implementations. However,
sys.get_tags
itself will be available consistently across implementations and versions. As I mentioned earlier, we don’t guarantee the specific output — and if a third-party library depends on certain tags being present, that’s their responsibility.Consider the case where we want to remove
sys._is_immortal()
. Withsys.get_tags
, we can simply stop returning the "immortal" tag — the code using it won’t break; only the implementation detail changes, which is exactly what we want. On the other hand,sys._is_immortal()
is a different story: in some cases, we might have to keep the API even if we no longer want to support it.ZeroIntensity commentedon May 28, 2025
Ok, that makes sense.
The place where I'm getting a little tripped up is that the whole point of the
_
prefix was that we could remove it any version--it's supposed to be a private API, we just document it and thus shift the maintenance responsibility to users. I don't see it as much different than using a private method (prefixed with_
). Why doesn't that work?corona10 commentedon May 28, 2025
I'm open to making the API design more flexible, but we should still try to avoid exposing implementation details whenever possible. So, should we plan to support object flags in the future? The reason I mention this is that we can not cover all cases :)
How about
sys.set_tag(obj, tag: str, *, options: dict[str, Any] = {}) -> None
this?corona10 commentedon May 28, 2025
See: #134762 (comment), this is a real-world example.
There are also several alternative Python implementations, such as PyPy, GraalPython, and RustPython, which often copy parts of the CPython implementation and adapt them to their own runtimes. Introducing this API would help reduce their catch up burden and make the CPython runtime less tied to specific implementation details :)
ZeroIntensity commentedon May 28, 2025
I was under the impression that it'd be totally fine to remove
sys._getframe
, we just won't in practice because frames are exposed in other public APIs (e.g.,inspect.currentframe
). I think we might just need some additional rules on when something is private (or "unstable") and not.18 remaining items