gh-128045: Mark unknown opcodes as deopting to themselves #128044

DinoV · 2024-12-17T19:46:06Z

When accessing co_code on a code object we'll first run through to do the de-opt https://github.com/python/cpython/blob/main/Objects/codeobject.c#L1706

This removes any unknown opcodes. Instead the deopt table should just recognize unknown opcodes as deopting to themselves, allowing the extensible interpreter loop to consume unknown opcodes.

Issue: code objects remove unknown opcodes from the instruction stream when accessing co_code #128045

iritkatriel · 2024-12-17T22:20:58Z

Tools/cases_generator/opcode_metadata_generator.py

    for name, deopt in sorted(deopts):
        out.emit(f"[{name}] = {deopt},\n")
+    defined = set(analysis.opmap.values())
+    for i in range(256):
+        if i not in defined:
+            out.emit(f"[{i}] = {i},\n")


Since we're not testing this at all in cpython, I'd suggest we at least add a couple of assertions to make sure this is correctly covering the range of opcodes:

Suggested change

for name, deopt in sorted(deopts):

out.emit(f"[{name}] = {deopt},\n")

defined = set(analysis.opmap.values())

for i in range(256):

if i not in defined:

out.emit(f"[{i}] = {i},\n")

defined = set(analysis.opmap.values())

for i in range(256):

if i not in defined:

deopts.append((f'{i}', f'{i}'))

assert len(deopts) == 256

assert len(set(x[0] for x in deopts)) == 256

for name, deopt in sorted(deopts):

out.emit(f"[{name}] = {deopt},\n")

Summary: When CPython hands out the bytecode it will first do a de-opt on it: python/cpython#128044 Reviewed By: jbower-fb Differential Revision: D67350914 fbshipit-source-id: 0073efab52da1be775272e7dd9ae5a46468ccb10

markshannon · 2025-01-07T12:41:06Z

I think we regard undefined instructions an error.
If you want to pass custom bytecodes to your custom interpreter, a more robust approach might be needed.
We can do the deopt thing for now, but it seems fragile.

For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process.

DinoV · 2025-04-02T17:28:26Z

Took me a while to get back to this but I'm finally back :) I applied the changes suggested by @iritkatriel.

I think we regard undefined instructions an error. If you want to pass custom bytecodes to your custom interpreter, a more robust approach might be needed. We can do the deopt thing for now, but it seems fragile.

For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process.

I think as long as we could still have some way to construct a code object ourselves that would be fine, we'd just may need to implement our own custom unmarshaling logic that could support our opcodes. Obviously that doesn't cover all the ways that things could change in the future but we can try and figure out how to adapt to other potential changes :)

markshannon

Looks good

miss-islington-app · 2025-05-19T14:15:20Z

Thanks @DinoV for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

…onGH-128044) * Mark unknown opcodes as deopting to themselves (cherry picked from commit cc9add6) Co-authored-by: Dino Viehland <dinoviehland@meta.com>

bedevere-app · 2025-05-19T14:15:30Z

3.14 branch.

#134228) * GH-128044)

DinoV added the skip news label Dec 17, 2024

DinoV changed the title ~~Mark unknown opcodes as deopting to themselves~~ gh-128045: Mark unknown opcodes as deopting to themselves Dec 17, 2024

bedevere-app bot mentioned this pull request Dec 17, 2024

code objects remove unknown opcodes from the instruction stream when accessing co_code #128045

Closed

DinoV marked this pull request as ready for review December 17, 2024 20:12

DinoV requested a review from markshannon as a code owner December 17, 2024 20:12

bedevere-app bot added the awaiting core review label Dec 17, 2024

Mark unknown opcodes as deopting to themselves

e37182c

DinoV force-pushed the deopt_unknown_ops branch from 7048634 to a675bfa Compare April 2, 2025 17:28

Add extra assertions on number of opcodes emitted

efe6990

DinoV force-pushed the deopt_unknown_ops branch from a675bfa to efe6990 Compare April 2, 2025 17:36

bedevere-app bot added awaiting merge and removed awaiting core review labels May 19, 2025

DinoV added awaiting core review needs backport to 3.14 and removed awaiting merge labels May 19, 2025

DinoV merged commit cc9add6 into python:main May 19, 2025
59 checks passed

bedevere-app bot removed the awaiting core review label May 19, 2025

bedevere-app bot removed the needs backport to 3.14 label May 19, 2025

DinoV pushed a commit that referenced this pull request May 19, 2025

[3.14] gh-128045: Mark unknown opcodes as deopting to themselves (GH-…

c869898

#134228) * GH-128044)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-128045: Mark unknown opcodes as deopting to themselves #128044

gh-128045: Mark unknown opcodes as deopting to themselves #128044

DinoV commented Dec 17, 2024 •

edited by bedevere-app bot

Loading

Uh oh!

iritkatriel Dec 17, 2024

Uh oh!

markshannon commented Jan 7, 2025

Uh oh!

DinoV commented Apr 2, 2025

Uh oh!

markshannon left a comment

Uh oh!

Uh oh!

miss-islington-app bot commented May 19, 2025

Uh oh!

bedevere-app bot commented May 19, 2025

Uh oh!

Uh oh!

gh-128045: Mark unknown opcodes as deopting to themselves #128044

gh-128045: Mark unknown opcodes as deopting to themselves #128044

Conversation

DinoV commented Dec 17, 2024 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iritkatriel Dec 17, 2024

Choose a reason for hiding this comment

Uh oh!

markshannon commented Jan 7, 2025

Uh oh!

DinoV commented Apr 2, 2025

Uh oh!

markshannon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

miss-islington-app bot commented May 19, 2025

Uh oh!

bedevere-app bot commented May 19, 2025

Uh oh!

DinoV commented Dec 17, 2024 •

edited by bedevere-app bot

Loading