Skip to content

gh-128045: Mark unknown opcodes as deopting to themselves #128044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 19, 2025

Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant

Conversation

DinoV
Copy link
Contributor

@DinoV DinoV commented Dec 17, 2024

When accessing co_code on a code object we'll first run through to do the de-opt https://github.com/python/cpython/blob/main/Objects/codeobject.c#L1706

This removes any unknown opcodes. Instead the deopt table should just recognize unknown opcodes as deopting to themselves, allowing the extensible interpreter loop to consume unknown opcodes.

@DinoV DinoV changed the title Mark unknown opcodes as deopting to themselves gh-128045: Mark unknown opcodes as deopting to themselves Dec 17, 2024
@DinoV DinoV marked this pull request as ready for review December 17, 2024 20:12
@DinoV DinoV requested a review from markshannon as a code owner December 17, 2024 20:12
iritkatriel
Comment on lines 251 to 256
for name, deopt in sorted(deopts):
out.emit(f"[{name}] = {deopt},\n")
defined = set(analysis.opmap.values())
for i in range(256):
if i not in defined:
out.emit(f"[{i}] = {i},\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not testing this at all in cpython, I'd suggest we at least add a couple of assertions to make sure this is correctly covering the range of opcodes:

Suggested change
for name, deopt in sorted(deopts):
out.emit(f"[{name}] = {deopt},\n")
defined = set(analysis.opmap.values())
for i in range(256):
if i not in defined:
out.emit(f"[{i}] = {i},\n")
defined = set(analysis.opmap.values())
for i in range(256):
if i not in defined:
deopts.append((f'{i}', f'{i}'))
assert len(deopts) == 256
assert len(set(x[0] for x in deopts)) == 256
for name, deopt in sorted(deopts):
out.emit(f"[{name}] = {deopt},\n")

facebook-github-bot pushed a commit to facebookincubator/cinder that referenced this pull request Dec 18, 2024
Summary:
When CPython hands out the bytecode it will first do a de-opt on it:

python/cpython#128044

Reviewed By: jbower-fb

Differential Revision: D67350914

fbshipit-source-id: 0073efab52da1be775272e7dd9ae5a46468ccb10
@markshannon
Copy link
Member

I think we regard undefined instructions an error.
If you want to pass custom bytecodes to your custom interpreter, a more robust approach might be needed.
We can do the deopt thing for now, but it seems fragile.

For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process.

@DinoV DinoV force-pushed the deopt_unknown_ops branch from 7048634 to a675bfa Compare April 2, 2025 17:28
@DinoV
Copy link
Contributor Author

DinoV commented Apr 2, 2025

Took me a while to get back to this but I'm finally back :) I applied the changes suggested by @iritkatriel.

I think we regard undefined instructions an error. If you want to pass custom bytecodes to your custom interpreter, a more robust approach might be needed. We can do the deopt thing for now, but it seems fragile.

For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process.

I think as long as we could still have some way to construct a code object ourselves that would be fine, we'd just may need to implement our own custom unmarshaling logic that could support our opcodes. Obviously that doesn't cover all the ways that things could change in the future but we can try and figure out how to adapt to other potential changes :)

@DinoV DinoV force-pushed the deopt_unknown_ops branch from a675bfa to efe6990 Compare April 2, 2025 17:36
markshannon
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@DinoV DinoV merged commit cc9add6 into python:main May 19, 2025
59 checks passed
@miss-islington-app
Copy link

Thanks @DinoV for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 19, 2025
…onGH-128044)

* Mark unknown opcodes as deopting to themselves
(cherry picked from commit cc9add6)

Co-authored-by: Dino Viehland <dinoviehland@meta.com>
@bedevere-app
Copy link

bedevere-app bot commented May 19, 2025

3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label May 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant

3 participants

Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant