-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-128045: Mark unknown opcodes as deopting to themselves #128044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!
Check out: eBank.nz (Art Generator) |
Netwrck.com (AI Tools) |
Text-Generator.io (AI API) |
BitBank.nz (Crypto AI) |
ReadingTime (Kids Reading) |
RewordGame |
BigMultiplayerChess |
WebFiddle |
How.nz |
Helix AI Assistant
Conversation
for name, deopt in sorted(deopts): | ||
out.emit(f"[{name}] = {deopt},\n") | ||
defined = set(analysis.opmap.values()) | ||
for i in range(256): | ||
if i not in defined: | ||
out.emit(f"[{i}] = {i},\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're not testing this at all in cpython, I'd suggest we at least add a couple of assertions to make sure this is correctly covering the range of opcodes:
for name, deopt in sorted(deopts): | |
out.emit(f"[{name}] = {deopt},\n") | |
defined = set(analysis.opmap.values()) | |
for i in range(256): | |
if i not in defined: | |
out.emit(f"[{i}] = {i},\n") | |
defined = set(analysis.opmap.values()) | |
for i in range(256): | |
if i not in defined: | |
deopts.append((f'{i}', f'{i}')) | |
assert len(deopts) == 256 | |
assert len(set(x[0] for x in deopts)) == 256 | |
for name, deopt in sorted(deopts): | |
out.emit(f"[{name}] = {deopt},\n") |
Summary: When CPython hands out the bytecode it will first do a de-opt on it: python/cpython#128044 Reviewed By: jbower-fb Differential Revision: D67350914 fbshipit-source-id: 0073efab52da1be775272e7dd9ae5a46468ccb10
I think we regard undefined instructions an error. For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process. |
Took me a while to get back to this but I'm finally back :) I applied the changes suggested by @iritkatriel.
I think as long as we could still have some way to construct a code object ourselves that would be fine, we'd just may need to implement our own custom unmarshaling logic that could support our opcodes. Obviously that doesn't cover all the ways that things could change in the future but we can try and figure out how to adapt to other potential changes :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Thanks @DinoV for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…onGH-128044) * Mark unknown opcodes as deopting to themselves (cherry picked from commit cc9add6) Co-authored-by: Dino Viehland <dinoviehland@meta.com>
When accessing
co_code
on a code object we'll first run through to do the de-opt https://github.com/python/cpython/blob/main/Objects/codeobject.c#L1706This removes any unknown opcodes. Instead the deopt table should just recognize unknown opcodes as deopting to themselves, allowing the extensible interpreter loop to consume unknown opcodes.
co_code
#128045