Skip to content

Memory corruption error if a second subinterpreter imports a module that contains certain imports #116524

Closed
@Agomik

Description

@Agomik

Crash report

What happened?

I wrote a single-threaded C++ code using Python C APIs where I start two subinterpreters Sub1 and Sub2, then I load a module M with both.
If M imports certain modules (such as urllib.request or yaml), importing M with Sub2 triggers a memory corruption error if Sub1 already imported it. It works instead if M imports some other modules (such as urllib, base64, os or sys).

Here is a snippet that reproduces the error:

    Py_Initialize();

    PyThreadState *tstate_main, *tstate_s1, *tstate_s2;
    
    tstate_main = PyThreadState_Get();
    std::cerr << "tstate_main: " << tstate_main << std::endl;

    /PyGILState_STATE gstate = PyGILState_Ensure();
    PyInterpreterConfig config_s1 = {
        .use_main_obmalloc = 0,
        .allow_fork = 0,
        .allow_exec = 0,
        .allow_threads = 0,
        .allow_daemon_threads = 0,
        .check_multi_interp_extensions = 1,
        .gil = PyInterpreterConfig_OWN_GIL,
    };
    tstate_s1 = NULL;
    PyStatus status_s1 = Py_NewInterpreterFromConfig(&tstate_s1, &config_s1);
    std::cerr << "tstate_s1: " << tstate_s1 << std::endl;
    std::string sysPathCmd1 = "import sys\nsys.path.append('" + cwd + "')";
    PyRun_SimpleString(sysPathCmd1.c_str());
    PyObject* bytecode1 = Py_CompileString(module_code1, "test_module1", Py_file_input);
    PyObject* pModule1 = PyImport_ExecCodeModule("test_module1", bytecode1);
    if(!pModule1) {
        std::cerr << "Error on module import:" << std::endl;
        PyErr_Print();
        return -1;
    }
    PyObject* pFunc1 = PyObject_GetAttrString(pModule1, "test_call");
    PyObject* pData1 = PyUnicode_FromString("hello");
    PyObject* pArgs1 = PyTuple_Pack(1, pData1);
    PyObject* pResult1 = PyObject_CallObject(pFunc1, pArgs1);

    PyEval_RestoreThread(tstate_main);

    /PyGILState_STATE gstate = PyGILState_Ensure();
    PyInterpreterConfig config_s2 = {
        .use_main_obmalloc = 0,
        .allow_fork = 0,
        .allow_exec = 0,
        .allow_threads = 0,
        .allow_daemon_threads = 0,
        .check_multi_interp_extensions = 1,
        .gil = PyInterpreterConfig_OWN_GIL,
    };
    tstate_s2 = NULL;
    PyStatus status_s2 = Py_NewInterpreterFromConfig(&tstate_s2, &config_s2);
    std::cerr << "tstate_s2: " << tstate_s2 << std::endl;
    std::string sysPathCmd2 = "import sys\nsys.path.append('" + cwd + "')";
    PyRun_SimpleString(sysPathCmd2.c_str());
    PyObject* bytecode2 = Py_CompileString(module_code2, "test_module2", Py_file_input);
    PyObject* pModule2 = PyImport_ExecCodeModule("test_module2", bytecode2);
    PyObject* pFunc2 = PyObject_GetAttrString(pModule2, "test_call");
    PyObject* pData2 = PyUnicode_FromString("");
    PyObject* pArgs2 = PyTuple_Pack(1, pData2);
    PyObject* pResult2 = PyObject_CallObject(pFunc2, pArgs2);
    
    PyEval_RestoreThread(tstate_main);

Here is the example Python code I used:

# import yaml # ERROR
# import urllib.request # ERROR
# import urllib # NO ERROR
# import os # NO ERROR
# import os.path # NO ERROR
# import base64 # NO ERROR

def test_call(data):
    print("Called")

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

No response

Activity

added
type-crashA hard crash of the interpreter, possibly with a core dump
on Mar 8, 2024
ericsnowcurrently

ericsnowcurrently commented on Jul 8, 2024

@ericsnowcurrently
Member

Sorry for the long delay, @Agomik. I was able to reproduce the problem with the following code:

import _xxsubinterpreters as _interpreters
interpid1 = _interpreters.create()
interpid2 = _interpreters.create()
_interpreters.run_string(interpid1, 'import urllib.request')
_interpreters.run_string(interpid2, 'import urllib.request')

Output:

Debug memory block at address p=0x7f84adf92ef0: API ''
    2314885530296122877 bytes originally requested
    The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
        at p-7: 0xfd
        at p-6: 0xfd
        at p-5: 0xfd
        at p-4: 0xfd
        at p-3: 0xdd *** OUCH
        at p-2: 0xdd *** OUCH
        at p-1: 0xdd *** OUCH
    Because memory is corrupted at the start, the count of bytes requested
       may be bogus, and checking the trailing pad bytes may segfault.
    The 8 pad bytes at tail=0x20209fa4aef72ced are Segmentation fault (core dumped)

I was also able to verify that the problem is gone in 3.13.

I'll look into what the fix was in 3.13 and if there is any chance we could backport it to 3.12.

Karnav123

Karnav123 commented on Mar 20, 2025

@Karnav123

Is there any update for this issue.

ZeroIntensity

ZeroIntensity commented on May 19, 2025

@ZeroIntensity
Member

Confirmed that this doesn't reproduce anymore. 3.12 is security-only, so there's nothing more we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.12only security fixestopic-subinterpreterstype-crashA hard crash of the interpreter, possibly with a core dump

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Memory corruption error if a second subinterpreter imports a module that contains certain imports · Issue #116524 · python/cpython

      Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

      Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant