Skip to content

Allow the developer to inject some metadata to the target process when they use sys.remote_exec at the same time #135360

Closed
@Zheaoli

Description

@Zheaoli

Feature or enhancement

Proposal:

For now, we emit an audit event when we execute the remote-injected script

    if (0 != PySys_Audit("remote_debugger_script", "O", path)) {
        PyErr_FormatUnraisable(
            "Audit hook failed for remote debugger script %U", path);
        return;
    }

I think maybe it's worth emitting the remote PID at the same time. It will be more helpful when we want collect the info related with security

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Activity

vstinner

vstinner commented on Jun 11, 2025

@vstinner
Member

run_remote_debugger_script() is called in the debugged process, the pid would the the process pid which should already be known. I don't think that logging the PID would be useful.

Zheaoli

Zheaoli commented on Jun 11, 2025

@Zheaoli
ContributorAuthor

run_remote_debugger_script() is called in the debugged process, the pid would the the process pid which should already be known. I don't think that logging the PID would be useful.

@vstinner For me, I would like to record the pid itself when debugger process attach to the debugged process. And the debugged process can emit the debugger PID.

vstinner

vstinner commented on Jun 11, 2025

@vstinner
Member
pablogsal

pablogsal commented on Jun 11, 2025

@pablogsal
Member

The debugged process doesn't know the debugger PID so it cannot emit it. For the debugged process perspective something writes to the interface but it doesn't know who

Zheaoli

Zheaoli commented on Jun 11, 2025

@Zheaoli
ContributorAuthor

The debugged process doesn't know the debugger PID so it cannot emit it. For the debugged process perspective something writes to the interface but it doesn't know who

Yes, This is the issue I want to solve. I propose to inject the debugger PID when we inject the script at the same time. So the debugged process can know the original debugger process and know where the script from

pablogsal

pablogsal commented on Jun 11, 2025

@pablogsal
Member

The debugged process doesn't know the debugger PID so it cannot emit it. For the debugged process perspective something writes to the interface but it doesn't know who

Yes, This is the issue I want to solve. I propose to inject the debugger PID when we inject the script at the same time. So the debugged process can know the original debugger process and know where the script from

While I understand the motivation, I’d prefer not to include the debugger PID for the time being. The debugged process can’t verify it, and the debugger could lie—intentionally or not. I think it’s better to keep the debugged side minimal and avoid logging unverifiable data. If needed, the debugger side can log that information more reliably.

Zheaoli

Zheaoli commented on Jun 11, 2025

@Zheaoli
ContributorAuthor

The debugged process doesn't know the debugger PID so it cannot emit it. For the debugged process perspective something writes to the interface but it doesn't know who

Yes, This is the issue I want to solve. I propose to inject the debugger PID when we inject the script at the same time. So the debugged process can know the original debugger process and know where the script from

While I understand the motivation, I’d prefer not to include the debugger PID for the time being. The debugged process can’t verify it, and the debugger could lie—intentionally or not. I think it’s better to keep the debugged side minimal and avoid logging unverifiable data. If needed, the debugger side can log that information more reliably.

I think we split this discussion into two parts:

  1. verified/known-source tools/debugger
  2. attack

For part2, yes, The debugged process can’t verify it. But the unverified message means a lot itself. For example, if the audit emit -1 PID and the log has been collected. But at the same time, we can not find a PID in host agent history. OK we know here's an unexpected attack.

BTW at the same time, here's a scenario in the real production environment. The enterprise may wrapped a developer tool be on the remote_exec API, and we just allow specific process to attach to the process. So maybe the code can be used

import sys
import json

def audit_callback(script, metadata):
	metadata = json.loads(metadata)
	if "version" is not in metadata:
		sys.exit(120)

So I propose to extend the remote_exec API

  1. Allow the developer to inject some metadata at the same time. The metadata length can be defined at the compile time or bootstrap time(maybe default 256?)
  2. We will emit the metadata in audit event when the process has been attached
Zheaoli

Zheaoli commented on Jun 12, 2025

@Zheaoli
ContributorAuthor

The debugged process doesn't know the debugger PID so it cannot emit it. For the debugged process perspective something writes to the interface but it doesn't know who

Yes, This is the issue I want to solve. I propose to inject the debugger PID when we inject the script at the same time. So the debugged process can know the original debugger process and know where the script from

While I understand the motivation, I’d prefer not to include the debugger PID for the time being. The debugged process can’t verify it, and the debugger could lie—intentionally or not. I think it’s better to keep the debugged side minimal and avoid logging unverifiable data. If needed, the debugger side can log that information more reliably.

I think we split this discussion into two parts:

  1. verified/known-source tools/debugger
  2. attack

For part2, yes, The debugged process can’t verify it. But the unverified message means a lot itself. For example, if the audit emit -1 PID and the log has been collected. But at the same time, we can not find a PID in host agent history. OK we know here's an unexpected attack.

BTW at the same time, here's a scenario in the real production environment. The enterprise may wrapped a developer tool be on the remote_exec API, and we just allow specific process to attach to the process. So maybe the code can be used

import sys
import json

def audit_callback(script, metadata):
metadata = json.loads(metadata)
if "version" is not in metadata:
sys.exit(120)
So I propose to extend the remote_exec API

  1. Allow the developer to inject some metadata at the same time. The metadata length can be defined at the compile time or bootstrap time(maybe default 256?)
  2. We will emit the metadata in audit event when the process has been attached

@pablogsal WDYT

vstinner

vstinner commented on Jun 12, 2025

@vstinner
Member

I don't think that it's a good idea to send the debugger PID. Pablo gave a good rationale for not doing that. I suggest closing this issue.

pablogsal

pablogsal commented on Jun 12, 2025

@pablogsal
Member

Thanks @Zheaoli for bringing this up and for the thoughtful discussion about security audit improvements.

I agree with @vstinner and I think I unfortunately will decline for now so I am closing this as not planned for a couple of key reasons:

  1. Passing additional metadata through the audit system introduces risk - metadata is provided by an external process and if that metadata is interpreted as a Python object and is malformed, it could potentially crash the process.

  2. For the time being we prefer to keep the debugged side minimal and avoid adding complexity that could introduce new failure modes. I want to gain some experience from people using it to know how to extend it but for now is very early to do it.

While I understand the security monitoring use case you've outlined, the potential risks of extending the API in this way outweigh the benefits at this time. As mentioned earlier in the discussion, the debugger side can log this information more reliably if needed.

Thanks again for the contribution and for thinking about Python's security posture!

pablogsal

pablogsal commented on Jun 12, 2025

@pablogsal
Member

Sorry one correction: I reviewed more your proposal and I noticed that passing the metadata as raw bytes actually could work and is actually not unsafe. But this still feels a bit heavyweight for now because I can see people wanting the metadata not only on the audit system but also on the debugged process.

Zheaoli

Zheaoli commented on Jun 12, 2025

@Zheaoli
ContributorAuthor

Sorry one correction: I reviewed more your proposal and I noticed that passing the metadata as raw bytes actually could work and is actually not unsafe. But this still feels a bit heavyweight for now because I can see people wanting the metadata not only on the audit system but also on the debugged process.

Thanks for the explanation. Would you mind that I open a discussion on discuss.python.org? Maybe we can collect more requirement about this.

22 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Allow the developer to inject some metadata to the target process when they use `sys.remote_exec` at the same time · Issue #135360 · python/cpython

      Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

      Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant