Description
Running this script under free-threading build:
import os
import tempfile
import threading
N=2
COUNT=100
def writer(file, barrier):
barrier.wait()
for _ in range(COUNT):
f.write("x")
def reader(file, stopping):
while not stopping.is_set():
for line in file:
assert line == ""
stopping = threading.Event()
with tempfile.NamedTemporaryFile("w+") as f:
reader = threading.Thread(target=reader, args=(f, stopping))
reader.start()
barrier = threading.Barrier(N)
writers = [threading.Thread(target=writer, args=(f, barrier))
for _ in range(N)]
for t in writers:
t.start()
for t in writers:
t.join()
stopping.set()
reader.join()
f.flush()
assert(os.stat(f.name).st_size == COUNT * N)
...results in a crash:
python: ./Modules/_io/textio.c:1751: _io_TextIOWrapper_write_impl: Assertion `self->pending_bytes_count == 0' failed.
Aborted (core dumped)
The textiowrapper_iternext
method is not protected by a critical section and calls _textiowrapper_readline
, which calls _textiowrapper_writeflush
, which relies on the GIL or critical section to synchronise access to its internal data. In a free-threading build this means iterating over lines in a text file is not thread-safe and can crash if it races with writes or other operations.
It looks like all other entry points that could call _textiowrapper_writeflush
are protected by the critical section, so the easiest fix is probably to just likewise protect textiowrapper_iternext
.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
No response
Activity
pythongh-134908: protect ``textiowrapper_iternext`` with critical sec…
textiowrapper_iternext
with critical section #134910duaneg commentedon May 30, 2025
Note that I found this while investigating #118138, however I think it is a separate bug that should be addressed under a separate issue.
CC @cmaloney who seems to be looking at related issues in this area already 👋
ZeroIntensity commentedon May 30, 2025
I'm pretty sure we don't support concurrent calls to
__next__
. You need to manually serialize the calls yourself. See gh-124397.duaneg commentedon May 30, 2025
That is a very interesting discussion, thanks for linking!
I totally agree that in general iterators should not be assumed to be thread-safe, in the broader sense. However they should not crash, which is happening here. In the description I reported an assertion failure, but the repro script will crash with assertions disabled too, just in more "interesting" ways.
ZeroIntensity commentedon May 30, 2025
You can get similar crashes from pure-Python by concurrently calling an object's
__init__
. I don't think there's much we can do without substantially hurting performance.duaneg commentedon May 30, 2025
I'm sure there are places where fixing crashes without substantially hurting performance might be difficult or impossible, but I don't think this is one of them. It should have no impact on GIL performance and minimal in free-threading.
The third bullet point in #124397 is:
I think this fix is exactly that...or at least, that was exactly what I intended when I wrote it. Of course, I may well have missed something!
ZeroIntensity commentedon May 30, 2025
I'll let others weigh in here, but my personal stance is that concurrent iteration without synchronization by the caller is not a real use-case. The example you sent would suffer from so much lock contention that it would be no better than calling it in a single thread :(
I think that last point is in regards to making the iterator thread-safe when others are calling separate methods on it, not necessarily concurrent calls to
__next__
. I think this needs clarification, though.colesbury commentedon May 30, 2025
From looking at the repro code, this looks like something we should fix.
ZeroIntensity commentedon May 30, 2025
Ok, thanks Sam. For clarity: when should we fix these cases?
colesbury commentedon May 30, 2025
This looks like concurrent operations on a NamedTemporaryFile. I think I/O should generally be internally synchronized. That's mostly true for Python already, it's usually true at the OS level, as well as in other language runtimes.
I get that there is a crash in iteration, but it doesn't look like there's an iterator shared between threads.
15 remaining items