Description
For directory, SimpleHTTPRequestHandler generates an index.html page containing a list of files. It uses the filesystem encoding for the page, which is reasonable, because file names are encoded with that encoding. The problem is that the directory patch, included in the title, can contain a query part of the URL, which may be not encodable with the filesystem encoding.
This causes test failure when running in non-UTF8 locale:
$ LC_ALL=uk_UA ./python -m test -vuall test_httpservers -m test_undecodable_parameter
...
test_undecodable_parameter (test.test_httpservers.SimpleHTTPServerTestCase.test_undecodable_parameter) ... ----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 48062)
Traceback (most recent call last):
File "/home/serhiy/py/cpython/Lib/socketserver.py", line 318, in _handle_request_noblock
self.process_request(request, client_address)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/serhiy/py/cpython/Lib/socketserver.py", line 349, in process_request
self.finish_request(request, client_address)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/serhiy/py/cpython/Lib/socketserver.py", line 362, in finish_request
self.RequestHandlerClass(request, client_address, self)
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/serhiy/py/cpython/Lib/http/server.py", line 721, in __init__
super().__init__(*args, **kwargs)
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/home/serhiy/py/cpython/Lib/socketserver.py", line 766, in __init__
self.handle()
~~~~~~~~~~~^^
File "/home/serhiy/py/cpython/Lib/http/server.py", line 485, in handle
self.handle_one_request()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/serhiy/py/cpython/Lib/http/server.py", line 473, in handle_one_request
method()
~~~~~~^^
File "/home/serhiy/py/cpython/Lib/http/server.py", line 725, in do_GET
f = self.send_head()
File "/home/serhiy/py/cpython/Lib/http/server.py", line 769, in send_head
return self.list_directory(path)
~~~~~~~~~~~~~~~~~~~^^^^^^
File "/home/serhiy/py/cpython/Lib/http/server.py", line 874, in list_directory
encoded = '\n'.join(r).encode(enc, 'surrogateescape')
File "/home/serhiy/py/cpython/Lib/encodings/koi8_u.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 178: character maps to <undefined>
encoding with 'koi8-u' codec failed
----------------------------------------
ERROR
======================================================================
ERROR: test_undecodable_parameter (test.test_httpservers.SimpleHTTPServerTestCase.test_undecodable_parameter)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/serhiy/py/cpython/Lib/test/test_httpservers.py", line 559, in test_undecodable_parameter
response = self.request(self.base_url + '/?x=%bb').read()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/serhiy/py/cpython/Lib/test/test_httpservers.py", line 131, in request
return self.connection.getresponse()
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/serhiy/py/cpython/Lib/http/client.py", line 1430, in getresponse
response.begin()
~~~~~~~~~~~~~~^^
File "/home/serhiy/py/cpython/Lib/http/client.py", line 331, in begin
version, status, reason = self._read_status()
~~~~~~~~~~~~~~~~~^^
File "/home/serhiy/py/cpython/Lib/http/client.py", line 300, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
" response")
http.client.RemoteDisconnected: Remote end closed connection without response
----------------------------------------------------------------------
I suspect that there may also be issues if some files in the directory have non-decodable or the path of the directory is non-decodable, but I have not tested this yet.
Linked PRs
- gh-133889: Encode file paths in
utf-8
by default inlist_directory
#133975 - gh-133889: Improve tests for SimpleHTTPRequestHandler #134102
- [3.14] gh-133889: Improve tests for SimpleHTTPRequestHandler (GH-134102) #134121
- [3.13] gh-133889: Improve tests for SimpleHTTPRequestHandler (GH-134102) #134122
- gh-133889: Only show the path of the URL in the SimpleHTTPRequestHandler page #134135
- [3.14] gh-133889: Only show the path of the URL in the SimpleHTTPRequestHandler page (GH-134135) #134190
- [3.13] gh-133889: Only show the path of the URL in the SimpleHTTPRequestHandler page (GH-134135) #134191
Activity
latin-1
#133677utf-8
by default inlist_directory
#133975StanFromIreland commentedon May 13, 2025
There is no good way that guarantees it will work with simple locale, so why not encode in
utf-8
instead? It is becoming default in Python anyway, and I believe is the default in the majority of web browsers. We could make it optional to use system encoding, and default to the web standard?pythongh-133889: Improve tests for SimpleHTTPRequestHandler
gh-133889: Improve tests for SimpleHTTPRequestHandler (GH-134102)
pythongh-133889: Improve tests for SimpleHTTPRequestHandler (pythonGH…
17 remaining items