Skip to content

Excess spaces at the end of files or repositorys are not handle when extracting zip files on Windows. #94018

@Rygone

Description

@Rygone

Bug report

Excess spaces at the end of files or repositorys are not handle when extracting zip files on Windows.
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Documents \\test.txt'

Can be tested with this Documents.zip
and this piece of code:

from zipfile import ZipFile
with ZipFile('Documents.zip', 'r') as zip:
    zip.extractall()

Fix proposal

cpython/Lib/zipfile.py : 1690

# remove end spaces
def remove_end_spaces(x):
    for c in x[::-1]:
        if(c == ' '): x = x[:-1]
        else: return x
arcname = (remove_end_spaces(x) for x in arcname)

Your environment

  • CPython versions tested on: python 3.9
  • Operating system and architecture: Windows 10 Professionnel 21H2 19044.1706

Linked PRs

Activity

added
type-bugAn unexpected behavior, bug, or error
on Jun 20, 2022
dignissimus

dignissimus commented on Jun 20, 2022

@dignissimus
Contributor

Removing trailing spaces can be done using str.rstrip.

I don't think this is an issue with the zipfile library. Testing the file using unzip shows the filename as including trailing spaces and unzip extracts the file with the directory name containing the trailing spaces. I don't think this behaviour needs to be changed and I don't think it should be altered.

sam@samtop /tmp % zip -Tv Documents.zip 
	zip warning: undefined bits used in flags = 0x0808: Documents /test.txt
Archive:  Documents.zip
    testing: Documents /test.txt      OK
No errors detected in compressed data of Documents.zip.
test of Documents.zip OK
sam@samtop /tmp % 

Testing with p7zip shows the same

sam@samtop /tmp % 7z t Documents.zip -bb3

7-Zip [64] 17.04 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.04 (locale=en_GB.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs x64)

Scanning the drive for archives:
1 file, 210 bytes (1 KiB)

Testing archive: Documents.zip
--
Path = Documents.zip
Type = zip
Physical Size = 210

T Documents /test.txt
Everything is Ok

Size:       0
Compressed: 210
sam@samtop /tmp % 
7 sam@samtop /tmp % 7z l Documents.zip -bb3

7-Zip [64] 17.04 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.04 (locale=en_GB.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs x64)

Scanning the drive for archives:
1 file, 210 bytes (1 KiB)

Listing archive: Documents.zip

--
Path = Documents.zip
Type = zip
Physical Size = 210

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2022-06-20 03:19:08 .....            0            2  Documents /test.txt
------------------- ----- ------------ ------------  ------------------------
2022-06-20 03:19:08                  0            2  1 files
Rygone

Rygone commented on Jun 20, 2022

@Rygone
ContributorAuthor

Completely agree with the str.rstrip.

However, the problem is on Windows machines.
Windows Explorer extracts them without spaces at the end because it is not possible to have files or repositorys that end with spaces on Windows.
That's why I propose to make the change in _sanitize_windows_name().

And it's already done for dots :1688

So new proposal
cpython/Lib/zipfile.py : 1687

# remove trailing dots and spaces
arcname = (x.rstrip(' .') for x in arcname.split(pathsep))
dignissimus

dignissimus commented on Jun 20, 2022

@dignissimus
Contributor

Ok, if it causes errors on Windows then updating the sanitisation function for windows sounds very reasonable

added a commit that references this issue on Jun 30, 2022
added a commit that references this issue on May 1, 2024
176fd55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    OS-windowstype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Excess spaces at the end of files or repositorys are not handle when extracting zip files on Windows. · Issue #94018 · python/cpython

      Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

      Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant