-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
Clarify base64.a85(en,de)code
documentation for Adobe mode #134837
Bug report
Bug description:
It seems that whitespace is allowed everywhere by base64.a85decode
, except after the end-of-data delimiter b'~>'
in adobe
mode:
>>> base64.a85decode(b"6#q'\\F`JTK<-N74;eT`QF!;`!@:O(oDf,~>", adobe=True)
b'Arthur "Two-Sheds" Jackson'
>>> base64.a85decode(b" 6 # q' \\ F`JTK<-N 7 4 ;eT`QF!;`!@:O(oDf,~>", adobe=True)
b'Arthur "Two-Sheds" Jackson'
>>> base64.a85decode(b" 6 # q' \\ F`JTK<-N 7 4 ;eT`QF!;`!@:O(oDf, ")
b'Arthur "Two-Sheds" Jackson'
>>> base64.a85decode(b" 6 # q' \\ F`JTK<-N 7 4 ;eT`QF!;`!@:O(oDf,~> ", adobe=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.11/base64.py", line 388, in a85decode
raise ValueError(
ValueError: Ascii85 encoded byte sequences must end with b'~>'
While this behaviour is actually problems due to the legacy of centuriesdecades of ambiguous PDF standards and implementations that emit and accept extra whitespace due to these amgibuities.
A separate but related issue is that some very broken PDF implementations have even been known to insert whitespace between the ~
and >
bytes. It maybe useful for "Adobe" mode to be tolerant of this as well.
Obviously, also, PostScript doesn't care about extra whitespace after ~>
in ASCII85 literal strings. (Note that the leading <~
is only accepted in PostScript and not in PDF).
Because >
is a valid ASCII85 digit, an improved rule would be to only accept the regular expression ~\s*>\s*
at the end of input in Adobe mode.
CPython versions tested on:
3.11
Operating systems tested on:
Linux
Metadata
Metadata
Assignees
Labels
Status
Milestone
Relationships
Development
Issue actions