Skip to content

tarfile: TarFile.offset attribute is not updated with the remainder when closing a TarFile #129255

@emontnemery

Description

@emontnemery

Bug report

Bug description:

The TarFile.offset attribute is not updated with the remainder when closing the tar.

import io
import pathlib
import tarfile

tar = tarfile.open("test.tar", "w")
t = tarfile.TarInfo("foo")
t.size = 123
tar.addfile(t, io.BytesIO(b"a" * t.size))
tar.close()
assert len(pathlib.Path("test.tar").read_bytes()) == tar.offset  # Fails because tar.offset is 2048, although 10240 bytes have been written

If this intentional, a test case asserting the current behavior should be added.
In case this is not intentional, a suggested change to make the offset match the number of bytes written, and a test case asserting it, are included in the snippet below.
In either case, I'd be happy to submit a PR.

diff --git a/Lib/tarfile.py b/Lib/tarfile.py
index a0fab46b24e..feafb88d2d3 100644
--- a/Lib/tarfile.py
+++ b/Lib/tarfile.py
@@ -2027,6 +2027,7 @@ def close(self):
                 blocks, remainder = divmod(self.offset, RECORDSIZE)
                 if remainder > 0:
                     self.fileobj.write(NUL * (RECORDSIZE - remainder))
+                    self.offset += (RECORDSIZE - remainder)
         finally:
             if not self._extfileobj:
                 self.fileobj.close()
diff --git a/Lib/test/test_tarfile.py b/Lib/test/test_tarfile.py
index 2549b6b35ad..4d1a2b2171b 100644
--- a/Lib/test/test_tarfile.py
+++ b/Lib/test/test_tarfile.py
@@ -1333,6 +1333,18 @@ def test_eof_marker(self):
         with self.open(tmpname, "rb") as fobj:
             self.assertEqual(len(fobj.read()), tarfile.RECORDSIZE * 2)

+    def test_offset_on_close(self):
+        # Check the offset after calling close matches the total number of
+        # bytes written.
+        tar = tarfile.open(tmpname, self.mode)
+        t = tarfile.TarInfo("foo")
+        tar.addfile(t)
+        tar.close()
+
+        with self.open(tmpname, "rb") as fobj:
+            self.assertEqual(len(fobj.read()), tar.offset)
+

 class WriteTest(WriteTestBase, unittest.TestCase):

CPython versions tested on:

3.14, 3.13

Operating systems tested on:

Linux

Linked PRs

Activity

changed the title [-]tarfile: The offset does not match the number of bytes written after closing a tarfile which has been written to[/-] [+]tarfile: TarFile.offset attribute is not updated with the remainder when closing a TarFile[/+] on Jan 24, 2025
StanFromIreland

StanFromIreland commented on Feb 26, 2025

@StanFromIreland
Member

I agree it offset should be updated. Are you still planning on opening a PR?

emontnemery

emontnemery commented on Feb 28, 2025

@emontnemery
ContributorAuthor

Yes, sure, I'll open a PR 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      tarfile: TarFile.offset attribute is not updated with the remainder when closing a TarFile · Issue #129255 · python/cpython

      Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

      Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant