From a56e6be84956c92ab6c96b57fa8e7e5cfa4d61c2 Mon Sep 17 00:00:00 2001 From: nahcmon Date: Mon, 8 Jun 2026 23:42:06 +0200 Subject: [PATCH] gh-62944: Add performance note for out-of-order extraction from compressed archives Extracting members in a different order than they appear in a compressed tarfile requires re-decompressing from the beginning of the stream for each backward seek. Add a note to tarfile.open() documenting this and recommending in-order extraction or use of TarFile.extractall() for best performance. --- Doc/library/tarfile.rst | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/Doc/library/tarfile.rst b/Doc/library/tarfile.rst index 29a329fdfeab15..b8ce813379ad7a 100644 --- a/Doc/library/tarfile.rst +++ b/Doc/library/tarfile.rst @@ -123,6 +123,17 @@ Some facts and figures: :exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this. If a compression method is not supported, :exc:`CompressionError` is raised. + .. note:: + + Compressed archives opened with modes like ``'r:gz'``, ``'r:bz2'``, + ``'r:xz'``, or ``'r:zst'`` support random access, but seeking backwards + in the underlying compressed stream requires re-decompressing from the + beginning. Extracting members in a different order than they appear in + the archive can therefore be significantly slower — proportional to the + total compressed data read rather than just the target member's size. For + best performance, extract members in archive order or use + :meth:`TarFile.extractall`. + If *fileobj* is specified, it is used as an alternative to a :term:`file object` opened in binary mode for *name*. It is supposed to be at position 0.