hfutils.archive

Overview:

Archive pack and unpack management.

Supported Formats:

Format

Extension Name

7z

.7z

bztar

.tar.bz2, .tbz2

gztar

.tar.gz, .tgz

rar

.rar

tar

.tar

xztar

.tar.xz, .txz

zip

.zip

Note

If you require support for 7z and RAR formats, simply install hfutils using the following code:

pip install hfutils[7z]
pip install hfutils[rar]

Warning

The creation of archive files in the RAR format is not supported, as we utilize the rarfile library, which does not offer functionality for creating RAR files.

register_archive_type

hfutils.archive.register_archive_type(name: str, exts: List[str], fn_pack: Callable, fn_unpack: Callable, fn_writer: Callable[[str], ArchiveWriter])[source]

Register a new archive type with its associated handlers and extensions.

This function allows for the registration of custom archive formats by providing the necessary functions for packing, unpacking, and creating archive writers.

Parameters:
  • name (str) – Identifier for the archive type (e.g., ‘zip’, ‘tar’).

  • exts (List[str]) – List of file extensions for this archive type (e.g., [‘.zip’]).

  • fn_pack (Callable) – Function to create archives of this type.

  • fn_unpack (Callable) – Function to extract archives of this type.

  • fn_writer (Callable[[str], ArchiveWriter]) – Function to create an archive writer instance.

Raises:

ValueError – If no file extensions are provided.

Example:
>>> def my_pack(directory, archive_file, **kwargs): pass
>>> def my_unpack(archive_file, directory, **kwargs): pass
>>> def my_writer(archive_file): return CustomWriter(archive_file)
>>> register_archive_type('custom', ['.cst'], my_pack, my_unpack, my_writer)

archive_pack

hfutils.archive.archive_pack(type_name: str, directory: str, archive_file: str, pattern: str | None = None, silent: bool = False, clear: bool = False)[source]

Create an archive from a directory using the specified archive type.

Parameters:
  • type_name (str) – Name of the archive type to use.

  • directory (str) – Source directory to archive.

  • archive_file (str) – Output archive file path.

  • pattern (str, optional) – Optional file pattern for filtering (e.g., ‘*.txt’).

  • silent (bool) – Whether to suppress warnings.

  • clear (bool) – Whether to remove existing files when packing.

Raises:

ValueError – If the archive type is not registered.

Example:
>>> archive_pack('zip', '/data', 'backup.zip', pattern='*.dat', silent=True)

archive_unpack

hfutils.archive.archive_unpack(archive_file: str, directory: str, silent: bool = False, password: str | None = None)[source]

Extract an archive file to a directory.

Parameters:
  • archive_file (str) – Path to the archive file to extract.

  • directory (str) – Destination directory for extraction.

  • silent (bool) – Whether to suppress warnings.

  • password (str, optional) – Optional password for protected archives.

Raises:

ValueError – If the archive type is not recognized.

Example:
>>> archive_unpack('protected.zip', 'output_dir', password='secret')

archive_writer

hfutils.archive.archive_writer(type_name: str, archive_file: str) ArchiveWriter[source]

Create an archive writer for the specified archive type.

Parameters:
  • type_name (str) – Name of the archive type.

  • archive_file (str) – Path to the archive file to create.

Returns:

An archive writer instance.

Return type:

ArchiveWriter

Raises:

ValueError – If the archive type is not registered.

Example:
>>> with archive_writer('zip', 'output.zip') as writer:
...     writer.add('file.txt', 'docs/file.txt')

ArchiveWriter

class hfutils.archive.ArchiveWriter(archive_file: str)[source]

Base class for creating and managing archive writers.

This class provides a context manager interface for handling archive files, allowing for safe resource management and consistent file addition operations. It serves as a template for specific archive format implementations.

Parameters:

archive_file (str) – Path to the archive file to be created or modified.

Example:
>>> with ArchiveWriter('output.zip') as writer:
...     writer.add('file.txt', 'archive_path/file.txt')
__enter__()[source]

Context manager entry point.

Returns:

Self reference for use in context manager.

Return type:

ArchiveWriter

__exit__(exc_type, exc_val, exc_tb)[source]

Context manager exit point.

Ensures proper cleanup of resources when exiting the context.

Parameters:
  • exc_type – Exception type if an error occurred.

  • exc_val – Exception value if an error occurred.

  • exc_tb – Exception traceback if an error occurred.

__init__(archive_file: str)[source]
add(filename: str, arcname: str)[source]

Add a file to the archive.

Parameters:
  • filename (str) – Path to the file to be added.

  • arcname (str) – Desired path within the archive.

close()[source]

Close the archive and release resources.

This method ensures proper cleanup of resources and is automatically called when using the context manager.

open()[source]

Open the archive for writing.

Initializes the archive handler if it hasn’t been created yet. This method is automatically called when using the context manager.

get_archive_type

hfutils.archive.get_archive_type(archive_file: str) str[source]

Determine the archive type from a file’s extension.

Parameters:

archive_file (str) – Path to the archive file.

Returns:

Name of the detected archive type.

Return type:

str

Raises:

ValueError – If the file extension doesn’t match any registered type.

Example:
>>> type_name = get_archive_type('data.tar.gz')
>>> print(type_name)
'gztar'

get_archive_extname

hfutils.archive.get_archive_extname(type_name: str) str[source]

Retrieve the primary file extension for a registered archive type.

Parameters:

type_name (str) – Name of the archive type.

Returns:

Primary file extension for the archive type.

Return type:

str

Raises:

ValueError – If the archive type is not registered.

Example:
>>> ext = get_archive_extname('zip')
>>> print(ext)
'.zip'

archive_splitext

hfutils.archive.archive_splitext(filename: str) Tuple[str, str][source]

Split a filename into root and extension, handling compound extensions.

Parameters:

filename (str) – The filename to split.

Returns:

Tuple of (root, extension).

Return type:

Tuple[str, str]

Example:
>>> root, ext = archive_splitext('data.tar.gz')
>>> print(root, ext)
'data' '.tar.gz'