hfutils.utils.arrange

A module for managing and grouping files based on size and structure.

This module provides functionality for walking through directories, grouping files based on various criteria, and managing file collections with size constraints. It’s particularly useful for tasks involving file organization, batch processing, and storage management.

Example usage:
>>> groups = walk_files_with_groups("./data", pattern="*.txt", max_total_size="1GB")
>>> for group in groups:
...     print(f"Group size: {group.size}, File count: {group.count}")

FileItem

class hfutils.utils.arrange.FileItem(file: str, size: int, count: int)[source]

A data class representing a single file with its properties.

Parameters:
  • file (str) – Path to the file

  • size (int) – Size of the file in bytes

  • count (int) – Number of files this item represents (typically 1)

classmethod from_file(file: str, rel_to: str | None = None) FileItem[source]

Create a FileItem instance from a file path.

Parameters:
  • file (str) – Path to the file

  • rel_to (Optional[str]) – Optional path to make the file path relative to

Returns:

A new FileItem instance

Return type:

FileItem

Raises:

FileNotFoundError – If the file does not exist

FilesGroup

class hfutils.utils.arrange.FilesGroup(files: List[str], size: int, count: int)[source]

A data class representing a group of files with collective properties.

Parameters:
  • files (List[str]) – List of file paths in the group

  • size (int) – Total size of all files in the group

  • count (int) – Total number of files in the group

add(file: FileItem | FilesGroup) FilesGroup[source]

Add a FileItem or another FilesGroup to this group.

Parameters:

file (Union[FileItem, FilesGroup]) – The item to add to the group

Returns:

Self reference for method chaining

Return type:

FilesGroup

Raises:

TypeError – If the input type is not FileItem or FilesGroup

classmethod new() FilesGroup[source]

Create a new empty FilesGroup instance.

Returns:

A new empty FilesGroup

Return type:

FilesGroup

walk_files_with_groups

hfutils.utils.arrange.walk_files_with_groups(directory: str, pattern: str | None = None, group_method: str | int | None = None, max_total_size: str | float | None = None, silent: bool = False) List[FilesGroup][source]

Walk through a directory and group files based on specified criteria.

This function walks through a directory, collecting files that match the given pattern, and groups them according to the specified method while respecting size constraints.

Parameters:
  • directory (str) – Root directory to start walking from

  • pattern (Optional[str]) – Optional glob pattern to filter files

  • group_method (Optional[Union[str, int]]) – Method for grouping files (None for default, int for segment count)

  • max_total_size (Optional[Union[str, float]]) – Maximum total size for each group (can be string like “1GB”)

  • silent (bool) – If True, the progress bar content will not be displayed.

Returns:

List of file groups

Return type:

List[FilesGroup]

Raises:
  • ValueError – If the grouping parameters are invalid

  • OSError – If there are filesystem-related errors

Example:
>>> groups = walk_files_with_groups("./data", "*.txt", group_method=2, max_total_size="1GB")
>>> for group in groups:
...     print(f"Group contains {group.count} files, total size: {group.size} bytes")