hfutils.utils.arrange
A module for managing and grouping files based on size and structure.
This module provides functionality for walking through directories, grouping files based on various criteria, and managing file collections with size constraints. It’s particularly useful for tasks involving file organization, batch processing, and storage management.
- Example usage:
>>> groups = walk_files_with_groups("./data", pattern="*.txt", max_total_size="1GB") >>> for group in groups: ... print(f"Group size: {group.size}, File count: {group.count}")
FileItem
- class hfutils.utils.arrange.FileItem(file: str, size: int, count: int)[source]
A data class representing a single file with its properties.
- Parameters:
file (str) – Path to the file
size (int) – Size of the file in bytes
count (int) – Number of files this item represents (typically 1)
- classmethod from_file(file: str, rel_to: str | None = None) FileItem [source]
Create a FileItem instance from a file path.
- Parameters:
file (str) – Path to the file
rel_to (Optional[str]) – Optional path to make the file path relative to
- Returns:
A new FileItem instance
- Return type:
- Raises:
FileNotFoundError – If the file does not exist
FilesGroup
- class hfutils.utils.arrange.FilesGroup(files: List[str], size: int, count: int)[source]
A data class representing a group of files with collective properties.
- Parameters:
files (List[str]) – List of file paths in the group
size (int) – Total size of all files in the group
count (int) – Total number of files in the group
- add(file: FileItem | FilesGroup) FilesGroup [source]
Add a FileItem or another FilesGroup to this group.
- Parameters:
file (Union[FileItem, FilesGroup]) – The item to add to the group
- Returns:
Self reference for method chaining
- Return type:
- Raises:
TypeError – If the input type is not FileItem or FilesGroup
- classmethod new() FilesGroup [source]
Create a new empty FilesGroup instance.
- Returns:
A new empty FilesGroup
- Return type:
walk_files_with_groups
- hfutils.utils.arrange.walk_files_with_groups(directory: str, pattern: str | None = None, group_method: str | int | None = None, max_total_size: str | float | None = None, silent: bool = False) List[FilesGroup] [source]
Walk through a directory and group files based on specified criteria.
This function walks through a directory, collecting files that match the given pattern, and groups them according to the specified method while respecting size constraints.
- Parameters:
directory (str) – Root directory to start walking from
pattern (Optional[str]) – Optional glob pattern to filter files
group_method (Optional[Union[str, int]]) – Method for grouping files (None for default, int for segment count)
max_total_size (Optional[Union[str, float]]) – Maximum total size for each group (can be string like “1GB”)
silent (bool) – If True, the progress bar content will not be displayed.
- Returns:
List of file groups
- Return type:
List[FilesGroup]
- Raises:
ValueError – If the grouping parameters are invalid
OSError – If there are filesystem-related errors
- Example:
>>> groups = walk_files_with_groups("./data", "*.txt", group_method=2, max_total_size="1GB") >>> for group in groups: ... print(f"Group contains {group.count} files, total size: {group.size} bytes")