hfutils.index.make
tar_create_index
- hfutils.index.make.tar_create_index(src_tar_file, dst_index_file: str | None = None, chunk_for_hash: int = 1048576, with_hash: bool = True, silent: bool = False)[source]
Create an index file for a tar archive file.
- Parameters:
src_tar_file (str) – The path to the source tar archive file.
dst_index_file (str, optional) – The path to save the index file, defaults to None.
chunk_for_hash (int, optional) – The chunk size for hashing, defaults to 1 << 20 (1 MB).
with_hash (bool, optional) – Whether to include file hashes in the index, defaults to True.
silent (bool, optional) – Whether to suppress progress bars and logging messages, defaults to False.
- Returns:
The path to the created index file.
- Return type:
str
hf_tar_create_index
- hfutils.index.make.hf_tar_create_index(repo_id: str, archive_in_repo: str, repo_type: Literal['dataset', 'model', 'space'] = 'dataset', revision: str = 'main', idx_repo_id: str | None = None, idx_file_in_repo: str | None = None, idx_repo_type: Literal['dataset', 'model', 'space'] | None = None, idx_revision: str | None = None, chunk_for_hash: int = 1048576, with_hash: bool = True, skip_when_synced: bool = True, hf_token: str | None = None)[source]
Create an index file for a tar archive file in a Hugging Face repository.
- Parameters:
repo_id (str) – The identifier of the repository.
archive_in_repo (str) – The path to the tar archive file.
repo_type (RepoTypeTyping, optional) – The type of the Hugging Face repository, defaults to ‘dataset’.
revision (str, optional) – The revision of the repository, defaults to ‘main’.
idx_repo_id (str, optional) – The identifier of the index repository, defaults to None.
idx_file_in_repo (str, optional) – The path to save the index file in the index repository, defaults to None.
idx_repo_type (RepoTypeTyping, optional) – The type of the index repository, defaults to None.
idx_revision (str, optional) – The revision of the index repository, defaults to None.
chunk_for_hash (int, optional) – The chunk size for hashing, defaults to 1 << 20 (1 MB).
with_hash (bool, optional) – Whether to include file hashes in the index, defaults to True.
skip_when_synced (bool) – Skip syncing when index is ready, defaults to True.
hf_token (str, optional) – The Hugging Face access token, defaults to None.
tar_get_index_info
- hfutils.index.make.tar_get_index_info(src_tar_file, chunk_for_hash: int = 1048576, with_hash: bool = True, silent: bool = False)[source]
Get the index information of a tar archive file.
Note
The return value of this function will be directly used as the index json file.
- Parameters:
src_tar_file (str) – The path to the source tar archive file.
chunk_for_hash (int, optional) – The chunk size for hashing, defaults to 1 << 20 (1 MB).
with_hash (bool, optional) – Whether to include file hashes in the index, defaults to True.
silent (bool, optional) – Whether to suppress progress bars and logging messages, defaults to False.
- Returns:
The index information of the tar archive file.
- Return type:
dict
hf_tar_create_from_directory
- hfutils.index.make.hf_tar_create_from_directory(repo_id: str, archive_in_repo: str, local_directory: str, repo_type: Literal['dataset', 'model', 'space'] = 'dataset', revision: str = 'main', chunk_for_hash: int = 1048576, with_hash: bool = True, silent: bool = False, hf_token: str | None = None)[source]
Create a tar archive file from a local directory and upload it to a Hugging Face repository.
- Parameters:
repo_id (str) – The identifier of the repository.
archive_in_repo (str) – The path to save the tar archive file in the repository.
local_directory (str) – The path to the local directory to be archived.
repo_type (RepoTypeTyping, optional) – The type of the Hugging Face repository, defaults to ‘dataset’.
revision (str, optional) – The revision of the repository, defaults to ‘main’.
chunk_for_hash (int, optional) – The chunk size for hashing, defaults to 1 << 20 (1 MB).
with_hash (bool, optional) – Whether to include file hashes in the index, defaults to True.
silent (bool, optional) – Whether to suppress progress bars and logging messages, defaults to False.
hf_token (str, optional) – The Hugging Face access token, defaults to None.