hfutils.operate.upload

upload_file_to_file

hfutils.operate.upload.upload_file_to_file(local_file, repo_id: str, file_in_repo: str, repo_type: Literal['dataset', 'model', 'space'] = 'dataset', revision: str = 'main', message: str | None = None, hf_token: str | None = None)[source]

Upload a local file to a specified path in a Hugging Face repository.

Parameters:
  • local_file (str) – The local file path to be uploaded.

  • repo_id (str) – The identifier of the repository.

  • file_in_repo (str) – The file path within the repository.

  • repo_type (RepoTypeTyping) – The type of the repository (‘dataset’, ‘model’, ‘space’).

  • revision (str) – The revision of the repository (e.g., branch, tag, commit hash).

  • message (Optional[str]) – The commit message for the upload.

  • hf_token (str, optional) – Huggingface token for API client, use HF_TOKEN variable if not assigned.

upload_directory_as_archive

hfutils.operate.upload.upload_directory_as_archive(local_directory, repo_id: str, archive_in_repo: str, repo_type: Literal['dataset', 'model', 'space'] = 'dataset', revision: str = 'main', message: str | None = None, silent: bool = False, hf_token: str | None = None)[source]

Upload a local directory as an archive file to a specified path in a Hugging Face repository.

Parameters:
  • local_directory (str) – The local directory path to be uploaded.

  • repo_id (str) – The identifier of the repository.

  • archive_in_repo (str) – The archive file path within the repository.

  • repo_type (RepoTypeTyping) – The type of the repository (‘dataset’, ‘model’, ‘space’).

  • revision (str) – The revision of the repository (e.g., branch, tag, commit hash).

  • message (Optional[str]) – The commit message for the upload.

  • silent (bool) – If True, suppress progress bar output.

  • hf_token (str, optional) – Huggingface token for API client, use HF_TOKEN variable if not assigned.

upload_directory_as_directory

hfutils.operate.upload.upload_directory_as_directory(local_directory, repo_id: str, path_in_repo: str, repo_type: ~typing.Literal['dataset', 'model', 'space'] = 'dataset', revision: str = 'main', message: str | None = None, time_suffix: bool = True, clear: bool = False, ignore_patterns: ~typing.List[str] = <object object>, hf_token: str | None = None, operation_chunk_size: int | None = None, upload_timespan: float = 5.0)[source]

Upload a local directory and its files to a specified path in a Hugging Face repository.

Parameters:
  • local_directory (str) – The local directory path to be uploaded.

  • repo_id (str) – The identifier of the repository.

  • path_in_repo (str) – The directory path within the repository.

  • repo_type (RepoTypeTyping) – The type of the repository (‘dataset’, ‘model’, ‘space’).

  • revision (str) – The revision of the repository (e.g., branch, tag, commit hash).

  • message (Optional[str]) – The commit message for the upload.

  • time_suffix (bool) – If True, append a timestamp to the commit message.

  • clear (bool) – If True, remove files in the repository not present in the local directory.

  • ignore_patterns (List[str]) – List of file patterns to ignore.

  • hf_token (str, optional) – Huggingface token for API client, use HF_TOKEN variable if not assigned.

  • operation_chunk_size (Optional[int]) – Chunk size of the operations. All the operations will be seperated into multiple commits when this is set.

  • upload_timespan (float) – Upload minimal time interval when chunked uploading enabled.

Note

When operation_chunk_size is set, multiple commits will be created. When some commits failed, it will roll back to the startup commit, using hfutils.repository.hf_hub_rollback() function..

Warning

When operation_chunk_size is set, multiple commits will be created. But HuggingFace’s repository api cannot guarantee the atomic feature of your data. So this function is not thread-safe.

Note

The rate limit of HuggingFace repository commit creation is approximately 120 commits / hour. So if you really have large number of chunks to create, please set the upload_timespan to a value no less than 30.0 to make sure your uploading will not be rate-limited.