Class HuggingFaceHub
java.lang.Object
smile.util.HuggingFaceHub
Utility for downloading files from the
Hugging Face Hub with local disk caching.
This class reproduces the same on-disk cache layout as official Python
function hf_hub_download so that downloads performed by this class
are interoperable with the Python library and vice versa.
Cache layout
$HF_HOME/hub/
models--{owner}--{repo}/
blobs/
{sha256} ← actual file content
snapshots/
{commit_hash}/
{filename} ← relative symlink → ../../blobs/{sha256}
refs/
{revision} ← text file containing the resolved commit hash
Environment variables
HF_HOME– base directory for all Hugging Face data (default:~/.cache/huggingface).HUGGINGFACE_HUB_CACHE– override cache root (default:$HF_HOME/hub).HF_ENDPOINT– Hugging Face endpoint (default:https://huggingface.co).HF_TOKEN– API token for private repositories (also read from~/.cache/huggingface/token).
Supported repo types
model(default)datasetspace
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumSupported Hugging Face repository types. -
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionstatic voiddeleteRepoCache(String repoId, HuggingFaceHub.RepoType repoType, Path cacheDir) Deletes all cached data for a given repository.static PathDownloads a single file from a Hugging Face Hub repository and caches it locally.static PathConvenience overload that usesHuggingFaceHub.RepoType.MODELand"main"revision, and accepts an explicittokenargument instead of reading environment variables.static Pathdownload(String repoId, String filename, HuggingFaceHub.RepoType repoType, String revision, String subfolder, Path cacheDir, boolean forceDownload, boolean localFilesOnly) Downloads a single file from a Hugging Face Hub repository and caches it locally.static PathresolveCacheDir(Path cacheDir) Resolves the local cache root directory, respecting environment variables:cacheDirargument (if non-null)HUGGINGFACE_HUB_CACHEenvironment variableHF_HOMEenvironment variable +"/hub"~/.cache/huggingface/hubstatic StringReturns the Hugging Face Hub API endpoint, from theHF_ENDPOINTenvironment variable, falling back toDEFAULT_ENDPOINT.static StringReturns the API token to use for authenticated requests, checking in priority order:HF_TOKENenvironment variableHUGGING_FACE_HUB_TOKENenvironment variable (legacy)~/.cache/huggingface/tokenfile (written byhuggingface-cli login)tryLoadFromCache(String repoId, String filename, HuggingFaceHub.RepoType repoType, String revision, Path cacheDir) Returns the expected local path for a cached file without making any network request.
-
Field Details
-
DEFAULT_ENDPOINT
-
-
Method Details
-
download
Downloads a single file from a Hugging Face Hub repository and caches it locally.- Parameters:
repoId- the repository identifier inowner/nameformat (e.g."google/bert-base-uncased").filename- the path of the file inside the repository (e.g."config.json"or"data/train.csv").- Returns:
- the local
Pathto the cached file. - Throws:
IOException- if a network or filesystem error occurs.
-
download
public static Path download(String repoId, String filename, HuggingFaceHub.RepoType repoType, String revision, String subfolder, Path cacheDir, boolean forceDownload, boolean localFilesOnly) throws IOException Downloads a single file from a Hugging Face Hub repository and caches it locally. The function checks for the HF_TOKEN environment variable or HUGGING_FACE_HUB_TOKEN system property. If neither is set, it checks the token file written by `huggingface-cli login`.- Parameters:
repoId- the repository identifier ("owner/name").filename- the file path inside the repository.repoType- the repository type (HuggingFaceHub.RepoType.MODEL,HuggingFaceHub.RepoType.DATASET, orHuggingFaceHub.RepoType.SPACE).revision- the git revision to download from (branch, tag, or full commit SHA). Defaults to"main".subfolder- an optional subdirectory prefix prepended tofilename. May benull.cacheDir- override for the local cache root directory. Whennull, the value of theHUGGINGFACE_HUB_CACHE/HF_HOMEenvironment variables is used.forceDownload- whentrue, bypass the local cache and always re-download the file.localFilesOnly- whentrue, raise anIOExceptioninstead of making any network request if the file is not already cached.- Returns:
- the local
Pathto the cached file. - Throws:
IOException- if a network or filesystem error occurs, or iflocalFilesOnlyistrueand the file is not cached.
-
resolveCacheDir
Resolves the local cache root directory, respecting environment variables:cacheDirargument (if non-null)HUGGINGFACE_HUB_CACHEenvironment variableHF_HOMEenvironment variable +"/hub"~/.cache/huggingface/hub
- Parameters:
cacheDir- explicit override; may benull.- Returns:
- the resolved cache root path.
-
resolveEndpoint
Returns the Hugging Face Hub API endpoint, from theHF_ENDPOINTenvironment variable, falling back toDEFAULT_ENDPOINT.- Returns:
- the endpoint URL (no trailing slash).
-
resolveToken
Returns the API token to use for authenticated requests, checking in priority order:HF_TOKENenvironment variableHUGGING_FACE_HUB_TOKENenvironment variable (legacy)~/.cache/huggingface/tokenfile (written byhuggingface-cli login)
- Returns:
- the token string, or
nullif none is configured.
-
download
Convenience overload that usesHuggingFaceHub.RepoType.MODELand"main"revision, and accepts an explicittokenargument instead of reading environment variables.- Parameters:
repoId- the repository identifier ("owner/name").filename- the file path inside the repository.token- the Bearer token for private repositories, ornull.- Returns:
- the local
Pathto the cached file. - Throws:
IOException- if a network or filesystem error occurs.
-
tryLoadFromCache
public static Optional<Path> tryLoadFromCache(String repoId, String filename, HuggingFaceHub.RepoType repoType, String revision, Path cacheDir) Returns the expected local path for a cached file without making any network request. Returns an emptyOptionalif the file is not currently cached.- Parameters:
repoId- the repository identifier ("owner/name").filename- the file path inside the repository.repoType- the repository type.revision- the git revision (branch, tag, or commit SHA).cacheDir- explicit cache root override;nullto use the default.- Returns:
- an
Optionalcontaining the cached path, or empty.
-
deleteRepoCache
public static void deleteRepoCache(String repoId, HuggingFaceHub.RepoType repoType, Path cacheDir) throws IOException Deletes all cached data for a given repository.This removes the entire repo cache directory, including all blobs, snapshots, and refs. Use with care.
- Parameters:
repoId- the repository identifier ("owner/name").repoType- the repository type.cacheDir- explicit cache root override;nullto use the default.- Throws:
IOException- if a filesystem error occurs.
-