Running Gated Huggingface Models with Token Authentication

Some models have restrictions and may require some sort of approval or agreement process, which, by consequence, requires token-authentication with Huggingface.

The easiest way might be to use the huggingface-cli login command.

Alternatively, here is how you can generate a "read-only public repositories" access token to log into your account on Huggingface, directly from bazel, in order to download models.

log in at https://huggingface.co/settings/tokens.
click on "Create new token"
give the token a name, eg zml_public_repos,
under Repositories, grant the following permission: "Read access to contents of all public gated repos you can access".
at the bottom click on "Create token".
copy the token by clicking Copy. You won't be able to see it again.
the token looks something like hf_abCdEfGhijKlM.
store the token on your machine (replace the placeholder with your actual token):

You can use the HUGGINGFACE_TOKEN environment variable to store the token or use its standard location:

mkdir -p $HOME/.cache/huggingface/; echo <hf_my_token> > "$HOME/.cache/huggingface/token"

Now you're ready to download a gated model like Meta-Llama-3.2-1b!

Example:

# requires token in $HOME/.cache/huggingface/token, as created by the
# `huggingface-cli login` command, or the `HUGGINGFACE_TOKEN` environment variable.
cd examples
bazel run @zml//tools:hf -- download meta-llama/Llama-3.2-1B-Instruct --local-dir $HOME/Llama-3.2-1B-Instruct --exclude='*.pth'
bazel run --config=release //llama -- --hf-model-path=$HOME/Llama-3.2-1B-Instruct --prompt="What is the capital of France?"