Quick Start#

This guide gets you from zero to running inference with a pretrained language model in a few minutes. No training required.

Prerequisites#

  • Completed Installation

  • At least one CUDA-enabled GPU with ≥ 8 GB VRAM

Step 1 — Activate the environment#

uv (activate venv):

source .venv/bin/activate

Or prefix every command with uv run to skip activation:

uv run python run.py hf-gen ...

conda:

conda activate new-compute

Step 2 — Generate text with a pretrained CLM#

We publish pretrained AICrossSim-CLM checkpoints on HuggingFace. The command below downloads AICrossSim/clm-60m automatically and runs text generation — no local training needed.

cd experiments/llm-digital/pretrain

CUDA_VISIBLE_DEVICES=0 python run.py hf-gen \
    --model_name AICrossSim/clm-60m \
    --prompt "London is" \
    --max_new_tokens 100 \
    --do_sample true \
    --temperature 0.6 \
    --top_k 50 \
    --top_p 0.9

Set CUDA_VISIBLE_DEVICES to the GPU index you want to use (e.g., 0, 1). You should see generated text printed to stdout within a few seconds.

Tip

Swap AICrossSim/clm-60m for AICrossSim/clm-200m, AICrossSim/clm-400m, or AICrossSim/clm-1.1b to use a larger model. Larger models require more VRAM.

Step 3 — Run evaluation on a downstream task#

Evaluate the same checkpoint on wikitext using lm-eval-harness:

CUDA_VISIBLE_DEVICES=0 python run.py eval hf-lm-eval \
    AICrossSim/clm-60m \
    --tasks ['wikitext'] \
    --dtype float16

What’s Next#