Quick Start#
This guide gets you from zero to running inference with a pretrained language model in a few minutes. No training required.
Prerequisites#
Completed Installation
At least one CUDA-enabled GPU with ≥ 8 GB VRAM
Step 1 — Activate the environment#
uv (activate venv):
source .venv/bin/activate
Or prefix every command with uv run to skip activation:
uv run python run.py hf-gen ...
conda:
conda activate new-compute
Step 2 — Generate text with a pretrained CLM#
We publish pretrained AICrossSim-CLM checkpoints on HuggingFace.
The command below downloads AICrossSim/clm-60m automatically and runs text generation —
no local training needed.
cd experiments/llm-digital/pretrain
CUDA_VISIBLE_DEVICES=0 python run.py hf-gen \
--model_name AICrossSim/clm-60m \
--prompt "London is" \
--max_new_tokens 100 \
--do_sample true \
--temperature 0.6 \
--top_k 50 \
--top_p 0.9
Set CUDA_VISIBLE_DEVICES to the GPU index you want to use (e.g., 0, 1).
You should see generated text printed to stdout within a few seconds.
Tip
Swap AICrossSim/clm-60m for AICrossSim/clm-200m,
AICrossSim/clm-400m, or AICrossSim/clm-1.1b to use a larger model.
Larger models require more VRAM.
Step 3 — Run evaluation on a downstream task#
Evaluate the same checkpoint on wikitext using lm-eval-harness:
CUDA_VISIBLE_DEVICES=0 python run.py eval hf-lm-eval \
AICrossSim/clm-60m \
--tasks ['wikitext'] \
--dtype float16
What’s Next#
LLM Pretraining & Evaluation — pretrain your own CLM from scratch
Random Bitflip on CLM — simulate random bitflip noise during pretraining
Bitflip-Aware LoRA Fine-Tuning — LoRA fine-tuning of Llama-3.1-8B with bitflip noise
Optical Neural Networks on RoBERTa — optical neural network experiments on RoBERTa
Processing-in-Memory on RoBERTa — processing-in-memory simulation on RoBERTa