Mase-Triton#
Mase-triton is a PyTorch extension library providing efficient Triton-based implementations of operations used in simulating new compute paradigms. It underpins NewComputeBench’s bitflip, optical transformer, and PIM simulations.
Installation#
pip install mase-triton
Functionality#
Random Bitflip#
Simulate random bit flips in neural network computations.
Functional APIs (source):
random_bitflip_fn— perform random bit flipping on tensors with configurable exponent and mantissa bit-flip probabilities (supports backward pass).calculate_flip_probability— calculate flip probability from a number of halves.find_nearest_prob_n_halves— snap a probability to the nearest valid power-of-0.5 value.
Layers (source):
RandomBitFlipDropout— random bit-flip layer with dropout functionality.RandomBitFlipLinear— linear layer with random bit flipping.
Note
The bitflip probability must be a power of 0.5 (e.g., 0.5^10 ≈ 9.77e-04).
The kernel snaps to the nearest valid value automatically.
The minimum supported probability is 0.5^24 ≈ 5.96e-08 due to the Philox PRNG.
Optical Transformer#
Simulate optical computing for transformer models (paper).
Functional APIs (source):
ot_quantize_fn— quantize tensors for optical transformer operations.ot_qlinear_fn— quantized linear transformation for optical computing.ot_qmatmul_fn— quantized matrix multiplication for optical computing.
Layers (source):
OpticalTransformerLinear— linear layer with optical transformer quantization.
MXFP (Microscaling Formats)#
Simulate MXFP on CPU and GPU using PyTorch and Triton.
Functional (source):
extract_mxfp_components— extract MXFP components (shared exponent and elements).compose_mxfp_tensor— compose MXFP components back to standard floats.quantize_dequantize— quantize and dequantize tensors using MXFP format.mxfp_linear/mxfp_matmul— linear and matmul with MXFP support.
Layers (source):
MXFPLinearPTQ— linear layer with MXFP for post-training quantization (no backpropagation support).
Minifloat#
Simulate minifloat formats on CPU and GPU.
Functional (source):
extract_minifloat_component/compose_minifloat_component— component extraction and composition.quantize_dequantize— quantize and dequantize using minifloat format.minifloat_linear/minifloat_matmul— linear and matmul with minifloat support.
Layers (source):
MinifloatLinearPTQ— linear layer with minifloat for post-training quantization.
Utilities#
manager.py —
KernelManager: enable/disable Triton kernel autotuning. HuggingFace Trainer does not work with autotuned Triton kernels; autotuning is therefore disabled by default in mase-triton.utils/ — utility functions for PyTorch modules, debugging, and training.