Mase-Triton#

PyPI | GitHub

Mase-triton is a PyTorch extension library providing efficient Triton-based implementations of operations used in simulating new compute paradigms. It underpins NewComputeBench’s bitflip, optical transformer, and PIM simulations.

Installation#

pip install mase-triton

Functionality#

Random Bitflip#

Simulate random bit flips in neural network computations.

Functional APIs (source):

  • random_bitflip_fn — perform random bit flipping on tensors with configurable exponent and mantissa bit-flip probabilities (supports backward pass).

  • calculate_flip_probability — calculate flip probability from a number of halves.

  • find_nearest_prob_n_halves — snap a probability to the nearest valid power-of-0.5 value.

Layers (source):

  • RandomBitFlipDropout — random bit-flip layer with dropout functionality.

  • RandomBitFlipLinear — linear layer with random bit flipping.

Note

The bitflip probability must be a power of 0.5 (e.g., 0.5^10 9.77e-04). The kernel snaps to the nearest valid value automatically. The minimum supported probability is 0.5^24 5.96e-08 due to the Philox PRNG.

Optical Transformer#

Simulate optical computing for transformer models (paper).

Functional APIs (source):

  • ot_quantize_fn — quantize tensors for optical transformer operations.

  • ot_qlinear_fn — quantized linear transformation for optical computing.

  • ot_qmatmul_fn — quantized matrix multiplication for optical computing.

Layers (source):

  • OpticalTransformerLinear — linear layer with optical transformer quantization.

MXFP (Microscaling Formats)#

Simulate MXFP on CPU and GPU using PyTorch and Triton.

Functional (source):

  • extract_mxfp_components — extract MXFP components (shared exponent and elements).

  • compose_mxfp_tensor — compose MXFP components back to standard floats.

  • quantize_dequantize — quantize and dequantize tensors using MXFP format.

  • mxfp_linear / mxfp_matmul — linear and matmul with MXFP support.

Layers (source):

  • MXFPLinearPTQ — linear layer with MXFP for post-training quantization (no backpropagation support).

Minifloat#

Simulate minifloat formats on CPU and GPU.

Functional (source):

  • extract_minifloat_component / compose_minifloat_component — component extraction and composition.

  • quantize_dequantize — quantize and dequantize using minifloat format.

  • minifloat_linear / minifloat_matmul — linear and matmul with minifloat support.

Layers (source):

  • MinifloatLinearPTQ — linear layer with minifloat for post-training quantization.

Utilities#

  • manager.pyKernelManager: enable/disable Triton kernel autotuning. HuggingFace Trainer does not work with autotuned Triton kernels; autotuning is therefore disabled by default in mase-triton.

  • utils/ — utility functions for PyTorch modules, debugging, and training.