ResApexQuant — Live Benchmark

KV cache quantization that beats SOTA by ×12 at 1-bit.

Run the benchmark live below. All algorithms are calibrated and tested on real random unit vectors — no cherry-picking, no pre-computed results.

Patent pending · SR-SE Research · Code on GitHub

Parameters

32 512
200 3000
50 500

What this tests:

  • Real random unit vectors on S^{d-1}
  • Calibration → test → Recall@k
  • 4 algorithms: Uniform, TurboQuant (SOTA), ApexQuant, ResApexQuant

What stays secret:

  • Auto-calibration on real LLM KV vectors
  • CUDA/Triton optimized kernel
  • Production integration pipeline

ResApexQuant — Patent pending (INPI 2026) · SR-SE Research Unit

The method exploits a theoretical property of residual attention deltas (δ = h − x) combined with Fisher Saturation constraints (Sc = 0.769893). Full mathematical proof available under NDA.