ResApexQuant — Live Benchmark
KV cache quantization that beats SOTA by ×12 at 1-bit.
Run the benchmark live below. All algorithms are calibrated and tested on real random unit vectors — no cherry-picking, no pre-computed results.
Patent pending · SR-SE Research · Code on GitHub
Parameters
32 512
200 3000
50 500
What this tests:
- Real random unit vectors on S^{d-1}
- Calibration → test → Recall@k
- 4 algorithms: Uniform, TurboQuant (SOTA), ApexQuant, ResApexQuant
What stays secret:
- Auto-calibration on real LLM KV vectors
- CUDA/Triton optimized kernel
- Production integration pipeline
ResApexQuant — Patent pending (INPI 2026) · SR-SE Research Unit
The method exploits a theoretical property of residual attention deltas (δ = h − x) combined with Fisher Saturation constraints (Sc = 0.769893). Full mathematical proof available under NDA.