Tuesday, 30 September

Tuesday, 30 September2025

DeepSeek Launches Sparse Attention Model, Halves API Costs

DeepSeek Launches Sparse Attention Model, Halves API Costs
DeepSeek has unveiled V3.2-exp, a sparse-attention model using alightning indexerand fine-grained token selection to trim inference expenses. In long-context applications, this architecture can cut per-call API costs by up to 50%. The model is open-weight and publicly available on Hugging Face, enabling further third-party validation and adoption.
Read full story at TechCrunch

Subscribe To Our Newsletter.

Full Name
Email