Topic
#inference
2 articles exploring inference. Expert insights and analysis from our editorial team.
Showing 1–2 of 2 articles
Articles
Newest first
AI Infrastructure
IonRouter (YC W26): The Custom NVIDIA GH200 Runtime Targeting the LLM Inference Cost Crisis
IonRouter (YC W26) built IonAttention, a custom GH200 inference runtime claiming 50% cost cuts and 2x VLM throughput. Here's what the technology actually does.
AI Research
Executing Programs Inside Transformers: The Inference Breakthrough Nobody Expected
A new architecture from Percepta embeds a program interpreter directly into transformer weights, achieving logarithmic-time execution lookups that could reshape how AI agents handle deterministic computation—if the early claims survive scrutiny.