Thoughts & Insights

Exploring the frontiers of AI, engineering, and digital experiences.

3/15/2024

Optimizing Transformer Inference

A deep dive into KV-cache optimization and quantization techniques for serving LLMs at scale.

#Paper#LLM

2/10/2024

Exploring how autonomous agents will reshape software engineering workflows.

#Article#Agents