llama.cpp checkpoint fix speeds local coding agents
A pull request still open in llama.cpp could eliminate the single biggest frustration for local AI coding agents: the forced full re-processing of prompts that makes every tool use painfully slow. The fix targets prompt cache reliability, a feature that hosted systems take for granted but open-source inference has struggled to match.