Coinbase halved its AI bill without restricting engineers and the playbook is worth stealing
Brian Armstrong cut Coinbase's AI bill by nearly 50% without capping usage, defaulting engineers to Chinese open-weight models like GLM 5.2 at $1.40 per million tokens versus Anthropic Opus at $5, while lifting cache hit rates from 5% to 60%. The playbook is replicable , but the regulatory question around Chinese-origin models in a federally registered financial firm is one Armstrong's X thread quietly skipped.