AI University of Toronto students hit 50,000 tokens per second on FPGA hardware and the inference economics story is more important than the headline 6 min 378 views