AI University of Toronto students hit 50,000 tokens per second on FPGA hardware and the inference economics story is more important than the headline 6 min 437 views