AI Consumer GPU hardware is closing the gap with cloud AI faster than anyone expected 5 min 176 views
AI PFlash claims a 10x prefill speedup over llama.cpp and it points to where local AI inference is heading 5 min 1.9K views