Skip Navigation

LLaMA Now Goes Faster on CPUs

justine.lol LLaMA Now Goes Faster on CPUs

I wrote 84 new matmul kernels to improve llamafile CPU performance.

LLaMA Now Goes Faster on CPUs

My kernels go 2x faster than MKL for matrices that fit in L2 cache, which makes them a work in progress, since the speedup works best for prompts having fewer than 1,000 tokens.

0
Hacker News @lemmy.smeargle.fans bot @lemmy.smeargle.fans
BOT
LLaMA now goes faster on CPUs
0 comments