Performance @programming.dev agilob @programming.dev 6 mo. ago

LLaMA Now Goes Faster on CPUs

justine.lol LLaMA Now Goes Faster on CPUs

I wrote 84 new matmul kernels to improve llamafile CPU performance.

My kernels go 2x faster than MKL for matrices that fit in L2 cache, which makes them a work in progress, since the speedup works best for prompts having fewer than 1,000 tokens.

LocalLLaMA @sh.itjust.works ylai @lemmy.ml 6 mo. ago

LLaMA Now Goes Faster on CPUs

justine.lol /matmul/

Hacker News @lemmy.smeargle.fans bot @lemmy.smeargle.fans

BOT

6 mo. ago

LLaMA now goes faster on CPUs

justine.lol /matmul/

AI Companions @lemmy.world pavnilschanda @lemmy.world 6 mo. ago

[News] LLaMA Now Goes Faster on CPUs

justine.lol /matmul/

0 comments