actually-a-cat @ actuallyacat @sh.itjust.works Posts 2Comments 2Joined 2 yr. ago
actually-a-cat @ actuallyacat @sh.itjust.works
Posts
2
Comments
2
Joined
2 yr. ago
You are supposed to manually set scale to 1.0 and base to 10000 when using llama 2 with 4096 context. The automatic scaling assumes the model was trained for 2048. Though as I say in the OP, that still doesn't work, at least with this particular fine tune.
Reddit has over 2,000 employees most of whom are doing bullshit nobody using the site actually needs or wants, it's possible to run a lot leaner than that. Like Reddit itself used to, before they started burning hundreds of millions trying to compete with every other social media site at once instead of being Reddit