[Done] Lemmy world was upgraded to 0.18.3 today (2023-07-30)
Update
The upgrade was done, DB migrations took around 5 minutes. We'll keep an eye out for (new) issues but for now it seems to be OK.
Original message
We will upgrade lemmy.world to 0.18.3 today at 20:00 UTC+2 (Check what this isn in your timezone). Expect the site to be down for a few minutes. ""Edit"" I was warned it could be more than a few minutes. The database update might even take 30 minutes or longer.
This version brings major optimizations to the database queries, which significantly reduces CPU usage. There is also a change to the way federation activities are stored, which reduces database size by around 80%.
Is it me or is the 80% figure just insane? Are there any benchmarks to see how fast this has become versus say Lemmy 0.18.2 on a very large instance?
Not really, you'd be surprised how often systems are bloated all because of a single option, character, etc. Most developers don't start optimizing until much later in the software's lifecycle. Often enough, it is easily overlooked. That's why code reviews are needed often with fresh pair of eyes.
Just to set the expectations, reducing database size or CPU usage does not necessarily mean it is faster but it does mean there's more free capacity on the servers to handle more users at the same performance.
More importantly; they may help reduce costs on the smaller indie instances that doesn't need to buy larger server instances.
Hopefully, we'll continue to see more of these optimizations.
I believe if the backend doesn't have to write as much data then you'll have less I/O operations so it should IMO have an impact on the overall speed of Lemmy (unless all of those operations are done asynchronously). Same for the reduced CPU usage, it could allow for more stuff in parallel.
Speed/pref and capacity are two separate things. I/O has nothing to do with the size of the database. You can write 100TB per second into the database and choose to only store 1TB of content. That does not mean the app is writing 1TB per second, it is still writing 100TBps.
As you can see, the issue here is that they were storing a lot of data in the activities table that is not needed, it was only meant for debug purposes. So, they split up the data into two and not store the other data as it isn't needed; they're still writing these data the same as before. One part is used to ensure they don't re-process the same data but this is the same thing they were doing before this change.
In addition, they've limited how long the data is retained for 3 months, which is a separate job they run to remove data.
All of this has zero impact on the users using the app right now. The main benefit is for instance admins with limited storage. One might say the system slows down if there's not enough space but that is still the same case here with this MR, it will still slow down.
Wait does that mean posts older than 3 months gets automatically deleted? Isn't that kinda bad? Being able to find years old posts is an important part of Reddit and pretty much all social media.
Funnily enough, this is the feature that can speed up the performance by doing less calls:
The federation code now includes a check for dead instances which is used when sending activities. This helps to reduce the amount of outgoing POST requests, and also reduce server load.