Update on SJW upgrade to v19.3

TheDude@sh.itjust.works · edit-2 8 months ago

Update on SJW upgrade to v19.3

4am · edit-2 8 months ago

Speed is usually the reason. SSDs in general are faster, enterprise SSDs are not only faster but much more write-tolerant and last a very long time in comparison to consumer SSDs.

They can also (in many cases) do write caching at the speed of a DRAM buffer, making the bottleneck the SATA or SAS bus itself (SAS is like enterprise SATA, 12Gb/sec as opposed to 6). NVMe can be even faster. This means that programs (ie Lemmy and its database) that write data aren’t waiting around for the drive to acknowledge the write before that program can move on to other things. Shaving off a few milliseconds per write can make a massive difference when you realize there might be millions of IOPS (Input/Output operations Per Second) under load. The requirement for low latency is everything in servers.

When you are running a public service and requests are coming in constantly and at a high rate, you really really do not want storage latency to bottleneck you, as that is a problem that will compound extremely quickly. This is a big issue with HDDs as well, as even disk seek times add to the problem, let alone caching/buffering writes.

We could talk all day about if four SSDs in a RAID 10 are optimal, but sometimes you have to think about budget and complexity as well. For the load that a popular Lemmy instance might currently draw, I’d make an educated guess that this might be sufficient for now. Room to expand was also mentioned, which is the second most important part of a storage plan.

_cnt0@sh.itjust.works · 8 months ago

I’d wager raid 5 would be better, but it would require a special storage controller or hog the cpu with 4 ssds.

burrito@sh.itjust.works · 8 months ago

Software RAID is much faster than you think, even in RAID 5. Many of the algorithms used in software RAID leverage special CPU instructions that can process the parity operations at a very fast rate. Reading the data, which is by far the most common operation in a Lemmy instance, uses even less computational power than writes.

4am · 8 months ago

Yeah, ZFS rocks these days. Fast and rock solid for me, even on older hardware. I run my whole array as mirrored vdevs (so, basically a bunch of raid 10) to keep resilver times down when i replace drives. No issues so far!

Task	Date	Expected Downtime
Migration to new server	Tuesday February 27 2024 @ 8:00PM ET	90 Minutes
Upgrade to V19.3	Thursday February 29 2024 @ 8:00PM ET	Up to 120 Minutes