Hi, currently I have a almost none backups and I want to change them. I have a PC with Nextcloud on 500gb ssd that I also use for gaming (1tb system drive). Nextcloud would be used to store/sync images, documents, contacts, and calendar from my phone and laptop. I also have an old pc that has 2x 80gb, 120gb, 320gb, and 500gb hdd. I want to use it for other backups like OS snapshots, programming projects, etc. but its not a big hdd but a lot of small hdds. Should I store each backup on 2 drives? Can I automate this? Any suggestions would be helpful.
Don't use a synchronized folder as a backup solution (delete a file by mistake on your local replica -> the deletion gets replicated to the server -> you lose both copies).
old pc that has 2x 80gb, 120gb, 320gb, and 500gb hdd
You can make a JBOD array out of that using LVM (add all disks as PVs, create a single VG on top of that, create a single LV on top of that VG, create a filesystem on top of that LV, format it as ext4 filesystem, mount this filesystem somewhere, access it over SFTP or another file transfer protocol).
But if the disks are old, I wouldn't trust them as reliable backup storage. You can use them to store data that will be backed up somewhere else. Or as an expendable TEMP directory (this is what I do with my old disks).
My advice is get a large disk for this PC, store backups on that. You don't necessarily need RAID (RAID is a high availability mechanism, not a backup). Setup backup software on this old PC to pull automatic daily backups from your server (and possibly other devices/desktops... personally I don't bother with that. Anything that is not on the server is expendable). I use rsnapshot for that, simple config file, basic deduplication, simple filesystem-backed backups so I can access the files without any special software, gets the job done. There are a few threads here about backup software recommendations:
In addition I make regular, manual, offsite copies of the backup server's backups/ directory to removable media (stash the drive somewhere where a disaster that destroys the backup server will not also destroy the offsite backup drive).
Prefer pull-based backup strategies, where hosts being backed up do not have write access to the backup server (else a compromised host could alter previous backups).
Monitor correct execution of backups (my simple solution to that, is to have cron create/update a state file after correct execution, and have the netdata agent check the date of last modification of this file. If it has not been modified in the last 24-25hrs, something is wrong and I get an alert).
JBOD here just means "show me this bunch of old drives as a single drive/partition". It's just a recommendation to at least get something out of these drives - but don't use this as backup storage , these drives are old and if a single one fails, you lose access to the whole array.
If you're not sure what to do with them, just get an USB/SATA dock or adapter, and treat them as old books: copy not-so-valuable stuff on them, and store them in a bookshelf with labels such as Old movies, Wikipedia dumps 2015-2022...
Definitely get a good, new drive for backup storage. And possibly another one for offsite backups.
I mostly use it for cloud backups but it also works great for local/network storage as well.
It's really fast and efficient, supports cutting edge encryption and compression algorithms and the de-duplication and file-splitting features will let you generate frequent snapshots while costing you minimal storage.
Snapshots are also effortless to mount and it even supports error correction to protect against bit-flipping and other long-term storage risks.
It's also cross-platform and FOSS.
De-duplication prevents duplicate bits of data from being stored twice. Even if they are different file names or even synced from different systems.
The rolling hash/file-splitting means if you modify a 25GB file and only change a couple MB then only the changed couple MB will need to be stored. This means you can spend a month modifying small parts of a massive file thousands of times and avoid storing a new 25GB file thousands of times to archive those changes.
Can second Kopia! The deduplication works like a charm.
I've recently started using Immich (I previously used Google Photos). And since I've backed up a recent Google Takeout archive (unzipped), backing up all of my images in Immich added just a couple hundered megabytes (over ~200GB of images).
Edit: also, don't discount paying for some cloud storage for backups entirely: I never wanted to do that since I wanted to host it myself, but there's multiple reasons to have one of your backup targets be a cloud storage (yes, I know I'm in the selfhosted community):
it's definitely physically seperate
most cloud storage has incredibly reliable storage (which is hard to replicate on most home-storage-budgets)
the cost can be very low even compared to buying disks (I pay 20$/year for 1TB, which can hold all of my valuable data easily, obviously not my "bulk stuff").
How old are these disks? If wouldn't trust anything of value to an HDD (better to save them on a bunch of good quality DVDs or BluRay disks than relying on such old disks.
If I've learned something about selfhosting and backups it is that you can trust HDDs to spin for 3-5 years and should still do backups. I myself do backups to HDDs that are only powered on for these backups. I'm still not sure if thats enougth.
Raid is more for an always-on solution, but not great for safe backups. They still might get damaged at the same time, because you bought them at the same time, from the same vendor and they have the same usage time.
One thing that RAID doesn't do is verify the integrity of your data on read. In other words: if you have silent data corruption somewhere you won't notice.
For many use cases that's acceptable, since it doesn't handle often, but personally I don't like it for any kind or achival/backups. That's why I picked ZFS, which stores and verifies checksums even on non-mirrored/non-raid storage. I've added RaidZ2 (similar to RAID 5 with 2 parity disks) on top of it to be able to recover from checksum errors.
You can of course automate it. If you're running backups from Linux use anacron rather than cron because anacron tries to run when it can (when the machine turns on), whereas cron doesn't run again if the machine was off when it was time to run.
rsync is the most straightfoward solution. Pros: it won't copy files again if they haven't changed; it can copy remotely over ssh. Con: it has a bit of a learning curve.
BorgBackup would be my next recommendation, it takes distinct backups but doesn't duplicate files between them. It has compression, encryption (optional) and you can run checks on the backups. Con: for remote use you need to run a borg server on the target machine. Another potential con is that it doesn't store the files in a directly usable format like rsync. Borg archives are similar to a zip archive – you can list the files, you can extract them, you can even mount them somewhere and then access the files directly – but you can't access them directly without borg.
I'm sure there are more elegant solutions out there, but here's my method:
I have an inexpensive hard drive dock connected to my NUC home server via USB (with UASP support). I rotate two large-capacity hard drives between work and home, ensuring that one is always off-site. The drives are wholly encrypted, so I manually decrypt and mount the drive, and run a backup script that pulls any changed data from all devices on the network. I then take that drive to work and bring the other one home.
I have a calendar reminder to do this each month, and I'll sometimes run a backup in between the usual schedule when we're working on important projects at home.
How much data are we talking about? I get confused. Is it 1.5TB or is it 2.5TB?
Then, how backed up do you want to be? Think about if you REALLY need daily backups. While Raid might be cool and flashy; if you don't need it you don't need it and running it only creates cost.
If you have about 2 TB of data then i would just buy 3 external HDD of 2TB size and replace them every 5 Years. Then rotate them around every time you do a backup.
Can you automate your backups? Not to the point of you not having to do anything. Unless you choose to pay for 2 cloud storage providers and both offer you to save your backups in an unchangeable state.
I personally don't use automation, I just have a Veracrypt volume for storing backups and do them manually. Rarely full-system, mostly just home folder.