hello friends! new(ish)!
Home Server/Setting up your Storage: Difference between revisions
>DataHoarderAnon (Added Note on RAID concepts. we don't need to cover those here IMO. Added subsections to ZFS. Added some warnings and not recommended practices. Added ZFS basics (Very WIP) added external links) |
>DataHoarderAnon mNo edit summary |
||
Line 11: | Line 11: | ||
{{Note|There are a lot of misconceptions about ZFS and ECC Ram. ECC Ram is '''NOT''' required for ZFS to operate. ZFS was made to protect data against degradation however, and not using ECC Ram to protect against memory errors (and thus data degradation) defeats the purpose of ZFS.}} | {{Note|There are a lot of misconceptions about ZFS and ECC Ram. ECC Ram is '''NOT''' required for ZFS to operate. ZFS was made to protect data against degradation however, and not using ECC Ram to protect against memory errors (and thus data degradation) defeats the purpose of ZFS.}} | ||
==== | ====Basic Concepts==== | ||
Revision as of 07:22, 23 December 2020
Setting up your RAID Solution
mdadm
ZFS
Basic Concepts
Physical Disks are grouped into Virtual devices (Vdevs). Vdevs are grouped into Zpools. Datasets reside in Zpools.
The actual file system portion of ZFS is a dataset which sits on top of the ZPool. This is where you store all of your data. There are also Zvols which are the equivalent of block devices (or LVM LVs). You can format these with other file systems like XFS, or use them as block storage, but for the most part we will be using just the standard ZFS file system. There are also Snapshots and Clones which we will talk about later.
Zpools stripe data across all included vdevs.
There are 7 types of vdevs.
- Disk: A single storage device. Adding multiple drives to the same pool without RAIDZ or Mirror is effectively Raid 0.
- Mirror: Same as a RAID 1 Mirror. Adding multiple mirrored vdevs is effectively Raid 10.
- RAIDZ: Parity based RAID similar to RAID 5. RAIDZ1, RAIDZ2, RAIDZ3 with Single, double, and triple parity respectively.
- Hot Spare: A hot spare, or standby drive that will replace a failed disk until it is replaced with a new one.
- File: A pre-allocated file.
- Cache: A cache device (typically SSD) for L2ARC. It's generally not recommend to use this unless you absolutely need it.
- Log: Dedicated ZFS Intent Log (ZIL) device, also called a SLOG (Separate intent LOG SSD). Usually these are high performance, durable SLC or MLC SSDs
Snapshots and Clones
Should I use ZFS?
ZFS has a lot of really great features that make a a superb file system. It has file system level checksums for data integrity, file self healing which can correct silent disk errors, incremental snapshots and rollback, file deduplication, encryption, and more.
There are however, some downsides to ZFS. Notably inflexibility and the upfront cost. ZFS RAIDZ vdevs CANNOT BE EXPANDED after being created. Parity cannot be added either (you cannot change a RAIDZ1 to a RAIDZ2 later on). You cannot use differently sized disks or disks with data already on them (even disks formatted as ZFS). In other words, you need to buy ALL of the drives you plan on using in your RAIDZ array at the same time, because unlike other software RAID (or even hardware RAID), you won't be able to change it later. This inherently makes ZFS costly to use and thus unfriendly to more budget oriented server builds. Growing your storage is pricy too, as best practice is to add either add identical Vdevs to your existing Zpools, or create an entirely new Zpool. This means at a minimum you need to add two drives at a time to maintain proper redundancy (if using mirrored pairs). You can "Vertically" expand your Zpool by replacing each disk in a RaidZ array with a larger disk, but this requires resilvering the array each time and for larger arrays can take Weeks or even Months so it is not recommended. Now also add in the fact that running ZFS also requires a hefty amount of RAM, preferably ECC ram (which is expensive in and of itself), requires server hardware to utilize to it's fullest, and that some of the fancy features like dedup also require a good processor too..
The price tag starts to add up really quickly.
So when asking yourself "Should I use ZFS?" you really should be asking "Do I really need ZFS?" (Do I want long term data integrity and all those other fancy features?) and "Can I afford ZFS?". If the answer to both of those questions is "Yes", then you can and should use ZFS, otherwise use something else like Snapraid or mdadm.
Not Recommended:
- Running ZFS on old hardware.
- Running ZFS on ancient hardware.
- Running ZFS on consoomer motherboards.
- Run ZFS without ECC Ram. If you can afford ZFS you can afford to get ECC Ram. No excuses.
- Run ZFS on underqualified hardware (shitty little NAS boxes, SBCs, etc).
- Use "Mutt" pools (Zpools with differently sized vdevs).
- Growing your Zpool by replacing disks. Backup your data elsewhere, create a new pool, and transfer the data to the new pool. Much faster. (You could theoretically use a USB drive dock provided your array is 5 disks or less).
DO NOT
- Run ZFS on top of Hardware RAID.
- Run ZFS on top of other soft RAID.
- Run ZFS in a VM without taking the proper precautions.
- Run ZFS with SMR drives.
Btrfs
Snapraid
Hardware RAID
If you bought an old used server with a RAID controller already installed, or perhaps you don't feel like messing with software RAID solutions, you have the option of using hardware RAID rather than software RAID.
Choosing a file system
XFS
ext4
NTFS
If you are using snapraid as your raid solution, using NTFS formatted drives is perfectly fine. With Snapraid you are usually pulling out random drives you have lying around, which are most likely to be NTFS formatted. Otherwise, we do not recommend using NTFS unless you are running a Windows server for some reason. It does not have the same level of support on Linux and UNIX based systems as ext4 and XFS.
unRAID does not support NTFS. If you are using unRAID you will need to use ext4 or XFS.