(2012-03-05, 22:36)TugboatBill Wrote: The odds are that if you have 2 drives fail at once one of them will be the parity drive. Why? Because it gets the most use as it is updated every time there's a change to any of the drives in the array.
The odds of a second drive failing at the same time are high, because often multiple DRIVES FAIL FOR THE SAME REASON. Just like the nuclear reactor meltdown in Fukushima, which had backups, but all the backups failed due to the tsunami, the same thing happens with other systems too.
Same thing with hard drives. If you power supply starts producing out of spec power, if there is a surge, if someone kicks over the server and it falls to one side, if someone spills coffee on it, if the operating temperature range of the drives is exceeded, or any number of a million possible things that could cause a drive to fail, in almost all of these cases it is highly likely that a second drive will be affected by the same thing that affected the first one.
The odds of an average, functional hard drive failing for no reason other than old age, on it's own, is less than 5% per year (some drives it's as low as 1%), so basically two drives failing at the same time for that reason is rarely going to happen. But this is far lower than the combined odds of all the other possible situations that could cause a failure. In reality two or more drives often do fail at the same time, like when rebuilding arrays due to bit errors. There are other situations also where a failure actually existed before, but you only discover it at the same time because you only try to access the data when you need it.
In a carefully, professionally, and well managed hosting environment, drive failures can be kept at below 5% per year. But for other types of users, the odds of two drive failures are sky high (more than 10%).
So yes a parity drive is being accessed at the time of drive failure slightly increases the risk of that drive failing also, but think of all the other risks combined that can cause data corruption or outright failure, they are even higher, much higher than you think.
We can easily predict the odds easily of a particular failure - but how well can you predict the overall risk, from all causes?
My point is without a second parity drive option, once you have lots of data more than 10TB, a single parity drive just isn't enough. When one drive fails and you need to rebuild the odds of a second failure are astronomical, it's practically guaranteed that you will lose data if you only have single parity.
So I really hope they add this feature, it's needed.