In addition to a 40-year career as a software engineer specializing in operating systems, I've also managed corporate and personal computer systems. It should be no surprise, then, that my primary home computer system has RAID 1 arrays for several critical file systems.
Some years ago, while far from home, I wanted to demonstrate the utility of RAID arrays to a client. As I explained the up-time benefits in case of a failure, I logged into this system remotely and displayed the array status. To my surprise I found that one of the drives in the array had failed! The system continued to run, of course, because of the redundancy. This perfectly illustrated my point to my client.
RAID arrays need periodic checking to identify any errors that may have developed. For years this was kicked off by a
cron job at 1 am local time on the first Sunday of each month. However, apparently with the adoption of
systemd a few years ago, the start time changed. It now starts at a
random time in the 24 hour period after 1 am.
Today it randomly started at 9:33 am, which was shortly after I sat down at my computer. It's now 12:30 pm, and these checks will run for another two hours. While this is running, applications randomly freeze as they compete for access to the file systems.
What brain-dead idiot thought that starting this at a random time on a Sunday was a good idea?