Wednesday, January 19, 2011

Panic, Paranoia, and Planning

I said I was going to write about tar, zip, and rsync, but I decided this needed to go first.  I used to ask my clients, "how long can you afford to be down?"
They always said never.   Wrong answer.  The truth is, everyone can afford to be down for a given period of time, if you know and plan ahead for that specific time.
Whether it is 1 minute, 1 hour, or 1 day, all network/sysadmins plan for a down time, sooner or later, for a specific server or group of servers, or applications.
Backups are for those unplanned times when hardware fails, or a worm attacks the system, or data gets accidentally overwritten. 

Backups are an insurance policy.  You only need it when things go wrong.
Very large corporations use multiple server farms, (cloud computing), to store and backup data.  Most of us use multiple servers, RAID systems, off site data mirroring, and/or other stuff.  Each one of these is a form of backup.

So, let us start with a common sense approach. 
How much gross income did the company make in 2010?
There are approximately 250 work days in a year. 
If the gross was $250000, then the average is $1000 per day gross income.
If the gross was $1 million, then the average is $4000 per day gross income.
If it cost you about $4000 per day for your business server to be down, then you can easily justify a $4000 backup plan.   

So, let' start at the basic hardware level.
Does the company lose time and money if a certain hard drive fails?
If the answer is yes, then mirror, stripe, etc .....  that drive.
If the drive is just a convenient temporary bucket for non-critical data,
the answer might be no.  Just keep a spare drive on hand.


What about at the next level: whole servers.
If your server mainboard fails unexpectedly, how much would it cost you in down time? (Real dollars!)
Maybe it is time to mirror that server with another complete server, on site or off site. Or, will just a copy of all critical data updated every 24 hours to a secondary server  keep you going? 


What about a router failure, web page server, or email system?
What if your UPS battery fails, causes a short in the system, and downs everything attached to it? (Rare, but I have experienced it.)

And, I have seen new hardware fail.  Just because the box is new does not guarantee 100% success.  


Write down (on paper and in red ink) the time lost in hours and days, and the cost in real dollars.  Be logical and think it through.  This will help you make your decision.

Much better to plan now than to be in the middle of a panic attack because a mission-critical server is down and there are no spare parts. 
Plan for the best and worst case scenarios, and sleep well.

Jim

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.