[E-Business/IT] Smart Business: Recovering From a Site Crash Hed: Site-Crash Recovery Deck: Site Crashes can Burn E-businesses. Summary: The key to quick site recovery can be found in a number of one-key or remote solutions on the market. Pull quote: "You know what? Spending $12,000 versus being out of business online for days could mean a big difference." -- Ken Burke, CEO, Multimedia Live E-business sites can't afford to "disappear" for hours at a time. Just as in brick-and-mortar retailing, all the service and quality in the world won't matter if your door is closed. Experts estimate that as many as 1,000 small business Web sites crash every day. The smart ones have backup systems and site recovery plans already in place. When huge retail sites like eBay, Amazon and Microsoft.com go down, they have crisis teams ready for action. For instance, when eBay crashed for the third time in late 1999, its technicians swooped in to route Web traffic to a separate site with an apology and an explanation. Then they used diagnostic software to identify and fix the hardware failure that had caused the crash. They rebooted the system, verified that all databases were working, and put the site back online. There was only one problem: EBay users couldn't post bids or list new items. A quick check told the techies that a portion of the network was still down. Engineers traced it to a corrupted system file and worked through the night to rebuild it. By morning the site was back online -- but the company had taken a $2 million hit. The Cost of Downtime Even for small e-tailers, unexpected downtime can mean a loss of precious income. And there's no way to set a price on the customers who try to log onto a site, can't get in, and take their business elsewhere -- permanently. If you think you can't afford a Web site recovery plan, consider this: 30 percent of ebusinesses experiencing a site crash go out of business within 18 months, reports Security Web Sites, a Maryland company helping businesses protect against unexpected site shutdowns. Site crashes can be caused by anything from seasonal overload to attacks by evil hackers, hardware failure, careless programmers…or the floor manager kicking a power plug loose. "There are a lot of pieces to the puzzle," says Blake Barthelmess, engineering manager for retailer Nordstrom.com. "You could have an issue with your carrier, the ISP where its services fail, or perhaps they made a change in their environment that gated out problems to everyone else." A lock-up can even happen if your credit card transaction company's server goes down. The Puzzle Pieces Some crashes can be fixed simply by rebooting your server -- in which case your biggest problems are angry customers and sales you lost while offline. Others can cause fatal corruption in your operating or data files, and call for stronger preventative measures. How do you tell the two types apart? "Monitoring, either by a third party or by software packages such as Net IQ, Computer Associates, or Sitescope," says Barthelmess. "Also, the people who write (the software running on) your Web site should be able to tell you what symptoms are indicative of what problems. Inability to check out is usually a database problem, lag time (means) overload." "Simple" site crashes (the ones that require only a restart to cure) are often caused by a sudden surge of orders that "overload pipes on the Web site and cause slowing," he says. The most common cure is simply to have your Web hosting service stop and restart your site software. If that doesn't work, ask to have the hosting service itself restarted. Database corruption, on the other hand, is a nastier can of worms. "Most database corruption is a result of some sort of physical error," Barthelmess says. "Once files are corrupted you may need an engineer or special software to find and repair the damage." Investing an Ounce of Prevention Ask experts how to recover from a crash and they usually start out with some straight talk about prevention. "If something goes up in smoke," says Tate Dodge, a senior systems engineer at Nordstrom.com who has worked as a consultant to many smaller companies, "you have to have equipment standing by. Even if it means just holding onto your older technology, you have to have some kind of backup in place." What kind of backup? Start by talking to an expert. Winetasting.com, which represents 50 Napa Valley area wineries on its site, called in Multimedia Live, of Petaluma, Calif., to help craft its recovery plan. To begin, Multimedia worked with Winetasting's ISP so that in case of a crash, visitors would be sent to another site with an apology for technical difficulties. "It's much better to have an 'under-construction' notice than the dreaded 404 error," says Joe Pierini, Multimedia network administrator. "If you can at least address that first step, your users won't go away thinking: 'Oh my gosh, they're not around any more.'" Next, the company made sure Winetasting's ISP had backup power. "One of the issues we're dealing with now is the rolling blackouts in California," says Multimedia CEO Ken Burke. "We looked for an ISP that had four-day backup power generators." Finally, it had Winetasting.com plunk down $6,000 for a second dedicated server. Now in case of a critical crash it can switch nearly instantaneously to a duplicate site until the problem is repaired -- with no lost customers or cancelled transactions. "Hosting systems rarely go down, but individual boxes within the hosting system can," Burke says. "That's why it's recommended to actually host your site at two boxes. They cost about $6,000 each, but you know what? Spending $12,000 versus being out of business online for days could mean a big difference." Nordstrom's Personal Site Shopper Some companies, like Nordstrom.com, take extreme precautions against site crashes, no matter the cost. For instance Nordstrom maintains dozens of servers for different purposes, using monitors and filters to protect those servers, and backing up several terabytes of information each day. Its high-level security team can put the dot-com back in business 10 hours after a total system meltdown. But thanks to a contract with Boston co-host Akamai Technologies, all of this rebuilding is hopefully beside the point. Akamai stores Nordstom.com's content on remote servers, so in case of a problem with the primary site visitors will be transferred to Akamai's nearby servers without any hint that the primary site is down. Akamai's clients include some of the world's largest companies -- FedEx, General Motors, Yahoo! -- but its service can be a boon to smaller businesses with limited financial resources. A monthly contract guarantees visitors will be unaware of site crashes, and can cost as little as $500 a year. "It's your choice," says Akamai product manager Andrew Lickly. "You can make capital expenditures on hardware and add extra staff to accommodate seasonal traffic spikes a few times a year -- or you can pay a monthly fee to Akamai, at considerable cost savings." A Pound of Cure "Recovery" is the name of the game for the new crop of site reconstruction and recovery planning specialists. Services range from free online advice (such as Security Web site's 10-point questionnaire) to $39.95 (for a "Web Security" CD-ROM from SmartBooks.com), $595 (for ready-made recovery policies, a contingency management planner, and risk assessment tools from Security Web Sites) and thousands of dollars (cohosting services where you run your software on computers maintained and backed up at remote sites dedicated to keeping you online). To quickly recover from a site crash, experts recommend using one or more of the following strategies: 1) Create a backup of Web site HTML and data files on your Web server, and update it frequently. Cost: $0.00. 2) Maintain a separate server with a complete, up-to-date copy of all your HTML and data files. It's best to keep the backup server at a separate location to ward against natural disasters like fire or flood. Cost: $6,000. 3) Install a remote diagnostics package such as UnixWare 7 Air-Bag from UniTrends Software in Myrtle Beach, S.C. This package lets your technicians do off-site repair and troubleshooting via modem. Cost: $364 to $750. "It will give you a nice little report," says UniTrends President Steve Schwartz. "It will tell you if the system recognizes the hard drive, if it sees all the memory, if there is any corruption in the files. It will check out the root directory and bootable directories to see if they're okay, and it will tell you if you have any critical files missing from your system that it needs to boot." The package also has a totally automated restore feature that rebuilds things from the previous night's backup tape. 4) If a catastrophic failure wipes out all of your operating files, and you have no backups, there are companies that will restore hard drives -- but they aren't cheap. Prices range from $5,000 to $15,000. Check the phone book under "Data Recovery." Two biggies are Action Front Data Recovery Labs, in Atlanta, and Excalibur Data Recovery near Boston. Both have 24-hour toll-free hotlines for instant service. But the bottom line is simply: Be prepared. "When you're building your system," Dodge says, "you just have to keep asking, 'What if? What if?' Some things you can't plan for -like if a backhoe takes out the fiber optic link in four states. But you have to be able to adapt to that. You control what you can, and even the things you can't control, you build those into your plan and know what you would do." SIDEBAR/RELATED ARTICLE: HED: Questions to ask your ISP An experienced Internet service provider may be the site savior an e-business needs. Experts at Web developer Multimedia Live say that asking a few strategic questions up front can help you determine if the ISP has the security features you need to recover in case of a crash. 1. Who provides your Internet backbone -- the actual phone lines that connect the ISP or co-host to the Internet? Are there multiple backbones leading to that co-location center? "If someone is jack-hammering on the street and cuts the PacBell link -- and this happened four years ago in Santa Clara -- you still have an AT&T link to fall back on," says Multimedia CEO Ken Burke. 2. Does the ISP have backup generators in case of a power outage? 3. Are there people on-site 24 hours a day, seven days a week, who can respond immediately if there is a problem? "Make sure you can pick up the phone and talk to tech support anytime of the day or night," says Joe Pierini, manager information systems. 4. Does the ISP offer "managed services" -- an on-staff technical consultant that can address hardware and software problems? 5. Does the ISP have any "peering partners"? Big companies like Exodus Communications have peering partners -- a direct connection with other ISPs constituting a private network that can supercharge the speed of your site. "That's a big deal for small businesses, who often say their site is too slow," Pierini says. Related Links <a href="http://www.securitywebsites.com/TestQuestions/StaffTest.htm">Informal test from Security Web Sites to gauge employee awareness of recovery strategies</a> <a href="http://www.akamai.com">Akamai</a> <a href="http://www.securitywebsites.com/">Security Web Sites</a> <a href="http://www.smartbooks.com/t-progsecurity.htm">Smart Books' "Web Security" software</a> <a href="http://www.mmlive.com">Multimedia Live</a> <a href="http://www.netiq.com/default.asp">Net IQ</a> <a href="http://www.ca.com/services/">Computer Associates Services</a> <a href="http://www.a-keyword-optimization.com/sitescope.html">Sitescope</a> <a href="http://www.brs.ibm.com">IBM Business Recovery Services</a> <a href="http://www.alw.nih.gov">Center for Information Technology</a> <a href="http://www.actionfront.com/track-ip.html">Action Front Data Recovery Labs</a> <a href="http://www.excaliburdatarecovery.com/">Excalibur Data Recovery</a> SOURCES Eprise Corp. 200 Crossing Blvd. Framingham, MA 01702 Phone: 508-661-5200 or 888-658-7037 Jeremy Allaire, CTO and co-founder C/o Jeannine McDonough, director of corporate communications Allaire Corp. Phone: 617-219-2026 Fax: 617-219-2008 E-mail: jmcdonough@allaire.com Web site: http://www.allaire.com Blake Barthelmess, engineering manager AND Tate Dodge, senior systems engineer Nordstrom.com Seattle, Wash. Phone: (206) 215-7403, 215-7428 Daren Goebel, Web site operations manager - doesn't want name used Nordstrom.com Seattle, Wash. Phone: (206) 215-7445 Donna Scott, VP of software infrastructure GartnerGroup (Gartner Dataquest research) 56 Top Gallant Road Stamford, Connecticut Phone: (203) 964-0096 Fax: (203) 324-7901 Web site: http://www3.gartner.com/Init David Peterschmidt, president and CEO C/o Kevin Brown, media contact Inktomi Corp., the largest network-cache provider in the world – WILL NO LONGER DEAL WITH SMALL BUSINESSES/ITS SMALLEST PACKAGE IS NOW $100,000 San Mateo, Calif. Phone: (650) 653-2825 or ext. *84578# E-mail: kevin@inktomi.com Ken Burke, president and CEO AND Joe Pierini, manager information systems Multimedia Live Petaluma, CA Phone: (707) 773-3434 Web site: mmlive.com J. R. Hawkins Jr., owner Security Web Sites Inc. (as little as $595) Gaithersburg, Maryland Office: (301) 756-4050, ext. 0858 Home: (301) 294-8784 Fax: (801) 749-4837 E-mail: information@securitywebsites.com Web site: www.securitywebsites.com Steven W. Schwartz, president C/o Joelle McCullough, vice president of marketing UniTrends Software Corp. 1601 Oak Street, Suite 201 Myrtle Beach, SC 29577 Phone: 843-626-2878, Ext. 717 E-mail: joelle@unitrends.com Web site: http://www.unitrends.com Jeff W. Young, director of public relations Andrew Lickly, "one of our product managers" Akamai Technologies, Inc. Phone: (617) 250-3913 E-mail: jyoung@akamai.com