During the recovery efforts following Hurricane Ike, organizations that conducted comprehensive tests of their recovery plans were operational in the allotted time with very little – if any – complications. One of the benefits observed within the organizations that tested was a sense of “operational familiarity” between the recovering organization and the Disaster Recovery (DR) provider.
A simple way to define operational familiarity is the personnel of two different organizations understanding how to work best with each other, developing a sense of reliance and trust in the others’ abilities. In many situations, the operational familiarity between the organizations was just as important as the technical know-how of the personnel and properly functioning equipment.
Testing provided organizations and DR coordinators a “point of reference,” and when issues would arise, it was not uncommon for one to say to the other, “Let’s configure it like the last time we tested.” Having conducted comprehensive testing, both organizations were aware of the others’ capabilities, personalities and how best to work with each other.
In addition to the increased recovery time, the operational familiarity created a sense of purpose and experience among those that had already completed the test. Each department knew what was expected of them, that it could be done, and knew what the other was doing. Additionally, the company was aware of the full extent of support the DR provider could provide. However, several organizations with untested or insufficiently tested DR plans did not do as well.
In an untested disaster recovery environment, hardware and software issues are probably easier to resolve than personality conflicts. Finding out a server isn’t properly functioning in a disaster situation would be extremely disruptive to meeting a set recovery time. Just as, if not more disruptive, is discovering an individual responsible for an aspect of an organization’s recovery does not work well with others under pressure.
To illustrate the point: One organization completed a comprehensive test about a month before Hurricane Ike. The organization discovered some software issues they were able to resolve. After Hurricane Ike, the organization declared and was fully operational within hours of a mobile recovery center being on site. Because of testing, the recovery was a success.
On the other hand, another organization never tested, then declared after Hurricane Ike. The mobile recovery center arrived on site, and was ready to be populated; all of the DR provider’s hardware and software was in order within a few hours of deploying to the site. However, the company was having a software error on their end. The software glitch probably would have been revealed if prior testing had occurred.
Instead of being able to install and bring up several workstations simultaneously, the IT director had to bring each one up individually, taking more than two hours, the time projected to bring the workstations up. Due to a lack of testing, the IT director was not comfortable listening to the suggestions of the DR personnel. The recovering organization then sent another specialist from another city 200 miles away to the site to provide a new perspective on the problem.
The new specialist arrived on site and fixed the problem within the hour. Within four hours, the site was fully operational. The solution the new specialist implemented was the same solution the DR provider personnel had recommended.
Another observation: Organizations that tested had a faster recovery time, creating more time and resources to find solutions to logistical and personnel issues. Hurricane Ike severely hampered supply lines for everything from gasoline to food. Power was out for several weeks in some areas, and many places had to scramble to find portable toilets. Finding fuel for vehicles, let alone generators, was difficult. Because of testing and trust in their DR provider, the technical side of the business recovery was not an issue the company had to worry about.
To create operational familiarity, tests must be as realistic and stressful as a controlled setting can provide, such as setting challenging time lines to be met to be considered a successful test. The test should closely resemble an organization’s actual recovery plan. An organization’s leadership also has to support and establish a serious tone for the testing.
If a DR provider only allows a customer to “test” in the DR provider’s parking lot, or does not test all parts of the recovery plan, there are several variables left to be determined in a disaster. Organizations should be able to “declare” a test, as they would an actual disaster. Additionally, if a DR provider makes simply testing a challenge; it raises the question of how responsive the DR provider will be in an actual disaster.
Testing enables an organization to verify that in a disaster, their business operations will be taken care of, allowing the organization to focus on the actual impact of the disaster. Testing also fosters confidence, competence, communication and understanding – operational familiarity – between two organization’s personnel. Waiting until a disaster to implement an untested plan could end up being the costliest “test” of a DR plan.
Recommend0 recommendationsPublished in Incident & Crisis Response
Leave A Comment
You must be logged in to post a comment.