Business Continuity Planning - Part 3

Article by Tom Olzak, CISSP (6,821 pts ) , published Jul 28, 2009

In this final installment in the BCP series, we walk through Steps 4 and 5 of the planning process: testing and managing test results.

BCP continued...

In Part 2 we saw how to develop BCP strategies and a plan for implementing them. The plans are useless if they are not tested via team training and hot site system recovery. Ensuring your plans have value is the topic of this final installment in the BCP series.

Step 4 - Test the Plan

Testing the plan is probably the most important part of BCP, and often the most neglected. Organizations that fail to conduct regular tests can’t reasonably expect their recovery teams to react quickly enough to an actual incident.

“The true measure of success from a business perspective is the pace of recovery. All of our business continuity plans and preparations are aimed at improving response and recovery times in order to reduce the impact on the business” (John Burtles, 2004).

To reach the measure of success defined by Burtles, your testing must target two primary objectives. First, your teams must be so familiar with the recovery process that management intervention is unnecessary except to help teams overcome external obstacles. Second, documentation should be free from inaccuracies, including those caused by mistakes or by changes in the business environment. It’s also a good idea to question all documented activities. Do they represent the most efficient path to recovery?

Testing should be a multi-step process. Without proper preparation, BCP tests will fall far short of your objectives. The elements of effective testing include educate, maintain, and test.

Educate

Educate each team on the contents of the documents related to their areas of responsibility. Team education should result in:

  1. An understanding of team roles.
  2. Solid knowledge of the processes and steps contained in the documentation.
  3. Elimination of team member resistance and apathy. In many organizations, the initial reaction to business continuity activities is that they are a waste of time; things that pull them away from “real work.” The education process must address this issue by helping team members understand the importance of BCP.
  4. Recommendations from the teams on how to improve the recovery processes.
  5. Team leaders having a high level view about how their teams’ activities fit into the overall recovery plan.

Maintain

Between tests, the documentation must be properly maintained. The best way to accomplish this is through an effective change management process. Some of the deliverables of change management are updated configuration and build documentation for infrastructure components, updated process diagrams, and changes to forms required for manual processes. The responsibility for ensuring documentation changes are made and included in the BCP must be clearly defined.

Test

The purpose of testing is to ensure that the documentation is accurate and to increase the awareness of recovery teams. Other reasons to test include:

  1. Recording system recovery times
  2. Identifying and documenting system recovery dependencies – in some cases, systems must be recovered in a specific order to fully recover delivery systems

Once you’re confident your documentation is reasonably accurate, plan the test. You don’t have to successfully restore a system to have a successful test. Remember, testing is designed to raise the awareness of your teams and to identify inaccuracies and inefficiencies in your documentation. However, you should establish certain guidelines and test objectives for each test. The following steps will help with the test planning process

  1. Establish a test strategy. This should include the type of test and the systems and processes you want to recover. There are three basic types of tests – checklist, walk-through, and hot site. A checklist test is performed by the individuals in your organization most familiar with the process or system being tested. The purpose of this test is to ensure the accuracy of the documentation. A walk-through is typically performed by one or more recovery teams sitting at a conference table. Walking through the recovery documents as though they were actually recovering from an incident increases awareness and helps identify roadblocks to recovery. A hot site test is an actual physical build of infrastructure and business processes.
  2. Clearly define the objectives of the test. Often, the objective of a test may be to simply test how long it takes to recover one or more systems. Other times, you may need to demonstrate you can recover and run a specific task. For example, your objective may be to recover your payroll system and actually print checks. Whatever your objectives, everyone participating in the test should understand what it is they’re trying to accomplish.
  3. Define how the test is to be conducted. Test planning should include criteria governing how the test will be performed. This includes how the documentation should be used, the types of logs and reports each team must complete, and the sequence of events. One challenge you should address is the tendency for teams to recover processes and systems based on memory rather than using recovery documentation. This is usually a bad idea. Planning for worst case scenarios includes planning for recovery situations in which your internal staff might not be available. If the documentation is not tested through strict adherence to it during recovery tests, you probably won’t be able to rely on it in an actual declared disaster.
  4. Select test team. Selecting the team for the test is relatively easy. The team responsible for the process or technology being tested should conduct the test. Ensure each member of the team understands the test strategy, objectives, and how the test is to be conducted.
  5. Test. Step through each phase of recovery, including initial notification of team members, immediate response through checklist implementation, and full system/process recovery. During a test, the following items should be documented in detail:
    1. Test start time
    2. Time each task in the plan is completed
    3. Actual time to complete each task
    4. Inaccuracies encountered in the documentation
    5. Recommendations for improving

Step 5 - Manage Test Results

Using the documentation generated during the test, conduct an After Action Review (AAR). The fundamental purpose of the AAR is to identify and address people, process, and technology issues related to efficient and effective recovery. The output of the AAR is an action plan that, at a minimum, should include the following activities:

  1. Documentation updates
  2. Modifications to agreements with recovery vendors
  3. Changes to processes
  4. Team restructuring

The results of the test, including the AAR action plan, should be communicated to management as soon as possible after the test.

The BCP produced by Steps 1 through 3 is not just a book for the auditors that sits unused on someone’s bookshelf. Regular testing followed by a remediation action plan, Steps 4 and 5, is the cornerstone of an effective business continuity program. This is an incremental, evolving process. Each time you execute the test-manage cycle, your team becomes a little more capable of responding to business continuity events in a way that prevents significant business impact.

BCP Series Summary

Business Continuity Planning is an essential part of ensuring the uninterrupted delivery of products and services. There are five phases to business continuity assurance – analyze your business, assess risks, develop a strategy and plan, test the plan, and manage the results.

Business Continuity Planning is not a one time project. It is a continuous process, which results in incremental improvements in your organization’s ability to effectively recover from unplanned business interruptions.

Read other articles in this series...