Small Business IT Support Continuity Planning Guide
Any small business reliant upon an IT service infrastructure must account for its resiliency amid disasters and disruptions through the company’s Business Continuity Plan (BCP). The BCP enslures fluency in business when hazards interfere with managed IT operations, mitigating or circumventing impact upon customers and revenue.
A BCP outlines procedures for establishing preventative controls, restoring business processes, and backing up data. Regular testing, training, and maintenance ensure plan effectiveness while preparing employees for emergency scenarios.
- 1) What is a Business Continuity Plan (BCP)?
- 2) BCP Process Components
- 3) 1. Contingency Planning Policy Statement
- 4) 2. Business Impact Analysis
- 5) 3. Preventative Controls
- 6) 4. Contingency Strategies
- 7) 5. Information System Contingency Plan
- 8) 6. Plan Testing
- 9) System Impact Levels
- 10) 7. Plan Maintenance
- 11) Final Thoughts
What is a Business Continuity Plan (BCP)?
A Business Continuity Plan outlines preventative and disaster recovery measures in the event of a threat to a company’s internal systems or infrastructure. The ultimate goal of a BCP is to enable a business to operate as usual despite disruptions.
A BCP is essential to avoid revenue losses and keep processes moving smoothly. Before developing a BCP, it’s necessary to identify the potential risks that could harm the business. Then, a company will need to do the following:
- Identify how the risks can interfere with operations
- Select appropriate risk-mitigation procedures
- Establish testing guidelines to ensure the selected methods are effective
- Keep processes up-to-date
Resilience describes a company’s ability to adapt to external changes, disasters, and attacks. Some of the most significant risks a business can face are Cyber Threats or Computer Network Attacks. This malicious code interferes with integral operational systems, causing detrimental data loss or manipulation.
BCP Process Components
An effective BCP requires specific process components, based on clearly defined principles. Below are the seven elements of the BCP process:
- Develop the contingency planning policy statement. Having a formal policy in place provides the foundation necessary to create an effective contingency plan.
- Conduct the Business Impact Analysis (BIA). The BIA supports the organization’s business processes by identifying and prioritizing information systems and the essential components. Find a template for developing the BIA here.
- Identify, implement, and maintain preventive controls. These proactive initiatives decrease the chance of disruptions and increase system availability. This step will help reduce the costs of the contingency life cycle.
- Create contingency strategies. A detailed recovery strategy ensures that the system is recovered quickly and efficiently after a disruption.
- Develop an information system contingency plan. This plan needs detailed outlines of procedures for restoring a disrupted system. The information system contingency plan should be tailored to a company’s unique recovery requirements and Impact Level.
- Ensure plan testing, training, and exercises. Testing is essential to ensure that the recovery process is functional. Additionally, staff requires training and regular exercises that prepare them for effectively executing the recovery plan.
- Ensure plan maintenance. Business Continuity is an ongoing process and requires frequent updates that stay consistent with organizational changes.
1. Contingency Planning Policy Statement
The contingency planning policy statement establishes a framework for a company’s continuity planning, defines requirements, and outlines the responsibilities of the system owners. The Contingency Planning Policy should include the following.
- Backup and storage frequency
- Plan maintenance schedule and guidelines
- Resource requirements
- Roles and responsibilities of personnel and senior management
- Scope of the plan and the organizational functions it covers
- Training requirements
- Testing schedule
The contingency policy should align with corporate policies that affect system operations, information system security, emergency preparedness, human resources, and physical safety.
2. Business Impact Analysis
The Business Impact Analysis (BIA) identifies and prioritizes system components. The BIA correlates business processes to their constituents to detect the impact on the organization if the system was unavailable. There are three steps involved in completing a BIA:
- Determine recovery difficulty. How critical is recovery for business processes that the system supports? Identify which operations are supported by the system, and how does a system disruption impact them? What is the estimated tolerable downtime?
- Identify resource requirements. What is required to perform a recovery? Evaluate the time and resources necessary to restore business processes as fast as possible. Resources that should be identified include data files, equipment, facilities, personnel, software system elements, and vital records.
- Identify recovery priorities for system resources. Which business processes need to be restored first? Use the results from the first two steps to determine which processes are linked to critical operations. Establish priority levels to sequence recovery procedures most efficiently.
There are three ways to measure information system downtime:
- Maximum Tolerable Downtime (MTD) is the maximum amount of system downtime that the system owner is willing to accept. This time helps determine suitable recovery methods and procedures.
- Recovery Time Objective (RTO) is the maximum amount of time that a system can remain down before there’s a negative impact on system resources and the corresponding business processes. This figure is essential for choosing the appropriate technology.
- Recovery Point Objective (RPO) represents the amount of time before a disruption that a system backup was performed. For example, a company does a system backup once per day every evening. An interference occurs in the afternoon and all the changes from the morning are lost. Companies need to determine how much data loss they can withstand during the recovery process.
3. Preventative Controls
It’s possible to prevent some of the risks identified in the BIA through preventative measures. Some examples of preventative controls include:
- Uninterruptible Power Supplies (UPS) that provide short-term backup power
- Generators that provide longer-term power backup
- Smoke, fire, and carbon monoxide detectors
- Water detectors in the computer control room
- Emergency off-switch for master system shutdown
- Off-site backup of electronic records and system documents
- Sufficient air conditioning to prevent system component overheating and failure
- Fire-resistant, waterproof containers for essential non-electronic records
- Advanced security controls like cryptographic key management
- Frequently-scheduled data backups
Always choose proactive initiatives that are cost-effective. Depending on the type of system and its configuration, some measure may be more feasible than others.
4. Contingency Strategies
Businesses develop contingency strategies to reduce planning risks associated with backup, testing, recovery, and maintenance. When creating and comparing approaches, it’s essential to consider the expenses, maximum downtimes, recovery priorities, integration with other contingency plans, and security.
There are five different types of disaster recovery sites:
- Cold Sites are backup facilities at a location different than the first place of business. A cold site is usually an office. These sites may not have all the equipment necessary to restore operations quickly.
- Warm Sites are equipped with data centers, but not with customer data. A business typically uses this site to restore its technology infrastructure by introducing customer information.
- Hot Sites have all the specific hardware, software, and equipment necessary to resume business as usual after a disruption.
- Mobile Sites are customizable, self-contained, and transportable sites.
- Mirrored Sites essentially create a real-time system copy.
The most effective disaster recovery site is the mirrored site, which ensures close to 100% availability. However, these sites are the most expensive to maintain. Hot sites are also ready to use immediately following a disaster but require the relocation of staff. Fixed sites should be located in areas that won’t be affected by the same disruptions as the primary business location.
Cold sites are the most affordable option but require considerable time and resources to install all the necessary equipment. Partially-equipped sites demand fewer resources to set up and are less expensive than hot or mirrored sites. Mobile sites can be delivered to any desired location but can take up to 24 hours to reach their destination.
5. Information System Contingency Plan
The Information System Contingency Plan contains detailed instructions and recovery procedures according to the severity of the disruption. The table below shows guidelines for identifying links between impact level, recovery priority, and the appropriate strategy.
|Impact Level||System Priority||Recovery Strategy|
|Low||An impact that causes minimal damage or disruption||Tape BackupRelocate or Cold Site|
|Moderate||A disaster that moderately affects an organization||Optical Backup, WAN VLAN duplicationCold or Warm Site|
|High||An impact that causes significant damage to system operation||Disc duplication and Mirrored SystemsHot Site|
Off-Site Storage Backup
Information systems require regular backup and contingency policies should outline the frequency of backup procedures. The data backup policy will contain the location of the data storage, media rotation frequency, file-naming conventions, and data transportation practices. Some ways to backup data include network-attached storage, electronic vaulting, and tape library systems.
However, storing data off-site is an excellent business practice. Cold, warm, and hot sites have fixed storage locations. It’s essential to consider the following factors when selecting an off-site disaster recovery location:
- Geographic location – The site should be far enough away that there is no chance of the backup site being affected by the same disaster.
- Security – How secure are the shipping methods and storage facility? The site must meet data security requirements.
- Accessibility – How long does it take to receive the data from storage during regular operating hours?
- Cost – Evaluate the operational fees, cost of shipping, and other expenses related to disaster recovery services.
- Environment – The facility should be free from extreme temperatures, have fire safety features, and other critical environmental controls.
Commercial storage facilities will typically offer recovery and media transportation services.
When the information system is damaged at the primary business location, hardware and equipment will have to be set up quickly at the alternative site. Following are the three strategies for preparing replacement equipment:
- Vendor Agreements – A Service Level Agreement (SLA) with hardware, software, and equipment vendors will outline the necessary steps to be taken following a disruption. These vendors provide temporary equipment leases in an emergency. The agreement should also include a priority status, in case multiple clients are affected by an outage simultaneously.
- Equipment Inventory – Backup equipment purchased and stored at an alternate location can quickly and easily be accessed after a disaster. However, there are significant upfront expenses, and media may become outdated with time.
- Existing Compatible Equipment – Hot sites typically provide equipment similar or compatible with the equipment an organization is currently using.
While purchasing equipment as needed is more affordable, it’s important to note that there are significant recovery times for delivery and set up. On the other hand, storing unused equipment is costly, but allows for business processes to resume almost immediately.
Assembling a Recovery Team
Recovery teams need to have specific roles and responsibilities. The recovery teams will work interdependently to return operations to normal. For this process to work effectively, it’s necessary to have an Information System Contingency Plan (ISCP) Coordinator that makes authoritative decisions. Additionally, the following teams may be required for a comprehensive recovery strategy:
- Application recovery
- Database recovery
- Legal affairs
- LAN/WAN recovery
- Management (including the ISCP Coordinator)
- Media relations
- Network operations recovery
- Operating system administration
- Outage assessment
- Physical/personnel security
- Procurement (equipment and supplies)
- Server recovery
- Transportation and relocation
Ideally, these teams will serve functions similar or identical to their regular responsibilities. All recovery team members need to be aware of cross-team coordination procedures and the purpose of contingency plan processes.
6. Plan Testing
The ISCP needs to be in a constant state of readiness. Information Plan testing is critical for identifying shortcomings and updating system components. The ISCP coordinator will have to test and reevaluate the following procedures regularly:
- System recovery on an alternate backup platform
- Connectivity inside and outside the company
- System performance with different equipment
- Restoring operations and processes
For the most valuable test results, the coordinator should establish success criteria and test objectives through a test plan. The plan will allow for an adequate assessment of each recovery process. Additionally, the test plan should list the participants and include timelines for completing each scenario.
These discussion-based exercises divide staff into groups to review their roles in emergencies. The event facilitator will present a disaster scenario to the group. Then, staff will have to answer targeted questions related to their roles, responsibilities, decision-making processes, and coordination between teams.
These exercises simulate emergencies where the staff has to perform their duties according to the recovery plan. Functional exercises test specific teams and procedures in the contingency plan, such as communication or setting up emergency equipment. Full-scale exercises will address all elements of the recovery plan and replicate the circumstances of an actual disruption.
System Impact Levels
There are three different security objectives for information systems outlined in the Federal Information Processing Standard (FIPS) Publication:
- Confidentiality – securing proprietary information and protecting personal privacy
- Availability – Reliable and timely access to data
- Integrity – Protection of stored information from destruction or modification.
Three distinct impact levels help business owners select the appropriate security measures, called FedRAMP levels. These measurements provide security categorizations for information systems.
This security risk describes systems with data that was intended for mass distribution. A loss of this type of information is not detrimental to a business and does not jeopardize its mission, finances, or reputation.
A tabletop exercise at an organizational level is sufficient for low-impact systems. The system owner should conduct the simulation, and it should include all the main points of the ISCP.
These systems contain data that is not widely available to the public. Disruptions on this level would have a mild impact on the organization’s ability to continue normal operations.
Moderate-impact systems should complete a functional exercise with the system owner. The scenario should cover all the points outlined in the ISCP, and include an element of system recovery from backup media.
These systems have sensitive, unclassified data that require protection by law, such as legal, finance, healthcare, and emergency services. A breach in this system could have detrimental consequences and potentially shut down operations.
High-impact systems will have to conduct a full-scale functional exercise that includes setting up a backup system at an alternative location. Additional activities require staff response at the recovery location, recovery of a database from backup media, and off-site server processing.
7. Plan Maintenance
The Contingency Plan should be reviewed regularly for accuracy and updated to reflect significant organizational changes. Some elements will require more frequent updating, such as contact lists. Following are some of the aspects of an ISCP that will require regular maintenance:
- Essential records
- Hardware, software, equipment
- Off-site facility requirements
- Operational requirements
- Security requirements
- Team member contact information
- Technical procedures
- Vendor contact information
Copies of the ISCP should be stored with the recovery staff and at an alternate site with other backup media. The ISCP coordinator must maintain a record of the location of the plans and the names of the people who possess them.
System disruptions can occur at any time, affecting an organization’s ability to provide services and perform operations. Business Continuity Planning is a vital business component that ensures the functionality of these core capabilities. A BCP that’s effectively communicated, integrated, and synchronized with other continuity initiatives will enhance a company’s resilience and ability to continue operations during a system disruption.