Enhancing Digital Operational Resilience: The PIE FARM Approach

By James Seaman and Michael Gioia|2023-07-20T20:18:09+00:00January 4th, 2023|0 Comments

In today’s interconnected and technology-driven world, businesses are increasingly reliant on digital systems to operate efficiently. However, with technological advancements come vulnerabilities that can disrupt operations. To mitigate these risks, organizations must develop robust Digital Operational Resilience Programs.

With the new regulatory changes that will affect any financial services institutions that are in-scope for the new European Union’s Digital Operational Resilience Act (EU DORA), (EUR-Lex[1]), developing an effective approach will become increasingly important.  This article explores how the PIE FARM approach (as depicted in Figure 1) can enhance the development of an effective Digital Operational Resilience Program.

One approach gaining traction is the PIE FARM framework, a comprehensive methodology that encompasses planning, identification, engagement, fixing, assessment, and maintenance.

Figure 1: PIE FARM Model

Applying the PIE FARM methodology

The PIE FARM framework is a suggested approach, that organizations can adapt to help meet their specific needs and context. This involves an integrated 7-stage model to help organizations to build effective Digital Operational Resilience programs, reducing their risks and meeting any regulatory obligations.

1. Plan & Prepare

The first step in the PIE FARM framework is to plan and prepare. This involves understanding the organization’s digital landscape, identifying critical systems, and defining resilience objectives. By conducting thorough risk assessments and creating contingency plans, organizations can proactively anticipate potential disruptions and develop strategies to minimize their impact.

Planning and preparing for an effective Digital Operational Resilience program involves several key steps. Here are some important considerations:

  • Define objectives: Clearly articulate the objectives of your Digital Operational Resilience program. This could include goals such as minimizing downtime, ensuring data security, and maintaining business continuity.
  • Risk assessment: Conduct a thorough assessment to identify potential risks and vulnerabilities within your digital operations. This could involve analysing your IT infrastructure, applications, networks, and data systems. Evaluate potential threats such as cyberattacks, hardware failures, and natural disasters.
  • Establish governance: Establish a governance framework to guide your Digital Operational Resilience program. Define roles, responsibilities, and decision-making processes. Assign a dedicated team or individual responsible for overseeing the program’s implementation and maintenance.
  • Develop incident response plans: Create detailed plans to respond to several types of incidents. Define steps for detecting, containing, mitigating, and recovering from disruptions. Test and refine these plans regularly to ensure their effectiveness.
  • Enhance cybersecurity measures: Implement robust cybersecurity measures to protect against cyber threats. This includes strong access controls, regular security assessments, encryption, intrusion detection systems, and employee awareness training.
  • Backup and recovery: Establish reliable backup and recovery mechanisms to ensure data availability and integrity. Regularly back up critical data and evaluate the restoration process to verify its effectiveness.
  • Establish monitoring and alerting systems: Implement monitoring tools and systems to continuously monitor your digital infrastructure. Establish alerts and notifications to promptly detect and respond to any anomalies or potential incidents.
  • Collaborate with stakeholders: Involve key stakeholders from different areas of your organization, such as IT, operations, legal, and compliance. Foster collaboration and communication to ensure a holistic and integrated approach to operational resilience.
  • Regular testing and simulation: Conduct regular testing and simulation exercises to evaluate the effectiveness of your Digital Operational Resilience program. This could involve running drills, tabletop exercises, or simulated incident scenarios to assess your preparedness.
  • Continual improvement: Foster a culture of continual improvement by learning from incidents and implementing lessons learned. Stay updated on emerging threats and technological advancements to adapt and enhance your program accordingly.

Remember, implementing an effective Digital Operational Resilience program is an ongoing process that requires regular review, refinement, and adaptation to address evolving risks and challenges.

2. Identify & Isolate

The next phase of the PIE FARM framework is identifying and isolating vulnerabilities within the digital infrastructure. This step involves conducting comprehensive assessments of potential threats, including cyber risks, system failures, and external dependencies. By pinpointing weaknesses, organizations can implement appropriate security measures and isolation protocols to prevent cascading failures and limit the impact of disruptions.

The identification and isolation stage of building an effective digital operational resilience program involves several key steps. Here are some important considerations:

  • Asset identification: Identify and inventory all digital assets within your organization. This includes hardware, software, data, networks, and applications. Understand their criticality and dependencies to prioritize your resilience efforts.
  • Risk assessment: Perform a comprehensive risk assessment to identify potential threats, vulnerabilities, and impacts on your digital assets. Evaluate risks such as cyberattacks, system failures, natural disasters, and human errors. Prioritize risks based on their likelihood and potential impact.
  • Dependency mapping: Map out the dependencies and interconnections between your digital assets. Understand how they rely on each other to operate effectively. This includes assessing dependencies on third-party vendors, cloud services, and other external systems.
  • Criticality assessment: Determine the criticality of each digital asset and its importance to your business operations. Classify assets based on their impact on revenue generation, customer experience, regulatory compliance, and brand reputation.
  • Segmentation and isolation: Implement segmentation and isolation measures to protect critical digital assets. This involves separating assets into different security zones or segments based on their criticality and sensitivity. Apply appropriate access controls, network segmentation, and security system configurations to prevent unauthorized access.
  • Redundancy and backups: Establish redundancy and backup mechanisms for critical digital assets. This includes implementing backup systems, off-site data storage, and replication strategies to ensure availability and timely recovery in case of failures.
  • Incident response planning: Develop incident response plans that outline specific actions to be taken when a disruption occurs. Define procedures to isolate affected systems, contain the impact, and restore operations. Ensure clear roles, responsibilities, and escalation paths are established.
  • Security measures: Implement robust cybersecurity measures to protect your digital assets from threats. This includes deploying firewalls, intrusion detection and prevention systems, encryption, multi-factor authentication, and security monitoring tools.
  • Testing and validation: Regularly assess and validate the effectiveness of your isolation measures and incident response plans. Conduct penetration testing, vulnerability assessments, and tabletop exercises to identify gaps and weaknesses. Update and refine your strategies based on the findings.
  • Compliance and regulatory considerations: Consider applicable compliance and regulatory requirements when designing your isolation measures. Ensure that your program adheres to industry standards and regulatory guidelines relevant to your business.

Remember, the identification and isolation stage is crucial for understanding the landscape of your digital assets, their vulnerabilities, and developing strategies to protect and isolate critical components. It sets the foundation for building a resilient digital infrastructure that can withstand disruptions and maintain operational continuity.

4. Engage, Explain & Evaluate

This stage involves being open and transparent with the key areas of the business that will be included in the evaluation of their existing practices. This is not designed to ‘catch them out’ but, rather, is designed to help them understand what is expected of them, regarding their capabilities. Consequently, this stage involves the following two elements:

Engage & Explain

Effective communication and stakeholder engagement are crucial for developing a resilient digital operational framework. In this phase, organizations must educate employees about potential risks and their role in maintaining resilience. Additionally, regular evaluations should be conducted to assess existing processes’ effectiveness, identify improvement areas, and ensure ongoing compliance with regulatory requirements.

The “engage and explain” stage of an effective Digital Operational Resilience program involves several important steps. Here are some key considerations:

  • Stakeholder engagement: Identify and engage relevant stakeholders across your organization. This includes executives, department heads, IT personnel, operations teams, and other key individuals or teams involved in digital operations. Foster open communication channels and establish a shared understanding of the program’s goals and objectives.
  • Education and awareness: Provide training and educational materials to ensure stakeholders clearly understand the importance of Digital Operational Resilience. Explain the potential risks, impacts, and benefits associated with the program. Increase awareness about the role of each stakeholder in maintaining operational resilience.
  • Clear communication: Establish effective communication channels to keep stakeholders informed about the program’s progress, updates, and any changes. Regularly communicate the program’s objectives, strategies, and key milestones. Encourage feedback and address any concerns or questions raised by stakeholders.
  • Business impact analysis: Conduct a business impact analysis (BIA) to assess the potential consequences of disruptions on your organization’s operations, reputation, and financials. Engage relevant stakeholders in this analysis to gain insights into critical processes, dependencies, and recovery time objectives.
  • Explain risk tolerance: Clearly articulate the organization’s risk tolerance level and how it relates to Digital Operational Resilience. Discuss the potential trade-offs between investing in resilience measures and the potential impacts of disruptions. Ensure stakeholders understand the rationale behind decisions related to risk management.
  • Collaboration and coordination: Encourage collaboration and coordination among stakeholders to align efforts towards operational resilience. Foster a cross-functional approach where different teams and departments work together to identify and address resilience challenges. Encourage sharing of best practices, lessons learned, and insights gained from incidents or tests.
  • Obtain buy-in and support: Engage with senior leadership and key decision-makers to obtain buy-in and support for the Digital Operational Resilience program. Demonstrate the program’s value proposition, return on investment, and its alignment with strategic goals. Address concerns, provide evidence, and present a clear business case for the program.
  • Policy development: Develop and communicate policies and guidelines that support Digital Operational Resilience. This includes policies on incident response, security measures, data protection, and employee responsibilities. Ensure stakeholders understand and comply with these policies.
  • Performance metrics: Establish performance metrics and key performance indicators (KPIs) to measure the effectiveness of the Digital Operational Resilience program. Communicate these metrics to stakeholders and regularly report on the program’s progress and achievements.
  • Continuous engagement: Maintain ongoing engagement with stakeholders throughout the program’s lifecycle. Seek feedback, address evolving needs, and adapt strategies, as necessary. Continuously communicate the importance of Digital Operational Resilience to ensure it remains a priority across the organization.

Remember, the “engage and explain” stage is crucial for building awareness, obtaining support, and fostering collaboration among stakeholders. It ensures a shared understanding of the program’s goals, benefits, and responsibilities, setting the stage for the successful implementation and adoption of Digital Operational Resilience measures.

The Evaluate Stage

The “evaluate” stage of an effective digital operational resilience program involves several key steps to assess the program’s performance and effectiveness. Here are some important considerations:

  • Performance measurement: Define and measure key performance indicators (KPIs) to evaluate the effectiveness of your digital operational resilience program. These KPIs could include metrics such as mean time to detect incidents, mean time to respond, recovery time objectives, and overall system availability. Regularly track and analyse these metrics to assess program performance.
  • Incident analysis: Analyse past incidents and disruptions to identify trends, root causes, and areas for improvement. Conduct post-incident reviews to understand what went well, what could have been done better, and identify any gaps in your resilience strategies. Use these insights to refine your incident response plans and enhance your program’s effectiveness.
  • Testing and exercises: Continually test and validate your resilience measures through simulations, exercises, and testing scenarios. This could involve running tabletop exercises, red teaming, penetration testing, or conducting vulnerability assessments. Assess the effectiveness of your program’s response and recovery capabilities, identify weaknesses, and implement necessary improvements.
  • Technology assessment: Regularly assess the technology infrastructure and systems that support your digital operations. Evaluate the reliability, scalability, and security of your IT systems and applications. Identify areas where technology upgrades or enhancements are needed to strengthen operational resilience.
  • Compliance evaluation: Review the program’s compliance with relevant industry standards, regulations, and best practices. Assess whether your digital operational resilience program aligns with legal and regulatory requirements and make any necessary adjustments to ensure compliance.
  • Feedback collection: Gather feedback from stakeholders, including employees, customers, and partners, regarding their experiences and perceptions of your organization’s operational resilience. Conduct surveys, interviews, or focus groups to understand their perspectives on the program’s effectiveness and identify areas for improvement.
  • Lessons learned: Continuously capture and document lessons learned from incidents, tests, and real-world experiences. Share these insights with relevant stakeholders to drive continuous improvement. Use lessons learned to update policies, procedures, and training materials to enhance resilience measures.
  • Risk reassessment: Periodically reassess and update your risk assessments to account for changes in the digital landscape, emerging threats, or evolving business priorities. Consider modern technologies, industry trends, and potential risks that may have emerged since the initial assessment.
  • External benchmarking: Compare your digital operational resilience program with industry benchmarks and best practices. Seek external assessments, audits, or certifications to validate the effectiveness and maturity of your program. Learn from other organizations’ experiences and incorporate relevant strategies into your program.
  • Continuous improvement: Use the findings from evaluations and assessments to drive continuous improvement of your digital operational resilience program. Implement necessary enhancements, update policies and procedures, and communicate changes to stakeholders. Foster a culture of learning, adaptability, and ongoing refinement.

The evaluate stage ensures that your digital operational resilience program remains effective, adaptive, and aligned with changing business needs and emerging threats. By regularly assessing performance, identifying areas for improvement, and implementing necessary changes, you can enhance your organization’s ability to withstand disruptions and maintain operational continuity.

4. Fix

The “Fix” stage of the PIE FARM approach involves taking corrective actions to address identified vulnerabilities and improve system resilience. This includes implementing security patches, upgrading infrastructure, enhancing incident response capabilities, and adopting best practices to protect against emerging threats. Organizations should prioritize timely fixes and establish protocols for continuous monitoring and improvement.

The fix stage of building an effective digital operational resilience program involves taking corrective actions to address identified vulnerabilities and enhance the organization’s ability to withstand disruptions. This phase focuses on implementing appropriate measures to mitigate risks and strengthen the digital infrastructure. Here are the key aspects involved in the fix stage:

  • Vulnerability Remediation: Identified vulnerabilities and weaknesses in the digital infrastructure must be addressed promptly. This includes applying security patches, updates, and fixes to software, systems, and network components. Vulnerability management processes, such as regular vulnerability scanning and penetration testing, help identify and prioritize areas that require immediate attention.
  • Infrastructure Upgrades: Outdated or unsupported systems and technologies pose significant risks to operational resilience. The fix stage involves upgrading critical infrastructure components, including hardware, software, and networking equipment, to ensure compatibility, performance, and security. This may involve replacing legacy systems with modern and more robust solutions.
  • Incident Response Enhancements: Strengthening incident response capabilities is vital to minimize the impact of disruptions. Organizations should review and update incident response plans, ensuring they align with current threats and operational requirements. This includes defining clear roles and responsibilities, establishing communication channels, conducting drills and simulations, and integrating incident response with broader business continuity and disaster recovery plans.
  • Security Controls Implementation: Robust security controls are essential for protecting against cyber threats and minimizing vulnerabilities. The fix stage involves implementing appropriate security measures, such as access controls, encryption, intrusion detection and prevention systems, firewalls, and security information event management (SIEM) solutions. This helps detect, prevent, and respond to security incidents effectively.
  • Best Practice Adoption: Organizations should adopt industry best practices and frameworks, such as ISO/IEC 27001:2022 (International Organization for Standardization[2]), NIST Cybersecurity Framework (CSF Tools[3]), or CIS 18 Critical Security Controls (Center for Internet Security[4]). These frameworks provide guidance on implementing comprehensive security measures and resilience strategies. Organizations can ensure a systematic and structured approach to improving operational resilience by aligning with recognized standards.
  • Employee Awareness and Training: Employees play a vital role in maintaining operational resilience. The fix stage involves providing training and awareness programs to educate employees about cybersecurity risks, incident reporting procedures, and best practices for secure behaviour. This helps create a security-conscious culture and reduces the likelihood of human error leading to disruptions.
  • Continuous Monitoring and Improvement: The fix stage is not a one-time effort but an ongoing process. Organizations should establish continuous monitoring mechanisms to identify new vulnerabilities, emerging threats, and areas for improvement. This includes implementing real-time monitoring systems, conducting periodic risk assessments, and engaging in proactive threat intelligence gathering.
  • Supplier and Third-Party Risk Management: Organizations should assess and mitigate risks associated with suppliers and third-party vendors. The fix stage involves implementing processes to evaluate the security posture of external partners, conducting due diligence during vendor selection, and establishing contractual agreements that ensure adherence to security and resilience standards.

By focusing on these aspects during the fix stage, organizations can proactively address vulnerabilities, enhance incident response capabilities, and strengthen the overall operational resilience of their digital infrastructure. This iterative and proactive approach helps organizations adapt to evolving threats and ensure their ability to operate effectively despite disruptions.

5. Assess & Report

The “Assess & Report” phase emphasizes the importance of ongoing monitoring, measurement, and reporting of digital operational resilience. Organizations should conduct regular assessments to evaluate the effectiveness of implemented measures, identify gaps, and measure their overall resilience. These findings should be documented and communicated through comprehensive reports to key stakeholders, including senior management and relevant regulatory bodies. Transparent reporting facilitates a clear understanding of the organization’s resilience posture and enables informed decision-making.

6. The Report Stage

This stage of building an effective digital operational resilience program involves generating and disseminating comprehensive reports to key stakeholders. This phase focuses on capturing and communicating critical information regarding the organization’s digital operational resilience. Here are some key aspects involved in the report stage:

  • Documentation: The report stage begins with documenting the findings and outcomes of assessments, evaluations, and measurements conducted throughout the resilience program. This includes information related to identified vulnerabilities, implemented controls, incident response capabilities, and overall resilience posture.
  • Data Analysis: The collected data is analysed to identify trends, patterns, and areas of improvement. This analysis helps in assessing the effectiveness of existing measures, identifying gaps, and evaluating the organization’s ability to withstand disruptions.
  • Key Performance Indicators (KPIs): Defining and tracking relevant KPIs is crucial in assessing the organization’s digital operational resilience. These KPIs can include metrics related to incident response time, recovery point objectives (RPO), recovery time objectives (RTO), system availability, and other performance indicators that demonstrate resilience capabilities.
  • Stakeholder Engagement: Reports should be tailored to the needs of different stakeholders, such as senior management, board members, regulatory bodies, and relevant internal departments. Engaging stakeholders through clear and concise reporting helps them understand the organization’s resilience posture, make informed decisions, and allocate resources appropriately.
  • Regulatory Compliance: If applicable, reports should address regulatory requirements specific to the organization’s industry or jurisdiction. Compliance-related information should be included to demonstrate adherence to relevant regulations and guidelines. 
  • Recommendations: The report stage provides an opportunity to present recommendations for further enhancements and improvements to the digital operational resilience program. These recommendations should be based on the findings and analysis conducted during the assessment and evaluation processes.
  •  Communication and Dissemination: Reports should be communicated effectively, considering the appropriate channels and formats for different stakeholders. This can involve formal presentations, executive summaries, dashboards, or detailed written reports, depending on the needs and preferences of the intended audience.
  • Iterative Process: The report stage is not a one-time event but rather a recurring process. Regular reporting ensures ongoing monitoring, measurement, and reporting of the organization’s digital operational resilience. It allows for the tracking of progress, the identification of emerging risks, and the continuous improvement of resilience strategies.

By engaging in comprehensive reporting, organizations can gain insights into their resilience capabilities, promote transparency, and facilitate informed decision-making. These reports play a crucial role in demonstrating the effectiveness of the digital operational resilience program and fostering a culture of continuous improvement.

7. Maintain

Maintaining digital operational resilience is an ongoing effort. The “Maintain” phase of the PIE FARM framework focuses on continuous improvement and adaptation. Organizations should establish a culture of proactive monitoring, periodic audits, and regular updates to ensure the resilience program remains effective and aligned with evolving threats and business requirements. By embracing a cycle of continuous improvement, organizations can strengthen their digital operational resilience over time.

 The maintenance stage of an effective Digital Operational Resilience program involves the ongoing activities and processes required to sustain and continuously improve the resilience of the organization’s digital infrastructure. This focuses on monitoring, assessing, and optimizing resilience measures to ensure they remain effective over time. Here are the key aspects involved in the maintenance stage:

  • Continuous Monitoring: Organizations should establish a systematic process for continuously monitoring the digital infrastructure, including networks, systems, applications, and data. This involves utilizing monitoring tools, conducting regular vulnerability assessments, analysing security logs, and tracking key performance indicators (KPIs) related to operational resilience.
  • Incident Response Refinement: The maintenance stage includes regularly reviewing and refining the incident response plans and processes. This may involve updating contact lists, improving incident categorization and escalation procedures, incorporating lessons learned from past incidents, and conducting tabletop exercises and simulations to assess the effectiveness of the response capabilities.
  • Patch and Update Management: Keeping software, applications, and systems updated with the latest security patches and updates is crucial for maintaining a resilient digital infrastructure. The maintenance stage involves implementing a robust patch and update management process to ensure the timely application of security fixes and enhancements.
  • Employee Training and Awareness: Ongoing employee training and awareness programs are vital in maintaining operational resilience. Organizations should provide regular cybersecurity training to employees, covering topics such as phishing awareness, secure password practices, social engineering, and data handling procedures. This helps reinforce a culture of security and resilience throughout the organization.
  • Risk Assessments and Audits: Periodic risk assessments and audits are essential for evaluating the effectiveness of the resilience program and identifying new risks. The maintenance stage involves conducting comprehensive risk assessments, both internal and external, and performing audits to ensure compliance with regulatory requirements and industry standards.
  • Documentation and Reporting: Maintaining accurate and up-to-date documentation is critical for effectively maintaining the digital operational resilience program. This includes updating policies, procedures, and incident response documentation. Additionally, generating regular reports for senior management and relevant stakeholders helps track progress, communicate resilience posture, and identify areas for improvement.
  • Compliance and Regulatory Updates: Organizations need to stay informed about changes in regulatory requirements and evolving cybersecurity standards. The maintenance stage involves monitoring relevant regulations and guidelines, implementing necessary changes to remain compliant, and adjusting resilience measures accordingly.
  • Continuous Improvement: The maintenance stage emphasizes the importance of continuous improvement. Organizations should foster a culture of learning from incidents and near misses, analysing root causes, and implementing corrective actions. This includes conducting post-incident reviews, seeking stakeholder feedback, and actively seeking opportunities to enhance resilience capabilities.

By prioritizing these aspects during the maintenance stage, organizations can ensure the ongoing effectiveness of their Digital Operational Resilience program. Regular monitoring, refining incident response, keeping systems updated, providing employee training, conducting assessments, and embracing a culture of continuous improvement are key to maintaining a resilient digital infrastructure in the face of evolving threats and challenges.

Conclusion

Developing an effective Digital Operational Resilience Program is critical for businesses to withstand and recover from disruptions. The PIE FARM framework offers a systematic and comprehensive approach, encompassing planning, identification, engagement, fixing, assessment, and maintenance. By utilizing this approach, organizations can enhance their understanding of vulnerabilities, improve incident response capabilities, and foster a resilient digital infrastructure. Embracing the PIE FARM methodology will contribute to organizations’ long-term sustainability and success in today’s dynamic digital landscape.

For further insights, check out:  Security Risk Management – The Driving Force for Operational Resilience The Firefighting Paradox (Seaman and Gioia[5]).

[1] EUR-Lex. “EUR-Lex – 32022R2554 – EN – EUR-Lex.” Eur-Lex.europa.eu, 27 Dec. 2022, eur-lex.europa.eu/eli/reg/2022/2554/oj . Accessed 25 June 2023.

[2] International Organization for Standardization. “Information Security, Cybersecurity and Privacy Protection — Information Security Management Systems — Requirements.” Iso.org, 2022, www.iso.org/obp/ui/#iso:std:iso-iec:27001:ed-3:v1:en. Accessed 25 June 2023.

[3] csf tools. “CSF Version 1.1.” CSF Tools, 13 Aug. 2020, www.csf.tools/reference/nist-cybersecurity-framework/v1-1. Accessed 25 June 2023.

[4] Center for Internet Security. “The 18 CIS Controls.” Center for Internet Security (CIS), 2023, www.cisecurity.org/controls/cis-controls-list . Accessed 25 June 2023.

[5] Seaman, Jim, and Michael Gioia. Security Risk Management – the Driving Force for Operational Resilience. CRC Press, 1 Aug. 2024.

Recommend0 recommendationsPublished in IT Availability & Security, Uncategorized

Share This Story, Choose Your Platform!

About the Author: James Seaman and Michael Gioia

James Seaman is a highly imaginative and creative individual with a talent for solving problems in unconventional ways. He honed his skills and knowledge through a successful career in the RAF Police. For 22 years, he was responsible for ensuring adequate protective security of mission-critical assets, working as a Police Dog handler, Security Commander, Aviation Security Specialist, and Counter-Intelligence operative.  On retiring from military service, he transitioned to the corporate environment, where has has fulfilled numerous protective security roles and responsibilities.  During this career, he achieved an MSc in Security Management, as well as various industry Information Security & Risk qualifications.

Leave A Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.