In this article, we address transparent Damage Quarantine and Recovery (DQR), a very important problem faced today by a large number of mission, life, and/or business-critical applications and information systems that must manage risk, business continuity, and assurance in the presence of severe cyber attacks. Today, these critical applications still have a good chance to su?er from a big hit from attacks. Due to data sharing, interdependencies, and interoperability, the hit could greatly amplify its damage by causing catastrophic cascading effects, which may force an application to halt for hours or even days before the application is recovered. In this paper, we ?rst do a thorough discussion on the limitations of traditional fault tolerance and failure recovery techniques in solving the DQR problem. Then we present a systematic review on how the DQR problem is being solved. Finally, we point out some remaining research issues in fully solving the DQR problem.
All Science Journal Classification (ASJC) codes
- Computer Science(all)