Optimal recovery from large-scale failures in IP networks

Qiang Zheng, Guohong Cao, Thomas F. La Porta, Ananthram Swami

Research output: Contribution to conferencePaper

12 Citations (Scopus)

Abstract

Quickly recovering IP networks from failures is critical to enhancing Internet robustness and availability. Due to their serious impact on network routing, large-scale failures have received increasing attention in recent years. We propose an approach called Reactive Two-phase Rerouting (RTR) for intra-domain routing to quickly recover from large-scale failures with the shortest recovery paths. To recover a failed routing path, RTR first forwards packets around the failure area to collect information on failures. Then, in the second phase, RTR calculates a new shortest path and forwards packets along it through source routing. RTR can deal with large-scale failures associated with areas of any shape and location, and is free of permanent loops. For any failure area, the recovery paths provided by RTR are guaranteed to be the shortest. Extensive simulations based on ISP topologies show that RTR can find the shortest recovery paths for more than 98.6% of failed routing paths with reachable destinations. Compared with prior works, RTR achieves better performance for recoverable failed routing paths and uses much less network resources for irrecoverable failed routing paths.

Original languageEnglish (US)
Pages295-304
Number of pages10
DOIs
StatePublished - Oct 5 2012
Event32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012 - Macau, China
Duration: Jun 18 2012Jun 21 2012

Other

Other32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012
CountryChina
CityMacau
Period6/18/126/21/12

Fingerprint

Recovery
Network routing
Topology
Availability
Internet

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Zheng, Q., Cao, G., La Porta, T. F., & Swami, A. (2012). Optimal recovery from large-scale failures in IP networks. 295-304. Paper presented at 32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012, Macau, China. https://doi.org/10.1109/ICDCS.2012.47
Zheng, Qiang ; Cao, Guohong ; La Porta, Thomas F. ; Swami, Ananthram. / Optimal recovery from large-scale failures in IP networks. Paper presented at 32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012, Macau, China.10 p.
@conference{77c4a97000f9448b971fa713e8917734,
title = "Optimal recovery from large-scale failures in IP networks",
abstract = "Quickly recovering IP networks from failures is critical to enhancing Internet robustness and availability. Due to their serious impact on network routing, large-scale failures have received increasing attention in recent years. We propose an approach called Reactive Two-phase Rerouting (RTR) for intra-domain routing to quickly recover from large-scale failures with the shortest recovery paths. To recover a failed routing path, RTR first forwards packets around the failure area to collect information on failures. Then, in the second phase, RTR calculates a new shortest path and forwards packets along it through source routing. RTR can deal with large-scale failures associated with areas of any shape and location, and is free of permanent loops. For any failure area, the recovery paths provided by RTR are guaranteed to be the shortest. Extensive simulations based on ISP topologies show that RTR can find the shortest recovery paths for more than 98.6{\%} of failed routing paths with reachable destinations. Compared with prior works, RTR achieves better performance for recoverable failed routing paths and uses much less network resources for irrecoverable failed routing paths.",
author = "Qiang Zheng and Guohong Cao and {La Porta}, {Thomas F.} and Ananthram Swami",
year = "2012",
month = "10",
day = "5",
doi = "10.1109/ICDCS.2012.47",
language = "English (US)",
pages = "295--304",
note = "32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012 ; Conference date: 18-06-2012 Through 21-06-2012",

}

Zheng, Q, Cao, G, La Porta, TF & Swami, A 2012, 'Optimal recovery from large-scale failures in IP networks' Paper presented at 32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012, Macau, China, 6/18/12 - 6/21/12, pp. 295-304. https://doi.org/10.1109/ICDCS.2012.47

Optimal recovery from large-scale failures in IP networks. / Zheng, Qiang; Cao, Guohong; La Porta, Thomas F.; Swami, Ananthram.

2012. 295-304 Paper presented at 32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012, Macau, China.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Optimal recovery from large-scale failures in IP networks

AU - Zheng, Qiang

AU - Cao, Guohong

AU - La Porta, Thomas F.

AU - Swami, Ananthram

PY - 2012/10/5

Y1 - 2012/10/5

N2 - Quickly recovering IP networks from failures is critical to enhancing Internet robustness and availability. Due to their serious impact on network routing, large-scale failures have received increasing attention in recent years. We propose an approach called Reactive Two-phase Rerouting (RTR) for intra-domain routing to quickly recover from large-scale failures with the shortest recovery paths. To recover a failed routing path, RTR first forwards packets around the failure area to collect information on failures. Then, in the second phase, RTR calculates a new shortest path and forwards packets along it through source routing. RTR can deal with large-scale failures associated with areas of any shape and location, and is free of permanent loops. For any failure area, the recovery paths provided by RTR are guaranteed to be the shortest. Extensive simulations based on ISP topologies show that RTR can find the shortest recovery paths for more than 98.6% of failed routing paths with reachable destinations. Compared with prior works, RTR achieves better performance for recoverable failed routing paths and uses much less network resources for irrecoverable failed routing paths.

AB - Quickly recovering IP networks from failures is critical to enhancing Internet robustness and availability. Due to their serious impact on network routing, large-scale failures have received increasing attention in recent years. We propose an approach called Reactive Two-phase Rerouting (RTR) for intra-domain routing to quickly recover from large-scale failures with the shortest recovery paths. To recover a failed routing path, RTR first forwards packets around the failure area to collect information on failures. Then, in the second phase, RTR calculates a new shortest path and forwards packets along it through source routing. RTR can deal with large-scale failures associated with areas of any shape and location, and is free of permanent loops. For any failure area, the recovery paths provided by RTR are guaranteed to be the shortest. Extensive simulations based on ISP topologies show that RTR can find the shortest recovery paths for more than 98.6% of failed routing paths with reachable destinations. Compared with prior works, RTR achieves better performance for recoverable failed routing paths and uses much less network resources for irrecoverable failed routing paths.

UR - http://www.scopus.com/inward/record.url?scp=84866886649&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84866886649&partnerID=8YFLogxK

U2 - 10.1109/ICDCS.2012.47

DO - 10.1109/ICDCS.2012.47

M3 - Paper

AN - SCOPUS:84866886649

SP - 295

EP - 304

ER -

Zheng Q, Cao G, La Porta TF, Swami A. Optimal recovery from large-scale failures in IP networks. 2012. Paper presented at 32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012, Macau, China. https://doi.org/10.1109/ICDCS.2012.47