Detecting and localizing large-scale router failures using active probes

Qiang Zheng, Guohong Cao, Thomas F. La Porta, Ananthram Swami

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Detecting the occurrence of large-scale router failures and localizing the failed routers are critical to enhancing network reliability. We propose a two-phase approach for detecting and localizing large-scale router failures using traceroute-like active probes. To detect large-scale router failures, the detection phase is periodically invoked to probe all routers. When detecting large-scale router failures, the localization phase is triggered to identify the failed routers.We reduce the probing cost by avoiding three types of useless probes. For the routers whose status cannot be identified by probes, we develop a distance based method to estimate their failure probability. Experimental results based on ISP topologies show that the accuracy of our approach is higher than 96.5%, even when only 10% of routers are connected by end systems for probing. Compared with prior works, the proposed approach achieves much higher accuracy with lower probing cost.

Original languageEnglish (US)
Title of host publication2010 Military Communications Conference, MILCOM 2010
Pages1170-1175
Number of pages6
DOIs
StatePublished - Dec 1 2011
Event2011 IEEE Military Communications Conference, MILCOM 2011 - Baltimore, MD, United States
Duration: Nov 7 2011Nov 10 2011

Publication series

NameProceedings - IEEE Military Communications Conference MILCOM

Other

Other2011 IEEE Military Communications Conference, MILCOM 2011
CountryUnited States
CityBaltimore, MD
Period11/7/1111/10/11

Fingerprint

Routers
Costs
Topology

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering

Cite this

Zheng, Q., Cao, G., La Porta, T. F., & Swami, A. (2011). Detecting and localizing large-scale router failures using active probes. In 2010 Military Communications Conference, MILCOM 2010 (pp. 1170-1175). [6127458] (Proceedings - IEEE Military Communications Conference MILCOM). https://doi.org/10.1109/MILCOM.2011.6127458
Zheng, Qiang ; Cao, Guohong ; La Porta, Thomas F. ; Swami, Ananthram. / Detecting and localizing large-scale router failures using active probes. 2010 Military Communications Conference, MILCOM 2010. 2011. pp. 1170-1175 (Proceedings - IEEE Military Communications Conference MILCOM).
@inproceedings{ee8bf98966ac439ab00f1c597831eaa5,
title = "Detecting and localizing large-scale router failures using active probes",
abstract = "Detecting the occurrence of large-scale router failures and localizing the failed routers are critical to enhancing network reliability. We propose a two-phase approach for detecting and localizing large-scale router failures using traceroute-like active probes. To detect large-scale router failures, the detection phase is periodically invoked to probe all routers. When detecting large-scale router failures, the localization phase is triggered to identify the failed routers.We reduce the probing cost by avoiding three types of useless probes. For the routers whose status cannot be identified by probes, we develop a distance based method to estimate their failure probability. Experimental results based on ISP topologies show that the accuracy of our approach is higher than 96.5{\%}, even when only 10{\%} of routers are connected by end systems for probing. Compared with prior works, the proposed approach achieves much higher accuracy with lower probing cost.",
author = "Qiang Zheng and Guohong Cao and {La Porta}, {Thomas F.} and Ananthram Swami",
year = "2011",
month = "12",
day = "1",
doi = "10.1109/MILCOM.2011.6127458",
language = "English (US)",
isbn = "9781467300810",
series = "Proceedings - IEEE Military Communications Conference MILCOM",
pages = "1170--1175",
booktitle = "2010 Military Communications Conference, MILCOM 2010",

}

Zheng, Q, Cao, G, La Porta, TF & Swami, A 2011, Detecting and localizing large-scale router failures using active probes. in 2010 Military Communications Conference, MILCOM 2010., 6127458, Proceedings - IEEE Military Communications Conference MILCOM, pp. 1170-1175, 2011 IEEE Military Communications Conference, MILCOM 2011, Baltimore, MD, United States, 11/7/11. https://doi.org/10.1109/MILCOM.2011.6127458

Detecting and localizing large-scale router failures using active probes. / Zheng, Qiang; Cao, Guohong; La Porta, Thomas F.; Swami, Ananthram.

2010 Military Communications Conference, MILCOM 2010. 2011. p. 1170-1175 6127458 (Proceedings - IEEE Military Communications Conference MILCOM).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Detecting and localizing large-scale router failures using active probes

AU - Zheng, Qiang

AU - Cao, Guohong

AU - La Porta, Thomas F.

AU - Swami, Ananthram

PY - 2011/12/1

Y1 - 2011/12/1

N2 - Detecting the occurrence of large-scale router failures and localizing the failed routers are critical to enhancing network reliability. We propose a two-phase approach for detecting and localizing large-scale router failures using traceroute-like active probes. To detect large-scale router failures, the detection phase is periodically invoked to probe all routers. When detecting large-scale router failures, the localization phase is triggered to identify the failed routers.We reduce the probing cost by avoiding three types of useless probes. For the routers whose status cannot be identified by probes, we develop a distance based method to estimate their failure probability. Experimental results based on ISP topologies show that the accuracy of our approach is higher than 96.5%, even when only 10% of routers are connected by end systems for probing. Compared with prior works, the proposed approach achieves much higher accuracy with lower probing cost.

AB - Detecting the occurrence of large-scale router failures and localizing the failed routers are critical to enhancing network reliability. We propose a two-phase approach for detecting and localizing large-scale router failures using traceroute-like active probes. To detect large-scale router failures, the detection phase is periodically invoked to probe all routers. When detecting large-scale router failures, the localization phase is triggered to identify the failed routers.We reduce the probing cost by avoiding three types of useless probes. For the routers whose status cannot be identified by probes, we develop a distance based method to estimate their failure probability. Experimental results based on ISP topologies show that the accuracy of our approach is higher than 96.5%, even when only 10% of routers are connected by end systems for probing. Compared with prior works, the proposed approach achieves much higher accuracy with lower probing cost.

UR - http://www.scopus.com/inward/record.url?scp=84856976110&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84856976110&partnerID=8YFLogxK

U2 - 10.1109/MILCOM.2011.6127458

DO - 10.1109/MILCOM.2011.6127458

M3 - Conference contribution

AN - SCOPUS:84856976110

SN - 9781467300810

T3 - Proceedings - IEEE Military Communications Conference MILCOM

SP - 1170

EP - 1175

BT - 2010 Military Communications Conference, MILCOM 2010

ER -

Zheng Q, Cao G, La Porta TF, Swami A. Detecting and localizing large-scale router failures using active probes. In 2010 Military Communications Conference, MILCOM 2010. 2011. p. 1170-1175. 6127458. (Proceedings - IEEE Military Communications Conference MILCOM). https://doi.org/10.1109/MILCOM.2011.6127458