An Analysis on Intra-Domain Routing Instability of

Case Studies on Intra-Domain
Routing Instability
Zhang Shu
Communications Research Laboratory, Japan
(To be renamed to National Institute of Information and
Communications Technology)
APAN17 – Engineering Session
1/30/2004, Hawaii
1
Overview





What is routing instability?
Methodology of the measurement
Case study 1: WIDE Internet
Case study 2: APAN Tokyo-XP
Conclusion and future work
2
Routing Instability

Routing instability
• Also called route flaps
• Unexpected topology change

Bad influence
• Packet loss
• Increased router load
• Wasted bandwidth

Causes
• Link failure, software bug

Types of routing instability
• Inter-domain
• Intra-domain
3
Methodology

Methodology
• Use “tcpdump” to collect link state
routing messages
• Then analyze the routing messages
by self-made tools


Ospfanaly
Some other scripts
• Include a CGI perl script to view the statistical
results by web
4
OSPF

Open Shortest Path First
• A widely deployed intra-domain link state
routing protocol
• OSPFv2 and OSPFv3

Link state advertisements (LSAs)
• OSPFv2



Router-LSA
Network-summary-LSA
AS-external-LSA


Network-LSA
ASBR-summary-LSA
• OSPFv3

Seven kinds of LSAs defined in RFC2740
5
Case Study One: WIDE Internet

WIDE Internet
• WIDE Project

http://www.wide.ad.jp
• Connecting hundreds of organizations

NARA-NOC
• Located in Nara Institute of Science and
Technology, Japan
• The measurement machine is placed into
one ethernet segment of the NARA-NOC
network
6
Measurement Result of WIDE Internet (OSPFv2)
Number of LSA changes
Number of LSAs
7
Date (Year/Month)
The Case of OSPFv3
Number of LSA changes
Number of LSAs
8
Date (Year/Month)
Other Findings during the Analysis

Sometimes serious LSA oscillation
happened
• The change happens with the interval of
10s-200s
• Usually lasts for hours, sometimes for days

Oscillation of router-LSA
• Most of the observed oscillation was the
repeated up/down of routers’ interfaces
9
The Causes of the Flaps

The isolated causes
• Congestion

DDoS attacks
• Operation miss

Mis-configuration of router ID
• Software/Hardware bug




Zebra routing daemon
Cisco’s OSPF bug
Foundry switch
The causes of much flaps are still unknown
• The flaps occur randomly

Why the flaps decrease in the recent months?
• The change of routing protocol implementation style

Special process on routing messages
• Bandwidth
10
Case Study Two: APAN Tokyo-XP

APAN Tokyo-XP
• Located in Otemachi, Tokyo
• Seven routers in the backbone area
• Data collected on a FreeBSD box
connected to a ethernet segment
11
Measurement Result of APAN
Tokyo-XP (OSPFv2)
Number of LSA changes
Number of LSAs
Date (Year/Month)
Although most of the updates are due to router maintenance,
there still unknown ones.
12
Conclusion

Our investigation on WIDE Internet
• OSPF LSA oscillation may occur frequently
sometimes
• Sometimes serious oscillation occurred
• It is difficult to determine what caused the
flaps

Similar phenomenon may be found on
other networks, so it is important to
deploy a measurement system on
different networks
13
Future Work

To do more measurement on other
networks
• Abilene of Internet2


To improve our monitoring system
To isolate the causes
• When detects oscillation, obtain
helpful data for troubleshooting
14
If you would like to conduct a
routing instability measurement on
your own network, please contact
Zhang Shu
[email protected]
Thank you for your attention!
15