Analysis of RIP captures RIP in Cisco routers and the DV

RIP in Cisco routers and the DV theory
n
Analysis of RIP captures
Unfortunately, real implementations may differ from the
theory
n
Often we have to rely on reverse engineering to understand what
is really going on
n
Not always sure of the reason of some dynamic behavior
n
We can suggest some explanation, but we cannot be sure that
this is 100% correct
Fulvio Risso, Politecnico di Torino
1
3
Hold Down in Cisco RIP
Route Poisoning in Cisco RIP
n
Most notable difference from RFC (btw, implemented in the Cisco
way, not at she theory suggests)
n
When a route is received from the next hop router toward that
destination whose cost increases, that route is placed in Hold
Down
n
Route poisoning begins when a router notices that a
connected route is no longer valid
n
The router then advertises that route out all its interfaces
with cost = 16
n
Hold Down does not apply when a router has a fault on a connected
network
n
During the Hold Down period the network is advertised with the
old cost
n
No further updates for that route are accepted until
4
n
n
The Hold Down timer expires (180s)
n
An update packet is received with a metric that is better than the
original metric
n
RIP captures seem to suggest that when a second update packet,
coming from the next hop router, is received, the HoldDown timer is
reset
5
All the other routers consider the metric infinite and the route
invalid
Network topology
Typical configuration (R1)
R1# configure terminal
R1(config)# interface FastEthernet0
R1(config-if)# no shutdown
R1(config-if)# no ip split-horizon
R1(config-if)# ip address 192.168.10.1 255.255.255.0
R1(config-if)# exit
Fe0
192.168.10.1/24
Fe2
192.168.12.1/24
Fe2
192.168.12.2/24
R1
Fe2
192.168.23.2/24
Fe1
192.168.23.1/24
R2
Fe1
192.168.13.1/24
Fe0
192.168.100.2/24
Fe2
R1(config)# interface FastEthernet1
192.168.12.1/24
R1(config-if)# no shutdown
R1(config-if)# no ip split-horizon
R1(config-if)# ip address 192.168.13.1 255.255.255.0
R1(config-if)# exit
Fe1
192.168.13.2/24
Fe0
192.168.10.1/24
Fe1
192.168.13.1/24
R1
R1(config)# interface FastEthernet2
R1(config-if)# no shutdown
R1(config-if)# no ip split-horizon
R1(config-if)# ip address 192.168.12.1 255.255.255.0
R1(config-if)# exit
R3
Fe0
192.168.100.3/24
R1(config)# router
R1(config-router)#
R1(config-router)#
R1(config-router)#
R1(config-router)#
R1(config-router)#
R1(config-router)#
R1#
rip
version 2
no auto-summary
network 192.168.10.0
network 192.168.12.0
network 192.168.13.0
end
6
7
RIP in the steady state (without Split Horizon)
Debugging RIP in the steady-state (without SH)
n
Please have a look at encapsulation (UDP port 520)
n
n
Next Hop is not NULL
n
n
Timings: DVs are generated every (approx) 30 seconds
E.g., pkt 4 (from R1), net 192.168.23.0: next hop= 192.168.12.2
Announcements are the same on all the interfaces (5
DV (R1 à LAN1) – Pkt 1
networks, pkts 1-9)
LAN1
192.168.10.0/24, 1
192.168.12.0/24, 1
192.168.13.0/24, 1
192.168.23.0/24, 2
192.168.100.0/24, 2
192.168.10.1/24
192.168.12.1/24
192.168.13.1/24
DV (R1 à R2) – Pkt 4
R1
192.168.10.0/24, 1
192.168.12.0/24, 1
192.168.13.0/24, 1
192.168.23.0/24, 2
192.168.100.0/24, 2
192.168.12.2/24
192.168.100.2/24
8
R2
R3
192.168.23.1/24
192.168.23.2/24
LAN2
192.168.13.2/24
192.168.100.3/24
rip-update-nosplithorizon.acp
Router1#debug ip rip
RIP protocol debugging is on
*Jan 23 08:48:28.252: RIP: sending v2 update to 224.0.0.9 via Vlan13 (192.168.13.1)
*Jan 23 08:48:28.252: RIP: build update entries
*Jan 23 08:48:28.252:
192.168.10.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.252:
192.168.12.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.252:
192.168.13.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.252:
192.168.23.0/24 via 192.168.13.2, metric 2, tag 0
*Jan 23 08:48:28.252:
192.168.100.0/24 via 192.168.13.2, metric 2, tag 0
*Jan 23 08:48:28.396: RIP: sending v2 update to 224.0.0.9 via FastEthernet0 (192.168.10.1)
*Jan 23 08:48:28.396: RIP: build update entries
*Jan 23 08:48:28.396:
192.168.10.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.396:
192.168.12.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.396:
192.168.13.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.396:
192.168.23.0/24 via 0.0.0.0, metric 2, tag 0
*Jan 23 08:48:28.396:
192.168.100.0/24 via 0.0.0.0, metric 2, tag 0
*Jan 23 08:48:28.560: RIP: sending v2 update to 224.0.0.9 via Vlan12 (192.168.12.1)
*Jan 23 08:48:28.560: RIP: build update entries
*Jan 23 08:48:28.560:
192.168.10.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.560:
192.168.12.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.560:
192.168.13.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 23 08:48:28.560:
192.168.23.0/24 via 192.168.12.2, metric 2, tag 0
*Jan 23 08:48:28.560:
192.168.100.0/24 via 192.168.12.2, metric 2, tag 0
*Jan 23 08:48:40.632: RIP: received v2 update from 192.168.12.2 on Vlan12
*Jan 23 08:48:40.632:
192.168.12.0/24 via 0.0.0.0 in 1 hops
*Jan 23 08:48:40.632:
192.168.23.0/24 via 0.0.0.0 in 1 hops
*Jan 23 08:48:40.632:
192.168.100.0/24 via 0.0.0.0 in 1 hops
*Jan 23 08:48:45.064: RIP: received v2 update from 192.168.13.2 on Vlan13
*Jan 23 08:48:45.064:
192.168.13.0/24 via 0.0.0.0 in 1 hops
*Jan 23 08:48:45.064:
192.168.23.0/24 via 0.0.0.0 in 1 hops
*Jan 23 08:48:45.064:
192.168.100.0/24 via 0.0.0.0 in 1 hops
Router1#no debug all
All possible debugging has been turned off
9
Adding a new router in RIP (without SH) (1)
n
We begin with R2 and R3
n
Pkts 1-6: steady state
n
Since only R2 and R3 are active, each DV contains 4 routes
n
R1 is turned on
n
Pkts 7, 8, 10: RIP Requests from R1 on its three interfaces
n
Pkts 9, 11: RIP Updates (i.e. Answers from R2 and R3 to the
previous requests)
n
n
Each DV contains 4 networks (the ones currently known by R2
and R3)
n
Please note that those updates are sent in unicast
Adding a new router in RIP (without SH) (2)
n
Pkts 15-17: triggered updates generated by R2, containing
only the new learned network (192.168.10.0/24)
n
Pkts 18-20: triggered updates generated by R3, containing
only the new learned network (192.168.10.0/24)
n
Pkts 21-end: DVs in the steady state
n
Each DV contains 5 routes
n
DVs are sent (approx) every 30 seconds
Pkts 12-14: New DVs, as calculated by R1
n
Contain 5 routes (network 10.0 plus the 4 known though the
previous RIP Updates)
rip-request-nosplithorizon.acp
10
11
Fault on RIP (without Split Horizon) (1)
Fault on RIP (without Split Horizon) (2)
n
Pkts 1-10: normal behavior (no fault)
n
We shut down the interface toward LAN1 on R1
n
Pkts 11-12: triggered update with Route Poisoning
n
n
n
n
The DV contains only the faulty network 192.168.10.0/24, announced with
cost = 16
Why R1 does not use the alternate path toward 192.168.10.0/24 as
contained in the DV arrived from R2 and R3 (and further reinforced in Pkt
14, that arrives to R1 after the fault)?
n
Do not contain any update related to the network 192.168.10.0/24
n
Possible reasons
n
n
Those DVs contain R1 as next hop for network 192.168.10.0/24,
hence R1 knows that those routes does not represent valid
alternatives
13
The HoldDown timer blocks any update for that route
Apparently, this packet resets the Hold Down Timer
n
Route poisoning, as Cisco says that is a connected route goes
down, it must e advertised with cost = infinity
The route update has not been activated yet (remember a little delay before
applying new routes, in order to wait for new updates?)
Pkt 16: traditional DV, generated by R1 toward R2 according
to the standard timing of the update timer (previous DV was
pkt 5)
n
rip-fault-nosplithorizon.acp
12
n
n
Two possible reasons (which one is correct?)
n
Pkts 13-15: traditional DVs, generated by R2 and R3 (contain
5 routes) according to the standard timing of the update timer
R2 receives a new update for network 192.168.10.0/24 from its next hop
router, confirming the same cost we announced in the previous update
Fault on RIP (without Split Horizon) (3)
n
Pkts 17-19: Router R2 updates its routing table; now
192.168.10.0/24 is recognized as uncreachable and a
triggered update is sent
n
Pkt 20: the same as pkt 16, but toward R3
n
Pkt 21: traditional DV, generated by R2 acocording to the
standard timings of the update timer; includes all the 5 routes
(192.168.10.0/24 with cost 16)
n
Pkts 22-24: the same as pkts 17-19, but related to R3
Fault on RIP (without Split Horizon) (4)
n
Pkt 36: this DV is the first coming from R1 that does no longer
have network 10.0/24 (DV contains only 4 destinations)
n
Please note that this packet is generated approx 60 sec after the
fault, which means that network 10.0 was declared invalid and R1
had to wait 60 sec (the difference between invalid and flush timer)
before clearing up its routing table
n
Pkt 39: last pkt coming from R3 that contains 5 destinations
n
Pkt 40: first pkt coming form R2 that contains only 4
destinations
n
Pkt 41: first pkt coming form R3 that contains only 4
destinations
14
15
Fault on RIP with count-to-infinity
(without SH) (1)
Fault on RIP with count-to-infinity
(without SH) (2)
n
Pkts 1-10: normal behaviour
n
We shut down the interface toward LAN1 on R1
n
Pkts 11-12: triggered updates with Route Poisoning
n
n
n
The DV contains only the faulty network 192.168.10.0/24,
announced with cost = 16
Pkts 13-15: traditional DVs, generated by R2 and R3 (contain
5 routes) according to the standard timing of the update
timer
n
n
The count to infinity begins
n
From this point on, the capture diverges compare to the previous
example
n
The reason is (apparently) the different timing of the update
packets
Pkt 15: traditional DV generated by R1 toward R3, according
to the standard timing of the update timer (previous DV on
that interface was pkt 6)
n
As seen in the previous example, they simply ignore the previous
triggered updates and still propagate network 192.168.10.0/24
with the previous cost
Apparently, this packet resets the HoldDown timer in R3
n
n
rip-fault-nosplithorizon-infinity.acp
16
17
R3 receives for the second time an update coming from its
nexthop router confirming that network 192.168.10.0/24 is
unreachable
This acts as “confirmation” that the network is dead, that the
original router was not able to find other paths; so, R3 will start
looking for other alternatives
Fault on RIP with count-to-infinity
(without SH) (3)
n
R3 waits a few seconds before sending a triggered update for
network 192.168.10.0
n
n
Fault on RIP with count-to-infinity
(without SH) (4)
n
Remember that triggered updates are delayed a few seconds)
Pkts 17-19: finally, R3 generates the triggered updates (on
its 3 interfaces), confirming that it accepted the new cost
advertised by R2
Pkt 16: in the meanwhile, R2 generates a standard DV
toward R3, according to the standard update timer
n
The update contains the network 192.168.10.0/24 at cost 3
n
It represents the beginning of the count to infinity
n
This DV arrives to R3 when the network 192.68.10.0/24 is no
longer in HoldDown state
n
Pkt 17: R1 receives an update that contains a better route
toward 192.168.10.0/24, hence it updates its routing table
n
Hence, R3 will accept modifications for that network
n
R3 will then accept the new path: 192.168.10.0/24 is reachable
through R2 at cost 3
n
n
n
Please note that HoldDown apparently does not apply to the
router that discovered the fault
It should apply only to routers that receive an update from their
next hop router
Pkt 20: R1 generates a new triggered update containing the
new cost (i.e., 4) for network 192.168.10.0/24
n
The count to infinity has started
18
19
How does HoldDown actually work? (1)
How does HoldDown actually work? (2)
n
Some (unfortunate) consideration on the HoldDown timer,
based on previous example
n
Let’s concentrate on router R3
n
#16 (R2 -> R3): 10.0 @ cost 2
n
#17 (R3 -> R1): 10.0 @ cost 3
20
n
It means that R3 knows how to reach 10.0, and that that route
goes toward R2 (which just announced it in #16)
n
As further confirmation, #17 has the NextHop field to NULL,
which means that R1 is not the next hop for that route
n
Let’s analyze now the packets generated/sent by R3
n
#21 (R1 -> R3): 10.0 @ cost 4
n
n
n
21
R3 should not accept that route, as it comes from a router that is
not its next hop for that destination (in fact, the next hop is R2)
and it contains an higher cost than the current one
#24 (R2 -> R3): 10.0 @ cost 5
n
This announcement comes from R3’s next hop route. Hence it
should activate the HoldDown timer
n
If so, R3 should still propagate the cost of network 10.0 as 4
#25 (R3 -> R1): 10.0 @ cost 6
n
Something has gone wrong. In fact, R3 propagates now network
10.0 at cost 6, which means that R3 accepted the modification
that came from R2 in the previous packet
n
Apparently, the HoldDown had no effects here
RIP in the steady state (with Split Horizon)
n
Differences compared to the no-split-horizon case
n
Announcements do not include the reverse route
n
NextHop always NULL (in the current topology)
DV (R1 à LAN1) – Pkt 1
LAN1
192.168.12.0/24
192.168.13.0/24
192.168.23.0/24
192.168.100.0/24
192.168.10.1/24
192.168.12.1/24
192.168.13.1/24
DV (R1 à R2) – Pkt 5
R1
192.168.10.0/24
192.168.13.0/24
192.168.12.2/24
192.168.100.2/24
R2
R3
192.168.23.1/24
192.168.23.2/24
192.168.13.2/24
192.168.100.3/24
LAN2
rip-update-splithorizon.acp
Debugging RIP in the steady-state (with SH)
Router1#debug ip rip
RIP protocol debugging is on
*Jan 24 07:01:35.623: RIP: received v2 update from 192.168.13.2 on Vlan13
*Jan 24 07:01:35.623:
192.168.23.0/24 via 0.0.0.0 in 1 hops
*Jan 24 07:01:35.627:
192.168.100.0/24 via 0.0.0.0 in 1 hops
*Jan 24 07:01:39.631: RIP: sending v2 update to 224.0.0.9 via Vlan13 (192.168.13.1)
*Jan 24 07:01:39.631: RIP: build update entries
*Jan 24 07:01:39.631:
192.168.10.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:01:39.631:
192.168.12.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:01:41.379: RIP: received v2 update from 192.168.12.2 on Vlan12
*Jan 24 07:01:41.379:
192.168.23.0/24 via 0.0.0.0 in 1 hops
*Jan 24 07:01:41.379:
192.168.100.0/24 via 0.0.0.0 in 1 hops
*Jan 24 07:01:46.899: RIP: sending v2 update to 224.0.0.9 via Vlan12 (192.168.12.1)
*Jan 24 07:01:46.899: RIP: build update entries
*Jan 24 07:01:46.899:
192.168.10.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:01:46.899:
192.168.13.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:01:49.803: RIP: sending v2 update to 224.0.0.9 via FastEthernet0 (192.168.10.1)
*Jan 24 07:01:49.803: RIP: build update entries
*Jan 24 07:01:49.803:
192.168.12.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:01:49.803:
192.168.13.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:01:49.803:
192.168.23.0/24 via 0.0.0.0, metric 2, tag 0
*Jan 24 07:01:49.803:
192.168.100.0/24 via 0.0.0.0, metric 2, tag 0
*Jan 24 07:02:04.391: RIP: received v2 update from 192.168.13.2 on Vlan13
*Jan 24 07:02:04.391:
192.168.23.0/24 via 0.0.0.0 in 1 hops
*Jan 24 07:02:04.395:
192.168.100.0/24 via 0.0.0.0 in 1 hops
*Jan 24 07:02:06.407: RIP: sending v2 update to 224.0.0.9 via Vlan13 (192.168.13.1)
*Jan 24 07:02:06.407: RIP: build update entries
*Jan 24 07:02:06.407:
192.168.10.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:02:06.407:
192.168.12.0/24 via 0.0.0.0, metric 1, tag 0
*Jan 24 07:02:09.703: RIP: received v2 update from 192.168.12.2 on Vlan12
Router1#no debug all
All possible debugging has been turned off
22
23
Adding a new router in RIP (with SH) (1)
Adding a new router in RIP (with SH) (2)
n
We begin with R2 and R3
n
Pkts 1-6: steady state
n
n
Only R2 and R3 are active
n
Some DV contain 2 routes, other contains 3 routes
n
R1 is turned on
n
Pkts 7, 8, 10: RIP Requests from R1 on its three interfaces
n
Pkts 9, 11: RIP Updates (i.e., answers from R2 and R3 to the
previous requests)
n
The DV on LAN contains 4 networks
n
The other contains 2 networks
Pkts 15-16: triggered updates generated by R2, containing
only the new learned network (192.168.10.0/24)
n
n
Please note that those updates are sent in unicast
n
25
Please note that the triggered update is not sent on the interface
toward R1, as there is nothing new to propagate
Pkts 17-28: triggered updates generated by R3, containing
only the new learned network (192.168.10.0/24)
n
The ones currently known by R2 and R3, excluding the network
in common with R1
rip-request-splithorizon.acp
24
n
Each DV contains 3 networks
n
n
n
Pkts 12-14: New DVs, as calculated by R1
Please note that the triggered update is not sent on the interface
toward R1, as there is nothing new to propagate
Pkts 191-end: DVs in the steady state
n
Each DV contains a different number of
n
DVs are sent (approx) every 30 seconds
Fault on RIP (with Split Horizon) (1)
Fault on RIP (with Split Horizon) (2)
n
Pkts 1-10: normal behavior (no fault)
n
We shut down the interface toward LAN1 on R1
n
Pkts 11-12: R1 generates a triggered update with Route
Poisoning
n
n
The DV contains only the faulty network 192.168.10.0/24,
announced with cost = 16
Pkts 13-18: the triggered update with Route Poisoning
spreads across all the network
n
R2 and R3 propagate the news
immediately, on all the interfaces
n
Not really clear the reason the HoldDown timer does not come
into play here
n
n
that
192.168.10.0/24
Apparently, the same situation of the no-split-horizon case,
but very different behavior as recorded on the network
Please note that the Route Poisoning mechanism is not affected
by the Split Horizon and messages are sent back to R1 as well
n
The objective is to spread the knowledge of a missing network as
soon as possible, hence Route Poisoning is not subject to the
Split Horizon rule
rip-fault-splithorizon.acp
26
27
Fault on RIP (with Split Horizon) (3)
Turning off a router (with Split Horizon) (1)
n
n
28
Pkts 19-34: traditional DVs, generated by the three routers,
according to the standard timing of the update timer
n
Include the network 192.168.10.0/24 at cost 16
n
This phase lasts 1 minute (the Flush Time interval)
n
Let’s simulate a fault on router R1
n
We turn off the RIP routing process on R1, so that the links
toward R2 and R3 will still be active
n
We force R2, then R3 to re-calculate the routing table, by typing
the following administrative command
Pkts 35-end
n
Traditional DVs, generated by the three routers, according to the
standard timing of the update timer.
n
Those DVs do no longer include the network 192.168.10.0/24
n
n
29
clear ip route *
Result
n
R3 immediately recognizes that network 192.168.10.0/24 is
down
n
R2 has to wait the HoldDown Timer to expire
Turning off a router (with Split Horizon) (2)
n
Turning off a router (with Split Horizon) (3)
Router2#sh ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route
Why?
n
R2 was reset first, hence it asks R3 for a routing table (RIP
Request message)
n
R3 was still with the “old” routing table (it does not know that R1
is no longer present), hence it returns also a route toward
192.168.10.0/24
n
n
Gateway of last resort is not set
C
R
R3 will ask (later, since it was reset in a second time) the routing
table to R2, but the Split Horizon prevents R2 from sending
network 192.168.10.0/24 to R3
192.168.12.0/24 is directly connected, Vlan12
192.168.13.0/24 [120/1] via 192.168.100.3, 00:00:29, FastEthernet0
[120/1] via 192.168.23.2, 00:00:02, Vlan23
192.168.10.0/24 [120/2] via 192.168.100.3, 00:02:29, FastEthernet0
[120/2] via 192.168.23.2, 00:02:29, Vlan23
192.168.23.0/24 is directly connected, Vlan23
192.168.100.0/24 is directly connected, FastEthernet0
R
C
C
...
C
R
R2 will converge when the Invalid Time (180s) expires, and it
purges that entry from the routing table 60 sec later (Flush
Timer)
R
C
C
...
C
R
C
C
192.168.12.0/24 is directly connected, Vlan12
192.168.13.0/24 [120/1] via 192.168.100.3, 00:00:14, FastEthernet0
[120/1] via 192.168.23.2, 00:00:10, Vlan23
192.168.10.0/24 is possibly down, routing via 192.168.100.3, FastEthernet0
192.168.23.0/24 is directly connected, Vlan23
192.168.100.0/24 is directly connected, FastEthernet0
192.168.12.0/24 is directly connected, Vlan12
192.168.13.0/24 [120/1] via 192.168.100.3, 00:00:28, FastEthernet0
[120/1] via 192.168.23.2, 00:00:23, Vlan23
192.168.23.0/24 is directly connected, Vlan23
192.168.100.0/24 is directly connected, FastEthernet0
30
31
Lessons learned (1)
Lessons learned (2)
n
Open specs are useful, open source code is even better
n
Our “reverse engineering” work was definitely frustrating
n
n
n
n
32
n
Several sources of info available on books / on the Internet,
partially different, and in any case not matching real behavior
Tried to define the actual behavior of Cisco RIP, but some
question marks are still there
The final result is that, despite our efforts, we were able to
predict exactly how the Cisco RIP behaves
The protocol is robust enough to converge anyway
n
Even if we do not know exactly why
n
Even if different implementations have different behaviors
n
n
Btw, have you noticed that nobody on the Internet (or on books)
discusses real captures? Everyone looks just fine by drawing
some pictures on the paper…
33
It’s a matter of time; if you wait enough, RIP will find a way to
converge
This is the result all people is interested in
n
It works, that is enough
n
Although… we, as engineers, are a little bit disappointed