On Tue, Sep 13, 2011 at 10:17 PM, Dennis Yusupoff dyr@smartspb.net wrote:
All of yours, I think, knows about connectivity problem with yandex services at 19 Aug 2011. According to comment of their IT team, it happend when all BGP routes was mistakenly redistributed in OSPF, which led to memory overflow at routers like a snowball, one-by-one. As far as I know, there is only one way to avoid it - by limiting maximum number of OSPF routes in redistribute process (Cisco example) at ASBR, which explicity is Single Point of Failure(SPoF). What happens if it failed you've just read above.
I'd not call it 'SPoF' otherwise we shall call almost any BGP policy SPoF as well. Needless to say how one can break the whole network with 1-2 BGP configuration commands ;)..We have seen it before many times. The only real SPoF in any network is a human being. There are many ways to deal with this problem (limiting the failure domain, for example).
So don't the respected community consider to make request for changing RFC to avoid such SPoF in future and give possibility to limiting maximum size of OSPF(or, wider, any IGP) routes at any OSPF-routers?
All routers in OSPF area have an identical link-state database (and corresponding graph) as a loop prevention mechanism. Are you suggesting to remove this fundamental requirement (which means creating a completely different, new routing protocol IMHO)?
I beg my pardon if that subject are far from ENOG >or my networking skills aren't enough to do such proposal conclusion.
If you are interested in OSPF development/improving: there is an IETF WG for OSPF: http://datatracker.ietf.org/wg/ospf/charter/