Re: SCTP failover

From: Randall Stewart <rrs@cisco.com>
Date: Fri Feb 11 2005 - 16:04:21 EST
('binary' encoding is not supported, stored as-is) ('binary' encoding is not supported, stored as-is) IMO considering Armando's work and all.. we really need:

1) 2 levels of failure.
     a) When you hit a Retranmist limit (set to say 1) you
        switch to the alternate
     b) When you hit a Max retransmit you mark the destination
        as down.
     c) If the HB sent to the place the t-o happened is sucessful
        then you clear your count and the primary stays where it
        was... Note you should send a HB right after a RTO on the
        place where the RTO happened IMO

2) Always send T-O's to the alternate.. not FR's since Armandos
    work shows this is not a good idea... and besides you don't
    switch off due to one loss.. after all you still have an
    ACK clock or you would not be FR'ing.

3) CMT, which is what Jana is working on, may be even a better
    method assuming we can find shared bottlenecks.

R

Anatoly Khusid wrote:
>>If you need faster failover, you could change the
>>Path.Max.Retrans setting to limit the number of consecutive timeouts
>>that will trigger the failover.
>
>
> If you modify SCTP provisioning to be something other than the proposed
> defaults, I thought this might cause significant performance impacts?
>
> -----Original Message-----
> From: Ryan W Bickhart [mailto:bickhart@cis.udel.edu]
> Sent: Friday, February 11, 2005 3:23 PM
> To: David Lehmann
> Cc: sctp-impl@external.cisco.com
> Subject: Re: SCTP failover
>
>
> I'm not sure switching over to the alternate after seeing a single FR is
> the best approach for all cases though. Imagine a situation where the
> alternate is significantly slower or less desirable than the primary. A
> random FR on the primary may not necessarily be grounds for abandoning
> it right away. If you need faster failover, you could change the
> Path.Max.Retrans setting to limit the number of consecutive timeouts
> that will trigger the failover. Rather than 1 + 2 + ... + 32 = 63
> seconds, you could set PMR to 0 or 1 and fail over immediately or in
> about 1 second if desired. I think the suggested change to the wording
> in 6.4 is actually already covered by the ability to set the PMR.
>
> ---Ryan
>
>
> On Fri, 2005-02-11 at 14:33 -0500, David Lehmann wrote:
>
>>David Lehmann wrote:
>>
>>>Randall Stewart wrote:
>>>
>>>
>>>>In any event the only way to keep the network failover time
>>>>down is to set RTO.Max to a lower value.. that would make things
>>>>faster... To have a 1 second failover I would imagine that
>>>>Ulticom's stack is setting both RTO.Min and RTO.Max to a
>>>>lower value... aka that adds up to a total of 1 second..
>>>>I.e. something like 50ms RTO.Min and 400ms RTO.Max
>>>
>>>
>>>Nope. When the error count is greater than 0 on the
>>>primary destination, Ulticom uses an alternate destination,
>>>until the error is cleared on the primary destination.
>>>So, if the user is using the default values, the user
>>>will only see a blip for one second.
>>>(...assuming an RTO of 1 second)
>>
>>I need to correct my former post. Assuming a steady flow
>>of data chunks, the switch-over with Ulticom's stack will
>>actually be in a fraction of a second since the FR will
>>notice the problem long before the RTO.
>>
>>
>>>IMHO, waiting around for 1 minute while data trickles through
>>>the association is not acceptable and does not give the user
>>>of a sense of transparent network redundancy.
>>>
>>>IMHO, the language in section 6.4 should be changed from:
>>> By default, an endpoint SHOULD always transmit to the primary path,
>>> unless the SCTP user explicitly specifies the destination transport
>>> address (and possibly source transport address) to use.
>>>to something like:
>>> By default, an endpoint SHOULD always transmit to the primary path,
>>> unless the error count on the destination is greater than zero or the
>>> SCTP user explicitly specifies the destination transport address (and
>>> possibly source transport address) to use.
>>>
>>
>>
>

-- 
Randall Stewart
ITD
803-345-0369 <or> 815-342-5222
Received on Fri Feb 11 16:07:52 2005

This archive was generated by hypermail 2.1.8 : Mon Mar 13 2006 - 15:22:23 EST