good show
Jay R. Ashworth
jra at baylink.com
Thu Jul 29 11:29:40 PDT 2004
On Wed, Jul 28, 2004 at 11:55:32PM -0400, Bill Vermillion wrote:
> > "Secondary MX" is the common name for a combination of techniques
> > intended to reduce mail delivery failures.
>
> > To have this really be reliable, you need (at least) two things, in
> > each of two categories:
>
> > You need for the master DNS zone for your domain to be served
> > from at least 2 machines, and preferably 3 or 4, *on different
> > backbones and uplink providers*. This way, mail will never
> > bounce with "can't resolve domain", which is a soft bounce (the
> > sending SMTP server will usually retry for up to 5 days).
>
> It all depends upon where your machines are and just how reliable
> you must be. Five 9's is easily doable - six 9's [ approximately
> 30 seconds downtime per year ] gets to be a bit expensive.
A nice table is at
http://www.eventhelix.com/RealtimeMantra/FaultHandling/reliability_availability_basics.htm
Along with some other info on the topic.
And remember, these are usually *system* reliability numbers, not
component. Engineering high system availablility with common
components is the Holy Grail.
> Five 9's is about 5 minutes per year. I was averaging that for the
> past 2 years until 2AM Monday morning when a Cisco 7120 decided to
> get finicky. We lost a total of about 3 hours connection time
> from 2AM to 6AM when I configured a machine to act as a router.
> That's 4 hours total outage since March 2000. Some of that lost
> time was bringing the Cicso backup and then watching it fall over
> again - while I was on the phone to tech support - in Australia.
:-)
> > You need at least one extra machine to actually *receive mail*
> > for your domain. These machines must have public, static IP
> > addresses, and properly administered mail SMTP mail systems.
> > You configure then in your DNS zone as additional MX records,
> > with higher numbers in their MX records (and therefore lower
> > priority).
>
> > If a sending system tries to get mail to you, and for some reason
> > cannot contact your primary MX server, it will try your secondaries in
> > descending priority (ascending numerical) order. Hopefully, *one* of
> > them will be accessible. As usual, the optimal situation is to have
> > your secondaries in phsyically separate locations, on different
> > backbones, just like your DNS servers.
>
> Optimal can be expensive. And it depends on your needs and
> it depends on your backbone. I said I now totalled about 4
> hour downtime in 4.5 years. The backbone I'm connected to has
> essentially no downtime. There have been moments when they were
> reconfiguring - and I had advance notice - and while they said
> they expeced the network to be unavailable for up to 5 minutes, I
> never saw that much. And that last notice like that was two years
> ago. There are more phone companies in that building than I can
> count. And the last time I was in the carrier side the line
> of Lucent Ascend devices stretched for many many feet. I made a
> rough estimate of 30,000 dial in connetions at that time - and
> they've probably added more since then.
I'll bet.
> I'm on a 40Gbs global backbone - Level 3. Fibre comes into the
> building from three separate locations. The battery room has
> almost as much square footage as my house. Those will keep
> everyting running for 6 to 8 hours. And the ONLY reason those are
> there is in case the diesel generator doesn't start. The diesel
> turns on in seconds after any power failure. It has a 6000 gallon
> tank and puts out 1,250,000 watts - Caterpillar unit.
I'll tell you what I told Mark:
Sure, the AT&T 5E in the 6th subbasement of WTC2 kept running until
almost 1600 on 9/11, but it didn't matter much, did it?
> If you are a huge company - then having secondaries and DNS in
> separate locations - may be a requisite. But depending on your
> needs and what you use for a backbone, a separate backbone may
> not be neccesary.
Oh sure. But I was speaking pedagogically; not telling John what *he*
needed.
> > These secondary servers are configured to accept the mail
> > for your domain, but not try local delivery -- they then
> > attempt the delivery to your primary server themselves, for
> > however long your secondaries are configured to try -- which,
> > hopefully, you're in control of.
>
> I handle secondary MX for a colo client with a flock of domains.
> His machines get swamped at time - and I have been inside his [he
> has given me access] and he's woefully short on memory. So he can
> get his mail server bogged down and things will come over to my
> secondary MX machines until his machines start breathing normally
> again.
>
> If don't control your secondaries you should at least be able
> to specify how long you want things to be held. A decent provider
> should do that for you. However I've seen some places just totally
> nuke the queues on a daily basis. Those are usually smaller ISPs.
Yikes. If they do, IMHO, they're *providing* secondary MX service.
They're playing games.
> > Worst case, if your machine is running but your link has suffered
> > backhoe fade, you might be able to sneakernet the mail spool from a
> > secondary to the primary for delivery.
>
> Only if they are close by and if you don't have a huge amount of
> mail to deliver.
Or you have a cablemodem at home, and a CD burner.
> > The highest bandwidth data transport known to mankind is a FedEx plane
> > full of DVD-ROM's. (This used to be a station wagon fill of magtape,
> > when someone at Duke coined it about Usenet; I've clearly updated.)
>
> The problem with that is that it has a huge bandwidth but a very
> poor temporal timeframe. And when the station wagon full of mag
> tapes analogy was made most backbones were in the 56K range as I
> recall.
Yeah, the lagtime is horrible, but the bandwidth is *still* high. :-)
> 25,000 units and $5,000,000,000 later the new unit - CRS-1 [Carrier
> Router System] is the one that will replace those. Bandwidth needs
> have grown faster than anyone had imagined.
Yeah. Good ghod...
> While a plane full of DVD-ROMs may have a higher aggregate
> bandwidth the time consumed burning those DVDs is one thing.
> And if you had to ship it overnight, those should be ready to go no
> later than 8PM for 10AM delivery.
>
> That's 14 hours, or 50400 seconds. In that time frame the CRS-1
> will be able to move over 4.6 exabits or about 580 exabtyes in
> that time frame.
>
> 580 Exabytes the standard terminology is about 580 quadrillion
> bytes.
>
> So it might be a close race between the plane and the data
> providing you already have the DVDs made. :-)
Nice to know you're still on the ball, Bill.
Cheers,
-- jra
--
Jay R. Ashworth jra at baylink.com
Designer Baylink RFC 2100
Ashworth & Associates The Things I Think '87 e24
St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
"You know: I'm a fan of photosynthesis as much as the next guy,
but if God merely wanted us to smell the flowers, he wouldn't
have invented a 3GHz microprocessor and a 3D graphics board."
-- Luke Girardi
More information about the Filepro-list
mailing list