Dedicated server hosting? -- sanity checks

Michael T. Halligan michael at halligan.org
Wed May 28 14:32:36 PDT 2003


= Your test scenario also covers well why I tend to fear using most
= Intel boxes and often Linux in a colo.

I agree to that.  In an ideal world, I'd have all of my customers
using REAL hardware.  But, when I can build 8 dual proc intel
boxes for the price of one 420R, guess who wins? 


= PCWeasel or Compaq's "Lights Out"  stuff is imperative for remote
= management.

Motherboards which support Ethernet Management Protocol are a nice
solution as well.

= Serial console access is imperative.
= "the screen says..." question is silly.  Where would you have a screen?
= Why?  If you can't tell it to boot from disk 2 (where you keep a spare
= root partition for that moment), then you'll be driving a fair bit.
= You ssh to your terminal server and YOU see "LI" on the serial port.
= Another port on the TS may handle power cycling your box.

To me, a better design philosophy is to avoid hard drives as boot media
on small servers.  I'm debating making all of my dedicated servers boot
pxelinux, and having mirrored hard drives on all the servers solely for
customer data and backing up via rsync or amanda, or using something
like AFS or CODA to keep synced copies of data on a "massive" central
file server. And having the root file system be a series of symlinks
to nfs mounts, a linkfarm if you will.  That way I can perform upgrades
without affecting my managed `users, then once I'm satisified adequate 
testing has been done, just sending a reboot command to their boxes.

= I'm moving some stuff local because I have a guy down the street
= with fractional T3 access (via 802.11a and a big freaking antenna
= to a colo 7 miles away in Oakland).  But my house terminal server
= is still the main access method (heaven forfend that I can't handle
= it while on a trip or even just upstairs).

What street do you live on? :)

In that note, in my datacenter is an ISP that happens to have a warehouse
about 1/2 mile line of sight from mine.  I'm thinking of doing the same.

= I don't really want NOC people to be touching my machines.  I
= CLEARLY mark the power and ethernet.  I'll tape over unused ports
= (parallel, USB, etc).  If it can be plugged in wrong, it will be.

When I was at (mid-sized horrible company) hosting with (large, 400 lb
gorilla hosting provider) we had 2 cisco 7200s, one which we had
turned off while waiting for new routers to get approved, because
the bad one had a bad port which brought down our connectivity.. 
once a month for 3 months, until we put a large "If you touch this,
you will be fired" sign on the router, some genius noc monkey would
plug it in, even though it was unplugged from the console server, power,
and uplink.

= The win of a colo is bursts of HUGE bandwidth - like a store at a
= mall gets bursts of infinite parking spaces - but high on-average
= uses will still cost.  It's just that now the Colo is a ready drop
= point for one of several providers.

Bursting is bad network design, and even worse budget design, especially
if your provider has an N*PRICE rate for bursting beyond commit rates.

= DIVERSITY:
= Previous work used Level3 because we could hook in in SF and NYC
= and in Europe.  Our "intraoffice" packets went over their network
= and could have bandwidth guarantees.  Didn't have, but should have,
= machines in geographicly diverse places.  The Bay moves.  That's
= gonna suck one day.  My little SPARC 10 hosts DNS for 200 domains
= and backup MX for a bunch of domains which I trade with another
= guy who's got good connectivity in the Bay.

A side note, secondary.com does a GREAT job of providing redundant
secondary dns, and will provide decent prices for large amounts of
servers.

= PRESUME the machine will crash.  Presume that the NOC staff just
= barely graduated 8th grade and were turned down at the McTacoKing.
= Presume that moving parts will fail (fans, disks).  Presume it
= will happen during an earthquake whose only damage it to wreck
= your car. 
= -How much does 3 days down cost you?
= -Can you change DNS from another place to get packets routed to
=  a low-bandwidth desperate recovery site (even just a page that
=  says "down for maintainance, back on monday")?
= 
= Now build your box.

-------------------
Michael T. Halligan
Chief Geek
Halligan Infrastructure Designs.
http://www.halligan.org/
2250 Jerrold Ave #11
San Francisco, CA 94124-1012
(415) 824.4453 - Home/Office
(415) 724.7998 - Mobile




More information about the Baylisa mailing list