Dedicated server hosting? -- sanity checks

Alvin Oga alvin at maggie.linux-consulting.com
Wed May 28 15:39:32 PDT 2003


hi ya chuck

On Wed, 28 May 2003, Chuck Yerkes wrote:

> You've got to start using whole sentences with verbs and nouns and
> articles.  I can understand people with strong chinese and indian
> accepts, but I can't read half of your posts...
^^^^^^^^^^

I'm sorry, i couldn't resist :-) 
didnt know if it was  test or intentional :-)

> Your test scenario also covers well why I tend to fear using most
> Intel boxes and often Linux in a colo.

which also eliminates the colo itself, as i wanna be able to fix
it "asap" and not waiting around for them
 	- i can' t wait for fedex either ( just my requirement )

yes, a spark is a lot better in that it has remote console access 
but some of the newer intel/amd mb now have those remote console
features

> In 4 years, the hard drive (old) died.  I replaced it via fedex and
> a guy there swapping.

some disk drives are rated at 1,000,000 MTBF ... 
most cpu's are rated at 30,000 MTBF ...
	- systems degrade by 1/2 of your life span for every 10C increase
	in operating temp from ambient 25C

- things should NOT be dying that soon ... something else is wrong
  besides temp problems ...  ( bad/flaky drives ?? )

> I have no BIOS to fight, I have an "oh shit" backup of net booting
> if I really needed it.

that's the linux oh $%^@#$ and sometimes freebsd too
and a good admin should be able to eliminate most of the surprises

> PCWeasel or Compaq's "Lights Out"  stuff is imperative for remote
> management.
> 
> Serial console access is imperative.

yes,  but i don't like(allow) people hitting reset/power switches
	- that's where lots of problems start from

	- one should be able to fix things without rebooting

> "the screen says..." question is silly. 
> Where would you have a screen?

yuppers and very time consuming

or use a webcam for "seeing the screen" and point it at the
screen/machine under test...

> Why?  If you can't tell it to boot from disk 2 (where you keep a spare
> root partition for that moment), then you'll be driving a fair bit.
> You ssh to your terminal server and YOU see "LI" on the serial port.
> Another port on the TS may handle power cycling your box.

that'd be nice ... to be able to see bios/console messages from
silly PCs ...

not many seem to have those gizmo's setup at the colo's :-)

> I don't really want NOC people to be touching my machines.  I

:-)

> CLEARLY mark the power and ethernet.  I'll tape over unused ports
> (parallel, USB, etc).  If it can be plugged in wrong, it will be.
> 
> A tape drive may be useful.  Easy to swap drives IS useful.
> Mirroring (for cheap) or real external hardware RAID is compulsory
> (internal RAID cards can be a hazard for remote production machines).

yuppers ... and installing a redundant and mirrored machine
that costs an extra $500 - $1000 will save your butt one day
( cheap insurance policy for the "oops" that will happen one day

i don't use tape drives ...
 
> BANDWIDTH:
..
> The win of a colo is bursts of HUGE bandwidth - like a store at a
> mall gets bursts of infinite parking spaces - but high on-average
> uses will still cost.  It's just that now the Colo is a ready drop
> point for one of several providers.

colo question ... if one goes over one's alloted 95th percentile,
how much extra are you charge for the over your alloted limits
	- he.net has big surprises at the low end of the bw scale

> DIVERSITY:
> Previous work used Level3 because we could hook in in SF and NYC
> and in Europe.

yuppers .. if availability is required regardless of "disasters"

> PRESUME the machine will crash.  Presume that the NOC staff just
> barely graduated 8th grade and were turned down at the McTacoKing.
> Presume that moving parts will fail (fans, disks).  Presume it
> will happen during an earthquake whose only damage it to wreck
> your car. 
> -How much does 3 days down cost you?
> -Can you change DNS from another place to get packets routed to
>  a low-bandwidth desperate recovery site (even just a page that
>  says "down for maintainance, back on monday")?
> 
> Now build your box.

"boxes" .. and network infrastructure and backups

and when the disk or cpu fan or power supply fan  dies
at 3:00am ... how long does it take to fix it ??
	- pay up front in prevention/workarounds
	- or pay when the machine dies in downtime ..

have fun
alvin 




More information about the Baylisa mailing list