network thruput

Wed Jun 1 14:21:47 PDT 2005

hi ya michael

On Wed, 1 Jun 2005, Michael T. Halligan wrote:

> One quick suggestion. quadruple your # of switches. I've built beowulf 
> clusters where each server has

changing hw is not an option .. 
	esp since reconfig the network or switches is not occuring
	either

> 4-8 nics, each connected to a different switch, and then bonded, 
> creating a 400 or 800mb (in your case
> 4gb or 8gb) network.

yes.. the suggestion was to channel bond to go faster.. but
its not much of a improvement, since not all machines
are bonded

>  If you have the spare hardware,

no spares ... they run 24x7 environment .. no test machines, no
hotswap live backup servers either .. yikes

> When you say they're getting a paltry 5-10MB/s at best, are you saying 
> all of the servers are at the
> same time, or at any given time?

that's the average thruput at any random given time between the
supposedly high performance cluster nodes
	- measured with say copying 100MB or 500MB files between
	any random node at any random time

	- to get rid of disk latency issues, we used node1:/dev/loop
	copying into nodexx:/dev/loop and its the same ... which
	means the ultra-360 disks is fast enough to keep up on the
	gigE lan

> Beyond that, the stacked switch setup could be bad if that means switch 
> 10 has to traverse all of the
> other switches in order to get to switch 9.

that;s the stack i am tryingt break up ... to get rid of all the netbios
packets from the cluster ( there is nothing a windoze box needs to do
on the cluster )
	- netbios packets are about 90% of all packets on the wire

> Another thing I'd do is collect some good stats to show to the PHB's .. 
> Setup NTOP for a week and
> show them that it's windows chatter eating up all the bandwidth. If 

already showed the traffic pattern ... but to no avail ... :-)

hard to convince PhD with managerial authority that they're not
quite up to puff with network design and topology issues
	- push too hard, and one is on the streets ya know

> they're manageable switches,
> setup cacti to graph them via snmp.

cacti seems too complicated for me ... :-)

i like something simple like ... to show what is clogging the network

	90% netbios packets
	5%  tcpip ( data )  not dns, arp, http, smtp, etc..
	5%  misc

> Might also be worth digging in to 
> see if you're having any
> type of arp or broadcast storms, perhaps a screwed up vlan.

i was hoping to see dns/arp issues but thats not the case here ..

> $150k? Ouch.

they're very proud to own that $150K tape library...
that i will not touch ... not even for $500/hr... no way ...

tapes are a disaster waitng to happen in my book and i rather
not be restoring from tape or making tape backups, and besides,
they have another to take care of that for them

> For $20k nowadays you can get a 40 tape lto2 library that 
> has 200GB (uncompressed)

we're looking at 3TB of data ..  still pretty small systems actually

> I'm starting to give up on Tape to be honest.

:-) congrats .. :-)

i think after one or few full restores from tapes that someone
else did, i think one will no longer be "tape happy" and prefer
a more reliable way to restore from full backups ( bare metal restore )
where you have to restore in 5 seconds because the whole company
is shutdown until it is back up and online ...
	- i will always prefer to have live warm-swap backups systems
	even if i have to bring in my own 2GB - 5GB of disks
	for those that are willing to pay my fees w/o discounts

> The value of tape and disk  always goes back and fourth,

yes... depending on the sitation

fun stuff...

c ya
alvin