So what's everyone using to push databases around these days?

Ray Wong rayw at rayw.net
Fri Feb 15 19:42:58 PST 2008



Oh,  the whole point of this was that it's just a big enough cycle that
I can't rely on replication to be fast enough. :)

On Fri, Feb 15, 2008 at 06:50:02PM -0800, Ray Wong wrote:
> 
> 
> Hey, so with ancient reference ideas like DFS and google file system and
> multicast rsync, what's everyone using to copy decent sized directories
> of data around these days?
> 
> I've got a site with a semi-typical data cycle (1 or more updates per
> day) filling a couple hundred gigs of space (MySQL in several database
> dirs).  It goes to a few dozen distributed targets, let's constrain the
> parallelism requirement to between 5 and 50 hosts.
> 
> What I'd really like is a multicast scp to just blast data to
> all targets (though some bwlimiting from ssh would be a nice option
> too), but something like mrsync is entirely too fragile for my liking,
> as the first target gets all data and yet is the one used for deciding
> what to send everywhere else when re-syncing (so it might be the only
> host with the files in place already, ensuring no one else gets the file).
> 
> rdist over ssh is okayish, but the lack of a multicast option concerns
> me.  I fully could handle doing a unicast recovery if a host missed out
> on a multicast copy, but without any multicast seems like I'd lose a lot
> of performance out of the data originator.
> 
> And rsync, well, it's just painfully slow. :)
> 
> So, what's anyone else doing?  I haven't really felt like I was pushing
> any new ground since Postini or UltraDNS, I'll bet this is old hat to
> quite a few of you.  Is there an obvious solution I'm missing?



More information about the Baylisa mailing list