From dannyman at toldme.com Wed Jun 4 15:10:23 2014 From: dannyman at toldme.com (Daniel Howard) Date: Wed, 4 Jun 2014 15:10:23 -0700 Subject: [Baylisa] Q: Estimating DC Power Message-ID: Hello, I have inherited an environment and some nice ServerTech PDUs. I've run stress tests on some servers and measured the power draw through the PDUs. In this way I can come up with a value, in Watts, of what a server will draw at peak. I can then figure out how many of what size circuits I need to power a given rack . . . When doing the measurements, I'll observe, say, the *average* peak draw averaging out to say, 4.4A, but with *spikes* recorded at 5A, &c. (The peak draw in that case being 88% of the highest measurement..) We are running Hadoop. So, most of the machines in a given rack will hit peak load around the same time. This is somewhat different from other environments I have managed with more variable workloads across a rack. So, I feel more pressure to Get This Right. :) As best I can tell, the Commonly Accepted Best Practice is that you aspire to not exceed 80% capacity on any given circuit. But then Management wants to know why. And my answer is: 1) Estimating power consumption is not a precise science 2) Power fluctuates, you want some room for error 3) Getting it wrong means blowing a fuse on the PDU, losing a rack, and a prolonged outage... 4) ... But, hey, I honestly don't understand electricity ... Back to 1: I figure that for the past century, if a guy is standing there watching a power meter, he's going to see the meter fluctuating around a certain high value under peak draw. The little needle will jump higher and lower, but where the needle rests is his 80% of circuit capacity. So, I have competing strategies: 1) Most Conservative: Take the *max* momentary measurement observed as your peak power consumption. That's your 80% baseline. 2) Take the *average* peak consumption as your peak consumption. That's your 80% baseline. OR... 3) Take the max momentary measurement as your 100% of circuit capacity, or 90% ... as long as your average peak doesn't exceed 80% of circuit capacity ... I am of course comfortable with Most Conservative, but I'm not the one writing the checks, and I'd rather not spend money we don't need to spend ... how do you folks estimate power needs? Thanks, -danny -- http://dannyman.toldme.com From npc at GangOfOne.com Wed Jun 4 17:33:29 2014 From: npc at GangOfOne.com (Nick Christenson) Date: Wed, 4 Jun 2014 17:33:29 -0700 (PDT) Subject: [Baylisa] Q: Estimating DC Power In-Reply-To: Message-ID: <201406050033.s550XToE066308@bellerophon.gangofone.com> > As best I can tell, the Commonly Accepted Best Practice is that you aspire > to not exceed 80% capacity on any given circuit. A lot of people will tell you that this is enshrined in the National Electrical Code. It isn't true. What the NEC says is that you can't have a single appliance on a branch circuit with more than one outlet that draws more than 80% of the circuit's rated load. One appliance on a single outlet circuit is allowed to draw 100% of the load, or multiple appliances are allowed to draw an aggregate of 100% of the load, as long as no one device exceeds 80%. Note, the loading rules are not designed to protect the circuit breaker, they're designed to protect the wiring. The problem is that a circuit breaker rated for 20A isn't going to be rated for continuous duty at 20A forever. So, while the electrical code will allow you to run 19.9A off a 20A breaker, we can't be shocked if that breaker eventually trips or fails. So, at what continuous load is it safe to run a 20A circuit? Nobody really knows. There certainly isn't a hard cutoff. However, because just about everyone follows the 80% rule of thumb, collectively we have a lot of experience that tells us that it's safe to run circuits for long periods of time at 80% load. I've run circuits at 90-95% load for moderately long periods of times (several months) without problems, but that was years ago, it's a small sample size, and it doesn't make me comfortable. Note also you need to deal with the power factor of the equipment. That is, because the power draw of the equipment can be non-linear, the effective volts*amps of a AC appliance may differ from its measured power draw in Watts. This is long and complicated, but the upshot is that you need to make sure your measured current load divided by your power factor doesn't exceed the rating of your circuit breaker. Twenty years ago, a lot of computer gear had power factors of 0.8 or even lower, but I power draws these days that are much closer to linear with most modern data center servers having a power factor in excess of 0.9. If you're working with a generous margin for error, you can probably get away with ignoring this, but if you're dipping into those margins, you want to be a little more careful. > But then Management wants > to know why. And my answer is: > 1) Estimating power consumption is not a precise science This is true. Also, as equipment, for example power supplies, ages it's possible that it could start drawing more power. > 2) Power fluctuates, you want some room for error Yeah, but if you're measuring at peak consumption, you're good. On the other hand, sometimes you add an extra hard drive to servers, upgrade their RAM, or replace power supplies with ones that aren't exactly like what was in there before, and it's good to have a little slack. If you're running identical machines that will never be individually upgraded on a maintenance contract, then I'd be willing to run with less margin. > 3) Getting it wrong means blowing a fuse on the PDU, losing a rack, and a > prolonged outage... Yes. Getting it wrong on the low end costs a little money in unused capacity. Getting it wrong on the high end causes serious downtime and maybe hardware damage. > So, I have competing strategies: > 1) Most Conservative: Take the *max* momentary measurement observed as your > peak power consumption. That's your 80% baseline. This is the way I see most people setting up their gear. > 2) Take the *average* peak consumption as your peak consumption. That's > your 80% baseline. I wouldn't do this one. Measure your real peak draw on the circuit and use that as your baseline. > 3) Take the max momentary measurement as your 100% of circuit capacity, or > 90% ... as long as your average peak doesn't exceed 80% of circuit capacity > ... Under restricted conditions, I'd be willing to do this up to 90% capacity: (1) Management explicitly signs off on the risks, (2) There is no way anyone would do something like plug another piece of gear, even if it's something as small as a laptop of phone charger, into those circuits, (3) I've measured the power factor of the equipment in question and am confident that it exceeds 0.9, and (4) the gear is uniform and won't be individually upgraded without remeasuring and rebalancing circuits at that time. > I am of course comfortable with Most Conservative, but I'm not the one > writing the checks, and I'd rather not spend money we don't need to spend > ... how do you folks estimate power needs? There isn't a lot of widespread knowledge about running this close to the margins, and what is known is situationally dependent. Like anything else, there are risks that are hard to quantify. The big factor is not so much the cost of outage, but what value do you assign to the cost of being wrong about your assumptions. Generally, as you run anything closer to red line MTTF goes down and systems require more hand holding which increases costs. Balancing these isn't easy, but I get nervous around folks who try to push these margins without acknowledging they're taking on risk. I'd quantify the known costs, explain the issues with the unknown risks, and present these to management and let them decide and sign off on what they want to do. Hope this helps. -- Nick Christenson npc at gangofone.com From Brent at GreatCircle.COM Wed Jun 4 17:50:53 2014 From: Brent at GreatCircle.COM (Brent Chapman) Date: Wed, 4 Jun 2014 17:50:53 -0700 Subject: [Baylisa] Q: Estimating DC Power In-Reply-To: <201406050033.s550XToE066308@bellerophon.gangofone.com> References: <201406050033.s550XToE066308@bellerophon.gangofone.com> Message-ID: This is great advice. Another argument that might resonate with your management is this: what is the cost of downtime (and recovery from downtime) caused by unexpected circuit breaker trips? The closer you run to the limits, the more frequent these unexpected breaker trips are going to be. You have to consider both the cost of the downtime itself (how long does it take for someone to get to the data center, diagnose the problem, reset the circuit breaker, and bring everything back online?), and the cost of the recovery from the downtime (repairing databases and filesystems that didn't get a clean shutdown, replacing disk and memory that didn't survive the unexpected power cycle, etc.)? How many times does a breaker need to trip, causing unexpected downtime and the associated recovery costs, before it would have been worth it to simply install more power in the first place so that you weren't running so close to the limits? As Nick says, you're never quite sure exactly where the limit is; circuit breakers are nominally rated for a certain capacity, but it's always +/- a certain margin. What load they trip at (versus what load they're _supposed_ to trip at) varies with factors like heat, age, and how many times they've tripped previously (usually, the more times a breaker trips, the "weaker" it gets; effectively, each time it trips, the point at which it will trip again gets a little bit lower). Cutting your power margins too close can be a case of "penny wise and pound foolish"; you may save a little bit of installation cost up front, but you're biting off more operating cost as time goes by. -Brent On Wed, Jun 4, 2014 at 5:33 PM, Nick Christenson wrote: > > > As best I can tell, the Commonly Accepted Best Practice is that you > aspire > > to not exceed 80% capacity on any given circuit. > > A lot of people will tell you that this is enshrined in the National > Electrical Code. It isn't true. What the NEC says is that you can't > have a single appliance on a branch circuit with more than one outlet > that draws more than 80% of the circuit's rated load. One appliance > on a single outlet circuit is allowed to draw 100% of the load, or > multiple appliances are allowed to draw an aggregate of 100% of the > load, as long as no one device exceeds 80%. > > Note, the loading rules are not designed to protect the circuit > breaker, they're designed to protect the wiring. The problem is that > a circuit breaker rated for 20A isn't going to be rated for continuous > duty at 20A forever. So, while the electrical code will allow you to > run 19.9A off a 20A breaker, we can't be shocked if that breaker > eventually trips or fails. > > So, at what continuous load is it safe to run a 20A circuit? Nobody > really knows. There certainly isn't a hard cutoff. However, because > just about everyone follows the 80% rule of thumb, collectively we > have a lot of experience that tells us that it's safe to run circuits > for long periods of time at 80% load. I've run circuits at 90-95% > load for moderately long periods of times (several months) without > problems, but that was years ago, it's a small sample size, and it > doesn't make me comfortable. > > Note also you need to deal with the power factor of the equipment. > That is, because the power draw of the equipment can be non-linear, > the effective volts*amps of a AC appliance may differ from its > measured power draw in Watts. This is long and complicated, but > the upshot is that you need to make sure your measured current load > divided by your power factor doesn't exceed the rating of your > circuit breaker. > > Twenty years ago, a lot of computer gear had power factors of 0.8 > or even lower, but I power draws these days that are much closer > to linear with most modern data center servers having a power factor > in excess of 0.9. If you're working with a generous margin for > error, you can probably get away with ignoring this, but if you're > dipping into those margins, you want to be a little more careful. > > > But then Management wants > > to know why. And my answer is: > > 1) Estimating power consumption is not a precise science > > This is true. Also, as equipment, for example power supplies, ages > it's possible that it could start drawing more power. > > > 2) Power fluctuates, you want some room for error > > Yeah, but if you're measuring at peak consumption, you're good. On > the other hand, sometimes you add an extra hard drive to servers, > upgrade their RAM, or replace power supplies with ones that aren't > exactly like what was in there before, and it's good to have a little > slack. If you're running identical machines that will never be > individually upgraded on a maintenance contract, then I'd be willing > to run with less margin. > > > 3) Getting it wrong means blowing a fuse on the PDU, losing a rack, and a > > prolonged outage... > > Yes. Getting it wrong on the low end costs a little money in unused > capacity. Getting it wrong on the high end causes serious downtime > and maybe hardware damage. > > > So, I have competing strategies: > > 1) Most Conservative: Take the *max* momentary measurement observed as > your > > peak power consumption. That's your 80% baseline. > > This is the way I see most people setting up their gear. > > > 2) Take the *average* peak consumption as your peak consumption. That's > > your 80% baseline. > > I wouldn't do this one. Measure your real peak draw on the circuit and > use that as your baseline. > > > 3) Take the max momentary measurement as your 100% of circuit capacity, > or > > 90% ... as long as your average peak doesn't exceed 80% of circuit > capacity > > ... > > Under restricted conditions, I'd be willing to do this up to 90% > capacity: (1) Management explicitly signs off on the risks, (2) There > is no way anyone would do something like plug another piece of gear, > even if it's something as small as a laptop of phone charger, into > those circuits, (3) I've measured the power factor of the equipment > in question and am confident that it exceeds 0.9, and (4) the gear > is uniform and won't be individually upgraded without remeasuring > and rebalancing circuits at that time. > > > I am of course comfortable with Most Conservative, but I'm not the one > > writing the checks, and I'd rather not spend money we don't need to spend > > ... how do you folks estimate power needs? > > There isn't a lot of widespread knowledge about running this close to > the margins, and what is known is situationally dependent. Like anything > else, there are risks that are hard to quantify. The big factor is not > so much the cost of outage, but what value do you assign to the cost of > being wrong about your assumptions. > > Generally, as you run anything closer to red line MTTF goes down and > systems require more hand holding which increases costs. Balancing > these isn't easy, but I get nervous around folks who try to push > these margins without acknowledging they're taking on risk. > > I'd quantify the known costs, explain the issues with the unknown risks, > and present these to management and let them decide and sign off on > what they want to do. > > Hope this helps. > > -- > Nick Christenson > npc at gangofone.com > _______________________________________________ > Baylisa mailing list > Baylisa at baylisa.org > http://www.baylisa.org/mailman/listinfo/baylisa > From rnovak at indyramp.com Sun Jun 8 14:58:34 2014 From: rnovak at indyramp.com (Robert Novak) Date: Sun, 8 Jun 2014 14:58:34 -0700 Subject: [Baylisa] O'Reilly Velocity 20% discount code Message-ID: Hi folks, Our friends at O'Reilly Media have provided a 20% discount code USRG to save you a few bucks on attendance at Velocity in Santa Clara ( http://velocityconf.com/velocity2014) June 24-26. You'll save $100 as well by registering on or before June 23, but on-site registration is available. Their new east coast event, Velocity New York ( http://velocityconf.com/velocityny2014/), is September 15-17, with registration prices going up on June 26. If you're likely to be on the other coast and/or if it fits your schedule and budget better, it's worth considering that event as well. We'll also have some booklets and books to share at the next couple of meetings, courtesy of O'Reilly. The update for this month's general meeting will be going out soon. We have Evident.io talking cloud security, and a rescheduled IO-Switch as well, with pizza expected to be in attendance as well. Hope you can make it! Robert Novak BayLISA President