Power and Cooling

  • The Cloud Physical Plant

    As we begin to look further, I now think that the idea of completing the physical plant as part of the layered model (shown below), may not be the best use of our time and our efforts are better served by treating it as an independent model simply referenced from this stack.

    image

    As such I have decided to fill out the complete physical plant model (shown here) for this layer and let it serve as our discussion vehicle as we move forward.

    image

    We have discussed the overall power system to a great degree but still need to look at approaches that can be used for internal power distribution. This model is build around the concept of A.C. distribution. There have been many arguments describing the benefits of D.C. as a distribution vehicle and there is a well known study at Lawrence –Berkley National Labs (http://whitepapers.silicon.com/publisher/39038243/lawrence-berkeley-national-laboratory.htm) describing the advantages of D.C. While an excellent study, it cites the advantages of D.C. with high efficiency D.C. power supplies as compared to A.C. with poor efficiency A.C. power supplies. With A.C. power supplies in the 86% to 92% today, these advantages are eliminated and A.C. solutions are generally less expensive. In all there are a few simple rules one should remember;

    · Keep transformations to a minimum

    · Make emergency power support components operate in a “by pass” mode where they do not contribute to loss and reduce overall efficiency during normal operation

    · Keep distribution voltages as high as safely and cost effectively possible

    · Avoid needless items in the power path that contribute to efficiency loss

    · Test all distribution advantage claims against your specific model and make sure you have not missed anything before making revolutionary decisions

    These will help you decide for yourself which is the best approach.

    Let’s now turn our focus to cooling. One of the key principles on which this scheme is built is the idea of containment. Generally speaking, hot aisle (or hot air) containment provides some distinct advantages because it tends to reduce the overall area where the hot temperatures will exist. If you are getting the most out of your cooling dollar, the exhaust air will be pretty warm. In fact, if your inlet air is about 30’C (85’ F), you can expect this temperature to be 45’C+ (about 115’ to 120’ F) so keeping this contained makes the most sense. Now, this is not conventional cooling and is built around using outside air or “free” cooling as much as possible. This is based on the idea that if your exhaust air is hotter than your outside air, you are better off starting with the cooler air source than expending the energy to “recool” the exhaust air. (The effectiveness of this will vary in different geographies and you need a wet bulb temperature of less than 85’F for this to work effectively.) There are a couple of things that must be considered: There must be filtration to clean the air to a point where it is usable (and you will need sensory equipment to detect clogged filters) and there are times for which the outside air will become unusable. During such times (in winter or in the presence of pollutants), the inside air must be re-circulated and used for cooling. In the figure, you will see the presence of a cooled water system and a heat exchanger for “re-cooling” inside air. You can also see the usage of evaporative cooling (or air-side economizers) and water side economizers. Using the proper combination of these approaches (again based on your geography and particular model), you can achieve PUE(s) lower than 1.10.

    From here, we will begin looking at what I think is the optimum approach for the hardware so stay tuned.

    The complete model is shown below In some models (and certainly the most inexpensive, Tier 0) this is simply none. This essentially relies on the utility provider and the general utility practices (GUP) for power without any additional support means. I don’t know anyone who is doing this yet, but there are a couple of folks that are looking at it. (Note: their availability is managed at a higher level even with duplicate data centers (geographically separated) that can take over and deliver their service if one of the data centers goes down … and yes there may be a reduction in their service, but it will stay up.) If your business model allows you to do this, you can see some very interesting advantages. Your centers can be very low frills which can save you a lot of money and the duplication can not only provide the availability but also serve to provide a data backup function, fulfilling another major SLA. (I believe this will become a standard practice in the future as building blocks become even less expensive …More on this later.)

    (A good reference for all this is the Tier definition from the Uptime institute which can be found at http://uptimeinstitute.org.)

    The next step in availability is type N (Tier 1), N+1 (Tier 2), etc. (This is probably a rehash of Power Availability 101 for some, but it is kind of nice to see it all put together.) Here, there is a backup source (equal to what is considered the critical power part of N … and this is probably not everything) for power in the event that the utilities go away. Most folks in this space seem to be providing some form of N+1 (Tier 2 - a backup source and 1 additional source in the event that some of their primary backup fails). You may be able, depending on your circumstance and your utility provider, to create a model for the maximum duration of backup power that doesn’t require you to cover a prolonged outage. (This can save a lot of money especially if you can shift load to an alternate locations as I mentioned above.) Your planning should look something like this:

    1. Determine the amount of power needed to back up only the essential systems. (BTW, don’t forget to isolate this from the general power which will likely include some things you don’t want to pay to backup, but don’t forget about cooling. We’ll discuss that in the next installment.) This is “Pcritical KVA” or the critical load and you will need to outfit this amount of alternate power. If you have the space, I favor a simple diesel or LP generator for the backup power (GenerationDiesel). (Also note, this may be the longest lead time item you need to procure as you are building a data center and in some cases approaches 48 weeks.) Depending on the size (usually bigger is better), these retail for about $150 to $200 / KVA. (Note: this is just a rule of thumbs for the materials and does not include installation cost. Plus, the actual prices may vary from your suppliers.) For example, if your Pcritical is 3MVA, you could use four 1000KVA generators (M x UPSKVA). This covers your Pcritical and provide an additional unit giving you N+1. For each generator, you will need a transfer switch which will move your load from the utility power to your generators. As a rule of thumbs, these retail for about $20 - 30 per KVA. This example would cost you somewhere around $800K for the generators and $30K for switching equipment. This will total up to somewhere around $830K for materials not including installation.
    Generator sets require two things that you need to consider. These are the storage of fuel on site which will determine the “TBUP (Hrs)” or the duration of backup power will last without intervention; (With today’s environmental rules, this is not something that can be taken lightly) and monthly maintenance. Once a month or so the generators will need to be checked to ensure they are properly working and are setup correctly for the current season. There is a lot of automation here, but it is pretty expensive and some of it can be avoided by simply having a “trusted” human :o) perform these regular checks.

    2. Now let’s focus on the UPS. Some form of UPS is required to hold the load up while the generators are becoming operational. This is the sequence of events occurs something like this: It can take about 4.5 seconds for a generator to crank up and become stable. (This is one of the main reasons for monthly maintenance. ) If it fails, there may be a pause / purge of about 3 seconds followed by an additional 4.5 second start up time. This is just under 15 seconds to bring them up. Most folks add some margin here but this is one place where time is money and 30 seconds for “TUPS (Sec)” for the duration of power provided by ups seem appropriate. By this point, it you are not up, you have a bigger problem. If you are planning to use a conventional UPS with VRLA type batteries (UPSVRLA), you should expect to pay something around $50to $75 per KVA for each 30 seconds of backup time. For the example configuration of 3MVA, if you use 500KVA UPS systems, you will need 6 of them (M x UPSKVA) and at $75 / KVA, this will cost you somewhere around $225K for the UPS systems alone. (Note: This is a very soft estimate and not all sizes and durations are available. So you will have to work in the constraints of you vender when you are sizing/estimated the UPS. All estimates are just to give you an idea about some of the cost. These will vary as technology changes and with specific vendors.)
    There is a lot of other material required to hook this up. I would add 10% to the total to give you are good rule of thumbs for materials cost. At this point, our stack and model look like the following with configuration guidelines to come.
    There are a couple of other good resources I would refer to at this point and both come from the uptime institute. They are: Cost Model: Dollars per kW plus Dollars per Square Foot of Computer Floor and A Simple Model for Determining True Total Cost of Ownership for Data Centers. Note: These are created along the lines of conventional data center, but they do contain some good information.

     

    image

    Figure 1 – Cloud Computing Layered Model – Power

     

    image

    Figure 2 – High Level Schematic w/ Power

  • High-Density & Co-Location

    Earlier this year I had the opportunity to tour a recently-commissioned, carrier-neutral co-location data center. Located on the outskirts of a major city, and run by one of the major co-location providers in the U.S., it was clearly a state-of-the-art facility with Tier III uptime, impressive physical security, and access to dozens of telecommunication carriers and ISP’s.

    What struck me most about this particular facility was its power density – only 4kW/rack. That’s not to say the provider would turn away a customer with racks pulling 16kW of IT load. In such a situation, the provider would lease higher power circuits and “phantom racks” of white space to bridge the power and cooling gaps. Pretty soon, that one 16kW rack looks a lot like four 4kW racks in terms of cost and footprint.

    With blades and high-density compute solutions pushing IT loads orders-of-magnitude beyond 4kW/rack, I was perplexed as to why a co-location provider would invest in a facility that appears to ignore all of the industry trends around the growth in IT load over time. Yet, this particular provider is not atypical. Unless a customer is willing to invest in a wholesale lease of a data center, it’s a challenge to find co-location providers capable of supporting power loads greater than 4kW/rack.

    Clearly, the economics of the co-location plays a role. According to Gartner, 60-70% of a data center’s CAPEX lies in the mechanical and electrical infrastructure that defines the total supportable IT load. To build a data center capable of supporting 8kW/rack, and have customers deploy racks pulling less than 8kW is an opportunity cost the provider cannot afford. To increase NPV on the facility investment, design to the lowest common denominator of IT load, and charge for density via circuits and white space.

    While co-location providers are acting rationally in building such facilities, innocuous decisions on the part of a customer can yield counterintuitive cost increases. Using quotes from some of the major co-location providers, I ran a simple case study for a customer replacing five 2kW racks with four 5kW racks as part of a refresh and consolidation effort. Between annual fee increases (which have been steeper than usual lately), setup, circuit premiums, and incremental white-space, the customer’s 20% reduction in enterprise hardware yielded a 155% Y/Y increase in co-location costs.

    For companies that lack the critical mass to build-out their own data-centers, and depend upon an expanding footprint of enterprise hardware to grow their business (e.g. start-up Web 2.0 companies), these are a challenging set of market dynamics. We’d certainly like to hear from firms if they’ve experienced something similar first-hand, and what strategies can help mitigate the impact to the bottom-line.

  • More Conversations: Dell Launches Cloud Computing Blog

    For those keeping score, we launched Direct2Dell back in July 2006. IdeaStorm roared onto the scene in February last year. From there, we began expanding into other languages: Direct2Dell Chinese in March 2007, Spanish in May last year, and Norwegian in September, and there will be more in the future. Most recently, our Investor Relations blog called DellShares went live in November 2007.

    From the beginning, the purpose of Direct2Dell has been to educate and to support our customers on a wide variety of topics that they care about. This blog has grown since those early days. And that growth has encouraged more Dell folks to want to have conversations with our customers. Up to now, I've added more categories on Direct2Dell to expand the topics of discussion. That strategy has worked to a point, but now it's time to evolve.

    Starting today, members from our Data Center Solutions (DCS) team will support a group blog called In the Clouds. It will focus on cloud computing and the backend server, storage and architecture required to make it work. If you're not familiar with the concept of cloud computing, think using web-based e-mail from Yahoo, Google or AOL (see link for their slick integration with Silverlight), or uploading videos to YouTube, pictures to Flickr, or microblogging with Twitter. When you do those kinds of things you aren't storing them on your local device.. you're storing them "in the clouds," or to a remote location in the Internet.

    So, why start with Cloud Computing? The short answer is there's a lot happening in this space right now. Take a look at what Adobe's doing with their AIR product (go Twhirl!) that they recently brought to market. Google continues to surge forward with their Google document apps (Spreadsheet Forms and Google Calendar synch are two recent enhancements that rock), and this week at MIX08, Microsoft is rolling out some cool stuff with Silverlight 2.0 and Internet Explorer 8

    What this all means is that we're at the beginning stages of a shift from the model of the past where applications and all the content created for them were stored locally. This shift has the potential to increase the types of Internet-connected devices we use to consume and create content (check out the good discussion Scoble has going about the battle for web-based content on mobile phones).  

    So, what does all this have to do with Dell and the kind of content you can expect to see in the cloud computing blog? These web-based activities require reams of server and storage hardware architected around complex custom networks. As such, these environments differ from traditional server/storage environments. Our DCS team's purpose is to help customers make sense of that complexity—see this PDF, or www.dell.com/cloudcomputing for more context. That's the kind of content you can expect from reading Dell's Cloud Computing blog.

    If this sounds interesting, I encourage you to subscribe to the Cloud Computing RSS feed. If you'd rather access it directly, go here:

    www.direct2dell.com/cloudcomputing