May 2008 - Posts

  • Within Layer 1 - Power

    As we begin to look at power, the first thing we need to do is to determine the desired availability.  In some models (and certainly the most inexpensive, Tier 0) there simply isn't any.  This essentially relies on the utility provider and the general utility practices (GUP) for power without any additional support means.  I don't know anyone who is doing this yet, but there are a couple of folks that are looking at it.  (Note: their availability is managed at a higher level even  with duplicate data centers (geographically separated) that can take over and deliver their service if one of the data centers goes down ... and yes there may be a reduction in their service, but it will stay up.)  If your business model allows you to do this, you can see some very interesting advantages.  Your centers can be very low frills which can save you a lot of money and the duplication can not only provide the availability but also serve to provide a data backup function, fulfilling another major objective. (I believe this will become a standard practice in the future as building blocks become even less expensive ...More on this later.)

    (A good reference for all this is the Tier definition from the Uptime institute which can be found at http://uptimeinstitute.org/.)

    The next step in availability is type  N (Tier 1), N+1 (Tier 2), etc.  (This is probably a rehash of Power Availability 101 for some, but it is kind of nice to see it all put together.)  Here, there is a backup source (equal to what is considered the critical power part of N ... and this is probably not everything) for power in the event that the utilities go away.  Most folks in this space seem to be providing some form of N+1 (Tier 2 - a backup source and 1 additional source in the event that some of their primary backup fails).  You may be able, depending on your circumstance and your utility provider, to create a model for the maximum duration of backup power that doesn't require you to cover a prolonged outage. (This can save a lot of money especially if you can shift load to an alternate locations as I mentioned above.)  Your planning should look something like this: 

    • 1. Determine the amount of power needed to back up only the essential systems. (BTW, don't forget to isolate this from the general power which will likely include some things you don't want to pay to backup, but don't forget about cooling. We'll discuss that in the next installment.) This is "Pcritical KVA" or the critical load and you will need to outfit this amount of alternate power. If you have the space, I favor a simple diesel or LP generator for the backup power (GenerationDiesel). (Also note, this may be the longest lead time item you need to procure as you are building a data center and in some cases approaches 48 weeks.) Depending on the size (usually bigger is better), these retail for about $150 to $200 / KVA. (Note: this is just a rule of thumb for the materials and does not include installation cost. Plus, the actual prices may vary from your suppliers.) For example, if your Pcritical is 3MVA, you could use four 1000KVA generators (M x UPSKVA). This covers your Pcritical and provide an additional unit giving you N+1. For each generator, you will need a transfer switch which will move your load from the utility power to your generators. As a rule of thumb, these retail for about $20 - 30 per KVA. This example would cost you somewhere around $800K for the generators and $30K for switching equipment (not including installation).

      Generator sets require two things that you need to consider. These are the storage of fuel on site which will determine the "TBUP (Hrs)" or the duration of backup power will last without intervention; (With today's environmental rules, this is not something that can be taken lightly) and monthly maintenance. Once a month or so, the generators will need to be checked to ensure they are properly working and are setup correctly for the current season. There is a lot of automation here, but it is pretty expensive and some of it can be avoided by simply having a "trusted" human :o) perform these regular checks.

    • 2. Now let's focus on the UPS. Some form of UPS is required to hold the load up while the generators are becoming operational. This is the sequence of events occurs something like this: It can take about 4.5 seconds for a generator to crank up and become stable. (This is one of the main reasons for monthly maintenance. ) If it fails, there may be a pause / purge of about 3 seconds followed by an additional 4.5 second start up time. This is just under 15 seconds to bring them up. Most folks add some margin here but this is one place where time is money and 30 seconds for "TUPS (Sec)" for the duration of power provided by ups seem appropriate. By this point, it you are not up, you have a bigger problem. If you are planning to use a conventional UPS with VRLA type batteries (UPSVRLA), you should expect to pay something around $50to $75 per KVA for each 30 seconds of backup time. For the example configuration of 3MVA, if you use 500KVA UPS systems, you will need 6 of them (M x UPSKVA) and at $75 / KVA, this will cost you somewhere around $225K for the UPS systems alone. (Note: This is a very soft estimate and not all sizes and durations are available. So you will have to work in the constraints of you vender when you are sizing/estimated the UPS. All estimates are just to give you an idea about some of the cost. These will vary as technology changes and with specific vendors.)

      There is a lot of other material required to hook this up. I would add 10% to the total to give you are good rule of thumbs for materials cost. At this point, our stack and model look like the following with configuration guidelines to come.

      There are a couple of other good resources I would refer to at this point and both come from the uptime institute. They are: Cost Model: Dollars per kW plus Dollars per Square Foot of Computer Floor and A Simple Model for Determining True Total Cost of Ownership for Data Centers. Note: These are created along the lines of conventional data center, but they do contain some good information.

     

  • XS23 Cloud Server

    There has been some recent press around some of the equipment we’ve developed in our cloud computing group. The core of our business is essentially a consulting and design service and developing new products for customers is a big part of the fun. Because these aren’t mainstream PowerEdge systems, we don’t get the chance to show them off as much as we’d like. Our group has been talking for some time about “optimized designs” for cloud and hyperscale computing without showing what that can really mean, so it’s time to unveil something that’s come out of the lab.  Pictured here is one of our favorites: the XS23.

    image

    XS23 front – twelve 3.5” SAS or SATA drives; 3 per server

    This product was designed for a customer that needed maximum compute density, a healthy amount of local disk and, of course, lowest power draw possible. Our architecture team threw all that in the blender and out came a 2U standard rack mount chassis that houses four dual-socket servers and twelve 3.5” hot plug drives.

    image

    XS23 exploded view: two dual-socket servers mounted in chassis bottom; two in a mezzanine above. Industry standard rack-mount chassis.

    Density of this type is certainly not unheard of (half depth or twin 1U’s), but by going to a 2U chassis we were able to fit it with larger, more efficient fans and stack 3 rows of full 3.5” drives across the front. So, even with a 25% higher density than general purpose blades, it provides three local spindles of 3.5” SAS/SATA disk to each server. Of course there are tradeoffs. This was expressly designed for an environment with high node failure tolerance - a cloud application. By designing out a lot of the capabilities that weren’t required (like redundant power) we were able to deliver the performance and power profile required. Efficiencies are gained by shared resources - as seen in a lot of general purpose designs available today. We think the key to designing the perfect cloud server is knowing where to stop and also what not to build in. This is a function of each customer’s unique design goals. Applications truly capable of foregoing high availability in hardware are somewhat rare, but customers in this space have it – as well as a laser focus on their business levers. So in this case we took the problem statement and made the tradeoffs to yield highest efficiency and density within the performance parameters of the application.

    It’s important for me to emphasize that the XS23 is not generally available. This system is qualified and supported for only a handful of specific customer applications and locations; it’s not completely productized to bear a PowerEdge badge. I hope you’ll watch this space for more unique designs and the discussion on cloud taxonomy and architecture that Jimmy's leading.

  • Layer 1....

    I’d like to continue on our journey and build out the model that we have described starting at the bottom of my model and moving towards the top. The first thing we should do is change the name of layer 1. Some have pointed out to me that while the facilities is an important element, this block is going to cover a lot more than just the facilities and we should change its name. I’d like to propose physical plant (which is a very familiar term to facilities folks) and see if this encompasses what lies ahead.

    image

    Figure 1 – Cloud Computing Layered Model

    The first aspect to consider as part of this layer is what I am going to call ”macroscopic containment” or MC for short. Most folks would simply refer to this as the building, but I want to make a distinction here as there are many functions we can get from the MC.

    · The simplest form of MC is of course NONE. This is the case for equipment where the cabinetry is designed to sit out in the open. We see this in the telecom and perhaps the military industries, but not in this space. (although there are some interesting discussions ahead and a debate where “container” based solutions should go.)

    · Next we find a very simple MC or what I am going to refer to as temporary devices. The best example of this is a tent. Not very practical in most cases (in fact it almost sounds like a joke), but I know there are people considering them for areas where all they need is a bit of protection from the elements and some light physical security.

    · The next level is a fairly major transition to an actual building. This is probably where we are going to see most cloud installations and is what I think will ultimately prove to be most cost effective. I will refer to this as a utility building which is best described as a simple shell with a concrete floor (no raised floor). It provides controlled separation from the IT environment and outside environment. (I’ve seen these for about $38/sq ft. depending on the way you want the building finished-out.)

    · The final MC type is more along the lines of conventional data centers with raised floors and the works. This provides a very clean and well controlled solution and is probably overkill for most cloud environments. A reasonable rule of thumbs for this type of MC is about $500 per sq ft.

    We may want to add something describing this as owned, leased, or co-located space, but I have omitted this for now. I have also added MC to the schematic model we are going to build, but it isn’t much to look at. We’ll have to get a bit further in the definition for it to start having meaning.

    As always, your comments are welcomed. Next up, Utilities!