Welcome to the GCPI Blog

11/06/2008 08:28 am
Matt Reilly  |  1 comment

Welcome to what I hope to be an interesting and relevant ongoing discussion of how to provide objective energy efficiency information to guide HPC investment decisions. Energy efficiency has certainly risen to the forefront as a key issue – both from a global perspective: data centers in the US consume the equivalent of the output of five large nuclear power stations; and from a local perspective: a large number of the data center people I speak to are running out of power in their facilities or are being asked to foot the substantial bill for their electricity consumption.

There are no real tools available to bring power efficiency into the purchasing process. The nearest thing we have is Wu Chun Feng’s well-known Green500 list, a great ranking of the fastest computers. This list is interesting, and raises the visibility of the issue. However, most large computer systems in the world won't appear on the Top500 list. How can we compare the types of systems that we see in labs, workshops, and machine rooms every day? I believe we need an objective tool to guide the rest of us in our purchase decisions, burning through the “marketing fog” that rolls in from computer vendors. Yes, SiCortex is a computer vendor too…

I've talked about this issue in my own computing blog and I look forward to participating in a lively discussion at Wu Chun Feng and Kirk Cameron's Green500 Birds of a Feather meeting during SC08.

In the meantime, apparently an earlier blog of mine inspired a group of my colleagues at SiCortex to join forces with others in our HPC community and propose just such a tool. The Green Computing Performance Index (GCPI) was created to provide a structured and rational method for assessing the energy efficiency of computer systems. We created the methodology from Jack Dongarra’s well-respected HPPC benchmark suite, leveraging its breadth of measurement and the fact that there is publicly available information for most of the leading HPC systems. The data from computer installations posted at the HPC Challenge website provided the performance “numerator” for the GCPI. We derived the energy consumption “denominator” from published sources to provide a snapshot of sample comparisons. We provide this “snapshot” in three forms:

  1. For those who prefer to compare benchmark results from different systems “as is,” we are providing a simple matrix of every HPCC benchmark / kWatt for each of the reference systems.
  2. For managers and decision makers who are looking for the “Holy Grail,” a single number to rank and compare systems, we have created an index based on the full benchmark suite. The index is normalized to the balanced Cray XT3 system and consolidated via weighted averages. A favorite of my marketing colleague here at SiCortex, but certainly not the choice of everyone.
  3. For the data miners out there, we will shortly be providing a tool to enable you to create your own index. Select the benchmarks that are relevant to your work, determine your own weighting, and go. Out comes your own index. We suspect this will be especially useful for analyzing and evaluating systems that will spend most of their time running a small set of extremely well characterized applications. Check the GCPI micro site in the next week.

These tools are provided on the GCPI micro site.

Any new metric invites debate. That's what we're facilitating with the GCPI blog. We don't intend the methodology we're introducing today to be the last word. In fact, we want it to spark a wider discussion. Is HPCC the "right" basis set? How should we measure power? KVA? Watts? Nameplate rating? Given the absence of I/O in the HPCC suite, what can we say about I/O intensive systems?

So, let's start the debate here and now. Are we on the right track here? Specifically, for the initial topic, is HPCC the right basis set? Weigh in, friends.

Matt

Comments

One thing to keep in mind when you look at the GCPI, and when you look at the HPC Challenge results generally, is the scale of the system under test. In the bandwidth tests, for example, it is a lot easier to get a high Random Ring Bandwidth result in a smaller system with a single level of, say, Infiniband, switches, than in a large scale system with thousands of cores.

If your application is embarassingly parallel, this doesn't matter, but for large scale problems, it makes sense to check the GCPI of systems with similar scale.
Larry Stewart (SiCortex Engineering) | 11/11/2008 12:21 pm