Archive for July, 2008

Fresh Bits: Attention all OpenMP and MPI Programmers!

July 30, 2008

The latest preview release of Sun’s compiler and tools suite for C, C++, and FORTRAN users is now available for free download. Called Sun Studio Express 07/08, this release of Sun Studio marks an important advance for HPC customers and for any customer interested in extracting high performance from today’s multi-threaded and multi-core processors. In addition to numerous compiler performance enhancements, the release includes beta-level support for the latest OpenMP standard, OpenMP 3.0. It also includes some nice Performance Analyzer enhancements that support simple and intuitive performance analysis of MPI jobs. More detail on both of these below.

As the industry-standard approach for achieving parallel performance on multi-CPU systems, OpenMP has long been a mainstay of the HPC developer community. Version 3.0, which is supported in this new <a href="Sun Studio preview release, is a major enhancement to the standard. Most notably it includes support for tasking, a major new feature that can help programmers achieve better performance and scalability with less effort than previous approaches using nested parallelism. There are a host of other enhancements as well. The OpenMP expert will find the latest specification useful. For those new to parallelism who have stumbled into a maze of twisty passages all alike, you may find Using OpenMP: Portable Shared Memory Parallel Programming to be a useful introduction to parallelism and OpenMP.

A parallel quicksort example, written using the new OpenMP tasking feature supported in Sun Studio Express 07/08

Sun Studio Express 07/08 also includes enhancements for programmers of parallel, distributed applications who use <a href="MPI. With this release of Sun Studio Express we have introduced tighter integration with Sun’s MPI library (Sun HPC ClusterTools). Sun’s Performance Analyzer has been enhanced to include the ability to examine the performance of MPI jobs by viewing information related to message transfers and messaging performance using a variety of visualization methods. This extends Analyzer’s already-sophisticated on-node performance analysis capabilities. Some screenshots below give some idea of the types of information that can be viewed. You should note the idea of viewing “MPI states” (e.g. MPI Wait and MPI Work) to get a high level view of the performance of the MPI portion of an application: an ability to understand how much time is spent doing actual work versus sitting in a wait state can motivate useful insights into the performance of these parallel, distributed codes.

A source code viewer window augmented with several MPI-specific capabilities, one of which is illustrated here: the ability to quickly see how much work (or waiting) is performed within a function.

In addition to supporting direct viewing of specific MPI performance issues within an application, Analyzer now also supports a range of visualization tools useful for understanding the messaging portion of an MPI code. Zoomable timelines with MPI events are supported, as is an ability to map various metrics against the X and Y axis of a plotting area to display various interesting characteristics of the MPI run, as shown below.

Just one example of Sun Studio’s new MPI charting capabilities. Shown here is a display showing the volume of messages transferred between communicating pairs of MPI processes during an application run.

This blog entry has barely scratched the surface of the new OpenMP and MPI capabilities available in this release. If you are a Solaris or Linux HPC programmer, please take these new capabilities for a test drive and let us know what you think. I know the engineering teams are excited by what they’ve accomplished and I hope you will share their enthusiasm once you’ve tried these new capabilities.

Sun Studio Express 07/08 is available for Solaris 9 & 10, OpenSolaris 2008.05, and Linux (SLES 9, RHEL 4) and can be downloaded here.

The Deep Blue Sea: Technology in the Service of Safety

July 28, 2008

My friends Jamie and Lori left Sunday on their annual month-long sailing trip from Boston to the Canadian maritime provinces. As usual, the trip begins with an open ocean sail across the Gulf of Maine directly from Boston to Cape Sable, Nova Scotia.

This year they are carrying a Spot satellite messenger on board. This neat little device can report its location every ten minutes, allowing others to track their progress over the course of the trip. It can also transmit a 911 emergency message, if needed. It is quite a nifty device and surprisingly inexpensive given its capabilities. Jim Gray should have had one of these on his boat last year when he went missing. Of course, boating is only one application–I can imagine this would be useful in any number of situations in which people may need to be rescued.

Here is a screenshot I took this morning of their progress towards Nova Scotia:

You can also view the live interface here.

Two MEASURED TeraFLOPs in a Box: Now THAT is Big Iron!

July 18, 2008

I love the smell of Big Iron in the morning.

We just announced new versions of our M-series midrange and high-end SMPs, the M4000, M5000, M8000, and M9000 systems, that sport the latest Fujitsu quad-core, dual-threaded SPARC64 VII processor. These systems, a co-development effort between Sun and Fujitsu, are traditionally viewed as high-end enterprise-class systems. With up to 64 quad-core processors, up to 2 TBytes of memory, and up to 288 PCIe or PCI-X IO slots, these systems are clearly high-end datacenter workhorses. But they kick butt on HPC workloads as well. No surprise given the tight coupling of compute and memory in such an SMP system, which is especially valuable for computations involving large amounts of very fine-grained communication between cooperating parallel processes.

We’ve published world record benchmark numbers on a standard Open MP benchmark, besting the competition by some considerable margins. We’ve also shown new world record benchmarks on a prominent standard floating-point benchmark. My favorite result, however, is a LINPACK score of over 2 TeraFLOPs with a single M9000 system using Solaris 10 and our latest compilers, Sun Studio 12. This result is almost 2X higher with the new 2.52 GHz SPARC64 VII processor than with the previous 2.4 GHz SPARC64 VI processor. Impressive–and yet another example of why shopping based on processor clock speeds is an increasingly bad idea. In any case, you can read more details about these benchmark results and others here and here.

Solaris InfiniBand: A Big Day!

July 14, 2008

Yesterday, the Sun InfiniBand engineering team released Solaris 10 driver support for ConnectX (a.k.a. Hermon), the latest generation of InfiniBand silicon from Mellanox. This is important news for both Solaris HPC customers as well as those enterprise customers interested in the best bandwidth and latencies available for applications like Oracle RAC. Congratulations to the team!

In addition to the driver, the update also includes a new flash updating tool for ConnectX, a uDAPL update, and several additional components, all of which is described in the documentation.

The specific ConnectX-related Sun part numbers supported by this release are: X4217A-Z HCA card, X4216A-Z EM, and X5196A-Z, the 24 Port NEM for the SunBlade 6048 family of servers. It also supports third-party cards based on the following Mellanox chips: MT25408, MT25418, and MT25428.

The release, called “Solaris InfiniBand Updates 2” is available for free download here.

Innovation@Sun Conference

July 14, 2008

Each year Sun holds two internal technical conferences that bring together all of Sun’s most senior engineers (Principal Engineers, Distinguished Engineers, and Fellows) at an offsite location for two days of technical presentations and networking. The conferences alternate between a smaller and larger venue. The smaller conference (Technology@Sun) accommodates just the PEs, DEs, and Fellows. At the larger conference (Innovation@Sun) we are able to include approximately 50 other attendees from around Sun. This year, we will choose them based on the merits of their poster or demo proposals.

The deadline for submissions closed last Friday and we received a total of 227 technical proposals from around the company. The program committee will now review all proposals and decide who will be invited to attend. We are also busy planning the rest of the technical content and events for the conference, which promises to be at least as interesting and useful as prior events.