Way to Go, Faceplant

May 12, 2010

Thanks to Monty for sharing this link to a nicely-done graphic showing the evolution (devolution?) of Facebook’s default privacy policy.

If you use Facebook, this is required viewing. Seriously. Go change your privacy settings!

A Bad Apple Week

May 12, 2010

Those following this blog will know that I like Apple products and I especially like Apple Service having had to avail myself of their expertise more times than I care to count with my early MacBook Pro which went in for repairs on enough occasions that Apple eventually replaced it.

Aside from a recent replacement of my logic board due to an nVidia flaw, my Apple experience has been without drama for the past year or so. Until about a week ago.

First, my iPhone 3G started to randomly freeze and reboot. After several visits to the Genuis Bar in Cambridge MA indications were that this was a hardware problem. By the time I arrived in Palo Alto on business, I was sure it was hardware. A visit to the Apple Store and $199 fixed that problem: I opted to replace my current 16 GB 3G with the same model to maintain the option of applying a discount to the much-rumored next gen iPhone, which I hope will appear soon.

The next morning my MacBook Pro beachballed and nothing I tried helped and I eventually had to power cycle it. It then refused to boot. It then also refused to boot from an installation disc. And refused to work correctly in Firewire target disk mode. So…back to the Apple Store where the diagnosis was that the disk was dead. Since my cloned backup (CCC, I love you!) was back in Boston, I opted to wait till I got home before taking any action on the MBP. I used my iPhone and a loaner Windows laptop to make it through my week of meetings in Palo Alto.

The disk was dead. Disk Utility could see the disk, but couldn’t repair, erase, or partition it. Note to anyone in a similar circumstance — it sometimes takes a very long time for a system with a dead disk to boot from the installation disc. Wait a lot longer than you’d expect — like five minutes or more — before giving up. Anyway, long story short, I bought a new 500 GB Hitachi 7200 rpm drive, installed it, restored from my clone and all was well.

Until this morning. When my wife dropped her MacBook on the floor. I’ll be stopping at the local Apple Store on my way home tonight to pick up her machine, which has a new disk installed and a fresh OS load.

They say bad things come in threes, so I’m hoping we are out of the woods for now.


Tap, tap…is this thing on?

May 12, 2010

It becomes more difficult to post a blog entry the further into the past my last post recedes. It’s now been three months since my last entry and I guess I’ve been thinking I need some Big, Interesting, Awesome Post to restart my blogging habit. And the bar keeps rising. Not a recipe for success.

So. I declare the hiatus over and will now return you to regularly scheduled programming. Except I am still figuring out how I want to handle my work-related blogging — whether to do it here under an HPC or Virtualization category or put it elsewhere. Stay tuned for that.

On to a New Adventure!

February 12, 2010

vmware logo

Today is my last day working at Sun Microsystems. Having declined my Oracle job offer, I am very excited to be moving to VMware to lead a new HPC effort, reporting to VMware’s CTO, Steve Herrod. I’ll be working internally across VMware to promote HPC technical requirements, and externally with customers and others in the HPC community to explain why virtualization is going to be the next big trend in HPC. I’ve covered some of this previously (here, here, here, and here) but there is much more to say, so stay tuned.

While I am going to miss my many friends and colleagues of thirteen or more years, I am totally psyched to be joining VMware!

Rest in Peace

January 15, 2010
Marguerite Handfield Simons
09/17/1937 – 09/11/2009
Julie Simons Droney
10/27/1967 – 12/06/2009

It has been an especially bad time for my family over the last few months with the loss of both my mother and my sister. Thank you everyone for your support.

Barbie’s Next Career

January 15, 2010

While I don’t follow her myself, I’m told Barbie has had over 120 “careers” since her introduction in 1959. Well, it is time for her to choose another, and Mattel wants to hear from you. Please vote for Computer Engineer Barbie! That is clearly much cooler than any of the other choices offered. Vote here.

Igniting the Earth’s Atmosphere

January 15, 2010

As part of background research for a blog entry I’m working on, I went looking for the name of the Manhattan Project scientist who was tasked with calculating whether an atomic detonation could ignite the Earth’s atmosphere and burn everyone on the planet to cinders. His name was Hans Bethe and he apparently concluded the bomb would not ignite the atmosphere. But according to the Wikipedia article on the Manhattan Project, Edward Teller co-authored a paper that also examined this question.

That paper, Ignition of the Atmosphere with Nuclear Bombs, was declassified in the 1970s and it is available as a PDF for your perusal here. I recommend reading the Abstract on Page 3 and the three concluding paragraphs on Page 18. The final paragraph, which I hereby nominate as a monumental understatement, reads as follows:

“One may conclude that the arguments of this paper make it unreasonable to expect that the N + N reaction could propagate. An unlimited propagation is even less likely. However, the complexity of the argument and the absence of satisfactory experimental foundations makes further work on the subject highly desirable.”

Apparently, the “satisfactory experimental foundations” were achieved at Trinity site. Had that gone wrong, it would have brought an entirely new meaning to the term “test coverage.”

[This just gets worse: As my friend Monty points out, the paper is dated August 1946. The Trinity detonation occurred a year earlier, in July 1945.]

Virtualization for HPC: The Heterogeneity Issue

January 15, 2010

I’ve been advocating for awhile now that virtualization has much to offer HPC customers (see here.) In this blog entry I’d like to focus on one specific use case, heterogeneity. It’s an interesting case because while heterogeneity is either desirable or to be avoided, depending on your viewpoint, virtualization can help in either case.

The diagram above depicts a typical HPC cluster installation with each compute node running whichever distro was chosen as that site’s standard OS. Homogeneity like this eases the administrative burden, but it does so at the cost of flexibility for end-users. Consider, for example, a shared compute resource like a national supercomputing center or a centralized cluster serving multiple departments within a company or other organization. Homogeneity can be a real problem for end-users whose applications only run on either other versions of the chosen cluster OS or, worse, on completely different operating systems. These users are generally not able to use these centralized facilities unless they can port their application to the appropriate OS or convinced their application provider to do so.

The situation with respect to heterogeneity for software providers, or ISVs — independent software vendors, is quite different. These providers have been wrestling with expenses and other difficulties related to heterogeneity for years. For example, while ISVs typically develop their applications on a single platform (OS 0 above,) they must often port and support their application on several operating systems in order to address the needs of their customer base. Assuming the ISV decides correctly which operating systems should be supported to maximize revenue, it must still incur considerable expenses to continually qualify and re-qualify their application on each supported operating system version. And maintain a complex, multi-platform testing infrastructure and in-house expertise to support these efforts as well.

Imagine instead a virtualized world, as shown above. In such a world, cluster nodes run hypervisors on which pre-built and pre-configured software environments (virtual machines) are run. These virtual machines include the end-user’s application and the operating system required to run that application. So far as I can see, everyone wins. Let’s look at each constituency in turn:

  • End-users — End-users have complete freedom to run any application using any operating system because all of that software is wrapped inside a virtual machine whose internal details are hidden. The VM could be supplied by an ISV, built by an open-source application’s community, or created by the end-user. Because the VM is a black box from the cluster’s perspective, the choice of application and operating system need no longer be restricted by cluster administrators.
  • Cluster admins — In a virtualized world, cluster administrators are in the business of launching and managing the lifecycle of virtual machines on cluster nodes and no longer need deal with the complexities of OS upgrades, configuring software stacks, handling end-user special software requests, etc. Of course, a site might still opt to provide a set of pre-configured “standard” VMs for end-users who do not have a need for the flexibility of providing their own VMs. (If this all sounds familiar — it should. Running a shared, virtualized HPC infrastructure would be very much like running a public cloud infrastructure like EC2. But that is a topic for another day.)
  • ISVs — ISVs can now significantly reduce the complexity and cost of their business. Since ISV applications would be delivered wrapped within a virtual machine that also includes an operating system and other required software, ISVs would be free to select a single OS environment for developing, testing, AND deploying their application. Rather than basing their operating system choice on market share considerations, the decision could be made based on the quality of the development environment, or perhaps the stability or performance levels achievable with a particular OS, or perhaps on the ability to partner closely with an OS vendor to jointly deliver a highly-optimized, robust, and completely supported experience for end-customers.

Sun Grid Engine: Still Firing on All Cylinders

January 14, 2010

The Sun Grid Engine team has just released the latest version of SGE, humbly called Sun Grid Engine 6.2 update 5. It’s a yawner of a name for a release that actually contains some substantial new features and improvements to Sun’s distributed resource management software, among them Hadoop integration, topology-aware scheduling at the node level (think NUMA), and improved cloud integration and power management capabilities.

You can get the bits directly here. Or you can visit Dan’s blog for more details first. And then get the bits.

Sun HPC Consortium Videos Now Available

December 21, 2009

Thanks to Rich Brueckner and Deirdré Straughan, videos and PDFs are now available from the Sun HPC Consortium meeting held just prior to Supercomputing ’09 in Portland, Oregon. Go here to see a variety of talks from Sun, Sun partners, and Sun customers on all things HPC. Highlights for me included Dr. Happy Sithole’s presentation on Africa’s largest HPC cluster (PDF|video), Marc Parizeau’s talk about CLUMEQ’s Collossus system and its unique datacenter design (PDF|video), and Tom Verbiscer’s talk describing Univa UD’s approach to HPC and virtualization, including some real application benchmark numbers illustrating the viability of the approach (PDF|video).

My talk, HPC Trends, Challenges, and Virtualization (PDF|video) is an evolution of a talk I gave earlier this year in Germany. The primary purposes of the talk were to illustrate the increasing number of common challenges faced by enterprise, cloud, and HPC users and to highlight some of the potential benefits of this convergence to the HPC community. Virtualization is specifically discussed as one such opportunity.