If you use Facebook, this is required viewing. Seriously. Go change your privacy settings!
If you use Facebook, this is required viewing. Seriously. Go change your privacy settings!
Those following this blog will know that I like Apple products and I especially like Apple Service having had to avail myself of their expertise more times than I care to count with my early MacBook Pro which went in for repairs on enough occasions that Apple eventually replaced it.
Aside from a recent replacement of my logic board due to an nVidia flaw, my Apple experience has been without drama for the past year or so. Until about a week ago.
First, my iPhone 3G started to randomly freeze and reboot. After several visits to the Genuis Bar in Cambridge MA indications were that this was a hardware problem. By the time I arrived in Palo Alto on business, I was sure it was hardware. A visit to the Apple Store and $199 fixed that problem: I opted to replace my current 16 GB 3G with the same model to maintain the option of applying a discount to the much-rumored next gen iPhone, which I hope will appear soon.
The next morning my MacBook Pro beachballed and nothing I tried helped and I eventually had to power cycle it. It then refused to boot. It then also refused to boot from an installation disc. And refused to work correctly in Firewire target disk mode. So…back to the Apple Store where the diagnosis was that the disk was dead. Since my cloned backup (CCC, I love you!) was back in Boston, I opted to wait till I got home before taking any action on the MBP. I used my iPhone and a loaner Windows laptop to make it through my week of meetings in Palo Alto.
The disk was dead. Disk Utility could see the disk, but couldn’t repair, erase, or partition it. Note to anyone in a similar circumstance — it sometimes takes a very long time for a system with a dead disk to boot from the installation disc. Wait a lot longer than you’d expect — like five minutes or more — before giving up. Anyway, long story short, I bought a new 500 GB Hitachi 7200 rpm drive, installed it, restored from my clone and all was well.
Until this morning. When my wife dropped her MacBook on the floor. I’ll be stopping at the local Apple Store on my way home tonight to pick up her machine, which has a new disk installed and a fresh OS load.
They say bad things come in threes, so I’m hoping we are out of the woods for now.
It becomes more difficult to post a blog entry the further into the past my last post recedes. It’s now been three months since my last entry and I guess I’ve been thinking I need some Big, Interesting, Awesome Post to restart my blogging habit. And the bar keeps rising. Not a recipe for success.
So. I declare the hiatus over and will now return you to regularly scheduled programming. Except I am still figuring out how I want to handle my work-related blogging — whether to do it here under an HPC or Virtualization category or put it elsewhere. Stay tuned for that.
Today is my last day working at Sun Microsystems. Having declined my Oracle job offer, I am very excited to be moving to VMware to lead a new HPC effort, reporting to VMware’s CTO, Steve Herrod. I’ll be working internally across VMware to promote HPC technical requirements, and externally with customers and others in the HPC community to explain why virtualization is going to be the next big trend in HPC. I’ve covered some of this previously (here, here, here, and here) but there is much more to say, so stay tuned.
While I am going to miss my many friends and colleagues of thirteen or more years, I am totally psyched to be joining VMware!
|Marguerite Handfield Simons
09/17/1937 – 09/11/2009
|Julie Simons Droney
10/27/1967 – 12/06/2009
It has been an especially bad time for my family over the last few months with the loss of both my mother and my sister. Thank you everyone for your support.
While I don’t follow her myself, I’m told Barbie has had over 120 “careers” since her introduction in 1959. Well, it is time for her to choose another, and Mattel wants to hear from you. Please vote for Computer Engineer Barbie! That is clearly much cooler than any of the other choices offered. Vote here.
As part of background research for a blog entry I’m working on, I went looking for the name of the Manhattan Project scientist who was tasked with calculating whether an atomic detonation could ignite the Earth’s atmosphere and burn everyone on the planet to cinders. His name was Hans Bethe and he apparently concluded the bomb would not ignite the atmosphere. But according to the Wikipedia article on the Manhattan Project, Edward Teller co-authored a paper that also examined this question.
That paper, Ignition of the Atmosphere with Nuclear Bombs, was declassified in the 1970s and it is available as a PDF for your perusal here. I recommend reading the Abstract on Page 3 and the three concluding paragraphs on Page 18. The final paragraph, which I hereby nominate as a monumental understatement, reads as follows:
“One may conclude that the arguments of this paper make it unreasonable to expect that the N + N reaction could propagate. An unlimited propagation is even less likely. However, the complexity of the argument and the absence of satisfactory experimental foundations makes further work on the subject highly desirable.”
Apparently, the “satisfactory experimental foundations” were achieved at Trinity site. Had that gone wrong, it would have brought an entirely new meaning to the term “test coverage.”
[This just gets worse: As my friend Monty points out, the paper is dated August 1946. The Trinity detonation occurred a year earlier, in July 1945.]
I’ve been advocating for awhile now that virtualization has much to offer HPC customers (see here.) In this blog entry I’d like to focus on one specific use case, heterogeneity. It’s an interesting case because while heterogeneity is either desirable or to be avoided, depending on your viewpoint, virtualization can help in either case.
The diagram above depicts a typical HPC cluster installation with each compute node running whichever distro was chosen as that site’s standard OS. Homogeneity like this eases the administrative burden, but it does so at the cost of flexibility for end-users. Consider, for example, a shared compute resource like a national supercomputing center or a centralized cluster serving multiple departments within a company or other organization. Homogeneity can be a real problem for end-users whose applications only run on either other versions of the chosen cluster OS or, worse, on completely different operating systems. These users are generally not able to use these centralized facilities unless they can port their application to the appropriate OS or convinced their application provider to do so.
The situation with respect to heterogeneity for software providers, or ISVs — independent software vendors, is quite different. These providers have been wrestling with expenses and other difficulties related to heterogeneity for years. For example, while ISVs typically develop their applications on a single platform (OS 0 above,) they must often port and support their application on several operating systems in order to address the needs of their customer base. Assuming the ISV decides correctly which operating systems should be supported to maximize revenue, it must still incur considerable expenses to continually qualify and re-qualify their application on each supported operating system version. And maintain a complex, multi-platform testing infrastructure and in-house expertise to support these efforts as well.
Imagine instead a virtualized world, as shown above. In such a world, cluster nodes run hypervisors on which pre-built and pre-configured software environments (virtual machines) are run. These virtual machines include the end-user’s application and the operating system required to run that application. So far as I can see, everyone wins. Let’s look at each constituency in turn:
The Sun Grid Engine team has just released the latest version of SGE, humbly called Sun Grid Engine 6.2 update 5. It’s a yawner of a name for a release that actually contains some substantial new features and improvements to Sun’s distributed resource management software, among them Hadoop integration, topology-aware scheduling at the node level (think NUMA), and improved cloud integration and power management capabilities.
Thanks to Rich Brueckner and Deirdré Straughan, videos and PDFs are now available from the Sun HPC Consortium meeting held just prior to Supercomputing ’09 in Portland, Oregon. Go here to see a variety of talks from Sun, Sun partners, and Sun customers on all things HPC. Highlights for me included Dr. Happy Sithole’s presentation on Africa’s largest HPC cluster (PDF|video), Marc Parizeau’s talk about CLUMEQ’s Collossus system and its unique datacenter design (PDF|video), and Tom Verbiscer’s talk describing Univa UD’s approach to HPC and virtualization, including some real application benchmark numbers illustrating the viability of the approach (PDF|video).
My talk, HPC Trends, Challenges, and Virtualization (PDF|video) is an evolution of a talk I gave earlier this year in Germany. The primary purposes of the talk were to illustrate the increasing number of common challenges faced by enterprise, cloud, and HPC users and to highlight some of the potential benefits of this convergence to the HPC community. Virtualization is specifically discussed as one such opportunity.