Supercomputing: An Interview with Henry Neeman

Bonnie BraceyBy Bonnie Bracey Sutton
Editor, Policy Issues

Introduction: “Dr. Henry Neeman is the Director of the OU Supercomputing Center for Education & Research and an adjunct assistant professor in the School of Computer Science at the University of Oklahoma. . . . In addition to his own teaching and research, Dr. Neeman collaborates with dozens of research groups, applying High Performance Computing techniques in fields such as numerical weather prediction, bioinformatics and genomics, data mining, high energy physics, astronomy, nanotechnology, petroleum reservoir management, river basin modeling and engineering optimization. . . . Dr. Neeman’s research interests include high performance computing, scientific computing, parallel and distributed computing and computer science education” (Oklahoma Supercomputing Symposium 2011).

Harry Neeman

ETCJ: In plain English, what is Supercomputing (SC)?

Henry Neeman: Supercomputing is the biggest, fastest computing in the world right this minute, and likewise a supercomputer is one of the biggest, fastest computers right this minute. The reason we say “right this minute” is that computers are always getting bigger and faster, so if something is a supercomputer today, it won’t be a supercomputer a few years from now. In fact, the biggest fastest supercomputer of 15 years ago would be a laptop today, and the biggest fastest computer of 25 years ago would be a cell phone today. Here’s an example: In 2002, OU got our first big cluster supercomputer. It could do slightly over a trillion calculations per second, which made it one of the fastest supercomputers in the world. It took up 132 square feet of floor space, weighed 5 tons, and cost almost a million dollars. Today, that same computing speed can be had in two graphics cards — and next year, it’ll be less than one.

Often, supercomputing is used for Computational Science, or more broadly Computational and Data-Enabled Science and Engineering (CDESE), which is the use of computing to conduct experiments. Sometimes CDESE is used for understanding results observed in the real world, but often it’s used to study phenomena that are too big (like a galaxy), too small (like an atom), too slow (like the entire lifetime of the universe), too fast (like collisions of subatomic particles), too expensive (testing dozens of airplane prototypes to build one new product) or just plain too dangerous (running around inside a tornado) to do in the real world.

ETCJ: How is SC being used by “scientists and engineers with modest computing experience”?

HN: Many science and engineering researchers are very sophisticated about their area of research, but not very sophisticated about computing, especially large scale computing. At OU, we teach them the basic concepts of supercomputing, then sit down with them and help them do with our supercomputer the work that they need to do. More often than not, that just means helping them to download a free “community code” off the web, install it and get it up and running to do their research with. But sometimes it means helping them to write software, or to modify software to be able to run on a supercomputer. Either way, our goal is to make them productive as quickly as possible.

ETCJ: Is SC being used by K-12 and college-level educators to conduct research in teaching and learning? If yes, can you share some examples?

HN: In fact, there’s a new online journal about teaching supercomputing, the Journal of Computational Science Education. And there are a number of organizations focused on research in both the teaching of supercomputing and CDESE, and the use of supercomputing and CDESE in teaching. Probably the world leader in both of these is the Shodor Education Foundation. They’ve been both doing this and studying it for over a decade.

ETCJ: Tell us about OU’s Condor pool, or massive grid of PCs. How is this being used by researchers? Can other schools and colleges create similar pools and how might they be used by educators and students?

HN: OU Information Technology manages a good number of PC labs all over campus — in the library, the residence halls, the Union, various departments — that collectively add up to about 800 PCs. When those PCs are idle — which in practice is about 80% of the time — they’re available for researchers to do number crunching on. The biggest consumer of our Condor pool is our High Energy Physics group, the ones who are banging tiny particles together at unbelievably high speed. But we make the Condor pool available to anyone at OU who wants it. Recently, we had a grad student do a project where he had to run his software 60,000 times. Doing that on his desktop, that would have taken years, maybe decades; on the Condor pool, it was a matter of a few weeks. Anyone can start a Condor pool of their own; you can run it on just a single PC or on thousands. And all the software is free.

ETCJ: Tell us about your NSF project, “Oklahoma Optical Initiative.”

HN: The Oklahoma Optical Initiative (OCII, pronounced “Okie”) is a project funded by the National Science Foundation’s Inter-Campus and Intra-Campus Cyber Connectivity program. It’s a collaboration among OU, Oklahoma State U, U Tulsa, Langston U and the Samuel Roberts Noble Foundation. It has four pieces to it:

  1. Transforming Oklahoma’s statewide ring to the latest high end optical hardware, which will allow Oklahoma institutions to run dedicated circuits across the state at minimal cost and labor.
  2. Upgrading the connectivity at several Oklahoma institutions, with improvements varying from a factor of 5 to a factor of 100, at OU (10 times increase), Oklahoma State U (10 times increase), U Tulsa (5 times increase), Langston U (100 times increase for Oklahoma’s only Historically Black University) and the Samuel Roberts Noble Foundation (22 times increase), a nonprofit research foundation in rural Oklahoma. We’re also doing improvements to some rural communities and to some Tribal colleges. Lately we’ve been working closely with the College of the Muscogee Nation in Okmulgee.
  3. The Oklahoma Telepresence Initiative, where we’re providing telepresence capability to institutions across Oklahoma.
  4. The Oklahoma Networking Mentorship Program, where we’re providing workforce development to institutions across Oklahoma. Specifically, we’re providing to networking courses at institutions across the state presentations from networking professionals, talking about what it’s like to do networking for a living. And, we provide job shadowing opportunities so they can see it up close. We do this at PhD-granting, masters-granting, bachelors-granting institutions, community colleges, career techs, and even a high school. And many of these institutions have high populations of underrepresented minorities — this is a great way to reach an audience that often doesn’t get exposed to the cutting edge. The ONMP has been so well received that, on the urging of our external evaluators, we’re expanding the ONMP to the OITMP, covering other areas of Information Technology. This semester, we’ve expanded to include security as well, and we’ll add more areas of IT as we move forward.

ETCJ: What is CI, or the Cyberinfrastructure Initiative?

HN: The Oklahoma Cyberinfrastructure Initiative is an agreement between OU and Oklahoma State U to help everyone in the state learn to use Cyberinfrastructure (supercomputing, high performance networking, CDESE and so on) by providing our centrally owned Cyberinfrastructure resources to institutions statewide, not just academic but government, nonprofit and even commercial — and for most of them, it’s free. It includes not only our resources but also our services, especially “Supercomputing in Plain English,” our very successful supercomputing education endeavor, and the Oklahoma Supercomputing Symposium.

ETCJ: Tell us more about the “new class of HPC users who aren’t experts in high performance computing, but are able to generate results that rival those of an expert.” In the future, will SC be a resource that’s available to the general public? How will people, in general, use SC in their everyday lives and pursuits?

HN: You may not see the supercomputers, but every single day supercomputing is making our lives better. Everything from the cars we drive to the weather forecast on TV to the movies we watch to the detergent bottles in our laundry rooms are made, or made better, by supercomputing. Today, there are a number of ways for citizens to access supercomputing. Often, these are known as “science gateways,” and they provide a simple interface to a complicated back end. An example is nanoHUB, which K-12 and postsecondary students can use to do nanotechnology simulations. In fact, the nanoHUB website has curricula and teaching materials that any teacher can put to work in their classroom.

ETCJ: In the future, how will SCs change the educational landscape? That is, how will it change teaching and learning?

HN: Teachers will have ways to not just show students phenomena that they previously couldn’t study in a classroom, but to have their students experiment with it.

ETCJ: Are schools and colleges doing enough to educate students about computers? If not, what more needs to be done?

HN: Ironically, the more ubiquitous and usable computers become, the less urgency people feel about learning about computing. And, as testing has become an increasing focus of schools nationwide, subjects outside the tests have fallen off the radar. So it’s no surprise that Computer Science has become less common in the classroom. At the same time, schools are increasingly discovering the value of computers in teaching their core subject matter. It may be that the best way to attract students into computing careers is to have them using computing everywhere.

ETCJ: In the future, how will SCs change social networking?

HN: Social networking services, both by accident and by design, collect mountains, oceans of data. And it turns out that all this data can be incredibly valuable. A recent article in Communications of the Association for Computing Machinery highlighted the mining of Twitter data — Tweets — to track flu outbreaks. Like other companies, social networks are discovering the value of supercomputing in discovering hidden patterns in the data they’ve collected.

ETCJ: Will SCs play a role in finding cures for cancer? Bringing about world peace? Improving education? Reducing poverty?

HN: All of the above, and in fact it’s already being used in all of these areas. Take cancer, for example; much of what’s happening in oncology research today is what are collectively called “the omics”: genomics, proteomics, metabolomics and so on. These are large scale data mining problems, looking for patterns in big data collections that show how various genetic, protein and metabolism phenomena lead to structures that cause diseases like cancer. World peace is, in an odd way, one of the oldest supercomputing applications. Specifically, the National Nuclear Security Administration, part of the US Department of Energy, was one of the earliest adopters of supercomputing, as a means of simulating nuclear explosions, especially after the signing of the Nuclear Test Ban Treaty. Since we can’t blow up our bombs, we test them virtually, via supercomputing. That way, both we and our rivals have confidence that an attack on us really would be the end of the world — so no one tries. For reducing poverty, supercomputing is increasingly important in the social sciences, especially in economics, where we’re learning more and more about how economies work in the real world, both by running simulations and by data mining.

ETCJ: Tell us about your passion for ballroom dancing.

HN: I got into ballroom dancing in grad school, for all the wrong reasons — namely, to meet girls. But I discovered that I loved it, a lot, and by my third semester I was already teaching it. I’ve been dancing for 22 years and teaching it for 20, and I still love both. In fact, my wife and I teach ballroom dancing twice a week and go social dancing most weekends. With two little kids at home, it’s pretty much the only hobby time we have time for.

ETCJ: Anything else you’d like to tell us that we haven’t covered?

HN: Nah, you hit the high points.

10 Responses

  1. This is fascinating, but – due to my ignorance of computing – there are a few things I don’t quite understand:

    – Cluster computer v. grid of computers: do they both involve harnessing the power of several computers?

    – If so, do the 2 different terms indicate a difference i the way the individual computers are linked?

    And perhaps off-topic, but there seems to be an analogy: in his The Birth of the Word March 2011 TED conference, Deb Roy explains how his MIT team used the MIT computer to analyze years of videos taken by webcams in his home to track how his son started using words, according to what interactions with his parents and nanny.

    And in the second part, he explains how his team then applied the same analyzing tools to TV broadcasts and reactions to them in social networks like Twitter and Facebook. This must need a very powerful computer, but is it one that works like the one at University of Oklahoma?

    • Both cluster computers and grids involve harnessing the power of several computers, but in different ways. Roughly speaking, think of a cluster as local — typically a cluster will have a dedicated internal network that isn’t accessible from outside the cluster — while grids are more broadly distributed, communicating over wide area (and often commodity) networks.

      In practice, this means that clusters and grids are used for different things. Grids are good for “loosely coupled” applications, where the various pieces can advance independently of one another; a classic example is Monte Carlo simulation, where a phenomenon is studied by randomly generating many examples of a phenomenon and then finding the average of their properties. Clusters are great for “tightly coupled” problems, where all of the pieces have to advance in lockstep; a good example is weather forecasting, where the weather at the eastern edge of Oklahoma has to line up, not just in space but in time, with the weather at the western edge of Arkansas.

      With respect to Deb Roy’s discussion, while I’m not familiar with the specific example, it wouldn’t surprise me if he used a large group of computers to mine his 90,000 hours of video data and 140,000 hours of audio data, because if he had that much data, then analyzing one file at a time would take much much longer than analyzing many of them at the same time. Whether that was on a cluster, a grid or just a roomful of laptops, I couldn’t say — but in principle all those solutions would have roughly the same outcome, a huge decrease in runtime in the real world, and therefore a huge increase in the practicality of the project.

  2. […] this interview from the Education Technology & Change blog, Henry Neeman from the University of Oklahoma […]

  3. Henry has helped us in K-12 by coming to conferences and sharing his knowledge. I would say he is out of the silo ..

  4. Companies were shifting from large mini-computers to small and powerful micro-computers and many people realized that this would lead to a large-scale waste of computing power as computing resources were being fragmented more and more. However the same organizations also face huge computation-intensive problems and thus require great computing power to remain competitive hence the stable demand for supercomputing solutions that largely are built on cluster computing concepts. Many vendors offer commercial cluster computing solutions.

  5. […] Henry Neeman  and Scott Lathrop who chairs the Supercomputing Conference reach out to help create a […]

  6. […] year we have found a few others to bring along on our journey. No one is in charge of the team.  Henry Neeman and Scott Lathrop are our main contacts within the SC […]

  7. […] this interview from the Education Technology & Change blog, Henry Neeman from the University of Oklahoma […]

  8. […] this interview from the Education Technology & Change blog, Henry Neeman from the University of Oklahoma […]

Leave a comment