Enabling HPC E-Science via Integrated Grid Infrastructure
ABSTRACT High Performance Computing E-Science has numerous requirements well in excess of the normal environment sufficient for routine computations. The data requirements in particular may be in the multiterabyte regime, with transfer rates in the several Gb/s or more range. Such data capabilities may only be available at a single location. On the other hand, understanding the data, via data-mining and visualization may require completely different facilities, while the actual data production may be only possible at yet a third location. The use of Global File System with multi-Gb/s speeds and hundreds of TeraBytes capacity can help the scientific researchers, but the simultaneous utilization of several systems also requires a reasonably sophisticated co-scheduling capability. In this paper, we show how the TeraGrid is combining massive computational clusters, a Global File System, very powerful visualization systems, and a co-scheduler to enable massive E-Science in a very coordinated and usable manner.