Data-Intensive Computing Initiative (DICI)
Multithreaded Architectures for Data-Intensive Applications
Technical Contacts: J. Nieplocha, A. Marquez, D. Chavarria, C. Scherrer, G. Chin, N. Beagley, and V. Tipparaju
Executive Summary
Multithreading (MT) programming and execution models are starting to permeate high-end and mainstream computing. This trend is driven by the need to increase processor utilization and deal with the memory-processor speed gap. Recent and upcoming example architectures that fit this profile are Cray's Eldorado, IBM's Cyclops, and several SMT processors from Sun (Niagara/UltraSparc T1), IBM (Power5+, Power6), and Intel (Xeon).
The underlying rationale to increase processor utilization is a varying mix of new metrics that take performance improvements as well as better power and cost budgeting into account. PNNL has been investigating multithreaded architectures for solving large data-intensive problems.
Accomplishments / Highlights
Cray Eldorado
- Eldorado is a 3rd generation Cray/Tera multithreaded system.
- First massively parallel multithreaded architecture.
- PNNL is an early adopter of one of the first systems.
- Relies on technology of existing systems:
- Cray Red Storm (XT3) infrastructure
- Cray/Tera MTA-2
- Flat shared memory programming model.
- Based on MTA-2 language extensions and compilers.
Recent Publications
Nieplocha J, A Marquez, J Feo, D Chavarría-Miranda, G Chin, Jr, C Scherrer, and N Beagley. 2007. "Evaluating the Potential of Multithreaded Platforms for Irregular Scientific Applications." In ACM Computing Frontiers.
Scherrer C, N Beagley, J Nieplocha, A Marquez, J Feo, and D Chavarría-Miranda. 2007. "Probability Convergence in a Multithreaded Counting Application." In Workshop on Multithreaded Architectures and Applications.
Graph Signature Computation: Subquadratic Triad Census
- Highly irregular code and data structure.
- This algorithm traverses large sparse networks with small minimum degree.
- Processor utilization of the main routine up to 6 processors on MTA-2 with a combination of implicit and explicit parallelism.
Anomaly Detection and Network Traffic Analysis
- Original sequential algorithm developed at PNNL to support categorical data analysis.
- Latest MTA-2 version developed in collaboration with John Feo at Cray.
- Linear speedups up to 32 processors on MTA-2 for trees with more than 100M nodes (network traffic data).
Power System State Estimation
- Code developed at PNNL to support the Northwest Center for Electric Power Technologies program.
- Makes intensive use of sparse matrices.
- In-house PNNL CG solver outperforms SuperLU solver on SGI Altix.
- Speedups up to 16 processors on MTA-2 exceed published results.
Collaboration
- Near Real-Time Situation Awareness from Massive Sensor Data.
- PNNL's Electricity Infrastructure Operations Center using real data to provide real solutions to real problems.
- Co-hosting workshop on Multithreaded Architectures and Applications at the International Parallel and Distributed Processing Symposium with Cray.
- Multithreaded Architecture Consortium leadership.
- Graph analysis for the Data-Intensive Computing for Complex Biological Systems program.
Demonstration
This work will contribute to the Decision Support and Control and Scientific Insight and Discovery demonstrations.
Impacts
Multithreaded architectures are expected to play an increasing role in the future of network analysis due to their ability to hide memory access latency. Engagement is underway with multiple industrial partners to create a consortium around the development of vendor agnostic libraries.
