Data Intensive Computing
Evaluating Multithreaded Architectures for Irregular Data Intensive Applications
Principal Investigator: Andrès Márquez
Challenge
Present standard hardware architectures have well-defined memory architectures, but domains that produce large-graph solutions, such as biology and predictive analytics, need data intensive applications that require highly irregular access patterns.
Cray XMT
- Scalable
- ~2,000 processors
- Can program at a higher level, which domain science can do
Mercury Cell
- Multicore (8 SPE, conventional processor)
- Each core runs 1<e; threads
- Brute force parallelism
- Impressive speed
Nvidia QuadroPlex IV
- GPUs
- New programming model for non-graphics applications
- Similar to Cell - well established memory hierarchy, but does latency hiding
Approach
- Multithreaded architectures improve latency reduction; applications must be irregular and have a large data set.
- Increase our expertise with these architectures and encourage adoption of these architectures
- Demonstrate that applications in the HPC community can exploit unconventional architectures in a data intensive framework.
- Statistical Kernels: Dynamic PDTree on XMT
- Biological Networks: SAT Solvers on MTA or XMT
- Video Analysis: Social Analytics on Cluster/Cell/GPU/XMT Chain
- Automatic Classification: Latent Dirichilet Allocation on GPU
- Subsurface Simulation: Smooth Particle Hydrodynamics on GPU
- Cyber Security: Triads on XMT
- Cyber Security: Support Vector Machines on XMT
Impact
Hybrid architectures can make viable a new set of applications not currently possible, providing a more attractive technology.
Collaborations
- Sandia
- Georgia Tech
- Data Intensive Computing projects (import apps to Nvidia), PNNL subsurface group
Accomplishments
- PNNL first to adopt Cray XMT
- Cray transitioned to external funding
- IPDPS multithreaded workshop
