Skip to Main Content U.S. Department of Energy
Data Intensive Computing

Data Intensive Computing

Developing a Pipeline for SMART Mass Spectrometry

Challenge:

A large number of high throughput biological experiments are conducted on a regular basis to understand biological processes at a systems level. Every experiment is treated as an independent entity by the instruments, resulting in a data explosion with huge amounts of unwanted redundancies.

Approach:

We demonstrate an online (real-time) pipeline that combines data intensive analytical software modules, visualization software and very large databases within a single integration framework (MeDICi). Incoming spectra from a simulated mass spectrometer are analyzed in real time to determine the course of processing for individual samples based on comparing them to an existing and updating database of observed mass and time values. The same spectra are visualized within a central software component along with additional results of analytical processing. A feedback based on the results of the analytical processing is initiated back to the instrument which decides whether the samples have been fragmented already.

Impact:

This capability provides a spectrum of benefits:

  • Processing of already analyzed features is avoided, which allows more efficient instrument usage and reduces the amount of redundant data generation. This has positive impact in data richness and speeds the results to the end user.
  • Without the smart instrument control method, experimental results of interest are usually the hardest to acquire. The described method will lead to more intelligent data gathering, which will improve analysis quality, reduce costs, and increase knowledge of the biological systems being studied.

In a broader sense, this method could be applied to a large number of fields where existing hierarchical analytical schemes move from nearly random detection events to increasingly more complex and specific detections (e.g. radiation, explosive, bio-threat, medical diagnostics, etc.).

see caption
Architecture for Pipeline

Data Intensive Computing

Research Areas

Demonstrations

Highlights

Medici Technology to be Highlighted in Special Issue of Scientific Computing

USCD Director Describes How Global Platform "OptIPuter" Opens New Frontiers

Research Projects

Projects Overview