Natural Organic Matter Workflow
Workflow Overview
Direct Infusion Fourier Transform mass spectrometry (DI FT-MS) data undergoes signal processing and molecular formula assignment leveraging EMSL’s CoreMS framework. Raw time domain data is transformed into the m/z domain using Fourier Transform and Ledford equation. Data is denoised followed by peak picking, recalibration using an external reference list of known compounds, and searched against a dynamically generated molecular formula library with a defined molecular search space. The confidence scores for all the molecular formula candidates are calculated based on the mass accuracy and fine isotopic structure, and the best candidate assigned as the highest score.
Workflow Availability
The workflow is available in GitHub: https://github.com/microbiomedata/enviroMS
The container is available at Docker Hub (microbiomedata/metaMS): https://hub.docker.com/r/microbiomedata/enviroms
The python package is available on PyPi: https://pypi.org/project/enviroMS/
Requirements for Execution
- Docker Container Runtime or 
- Python Environment >= 3.10 and 
- Python Dependencies are listed on requirements.txt 
Execution Details
Please refer to:
https://github.com/microbiomedata/enviroMS#enviroms-installation
Hardware Requirements
- To run this application, you need a processor with at least 2.0 GHz speed, 8GB of RAM, 10GB of free hard disk space 
Workflow Dependencies
Software
- CoreMS (2-clause BSD) 
- Click (BSD 3-Clause “New” or “Revised” License) 
Database
- CoreMS dynamic molecular database search and generator 
- The database is generated at runtime during workflow execution based on selected parameters 
Test datasets
Inputs
- Supported format for Direct Infusion FT-MS data: - Thermo raw file (.raw) 
- Bruker raw file (.d) 
- Generic mass list in profile and/or centroid mode (inclusive of all delimiters types and Excel formats) 
 
- Calibration File: - Molecular Formula Reference (.ref) 
- SRFA.ref should be used for SRFA data acquisition only 
- Hawkes.ref contains a list of 2000 common NOM molecular formulas and should be the default calibration list for NOM samples acquired in negative mode 
 
- Parameters: - CoreMS Parameter File (.json) 
- EnviroMS Parameter File (.json) 
 
Outputs
- Molecular Formula Data-Table, containing m/z measuments, Peak height, Peak Area, Molecular Formula Identification, Ion Type, Confidence Score, etc. - CSV, TAB-SEPARATED TXT 
- HDF: CoreMS HDF5 format 
- XLSX : Microsoft Excel 
 
- Workflow Metadata: - JSON 
 
Version History
- 4.1.5 
Point of contact
Package maintainer: Yuri E. Corilo <corilo@pnnl.gov>