Natural Organic Matter Workflow
Workflow Overview
Direct Infusion Fourier Transform mass spectrometry (DI FT-MS) data undergoes signal processing and molecular formula assignment leveraging EMSL’s CoreMS framework. Raw time domain data is transformed into the m/z domain using Fourier Transform and Ledford equation. Data is denoised followed by peak picking, recalibration using an external reference list of known compounds, and searched against a dynamically generated molecular formula library with a defined molecular search space. The confidence scores for all the molecular formula candidates are calculated based on the mass accuracy and fine isotopic structure, and the best candidate assigned as the highest score.
Workflow Availability
The workflow is available in GitHub: https://github.com/microbiomedata/enviroMS
The container is available at Docker Hub (microbiomedata/metaMS): https://hub.docker.com/r/microbiomedata/enviroms
The python package is available on PyPi: https://pypi.org/project/enviroMS/
Requirements for Execution
Docker Container Runtime or
Python Environment >= 3.10 and
Python Dependencies are listed on requirements.txt
Execution Details
Please refer to:
https://github.com/microbiomedata/enviroMS#enviroms-installation
Hardware Requirements
To run this application, you need a processor with at least 2.0 GHz speed, 8GB of RAM, 10GB of free hard disk space
Workflow Dependencies
Software
CoreMS (2-clause BSD)
Click (BSD 3-Clause “New” or “Revised” License)
Database
CoreMS dynamic molecular database search and generator
The database is generated at runtime during workflow execution based on selected parameters
Test datasets
Inputs
Supported format for Direct Infusion FT-MS data:
Thermo raw file (.raw)
Bruker raw file (.d)
Generic mass list in profile and/or centroid mode (inclusive of all delimiters types and Excel formats)
Calibration File:
Molecular Formula Reference (.ref)
SRFA.ref should be used for SRFA data acquisition only
Hawkes.ref contains a list of 2000 common NOM molecular formulas and should be the default calibration list for NOM samples acquired in negative mode
Parameters:
CoreMS Parameter File (.json)
EnviroMS Parameter File (.json)
Outputs
Molecular Formula Data-Table, containing m/z measuments, Peak height, Peak Area, Molecular Formula Identification, Ion Type, Confidence Score, etc.
CSV, TAB-SEPARATED TXT
HDF: CoreMS HDF5 format
XLSX : Microsoft Excel
Workflow Metadata:
JSON
Version History
4.1.5
Point of contact
Package maintainer: Yuri E. Corilo <corilo@pnnl.gov>