Chemometrics for comprehensive GC-MS data evaluation
Deconvolution and data handling
V2.1 (free download) use with NIST98 [LINK]
- Mass Frontier from HighChem and Thermo/Finnigan (demo on request) [LINK]
- Massfinder [LINK]
- IST for GC-MS (demo available) [LINK]
- MassLIB (demo available) [LINK]
- ACD/MS Manager
(demo available) [LINK]
- RIZA GC/MS Database (free download) [LINK]
The free RIZA GC/MS Database allows a coupled search of mass spectra and retention indices from AMDIS processed GC-MS files. Thus it keeps formerly processed data in the actual work process.
Mass Spectral Databases
There exist two large mass spectral databases which where assembled from different sources. The Wiley 5th edition and NIST98 had an overlap of 36,847 spectra. For a fast and secure identification of unknown spectra you need both libraries.
- Wiley 7th edition (338,000 spectra) (demo available) [LINK]
- NIST98 (129,136 spectra) (demo available) [LINK]
- NIST02 (175,214 spectra) [LINK]
- Palisade 600k Edition (606,000 spectra) [LINK]
- NIST05 MS Database (190.000 curated! spectra) [LINK]
GC Retention Indices
Almost 50 years after the indroduction of the Retention Index concept - there is a retention index database assembled by NIST. More than 25.000 RI values are included in NIST05. Other resources are:
- NIST05 RI Database [LINK]
- Pro ezGC (demo available) [LINK] or the
- ACD/GC Simulator.(demo available) [LINK]
For common congener analytes (PCBs, PCNs, dioxines) published RI tables exist. You may also use small libraries like the
- Agilent PCB Congener GC/MS RTL Database [LINK] or the
- Agilent/NMSLAB Forensic Toxicology GC/MS RTL Database [LINK]
- Tobacco/Smoke Mass Spectral Library and Retention Time Database [LINK] [PDF]
Programs- Chemicals and Properties
- EPA EPI Suite (free download) [LINK] [LINK2]
The free EPI-Suite contains some 103.000 structures + CAS numbers (SMILECAS database) and (PHYSPROP database) with around 25,000 compounds with some experimental data (logP, bp, henry..)
and you can also calculate a lot of physico-chemical properties (bp,logp,simple toxicity)
- Molgen-MS (demo available) [LINK]
Evaluation of Low Resolution Electron Impact Mass Spectra without Database Search; Modules: MSclass (mass spectra classification)
ElCoCo (elemental composition computation)
MOLGEN (molecular structure generator)
ReNeGe (reaction network generator)
As we are working in a living world there are always metabolites or breakdown products of chemicals. For a clever identification you need metabolite databases like:
- Metabolite database from MDL (8,590 parent compounds) [LINK]
- Metabolism Database from Accelrys [LINK]
- Environmental Fate Database (EPA and Syracuse Research) (free) [LINK]
- University of Minnesota Biocatalysis/ Biodegradation Database (free) [LINK]
If you want to identify metabolites of new or unknown chemicals you can
use expert algorithms like:
- MetabolExpert from CompuDrug (demo) [LINK]
- Meteor from Uni Leeds [LINK]
- CATABOL from (University Burgas) [LINK]
- statistic program [LINK]
Multivariate analysis, t-tests, ANOVAs, regression, PCA, cluster analysis, factor analysis, discriminant anaylsis, canonical correlation, survival analysis
- Datalab (Lohninger) [LINK]
Univariate linear and parabolic regression,
rank correlation, multiple linear regression, principal component analysis, neural networks, KNN-modelling, hierarchical clustering (dendrograms), etc.
ADE-4 (Uni Lyon) [LINK]
PCA (Principal Components Analysis and extensions), COA (COrrespondence Analysis and extensions), HTA (Homogeneous Tables Analysis), MCA (Multiple Correspondance Analysis and extensions), DDUtil (Complements to
basic analyses) and MatAlg (Matrix computations).
UniVarReg, OrthoVar, LinearReg (Linear rgressions),Discrimin (Discriminant analyses, between and within class analyses), Projectors (Principal Component Analysis with respect to Instrumental Variables) and CCA (Canonical Correspondance Analysis).
DMAUse (multiple distance matrices analysis and biodiversity measures), Clusters (cluster analysis) et Dendrograms (cluster analysis graphics).
Remember the CAS service knows today around 30 million organic compounds - we know around 0.5 million mass spectra. This is a very large gap. Therefore a smart chromatographer needs to know all possible compounds.
Q: Why not use CAS? A: Database politics, f.e. in germany is a very delicate subject. Most of the german universities have no money for a full CAS service (without major restrictions). The annual shilly shallying about a new subscription
will fail in most cases because there is no money. Database politics in science in general is a difficult thing. Apart from some exceptions it is not reasonable to pay an inadequate amount of money for the own reengineered results. Policy-makers do not pay attention to this important topic.
Wouldnt it be better if all OECD countries share the annual costs of CAS (~$30 Mio.) and gain free access?
- PubChem Database [LINK]
The upcoming shooting star.
- ChemIDPlus - [LINK]
Toxline contains also literature for analytical chemistry (coverage 1945...today). You need the MDLI chime-plugin for copy/paste from ISIS Draw and other applications [LINK]
- Enhanced NCI (CACTVS) database browser - [LINK]
Very flexible open database concept with multiple input and output options.
- SMILECAS database - [LINK]
Contains about 103.000 structures with name and CAS numbers as backbone of the EPI-Suite.
- Available Chemicals Directory (ACD) - [LINK]
Contains around 400000 unique compounds from 700 different suppliers. Trail within chemweb.com.