Home A Cross-Validation Study to Select a Classification Procedure for Clinical Diagnosis Based on Proteomic Mass Spectrometry
Article
Licensed
Unlicensed Requires Authentication

A Cross-Validation Study to Select a Classification Procedure for Clinical Diagnosis Based on Proteomic Mass Spectrometry

  • Dirk Valkenborg , Suzy Van Sanden , Dan Lin , Adetayo Kasim , Qi Zhu , Philippe Haldermans , Ivy Jansen , Ziv Shkedy and Tomasz Burzykowski
Published/Copyright: March 24, 2008

We present an approach to construct a classification rule based on the mass spectrometry data provided by the organizers of the "Classification Competition on Clinical Mass Spectrometry Proteomic Diagnosis Data." Before constructing a classification rule, we attempted to pre-process the data and to select features of the spectra that were likely due to true biological signals (i.e., peptides/proteins). As a result, we selected a set of 92 features. To construct the classification rule, we considered eight methods for selecting a subset of the features, combined with seven classification methods. The performance of the resulting 56 combinations was evaluated by using a cross-validation procedure with 1000 re-sampled data sets. The best result, as indicated by the lowest overall misclassification rate, was obtained by using the whole set of 92 features as the input for a support-vector machine (SVM) with a linear kernel. This method was therefore used to construct the classification rule. For the training data set, the total error rate for the classification rule, as estimated by using leave-one-out cross-validation, was equal to 0.16, with the sensitivity and specificity equal to 0.87 and 0.82, respectively.

Published Online: 2008-3-24

©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston

Articles in the same Issue

  1. Editorial
  2. International Competition on Mass Spectrometry Proteomic Diagnosis
  3. Introduction Paper
  4. Case-Control Breast Cancer Study of MALDI-TOF Proteomic Mass Spectrometry Data on Serum Samples
  5. Organizing a Competition on Clinical Mass Spectrometry Based Proteomic Diagnosis
  6. Competition Paper
  7. Application of the Random Forest Classification Method to Peaks Detected from Mass Spectrometric Proteomic Profiles of Cancer Patients and Controls
  8. Developing a Discrimination Rule between Breast Cancer Patients and Controls Using Proteomics Mass Spectrometric Data: A Three-Step Approach
  9. Principal Component Discriminant Analysis
  10. Classification of Breast Cancer versus Normal Samples from Mass Spectrometry Profiles Using Linear Discriminant Analysis of Important Features Selected by Random Forest
  11. A Classification Model for the Leiden Proteomics Competition
  12. Empirical Bayes Logistic Regression
  13. Autocorrelated Logistic Ridge Regression for Prediction Based on Proteomics Spectra
  14. Support Vector Machine Approach to Separate Control and Breast Cancer Serum Samples
  15. A Cross-Validation Study to Select a Classification Procedure for Clinical Diagnosis Based on Proteomic Mass Spectrometry
  16. Clinical Mass Spectrometry Proteomic Diagnosis by Conformal Predictors
  17. Article
  18. Assessing the Validity Domains of Graphical Gaussian Models in Order to Infer Relationships among Components of Complex Biological Systems
  19. Assessment
  20. Breast Cancer Diagnosis from Proteomic Mass Spectrometry Data: A Comparative Evaluation
Downloaded on 25.9.2025 from https://www.degruyterbrill.com/document/doi/10.2202/1544-6115.1363/html
Scroll to top button