Home Developing a Discrimination Rule between Breast Cancer Patients and Controls Using Proteomics Mass Spectrometric Data: A Three-Step Approach
Article
Licensed
Unlicensed Requires Authentication

Developing a Discrimination Rule between Breast Cancer Patients and Controls Using Proteomics Mass Spectrometric Data: A Three-Step Approach

  • A. Geert Heidema and Nico Nagelkerke
Published/Copyright: February 8, 2008

To discriminate between breast cancer patients and controls, we used a three-step approach to obtain our decision rule. First, we ranked the mass/charge values using random forests, because it generates importance indices that take possible interactions into account. We observed that the top ranked variables consisted of highly correlated contiguous mass/charge values, which were grouped in the second step into new variables. Finally, these newly created variables were used as predictors to find a suitable discrimination rule. In this last step, we compared three different methods, namely Classification and Regression Tree (CART), logistic regression and penalized logistic regression. Logistic regression and penalized logistic regression performed equally well and both had a higher classification accuracy than CART. The model obtained with penalized logistic regression was chosen as we hypothesized that this model would provide a better classification accuracy in the validation set. The solution had a good performance on the training set with a classification accuracy of 86.3%, and a sensitivity and specificity of 86.8% and 85.7%, respectively.

Published Online: 2008-2-8

©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston

Articles in the same Issue

  1. Editorial
  2. International Competition on Mass Spectrometry Proteomic Diagnosis
  3. Introduction Paper
  4. Case-Control Breast Cancer Study of MALDI-TOF Proteomic Mass Spectrometry Data on Serum Samples
  5. Organizing a Competition on Clinical Mass Spectrometry Based Proteomic Diagnosis
  6. Competition Paper
  7. Application of the Random Forest Classification Method to Peaks Detected from Mass Spectrometric Proteomic Profiles of Cancer Patients and Controls
  8. Developing a Discrimination Rule between Breast Cancer Patients and Controls Using Proteomics Mass Spectrometric Data: A Three-Step Approach
  9. Principal Component Discriminant Analysis
  10. Classification of Breast Cancer versus Normal Samples from Mass Spectrometry Profiles Using Linear Discriminant Analysis of Important Features Selected by Random Forest
  11. A Classification Model for the Leiden Proteomics Competition
  12. Empirical Bayes Logistic Regression
  13. Autocorrelated Logistic Ridge Regression for Prediction Based on Proteomics Spectra
  14. Support Vector Machine Approach to Separate Control and Breast Cancer Serum Samples
  15. A Cross-Validation Study to Select a Classification Procedure for Clinical Diagnosis Based on Proteomic Mass Spectrometry
  16. Clinical Mass Spectrometry Proteomic Diagnosis by Conformal Predictors
  17. Article
  18. Assessing the Validity Domains of Graphical Gaussian Models in Order to Infer Relationships among Components of Complex Biological Systems
  19. Assessment
  20. Breast Cancer Diagnosis from Proteomic Mass Spectrometry Data: A Comparative Evaluation
Downloaded on 23.9.2025 from https://www.degruyterbrill.com/document/doi/10.2202/1544-6115.1341/html
Scroll to top button