Home Special Issue on Data-Adaptive Statistical Inference
Article Publicly Available

Special Issue on Data-Adaptive Statistical Inference

  • Antoine Chambaz EMAIL logo , Alan Hubbard and Mark J. van der Laan
Published/Copyright: May 26, 2016

The concomitant emergence of big data, explosion of ubiquitous computational resources and democratization of the access to more powerful computing make it necessary and possible to rethink pragmatically the practice of statistics. While numerous machine learning methods provide much ever easier access to data-mining tools and sophisticated prediction, there is a growing realization that ad hoc and non-prespecified approaches to high-dimensional problems lend themselves to a proliferation of “findings” of dubious reproducibility. This period of fast-paced evolution is thus a blessing for statistics. It is a golden opportunity to build upon more than a century of methodological research in statistics and five decades of methodological research in machine learning to bend the course of statistics in a new direction, away from the misuse of parametric models and reporting of non-robust inference, to tackle rigorously the challenges that we, as a community, are confronted with.

The foundation of statistics is incorporating knowledge about the data-generating experiment through the definition of a statistical model (a set of laws), formalizing the question of interest through the definition of an estimand seen as the value of a statistical parameter (a functional mapping the model to a parameter set) at the true law of the experiment and inferring the estimand based on data yielded by the experiment. Typically, one would construct an estimator of (a collection of key features of) the true law and evaluate the statistical parameter at its value. The present special issue broadly focuses on the inference of various statistical parameters in situations where either the data-generating law or the statistical parameter or both are data-adaptively defined and/or estimated. Statistical theory has advanced in sync with scientific computing so practical implementation is now possible for the resulting computationally challenging estimators.

We asked researchers currently engaged in cutting edge research on data-adaptive inferential methods to share their views with us. The result is a compelling collection of advances in statistical theory and practice.

The special issue consists of 19 articles. Its theoretical spectrum is wide. Semiparametric models and inference, empirical process theory and machine learning are the three major subfields explored in the articles. Across this special issue, the acceptation of the word inference covers the estimation of finite-dimensional parameters and the construction of confidence regions for them, the estimation of infinite-dimensional features (either as an endgame or as a means to an end); testing hypotheses (for the sake of making discoveries), identifying particular subgroups in a population, selecting (groups or clusters of) significant variables, comparing data-adaptive predictors. Cross-validating, decomposing a task in a series of sub-tasks (by partitioning or relying on a recurrence), fluctuating and weighting are the recurring technical concepts. Most articles are motivated by applications arising from medicine (analyzing neuroimages, comparing treatments, inferring optimal individualized treatment rules). The others address challenging theoretical questions. They shed light on delicate theoretical problems, offer guidance for better practice, open exciting new territories to explore.

We hope you will enjoy perusing the special issue and that it will serve as a useful pivot towards methods that can address new challenges in data science.

We wish to thank warmly the De Gruyter team for its unconditional scientific and technical support, in particular Theresa Haney, Spencer McGrath and John Wolfe.

Published Online: 2016-5-26
Published in Print: 2016-5-1

©2016 by De Gruyter

Articles in the same Issue

  1. Frontmatter
  2. Editorial
  3. Special Issue on Data-Adaptive Statistical Inference
  4. Research Articles
  5. Statistical Inference for Data Adaptive Target Parameters
  6. Evaluations of the Optimal Discovery Procedure for Multiple Testing
  7. Addressing Confounding in Predictive Models with an Application to Neuroimaging
  8. Model-Based Recursive Partitioning for Subgroup Analyses
  9. The Orthogonally Partitioned EM Algorithm: Extending the EM Algorithm for Algorithmic Stability and Bias Correction Due to Imperfect Data
  10. A Sequential Rejection Testing Method for High-Dimensional Regression with Correlated Variables
  11. Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference
  12. Testing the Relative Performance of Data Adaptive Prediction Algorithms: A Generalized Test of Conditional Risk Differences
  13. A Case Study of the Impact of Data-Adaptive Versus Model-Based Estimation of the Propensity Scores on Causal Inferences from Three Inverse Probability Weighting Estimators
  14. Influence Re-weighted G-Estimation
  15. Optimal Spatial Prediction Using Ensemble Machine Learning
  16. AUC-Maximizing Ensembles through Metalearning
  17. Selection Bias When Using Instrumental Variable Methods to Compare Two Treatments But More Than Two Treatments Are Available
  18. Doubly Robust and Efficient Estimation of Marginal Structural Models for the Hazard Function
  19. Data-Adaptive Bias-Reduced Doubly Robust Estimation
  20. Optimal Individualized Treatments in Resource-Limited Settings
  21. Super-Learning of an Optimal Dynamic Treatment Rule
  22. Second-Order Inference for the Mean of a Variable Missing at Random
  23. One-Step Targeted Minimum Loss-based Estimation Based on Universal Least Favorable One-Dimensional Submodels
Downloaded on 25.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/ijb-2016-0033/html
Scroll to top button