Skip to main content

Moving towards statistical machines in health sciences

A Berkeley Public Health Brown Bag Talk

Click or tap the embedded video play button to watch this event.

Recent years have seen the rapid rise of the use of data adaptive, machine learning (ML) algorithms in health sciences. Their use has seen some success, but also some reason for caution in how they can be misused. Some of this has to do with the idiosyncratic manner in which ML is deployed, guided less by theory and more by practical metrics (e.g., prediction performance of the algorithm in a test set). Both recently develop methodology/theory has suggested the potential of true statistical machines: algorithms where the required information is inputted including the desired statistical summaries, a button is pressed, and estimation/inference are optimized automatically with little input and “experimentation” by the analyst. The methods to do so represent a break with traditional statistical analysis and its teaching, which require a re-think of how we apply statistical methodology in public health, and how to change the curriculum to better train students to understand these developments. The talk will both present a background in the development of statistical machines, as well as possible future developments, how machines can help to improve reliability of public health science, always with an eye on the public health goals.

Alan Hubbard’s (UC Berkeley professor of Biostatistics) research focuses on the application of statistics to population studies with emphasis on semi-parametric models in causal inference, as well as applications in high dimensional biology. Applied work ranges from molecular biology of aging, wildlife biology, epidemiology, and infectious disease modeling, but most of his work has focused on semi-parametric estimation and inference with high-dimensional data. He is particularly interested in harnessing machine-learning algorithms and advances in semiparametric causal inference towards machines for optimizing estimation of parameters related to causal inference/variable importance.


People of UCBPH found in this article include:

Moving towards statistical machines in health sciences © 2020 by UC Berkeley School of Public Health is licensed under CC BY-NC-ND 4.0 Creative Commons Credit must be given to the creator Only noncommercial use is permitted No derivatives or adaptations are permitted
  • What is CC BY-NC-ND 4.0?

    CC BY-NC-ND 4.0

    Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

    You are free to:
    • Share — copy and redistribute the material in any medium or format
    • The licensor cannot revoke these freedoms as long as you follow the license terms.
    Under the following terms:
    • BY Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
    • NC NonCommercial — You may not use the material for commercial purposes.
    • ND NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.
    • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
    Learn more: