Skip subnavigation and go to article content

Many issues in the health, medical and biological sciences are addressed by collecting and exploring relevant data. The development and application of techniques to better understand such data is a fundamental concern of our program.

This program offers training in the theory of statistics and biostatistics, computer implementation of analytic methods and opportunities to use this knowledge in areas of biological/medical research. The resources of Berkeley Public Health and the UC Berkeley Department of Statistics, together with those of other university departments, offer a broad set of opportunities to satisfy the needs of individual students. Furthermore, the involvement of UCSF faculty from the Department of Biostatistics and Epidemiology also enriches instructional and research activities.

Curriculum

A PhD degree in Biostatistics requires a program of courses selected from biostatistics, statistics, and at least one other subject area (such as environmental health, epidemiology, or genomics), an oral qualifying examination, and a dissertation. Courses cover traditional topics as well as recent advances in biostatistics and statistics. Those completing the PhD will have acquired a deep knowledge and understanding of the MA subject areas. Since graduates with doctorates often assume academic research and teaching careers, a high degree of mastery in research design, theory, methodology, and execution is expected, as well as the ability to communicate and present concepts in a clear, understandable manner.

The PhD degree program requires 4-6 semesters of coursework, the completion of  the qualifying examination and dissertation (in total, a minimum of four semesters of registration is required). Since there are no formal course requirements for the PhD, a program of courses appropriate to a student’s background and interests may be developed with a graduate adviser.

Qualifications

A Master’s degree in Biostatistics or a related field is recommended but not required for admission to the PhD program. Strongly recommended prerequisite courses are calculus, linear algebra, and statistics. Applicants admitted without a Master’s degree may be required to go through the Biostatistics MA curriculum; students can concurrently earn that degree with no additional cost or time to degree. Normative time to degree is 5 years.

Students entering with a relevant master’s degree in biostatistics or statistics must have a faculty advisor who is a member of the Biostatistics Graduate Group committing funding and mentorship support.

GRE Exemption Criteria

GRE General Test scores are required for admission to the Biostatistics PhD program however applicants are exempted from the requirement if they meet all of the following criteria:

  • Completed two semesters of calculus for a letter grade and earned a grade of “B” or higher.
  • Completed one semester of linear algebra for a letter grade and earned a grade of “B” or higher.
  • Completed one semester of statistics for a letter grade and earned a grade of “B” or higher.
  • Cumulative undergraduate GPA of 3.0 or higher.
  • Overall quantitative/math GPA of 3.0 or higher.
  • For students with a Master’s in Biostatistics or a related field, graduate GPA of 3.0 or higher.
  • For international students: TOEFL score of 100 or higher OR IELTS score of 7.0 or higher

Berkeley Public Health also exempts applicants who already hold a doctoral level degree from the GRE requirement.You can find more information on the application instructions page. There is a program page in the Berkeley Graduate Application where you can indicate you meet the criteria for GRE exemption. Applicants who are exempted from the GRE are not at a disadvantage in the application review process.

Employment

Many doctoral graduates accept faculty positions in schools of public health, medicine, and statistics and/or math departments at colleges and universities, both in the United States and abroad. Some graduates take research positions, including with pharmaceutical companies, hospital research units, non-profits, and within the tech sector.

Funding and fee remission

Prospective students who are US citizens or permanent residents can find more information about applying for an application fee waiver for the Berkeley Graduate Application. Fees will be waived based on financial need or participation in selected programs described on the linked website. International applicants (non-US citizens or Permanent Residents) are not eligible for application fee waivers.

All PhD students are fully funded (including tuition and fees and a stipend or salary) with the exception of Non-Resident Supplemental Tuition (NRST) for the second year, if applicable. NRST is typically waived after the first year of study for PhD students when they advance to candidacy. Information on applying to GSI positions for biostatistics students can be found in the student handbook.

Tuition and fees change each academic year. To view the current tuition and fees, see the fee schedule on the Office of the Registrar website (in the Graduate: Academic section).

Please contact biostat@berkeley.edu if you have any questions about funding opportunities for the biostatistics programs.

Diversity, Equity and Inclusion

The Division of Biostatistics is committed to challenging systemic inequities in the areas of health, medical, and biological sciences, and to advancing the goals of diversity, equity, and inclusivity in Biostatistics and related fields.

Diversity, Equity and Inclusion in Biostatistics

Admissions Statistics

8.6% Admissions Ratio (8/93)
3.9 Average GPA of admitted applicants
81% Average Verbal GRE percentile
92% Average Quantitative GRE percentile
25 Average age upon admission
4 Average years of relevant professional/research experience

Faculty

Clinical Faculty

Emeritus

Faculty Associated in Biostatistics Graduate Group

  • Peter Bickel PhD
    Statistics
  • David R. Brillinger PhD
    Statistics
  • Perry de Valpine PhD
    Environmental Science, Policy, and Management
  • Haiyan Huang PhD
    Statistics
  • Michael J. Klass PhD
    Statistics
  • Priya Moorjani PhD
    Molecular & Cell Biology
  • Rasmus Nielsen PhD
    Integrative Biology and Statistics
  • Elizabeth Purdom PhD
    Statistics
  • Sophia Rabe-Hesketh PhD
    Education
  • John Rice PhD
    Statistics
  • Yun S. Song PhD
    Statistics; Electrical Engineering and Computer Sciences
  • Bin Yu PhD
    Statistics

Alumni Directory

Katherine Pollard
Director, Gladstone Institute of Data Science & Biotechnology and Professor & Chief, Division of Bioinformatics, Department of Epidemiology & Biostatistics, UCSF
Biostatistics PhD, Fall 1998–Spring 2003
Dissertation chair: Mark van der Laan
Dissertation title: Computationally intensive statistical methods for analysis of gene expression data

Yan Wang
Director, Oncology Biostatistics, Gilead Sciences
Biostatistics PhD, Fall 2001–Spring 2006
Dissertation chair: Sandrine Dudoit
Dissertation title: Statistical Methods for Evaluating Linkage Disequilibrium and Its Patterns Using Length of Haplotype Sharing

Maya Petersen
Associate Professor and Head of the Division of Biostatistics, UC Berkeley School of Public Health
Biostatistics PhD, Fall 2002–Spring 2007
Dissertation chair: Mark van der Laan
Dissertation title: Application of causal inference methods to improve the treatment of antiretroviral-resistant HIV infection

Merrill Birkner
Vice President, Portfolio Strategy and Analytics at Gilead Sciences
Biostatistics PhD, Fall 2003–Spring 2006
Dissertation chair: Mark van der Laan
Dissertation title: Statistical Hypothesis Testing and Application to Biological Data

Houston Gilbert
VP, Biometrics and Data Management, Arcus Biosciences
Biostatistics PhD, Spring 2009
Dissertation chair: Sandrine Dudoit
Dissertation title: Multiple Hypothesis Testing: Methodology, Software Implementation, and Applications to Genomics

Kasper Hansen
Associate Professor, Department of Biostatistics, Johns Hopkins University
Biostatistics PhD, Fall 2009
Dissertation chair: Sandrine Dudoit
Dissertation title: Analyses of High-Throughput Gene Expression Data

Eric Polley
Associate Professor, University of Chicago
Biostatistics PhD, Fall 2005–Fall 2010
Dissertation chair: Mark van der Laan
Dissertation title: Super Learner

Sherri Rose
Associate Professor, Stanford University
Biostatistics PhD, Fall 2007–Spring 2011
Dissertation chair: Mark van der Laan
Dissertation title: Causal Inference for Case-Control Studies

Iván Díaz
Assistant Professor of Biostatistics, Weill Cornell Medical College
Biostatistics PhD, Fall 2009–Fall 2013
Dissertation chair: Mark van der Laan
Dissertation title: Non-parametric causal effects for continuous exposures

Laura Balzer
Assistant Professor of Biostatistics at University of Massachusetts, Amherst
Biostatistics PhD, Fall 2010–Spring 2015
Dissertation chairs: Maya Petersen, Mark van der Laan
Dissertation title: Design and Analysis of Cluster Randomized Trials with Application to HIV Prevention and Treatment

Erin LeDell (Twitter, Github)
Chief Machine Learning Scientist at H2O.ai
Biostatistics PhD, Fall 2011–Spring 2015
Dissertation chairs: Maya Petersen, Mark van der Laan
Dissertation title: Scalable Ensemble Learning and Computationally Efficient Variance Estimation

Alexander Luedtke
Assistant Professor of Statistics at University of Washington
Biostatistics PhD, Fall 2012–Spring 2016
Dissertation chair: Mark van der Laan
Dissertation title: Evaluating Optimal Individualized Treatment Rules

Robin Mejia
Special Faculty, Department of Statistics and Data Science at Carnegie Mellon University
Biostatistics PhD, Fall 2012–Spring 2016
Dissertation chair: Nicholas Jewell
Dissertation title: Estimating the size of unobserved populations in human rights: Problems in Syria and El Salvador

Oleg Sofrygin
Research Scientist (Biostatistician) at Kaiser Permanente
Biostatistics PhD, Fall 2011–Spring 2016
Dissertation chair: Mark van der Laan
Dissertation title: Semi-Parametric Estimation in Network Data and Tools for Conducting Complex Simulation Studies in Causal Inference

Linh Tran
Scientist, Google / Lecturer, Stanford University
Biostatistics PhD, Fall 2009–Spring 2016
Dissertation chair: Maya Petersen
Dissertation title: Comparative Causal Effect Estimation and Robust Variance for Longitudinal Data Structures with Applications to Observational HIV Treatment

Inna Gerlovina
Postdoctoral Scholar, UCSF
Biostatistics PhD, Fall 2008–Fall 2016
Dissertation chair: Alan Hubbard
Dissertation title: Small sample inference

Jeremy Coyle
Founder, Magnolia Data Science
Biostatistics PhD, Fall 2011–Spring 2017
Dissertation chair: Alan Hubbard
Dissertation title: Computational Considerations for Targeted Learning

Marla Johnson
Bioinformatics Scientist at Veracyte, Inc.
Biostatistics PhD, Fall 2010–Fall 2017
Dissertation chair: Elizabeth Purdom
Dissertation title: Clustering of mRNA-Seq Data for Detection of Alternative Splicing Patterns

Cheng Ju
Research Scientist at Netflix
Biostatistics PhD, Fall 2014–Spring 2018
Dissertation chair: Mark van der Laan
Dissertation title: Variable and Model Selection for Propensity Score Estimators in Causal Inference

Kelly Street
Research Fellow at Dana-Farber Cancer Institute
Biostatistics PhD, Fall 2014–Fall 2018
Dissertation chair: Sandrine Dudoit
Dissertation title: Trajectory Inference and Analysis in Single-Cell Genomics

Jonathan Levy
Statistical Scientist, Genentech
Biostatistics PhD, Fall 2014–Spring 2019
Dissertation chair: Mark van der Laan
Dissertation title: Targeted Learning in Estimating Heterogeneous Effects and Transporting Direct and Indirect Effects

Courtney Schiffman
Senior Statistical Scientist at Genentech
Biostatistics PhD, Spring 2016–Spring 2019
Dissertation chair: Sandrine Dudoit
Dissertation title: The Role of Exploratory Data Analysis and Pre-processing in Omics Studies

Chi Zhang
Product Manager, Databricks
Biostatistics PhD, Fall 2015–Spring 2019
Dissertation chair: Mark van der Laan
Dissertation title: Targeted Maximum Likelihood Estimation and Ensemble Learning for Community-Level Data and Healthcare Claims Data

Weixin Cai
Deep Learning Researcher, Microsoft
Biostatistics PhD, Fall 2017–Summer 2019
Dissertation chair: Mark van der Laan
Dissertation title: Targeted Learning of High-dimensional Parameters and Its Finite Sample Inference

Suzanne Dufault
Postdoctoral Researcher for UC Berkeley School of Public Health and the World Mosquito Program
Biostatistics PhD, Fall 2017–Spring 2020
Dissertation chair: Nicholas Jewell
Dissertation title: The Analysis of Cluster-Randomized Test-Negative Designs: Eliminating Dengue

Alejandra Benitez
Statistical Scientist, Genentech
Biostatistics PhD, Fall 2017–Summer 2020
Dissertation chair: Maya Petersen
Dissertation title: Targeted machine learning approaches for leveraging data in resource-constrained settings

Chris Kennedy
Postdoc in biomedical informatics, Harvard Medical School
Biostatistics PhD, Fall 2017–Summer 2020
Dissertation chair: Alan Hubbard
Dissertation title: Innovations in machine learning: interval latent variables, causal exposure mixtures, and clinical predictive modeling

Lina Maria Montoya
Postdoctoral fellow at UNC Chapel Hill (supervisor: Michael Kosork, PhD) and UC Berkeley (supervisor: Jennifer Skeem, PhD)
Biostatistics PhD, Fall 2017–Fall 2020
Dissertation chair: Maya Petersen
Dissertation title: Estimation and Evaluation of the Optimal Dynamic Treatment Rule: Practical Considerations, Performance Illustrations, and Application to Criminal Justice Interventions

Aurelien Bibaut (Google Scholar)
Senior Research Scientist at Netflix
Biostatistics PhD, Summer 2015–Spring 2021
Dissertation chair: Mark van der Laan
Dissertation title: Statistical methods for causal inference from sequentially collected data and sequential decision making

Yue You
Data scientist at Facebook
Biostatistics PhD, Fall 2016–Spring 2021
Dissertation chairs: Mark van der Laan, Alan Hubbard
Dissertation title: Targeted learning for capture recapture models and treatment effect estimation

Hector Roux de Bezieux
Biostatistics PhD, Fall 2017–Spring 2021
Dissertation chair: Sandrine Dudoit
Dissertation title: Inference in high dimensions with applications to the analysis of single-cell transcriptomic and bacterial genetic data

Nima Hejazi
postdoctoral research fellow (Weill Cornell Medicine)
Biostatistics PhD, Fall 2017–Summer 2021
Dissertation chairs: Mark van der Laan, Alan Hubbard
Dissertation title: Nonparametric Causal Inference for Stochastic Interventions

Yuting Ye
Biostatistics PhD, Fall 2015–Fall 2021
Dissertation chairs: Haiyan Huang, Peter J. Bickel
Dissertation title: Decision Making on Noisy Data with Additional Knowledge

Student Directory

Kevin Benac

benac@berkeley.edu

Advisor: Peng Ding

LinkedIn

Philippe Boileau

philippe_boileau@berkeley.edu

Advisor: Sandrine Dudoit

Website

Github

I’m a PhD student from Montreal under the supervision of Professor Sandrine Dudoit. My research revolves around the development of statistical learning methods and their application to high-dimensional datasets. I also collaborate with epidemiologists and biologists, guiding experimental design and analyzing data generated via next-generation sequencing experiments.

David Chen

Mary Combs

Maryac330@gmail.com

Google Scholar

Advisor: Mark van der Laan

Mary Combs, MA, is a PhD student in Biostatistics at the University of California, Berkeley under Mark van der Laan. Mary’s research interests include developing targeted estimators for parameters of interest in semi- and non-parametric models within the framework of causal inference. Her doctorate thesis proposes, with minimal assumptions, a novel approach to risk estimation in complex longitudinal studies in the presence of competing risks.

Lauren Eyler Dang

lauren.eyler@berkeley.edu

Advisors: Alan Hubbard, Mark van der Laan

Github

Lauren Eyler Dang, MD, MPH is currently a PhD student in Biostatistics at the University of California, Berkeley. Her research focuses on targeted maximum likelihood-based methods and constrained optimization. Her applied work addresses measurement of health disparities, cancer risk prediction for populations with limited data, and healthcare systems development in low-resource settings.

James Duncan

jpduncan@berkeley.edu

Advisor: Bin Yu

Zhiyue Hu

Partow Imani

Yunwen Ji

jiyunwen@berkeley.edu

Advisor: Alan Hubbard

Haodong Li

haodong_li@berkeley.edu

Advisors: Alan Hubbard and Mark van der Laan

A Super-enthusiastic Learner interested in the methodology and applications of targeted learning and casual inference.

Yi Li

yi_li@berkeley.edu

Advisor: Jingshen Wang

I am interested in causal inference and targeted learning.

Lauren Liao

ldliao@berkeley.edu

AdvisorsAlan Hubbard and Yeyi Zhu

Github

LinkedIn

My research interests lie in causal inference, experimental design and analysis. I am interested in cognitive neuroscience, maternal health, and a wide range of public health issues.

Aidan McLoughlin

aidan_mcloughlin@berkeley.edu

AdvisorsHaiyan Huang, Lexin Li

LinkedIn

I am interested in developing machine learning methods for exploration of single cell sequencing data and spatiotemporal brain data. I am also interested in reinforcement learning for problems involving these data.

Sara Moore

Maxwell Murphy

murphy2122@berkeley.edu

Advisors: Rasmus Nielsen / Bryan Greenhouse (UCSF), Mark van der Laan

Github

Rachael Phillips

rachaelvphillips@berkeley.edu

Advisor: Mark van der Laan

Twitter

Github

Rachael has an MA in Biostatistics, BS in Biology, and BA in Mathematics. A student of targeted learning and causal inference; her research integrates personalized medicine, human-computer interaction, experimental design, and regulatory policy.

George Shan

Junming (Seraphina) Shi

junming_shi@berkeley.edu

Advisor: Haiyan Huang

Lei Shi

leishi@berkeley.edu

Advisor: Lexin Li

Website

I’m interested in high dimensional stat, causal inference and modern ML/DL theory together with their application in public health.

Hao Wang

hao_wang@berkeley.edu

Advisor: Elizabeth Purdom

My research interests are in the fields of statistical genomics and computational biology. I’m currently working on the Single-cell RNA sequencing (scRNA-seq) projects, advised by Professor Elizabeth Purdom. I enjoy cooking, playing the harp and imagining I have a cat 🙂

Yutong Wang

ytwang@berkeley.edu

Advisor: Yun S. Song

Website

I am broadly interested in statistical machine learning, probabilistic modeling, and causal inference, with applications in high-dimensional genomics and metagenomics data.

Waverly (Linqing) Wei

linqing_wei@berkeley.edu

Advisors: Jingshen Wang, Alan Hubbard

Website

My research interests lie in causal inference and adaptive design.

Yulun (Rayn) Wu

yulun_wu@berkeley.edu

Advisor: James Bentley “Ben” Brown

Github

My research interests include bayesian network, reinforcement learning, causal inference, semiparametric estimation and statistical computing. Outside of work, I love basketball, billiards, skating, biking, gaming and various kinds of water sports.

Mingrui Zhang

mingrui_zhang@berkeley.edu

Advisor: Lexin Li

Wenxin Zhang

wenxin_zhang@berkeley.edu

Advisor: Mark van der Laan

Xin Zhou

xinzhou@berkeley.edu

Advisor: Lexin Li

Yunzhe Zhou

ztzyz615@berkeley.edu

Advisor: Alan Hubbard

LinkedIn

I am interested in deep learning, statistical learning, graphical model and network analysis.

Postdoc Directory

Aaron Hudson

awhudson@berkeley.edu

Advisors: Maya Petersen and Mark van der Laan

Website

I recently completed my graduate studies in the Department of Biostatistics at University Washington, where I was advised by Dr. Ali Shojaie. I am broadly interested in developing methods for nonparametric statistical inference with applications in public health and medicine.

Yogita Sharma

yogita.sharma@berkeley.edu

Advisor: John Marshall

Linkedin

Suzanne Dufault

sdufault@berkeley.edu

Website

Advisor: Nicholas P. Jewell

As a postdoctoral scholar, I work in close collaboration with the World Mosquito Program to evaluate the potential of novel interventions to eliminate mosquito-borne diseases such as dengue. Statistical methods for novel trial designs, spatial-temporal analyses of disease spread, and randomization-based inference are three of the areas my work explores. Through consultancies and additional graduate research opportunities, I have also had the opportunity to work with occupational cohort data, survey data, and biomedical big data.

Zeyi Wang

wangzeyi@berkeley.edu

Advisors: Mark van der Laan, Maya Petersen

My research interests include methods development with targeted maximum likelihood estimation in causal inference, computerized efficient estimation, longitudinal and survival data analysis, as well as reproducibility and clinical trials with brain functional connectivity.