Statistical Guarantees for Black-Box Models

Anastasios Angelopoulos

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2024-226

December 20, 2024

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-226.pdf

Reliability has emerged as one of the most important challenges facing AI deployments. One difficulty is the inability of standard theoretical tools to guarantee strong performance of modern AI systems due to their complexity and ever-changing training and development pipelines. Here, we take a different strategy: ensuring reliability of a black-box model—one where we only have access to the inputs and outputs, and no knowledge of the mapping between the two. The only way to ensure reliability of such models is by surrounding them with a statistical infrastructure for measurement and calibration.

This thesis develops statistical guarantees for black-box AI models in the domains of prediction and inference. The first part will deal with prediction: guaranteeing reliability on a per-input basis. I will in particular focus on a line of work extending the model-agnostic guarantees of conformal prediction to the realm of risk control and decision-making. The second part will deal with inference, or aggregating predictions to produce estimators, confidence intervals, and p-values to learn about the broader world. There, I will focus on Prediction-Powered Inference, a tool for trustworthy AI-driven science, and its application to automated evaluation of AI algorithms.

Advisors: Jitendra Malik and Michael Jordan

BibTeX citation:

@phdthesis{Angelopoulos:EECS-2024-226,
    Author= {Angelopoulos, Anastasios},
    Title= {Statistical Guarantees for Black-Box Models},
    School= {EECS Department, University of California, Berkeley},
    Year= {2024},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-226.html},
    Number= {UCB/EECS-2024-226},
    Abstract= {Reliability has emerged as one of the most important challenges facing AI deployments. One difficulty is the inability of standard theoretical tools to guarantee strong performance of modern AI systems due to their complexity and ever-changing training and development pipelines. Here, we take a different strategy: ensuring reliability of a black-box model—one where we only have access to the inputs and outputs, and no knowledge of the mapping between the two. The only way to ensure reliability of such models is by surrounding them with a statistical infrastructure for measurement and calibration.

This thesis develops statistical guarantees for black-box AI models in the domains of prediction and inference. The first part will deal with prediction: guaranteeing reliability on a per-input basis. I will in particular focus on a line of work extending the model-agnostic guarantees of conformal prediction to the realm of risk control and decision-making. The second part will deal with inference, or aggregating predictions to produce estimators, confidence intervals, and p-values to learn about the broader world. There, I will focus on Prediction-Powered Inference, a tool for trustworthy AI-driven science, and its application to automated evaluation of AI algorithms.},
}

EndNote citation:

%0 Thesis
%A Angelopoulos, Anastasios 
%T Statistical Guarantees for Black-Box Models
%I EECS Department, University of California, Berkeley
%D 2024
%8 December 20
%@ UCB/EECS-2024-226
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-226.html
%F Angelopoulos:EECS-2024-226