Xinyang Geng

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2023-219

August 11, 2023

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-219.pdf

Black-box model-based optimization problems, where the goal is to find a design input that maximizes an unknown objective function, are ubiquitous in a wide range of domains, such as the design of proteins, DNA sequences, aircraft, and robots. Solving model-based optimization problems typically requires actively querying the unknown objective function on design proposals, which means physically building the candidate molecule, aircraft, or robot, testing it to obtain the result. This process can be expensive and time consuming, and one might instead prefer to optimize for the best design using only the data one already has. This setting, called offline model-based optimization (MBO), poses substantial and different algorithmic challenges than more commonly studied online techniques. In this thesis, I will cover how to build benchmarks and algorithms to tackle these challenges. In particular, I will first define the offline MBO problem formally, and identify the common challenging properties associated with real-world offline MBO problems. I will then present Design-Bench, a benchmark for evaluating offline MBO methods with a suite of diverse and realistic tasks derived from real-world optimization problems. With the benchmark set up, I will describe conservative objective models (COMs), a surprisingly simple but effective method for tackling offline MBO problems. Finally, I will cover applications of offline MBO in computational chemistry and synthetic biology to demonstrate how variants of COMs can be applied to solve real-world scientific problems.

Advisors: Sergey Levine


BibTeX citation:

@phdthesis{Geng:EECS-2023-219,
    Author= {Geng, Xinyang},
    Title= {Offline Data-Driven Optimization: Benchmarks, Algorithms and Applications},
    School= {EECS Department, University of California, Berkeley},
    Year= {2023},
    Month= {Aug},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-219.html},
    Number= {UCB/EECS-2023-219},
    Abstract= {Black-box model-based optimization problems, where the goal is to find a design input that maximizes an unknown objective function, are ubiquitous in a wide range of domains, such as the design of proteins, DNA sequences, aircraft, and robots. Solving model-based optimization problems typically requires actively querying the unknown objective function on design proposals, which means physically building the candidate molecule, aircraft, or robot, testing it to obtain the result. This process can be expensive and time consuming, and one might instead prefer to optimize for the best design using only the data one already has. This setting, called offline model-based optimization (MBO), poses substantial and different algorithmic challenges than more commonly studied online techniques. In this thesis, I will cover how to build benchmarks and algorithms to tackle these challenges. In particular, I will first define the offline MBO problem formally, and identify the common challenging properties associated with real-world offline MBO problems. I will then present Design-Bench, a benchmark for evaluating offline MBO methods with a suite of diverse and realistic tasks derived from real-world optimization problems. With the benchmark set up, I will describe conservative objective models (COMs), a surprisingly simple but effective method for tackling offline MBO problems. Finally, I will cover applications of offline MBO in computational chemistry and synthetic biology to demonstrate how variants of COMs can be applied to solve real-world scientific problems.},
}

EndNote citation:

%0 Thesis
%A Geng, Xinyang 
%T Offline Data-Driven Optimization: Benchmarks, Algorithms and Applications
%I EECS Department, University of California, Berkeley
%D 2023
%8 August 11
%@ UCB/EECS-2023-219
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-219.html
%F Geng:EECS-2023-219