Neural-Backed Generators for Program Synthesis

Rohan Bavishi

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2019-82
May 17, 2019

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-82.pdf

Developers nowadays have to contend with a growing number of APIs. While in the long-term they are very useful to developers, many modern APIs, with their hundreds of functions handling many arguments, obscure documentation, and frequently changing semantics, have an incredibly steep learning curve. For APIs that perform data transformations, novices can often provide an I/O example demonstrating the desired transformation, but are stuck on how to translate it to the API. Our goal is to build a programming-by-example synthesis engine that takes such I/O examples and directly produces programs in the target API. This presents unique challenges due to the breadth of real-world APIs, and the often-complex constraints over function arguments. We present a generator-based synthesis approach to contend with these problems. This approach uses a program candidate generator, which encodes basic constraints on the space of programs. We introduce neural-backed operators which can be seamlessly integrated into the program generator. To improve the efficiency of the search, we simply use these operators at non-deterministic decision points, instead of relying on domain-specific heuristics. We implement this technique for the Python pandas library in AutoPandas. AutoPandas supports 119 pandas dataframe transformation functions. We evaluate AutoPandas on 26 real-world benchmarks and find it solves 65% of them.

Advisor: Koushik Sen


BibTeX citation:

@mastersthesis{Bavishi:EECS-2019-82,
    Author = {Bavishi, Rohan},
    Title = {Neural-Backed Generators for Program Synthesis},
    School = {EECS Department, University of California, Berkeley},
    Year = {2019},
    Month = {May},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-82.html},
    Number = {UCB/EECS-2019-82},
    Abstract = {Developers nowadays have to contend with a growing number of APIs. While in the long-term they are very useful to developers, many modern APIs, with their hundreds of functions handling many arguments, obscure documentation, and frequently changing semantics, have an incredibly steep learning curve. For APIs that perform data transformations, novices can often provide an I/O example demonstrating the desired transformation, but are stuck on how to translate it to the API. Our goal is to build a programming-by-example synthesis engine that takes such I/O examples and directly produces programs in the target API. This presents unique challenges due to the breadth of real-world APIs, and the often-complex constraints over function arguments. We present a generator-based synthesis approach to contend with these problems. This approach uses a program candidate generator, which encodes basic constraints on the space of programs. We introduce neural-backed operators which can be seamlessly integrated into the program generator. To improve the efficiency of the search, we simply use these operators at non-deterministic decision points, instead of relying on domain-specific heuristics. We implement this technique for the Python pandas library in AutoPandas. AutoPandas supports 119 pandas dataframe transformation functions. We evaluate AutoPandas on 26 real-world benchmarks and find it solves 65% of them.}
}

EndNote citation:

%0 Thesis
%A Bavishi, Rohan
%T Neural-Backed Generators for Program Synthesis
%I EECS Department, University of California, Berkeley
%D 2019
%8 May 17
%@ UCB/EECS-2019-82
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-82.html
%F Bavishi:EECS-2019-82