Recognizing Functions in Binaries with Neural Networks

Richard Shin, Dawn Song and Reza Moazzezi

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2017-200
December 12, 2017

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-200.pdf

Binary analysis facilitates many important applications like malware detection and automatically fixing vulnerable software. In this paper, we propose to apply artificial neural networks to solve important yet difficult problems in binary analysis. Specifically, we tackle the problem of function identification, a crucial first step in many binary analysis techniques. Using a dataset from prior work, we show that recurrent neural networks can identify functions in binaries with greater accuracy and efficiency than the state-of-the-art machine-learning-based method. We can train the model an order of magnitude faster and evaluate it on binaries hundreds of times faster. Furthermore, it halves the error rate on six out of eight benchmarks, and performs comparably on the remaining two.

Advisor: Dawn Song


BibTeX citation:

@mastersthesis{Shin:EECS-2017-200,
    Author = {Shin, Richard and Song, Dawn and Moazzezi, Reza},
    Title = {Recognizing Functions in Binaries with Neural Networks},
    School = {EECS Department, University of California, Berkeley},
    Year = {2017},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-200.html},
    Number = {UCB/EECS-2017-200},
    Abstract = {Binary analysis facilitates many important applications like malware detection and automatically fixing vulnerable software. In this paper, we propose to apply artificial neural networks to solve important yet difficult problems in binary analysis. Specifically, we tackle the problem of function identification, a crucial first step in many binary analysis techniques. Using a dataset from prior work, we show that recurrent neural networks can identify functions in binaries with greater accuracy and efficiency than the state-of-the-art machine-learning-based method. We can train the model an order of magnitude faster and evaluate it on binaries hundreds of times faster. Furthermore, it halves the error rate on six out of eight benchmarks, and performs comparably on the remaining two.}
}

EndNote citation:

%0 Thesis
%A Shin, Richard
%A Song, Dawn
%A Moazzezi, Reza
%T Recognizing Functions in Binaries with Neural Networks
%I EECS Department, University of California, Berkeley
%D 2017
%8 December 12
%@ UCB/EECS-2017-200
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-200.html
%F Shin:EECS-2017-200