Towards Automated Online Schema Evolution

Yu Zhu

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2017-218
December 14, 2017

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-218.pdf

Schema evolution studies the issue of moving a database from one version of its schema to a new updated schema. Traditionally, database administrators perform these tasks offline and they involve large amounts of manual labor and custom scripting. In today’s world where databases power many 24/7 online services, schema evolution can no longer be an offline process. Furthermore, application requirements change much more rapidly today, causing more frequent changes in database schemas. Because of these trends, it is critical for database administrators to have automated tools to evolve database schemas in an online fashion that does not disrupt the foreground services.

This thesis attempts to explore ways an administrator might automate this pro- cess and provide some insight into building tools to help make this process easier, faster and more reliable. The thesis makes the following contributions. First, it provides a complete system implementation, Ratchet, that a database administrator can use to perform efficient schema evolution on supported platforms (PostgreSQL). The system uses various techniques such as improved fine-grained locking and a delayed- copy strategy to improve its schema evolution performance. Second, it analyzes the characteristics of schema evolution for a five-year period for Wikimedia, one of the most widely used websites. Third, using Ratchet, the thesis recreates five years of schema evolution automatically. Finally, the thesis provides a mechanism of rollback in schema evolution.

Advisor: Eric Brewer


BibTeX citation:

@phdthesis{Zhu:EECS-2017-218,
    Author = {Zhu, Yu},
    Title = {Towards Automated Online Schema Evolution},
    School = {EECS Department, University of California, Berkeley},
    Year = {2017},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-218.html},
    Number = {UCB/EECS-2017-218},
    Abstract = {Schema evolution studies the issue of moving a database from one version of its schema to a new updated schema. Traditionally, database administrators perform these tasks offline and they involve large amounts of manual labor and custom scripting. In today’s world where databases power many 24/7 online services, schema evolution can no longer be an offline process. Furthermore, application requirements change much more rapidly today, causing more frequent changes in database schemas. Because of these trends, it is critical for database administrators to have automated tools to evolve database schemas in an online fashion that does not disrupt the foreground services.

This thesis attempts to explore ways an administrator might automate this pro- cess and provide some insight into building tools to help make this process easier, faster and more reliable. The thesis makes the following contributions. First, it provides a complete system implementation, Ratchet, that a database administrator can use to perform efficient schema evolution on supported platforms (PostgreSQL). The system uses various techniques such as improved fine-grained locking and a delayed- copy strategy to improve its schema evolution performance. Second, it analyzes the characteristics of schema evolution for a five-year period for Wikimedia, one of the most widely used websites. Third, using Ratchet, the thesis recreates five years of schema evolution automatically. Finally, the thesis provides a mechanism of rollback in schema evolution.}
}

EndNote citation:

%0 Thesis
%A Zhu, Yu
%T Towards Automated Online Schema Evolution
%I EECS Department, University of California, Berkeley
%D 2017
%8 December 14
%@ UCB/EECS-2017-218
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-218.html
%F Zhu:EECS-2017-218