Margaret Helen Butler

EECS Department, University of California, Berkeley

Technical Report No. UCB/CSD-88-401

, 1988

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-401.pdf

Considerable recent activity has been directed toward providing data management facilities for artificial intelligence and computer-aided design. A major characteristic of the data in these environments is that data objects may contain references to other data objects. These interobject references introduce many issues that are studied in this dissertation. Since Lisp is a common language with interobject references, it has been used to study their characteristics when stored in a database management system. First, this dissertation reports on a persistent Lisp prototype in which Lisp stores application data in the database management system Ingres. Two Lisp applications were converted to use this prototype and their performance is analyzed. <p>Initial testing of a data intensive application displayed a slowdown of a factor of three over the performance when all data was stored in virtual memory. The retrieval rate from the database was thirty-five objects per second. In contrast, the update rate was only 10 objects per second. Therefore, applications with many retrieves but few updates will perform fairly well. <p>Three techniques were studied for their impact on performance, namely, caching data in virtual memory, prefetching data, and clustering data in the database. If an object is cached in virtual memory after it is fetched from the database, the performance of the example persistent applications improved by a factor of 25. Subsequent experiments incorporated a mechanism that predicted what data would be accessed next. The performance of persistent application improves up to 30% when data is prefetched. <p>Clustering was implemented by storing more than one object in a single database tuple. Various forms of clustering were analyzed and a kind that has proven effective for Lisp data was implemented and tested. This clustering technique improves performance up to 40%. <p>Once interobject references are stored in a database, it becomes possible to alter a reference such that an object stored in the database will no longer be accessible. In virtual memory environments, automatic storage reclaimers, or garbage collectors, are utilized to find and delete unreachable objects. Existing storage reclamation algorithms are analyzed to determine their disk I/O characteristics. The features that appear to be necessary include being able to reclaim data a little at a time, and reclustering the data. Algorithms that do not recluster data cause applications performance to degrade over time. In addition, algorithms that scavenge the entire memory at once require an application remain idle for unrealistic amounts of time.

Advisors: Michael R. Stonebraker


BibTeX citation:

@phdthesis{Butler:CSD-88-401,
    Author= {Butler, Margaret Helen},
    Title= {Persistent LISP: Storing Interobject References in a Database},
    School= {EECS Department, University of California, Berkeley},
    Year= {1988},
    Month= {Nov},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5872.html},
    Number= {UCB/CSD-88-401},
    Abstract= {Considerable recent activity has been directed toward providing data management facilities for artificial intelligence and computer-aided design.  A major characteristic of the data in these environments is that data objects may contain references to other data objects.  These interobject references introduce many issues that are studied in this dissertation.  Since Lisp is a common language with interobject references, it has been used to study their characteristics when stored in a database management system.  First, this dissertation reports on a persistent Lisp prototype in which Lisp stores application data in the database management system Ingres.  Two Lisp applications were converted to use this prototype and their performance is analyzed.  <p>Initial testing of a data intensive application displayed a slowdown of a factor of three over the performance when all data was stored in virtual memory.  The retrieval rate from the database was thirty-five objects per second.  In contrast, the update rate was only 10 objects per second.  Therefore, applications with many retrieves but few updates will perform fairly well.  <p>Three techniques were studied for their impact on performance, namely, caching data in virtual memory, prefetching data, and clustering data in the database.  If an object is cached in virtual memory after it is fetched from the database, the performance of the example persistent applications improved by a factor of 25. Subsequent experiments incorporated a mechanism that predicted what data would be accessed next.  The performance of persistent application improves up to 30% when data is prefetched. <p>Clustering was implemented by storing more than one object in a single database tuple.  Various forms of clustering were analyzed and a kind that has proven effective for Lisp data was implemented and tested.  This clustering technique improves performance up to 40%. <p>Once interobject references are stored in a database, it becomes possible to alter a reference such that an object stored in the database will no longer be accessible.  In virtual memory environments, automatic storage reclaimers, or garbage collectors, are utilized to find and delete unreachable objects.  Existing storage reclamation algorithms are analyzed to determine their disk I/O characteristics.  The features that appear to be necessary include being able to reclaim data a little at a time, and reclustering the data.  Algorithms that do not recluster data cause applications performance to degrade over time. In addition, algorithms that scavenge the entire memory at once require an application remain idle for unrealistic amounts of time.},
}

EndNote citation:

%0 Thesis
%A Butler, Margaret Helen 
%T Persistent LISP: Storing Interobject References in a Database
%I EECS Department, University of California, Berkeley
%D 1988
%@ UCB/CSD-88-401
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5872.html
%F Butler:CSD-88-401