Ali Sinan Koksal

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2018-49

May 11, 2018

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-49.pdf

Cell signaling controls basic cellular activities and coordinates cell actions, such as cell differentiation, division and growth. Consequently, errors in cellular signaling are responsible for diseases such as cancer, autoimmunity, and diabetes.

Executable biology describes mechanistic models of biological processes in a formal language that is dynamic and executable by a computer. Models in executable biology are able to capture complex behaviors of biological systems, such as time and concurrency. In addition, discrete modeling enables efficient algorithms to exhaustively explore spaces of models.

This thesis introduces tools to automatically infer executable models at different levels of abstraction from varied types of experimental data. In each case, we investigate identifiability of models when the provided experimental evidence and prior knowledge are varied. We make the following individual contributions: * TPS:A framework for the automated inference of signed directed graphs modeling protein signaling networks, using time series data. * SBL: A modeling language embedded in Scala for the automated synthesis of concurrent programs modeling cell fate decision using mutation experiments. * Karme: A framework for investigating identifiability of asynchronous Boolean network models from single-cell gene expression data.

To evaluate our work, we apply our tools to in vivo, in vitro, and in silico data sets on cellular differentiation and protein signaling. We show that, through explicit characterization of ambiguities in input specifications, our approaches make unambiguous predictions supported by experimental evidence, and suggest new experiments that help disambiguate alternative explanations. Applied to epidermal growth factor signaling response data, TPS exhaustively explores all models that are consistent with the input, and makes predictions that are unambiguous across the model space. These predictions are supported by further experimental validation. Using SBL, we synthesize valid models of cell fate decision in C. elegans vulval precursor cells, fixing a bug in previous modeling. We show the existence of internally different models that are behaviorally equivalent under all mutation experiments. One of the inferred models expresses a previously unknown biological hypothesis. Finally, we use Karme to synthesize models of myeloid cell differentiation from simulated noisy single-cell data, and demonstrate that experimental design can reveal key genes in the system.

Advisors: Ras Bodik and Nir Yosef


BibTeX citation:

@phdthesis{Koksal:EECS-2018-49,
    Author= {Koksal, Ali Sinan},
    Title= {Program Synthesis for Systems Biology},
    School= {EECS Department, University of California, Berkeley},
    Year= {2018},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-49.html},
    Number= {UCB/EECS-2018-49},
    Abstract= {Cell signaling controls basic cellular activities and coordinates cell actions, such as cell differentiation, division and growth. Consequently, errors in cellular signaling are responsible for diseases such as cancer, autoimmunity, and diabetes.

Executable biology describes mechanistic models of biological processes in a formal language that is dynamic and executable by a computer. Models in executable biology are able to capture complex behaviors of biological systems, such as time and concurrency. In addition, discrete modeling enables efficient algorithms to exhaustively explore spaces of models.

This thesis introduces tools to automatically infer executable models at different levels of abstraction from varied types of experimental data. In each case, we investigate identifiability of models when the provided experimental evidence and prior knowledge are varied. We make the following individual contributions:
  * TPS:A framework for the automated inference of signed directed graphs modeling protein signaling networks, using time series data.
  * SBL: A modeling language embedded in Scala for the automated synthesis of concurrent programs modeling cell fate decision using mutation experiments.
  * Karme: A framework for investigating identifiability of asynchronous Boolean network models from single-cell gene expression data.

To evaluate our work, we apply our tools to in vivo, in vitro, and in silico data sets on cellular differentiation and protein signaling. We show that, through explicit characterization of ambiguities in input specifications, our approaches make unambiguous predictions supported by experimental evidence, and suggest new experiments that help disambiguate alternative explanations. Applied to epidermal growth factor signaling response data, TPS exhaustively explores all models that are consistent with the input, and makes predictions that are unambiguous across the model space. These predictions are supported by further experimental validation. Using SBL, we synthesize valid models of cell fate decision in C. elegans vulval precursor cells, fixing a bug in previous modeling. We show the existence of internally different models that are behaviorally equivalent under all mutation experiments. One of the inferred models expresses a previously unknown biological hypothesis. Finally, we use Karme to synthesize models of myeloid cell differentiation from simulated noisy single-cell data, and demonstrate that experimental design can reveal key genes in the system.},
}

EndNote citation:

%0 Thesis
%A Koksal, Ali Sinan 
%T Program Synthesis for Systems Biology
%I EECS Department, University of California, Berkeley
%D 2018
%8 May 11
%@ UCB/EECS-2018-49
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-49.html
%F Koksal:EECS-2018-49