FlowFusion: Optimizing Cloud Workflows Through Fusion and Parallelization

Nithin Tatikonda

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2025-80
May 16, 2025

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-80.pdf

Currently, workflow services such as Google Cloud Workflows, AWS Step Functions, Azure Durable Orchestrations, and Airflow workflows execute literally as dictated by the user. This is not ideal as we want users to program for readability and programmability without worrying about impacts on performance. We introduce FlowFusion, a tool for programmatically combining or rearranging the separate tasks (task fusion) of a workflow into an optimized workflow with fewer workflow tasks, fewer database operations, and/or increased parallelism. FlowFusion works through a three-step process: profiling, task fusion, and task parallelization. Profiling involves executing the original workflow and determining the durations of tasks and read/write operations. Task fusion involves determining which tasks to combine into a single task to reduce the cost of scheduling new tasks and transferring data from task to task. Task parallelization involves determining if and how a data parallel task should be parallelized in order to minimize execution time. In implementing these three phases of our optimization tool, the “quirks” of cloud functions, cloud workflows, and database operations are considered. Our tool considers task fusion versus parallelism. Fusion inherently reduces parallelism, which could increase execution time. On the other hand, for some workflows, the task invocation overhead and task spin-up time could make fusion the optimal choice. Our tool also considers failure rate and retries for tasks. Some workflow tasks will have multiple retries enabled, meaning tasks will sometimes need to be executed until success is achieved or the number of retries is exceeded, and this behavior is also taken into consideration by the optimizer. Overall, our evaluation shows that FlowFusion achieves significantly lower execution times for most workflows, achieving up to a 4× improvement.

Advisor: Alvin Cheung

\"Edit"; ?>


BibTeX citation:

@mastersthesis{Tatikonda:EECS-2025-80,
    Author = {Tatikonda, Nithin},
    Title = {FlowFusion: Optimizing Cloud Workflows Through Fusion and Parallelization},
    School = {EECS Department, University of California, Berkeley},
    Year = {2025},
    Month = {May},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-80.html},
    Number = {UCB/EECS-2025-80},
    Abstract = {Currently, workflow services such as Google Cloud Workflows, AWS Step Functions, Azure Durable Orchestrations, and Airflow workflows execute literally as dictated by the user. This is not ideal as we want users to program for readability and programmability without worrying about impacts on performance. We introduce FlowFusion, a tool for programmatically combining or rearranging the separate tasks (task fusion) of a workflow into an optimized workflow with fewer workflow tasks, fewer database operations, and/or increased parallelism. FlowFusion works through a three-step process: profiling, task fusion, and task parallelization. Profiling involves executing the original workflow and determining the durations of tasks and read/write operations. Task fusion involves determining which tasks to combine into a single task to reduce the cost of scheduling new tasks and transferring data from task to task. Task parallelization involves determining if and how a data parallel task should be parallelized in order to minimize execution time. In implementing these three phases of our optimization tool, the “quirks” of cloud functions, cloud workflows, and database operations are considered. Our tool considers task fusion versus parallelism. Fusion inherently reduces parallelism, which could increase execution time. On the other hand, for some workflows, the task invocation overhead and task spin-up time could make fusion the optimal choice. Our tool also considers failure rate and retries for tasks. Some workflow tasks will have multiple retries enabled, meaning tasks will sometimes need to be executed until success is achieved or the number of retries is exceeded, and this behavior is also taken into consideration by the optimizer. Overall, our evaluation shows that FlowFusion achieves significantly lower execution times for most workflows, achieving up to a 4× improvement.}
}

EndNote citation:

%0 Thesis
%A Tatikonda, Nithin
%T FlowFusion: Optimizing Cloud Workflows Through Fusion and Parallelization
%I EECS Department, University of California, Berkeley
%D 2025
%8 May 16
%@ UCB/EECS-2025-80
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-80.html
%F Tatikonda:EECS-2025-80