Hari Prasanna Das

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2023-198

August 6, 2023

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-198.pdf

Climate change and pandemics are two of the most pressing threats facing humanity today. Addressing these urgent threats require immediate mitigative actions. In the US, buildings are responsible for 40% of primary energy consumption, 73% of electrical use and 40% of greenhouse gas emissions, the primary cause of global warming, and such high levels are now rapidly spreading across the rest of the world. At the same time, buildings are integral to human lives, as we spend most of our time in them which substantially affects our health and productivity. So, for climate change mitigation, it is essential to optimize energy use in buildings while ensuring human comfort. On the other hand, for pandemics mitigation, it is crucial to diagnose and have a better understanding of the new disease in a time-sensitive manner. Over the years, Machine Learning (ML) as a tool has been widely utilized for both the above efforts. However, both buildings and pandemic-specific healthcare systems exhibit a number of shared data-specific challenges, hindering robust ML implementations.

We will present 3 major research works on tackling them with generative modeling, and transfer learning. The first work will be on conditional synthetic data generation, where the focus is to conditionally generate synthetic data for classes with infrequent data points. The applications include tackling class imbalance in healthcare data, and privacy-preserving data sharing. The second will be on improved pre-processing methods for tabular data (a common data type in smart buildings) to enable seamless use by many ML algorithms. To improve the generalizability and scalability of the models, the third work will be on a transfer learning-based adversarial domain adaptation method, with applications in adapting personal thermal comfort models in buildings from one occupant to another without using any data labels for the target occupant. With this method, the time and the resource-intensive task of acquiring multiple labels for the target environment in a building can be avoided.

Advisors: Costas J. Spanos


BibTeX citation:

@phdthesis{Das:EECS-2023-198,
    Author= {Das, Hari Prasanna},
    Title= {Data-Centric Machine Learning for Human-Centric Applications},
    School= {EECS Department, University of California, Berkeley},
    Year= {2023},
    Month= {Aug},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-198.html},
    Number= {UCB/EECS-2023-198},
    Abstract= {Climate change and pandemics are two of the most pressing threats facing humanity today. Addressing these urgent threats require immediate mitigative actions. In the US, buildings are responsible for 40% of primary energy consumption, 73% of electrical use and 40% of greenhouse gas emissions, the primary cause of global warming, and such high levels are now rapidly spreading across the rest of the world. At the same time, buildings are integral to human lives, as we spend most of our time in them which substantially affects our health and productivity. So, for climate change mitigation, it is essential to optimize energy use in buildings while ensuring human comfort. On the other hand, for pandemics mitigation, it is crucial to diagnose and have a better understanding of the new disease in a time-sensitive manner. Over the years, Machine Learning (ML) as a tool has been widely utilized for both the above efforts. However, both buildings and pandemic-specific healthcare systems exhibit a number of shared data-specific challenges, hindering robust ML implementations.

We will present 3 major research works on tackling them with generative modeling, and transfer learning. The first work will be on conditional synthetic data generation, where the focus is to conditionally generate synthetic data for classes with infrequent data points. The applications include tackling class imbalance in healthcare data, and privacy-preserving data sharing. The second will be on improved pre-processing methods for tabular data (a common data type in smart buildings) to enable seamless use by many ML algorithms. To improve the generalizability and scalability of the models, the third work will be on a transfer learning-based adversarial domain adaptation method, with applications in adapting personal thermal comfort models in buildings from one occupant to another without using any data labels for the target occupant. With this method, the time and the resource-intensive task of acquiring multiple labels for the target environment in a building can be avoided.},
}

EndNote citation:

%0 Thesis
%A Das, Hari Prasanna 
%T Data-Centric Machine Learning for Human-Centric Applications
%I EECS Department, University of California, Berkeley
%D 2023
%8 August 6
%@ UCB/EECS-2023-198
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-198.html
%F Das:EECS-2023-198