Xuaner Zhang

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2020-228

December 18, 2020

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-228.pdf

Having a compact, casual pocket camera always within reach is a delight. It opens the opportunity to capture spontaneous moments and casual events. While users appreciate the convenience of mobile experience, their crave for visual quality of the professionals is hard to achieve. Because of hardware limitations and a lack of control over suboptimal conditions in the environment, casual photos and videos suffer from noise, lack of sharpness, unflattering lighting, wrong focus, distracting obstructions, etc. The desires are eager to make cameras see as our human visual system does, to understand the world and produce photographs that are perceptually pleasing and meaningful. Professional studio photography and cinematography have made the best attempts delivering high-quality photos and videos by incorporating intricate hardware and gathering professional crew. Casual imaging, on the other hand, is still nowhere close.

In this thesis, I argue that it is key for a camera to understand the semantics of the scene -- the context -- presented in its viewfinder in order to intelligently capture and process sensor data. The approach to bring in such contextual information is through machine learning. Thankfully, modern mobile cameras are integrated with fast image processors and even dedicated machine learning chips to drive the development of computational capacities. Machine-learning-driven computational photography algorithms are lifted to great practicality more than ever before. Throughout the thesis, I discuss the challenges of causal imaging and how its quality can benefit from professional photography and cinematography principles. The thesis focuses on the quality enhancement from three aspects -- perceptual, lighting and focus. We propose a number of learning-based methods to lift these limitations to produce unprecedented results, and show a potential direction that integrates machine learning and imaging systems to enhance casual photos and videos towards the quality of the professionals.

Advisors: Ren Ng


BibTeX citation:

@techreport{Zhang:EECS-2020-228,
    Author= {Zhang, Xuaner},
    Title= {Bridging Machine Learning and Computational Photography to Bring Professional Quality into Casual Photos and Videos},
    Year= {2020},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-228.html},
    Number= {UCB/EECS-2020-228},
    Abstract= {Having a compact, casual pocket camera always within reach is a delight. It opens the opportunity to capture spontaneous moments and casual events. While users appreciate the convenience of mobile experience, their crave for visual quality of the professionals is hard to achieve. Because of hardware limitations and a lack of control over suboptimal conditions in the environment, casual photos and videos suffer from noise, lack of sharpness, unflattering lighting, wrong focus, distracting obstructions, etc. The desires are eager to make cameras see as our human visual system does, to understand the world and produce photographs that are perceptually pleasing and meaningful. Professional studio photography and cinematography have made the best attempts delivering high-quality photos and videos by incorporating intricate hardware and gathering professional crew. Casual imaging, on the other hand, is still nowhere close. 

In this thesis, I argue that it is key for a camera to understand the semantics of the scene -- the context -- presented in its viewfinder in order to intelligently capture and process sensor data. The approach to bring in such contextual information is through machine learning. Thankfully, modern mobile cameras are integrated with fast image processors and even dedicated machine learning chips to drive the development of computational capacities. Machine-learning-driven computational photography algorithms are lifted to great practicality more than ever before. Throughout the thesis, I discuss the challenges of causal imaging and how its quality can benefit from professional photography and cinematography principles. The thesis focuses on the quality enhancement from three aspects -- perceptual, lighting and focus. We propose a number of learning-based methods to lift these limitations to produce unprecedented results, and show a potential direction that integrates machine learning and imaging systems to enhance casual photos and videos towards the quality of the professionals.},
}

EndNote citation:

%0 Report
%A Zhang, Xuaner 
%T Bridging Machine Learning and Computational Photography to Bring Professional Quality into Casual Photos and Videos
%I EECS Department, University of California, Berkeley
%D 2020
%8 December 18
%@ UCB/EECS-2020-228
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-228.html
%F Zhang:EECS-2020-228