Improve Model Inference Cost with Image Gridding
Shreyas Krishnaswamy
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2023-182
May 19, 2023
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-182.pdf
The overwhelming success of AI has spurred the rise of Machine Learning as a Service (MLaaS), where companies develop, maintain, and serve general-purpose models such as object detectors and image classifiers for users that pay a fixed rate per inference. As more organizations incorporate AI technologies into their operations, the MLaaS market is set to expand, necessitating cost optimization for these services, particularly in high-volume applications. We explore how a simple yet effective method of increasing model efficiency, aggregating multiple images into a grid before inference, can significantly reduce the required number of inferences for processing a batch of images with varying drops in accuracy. To counter the slight decrease in object detection accuracy, we introduce ImGrid, an innovative technique that decides when to reprocess gridded images at a higher resolution based on model confidence and bounding box area assessments. Experiments on open-source and commercial models show that ImGrid reduces inferences by 50%, while maintaining low impact on mean Average Precision (mAP) for the Pascal VOC object detection task.
Advisors: Joseph Gonzalez
BibTeX citation:
@mastersthesis{Krishnaswamy:EECS-2023-182, Author= {Krishnaswamy, Shreyas}, Title= {Improve Model Inference Cost with Image Gridding}, School= {EECS Department, University of California, Berkeley}, Year= {2023}, Month= {May}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-182.html}, Number= {UCB/EECS-2023-182}, Abstract= {The overwhelming success of AI has spurred the rise of Machine Learning as a Service (MLaaS), where companies develop, maintain, and serve general-purpose models such as object detectors and image classifiers for users that pay a fixed rate per inference. As more organizations incorporate AI technologies into their operations, the MLaaS market is set to expand, necessitating cost optimization for these services, particularly in high-volume applications. We explore how a simple yet effective method of increasing model efficiency, aggregating multiple images into a grid before inference, can significantly reduce the required number of inferences for processing a batch of images with varying drops in accuracy. To counter the slight decrease in object detection accuracy, we introduce ImGrid, an innovative technique that decides when to reprocess gridded images at a higher resolution based on model confidence and bounding box area assessments. Experiments on open-source and commercial models show that ImGrid reduces inferences by 50%, while maintaining low impact on mean Average Precision (mAP) for the Pascal VOC object detection task.}, }
EndNote citation:
%0 Thesis %A Krishnaswamy, Shreyas %T Improve Model Inference Cost with Image Gridding %I EECS Department, University of California, Berkeley %D 2023 %8 May 19 %@ UCB/EECS-2023-182 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2023/EECS-2023-182.html %F Krishnaswamy:EECS-2023-182