Faculty Publications - Kurt Keutzer

Books

  • D. Chinnery and K. Keutzer, Closing the power gap between ASIC \& custom: tools and techniques for low power design, Springer Science \& Business Media, 2008.
  • D. Chinnery and K. Keutzer, Closing the Gap Between ASIC & Custom: Tools and Techniques for Low Power ASIC Design, New York, NY: Springer Business+Media, LLC, 2007.
  • M. Gries and K. W. Keutzer, Eds., Building ASIPs: The MESCAL Methodology, New York: Springer, 2005.
  • P. Chen, D. A. Kirkpatrick, and K. W. Keutzer, Static Crosstalk-Noise Analysis: For Deep Sub-Micron Digital Designs, Norwell, MA: Kluwer Academic Publishers, 2004.
  • D. Chinnery and K. Keutzer, Closing the Gap Between ASIC & Custom: Tools and Techniques for High-Performance ASIC Design, Boston, MA: Kluwer Academic Publishers, 2002.

Book chapters or sections

  • M. Anderson, B. Catanzaro, J. Chong, E. Gonina, K. Keutzer, C. Lai, M. W. Moskewicz, M. Murphy, B. Su, and K. Keutzer, "PALLAS: Mapping Applications onto Manycore," in Multiprocessor System-on-Chip: Hardware Design and Tool Integration, Springer, 2010, pp. 89-114.
  • K. Keutzer and K. Ravindran, "Technology mapping," in Encyclopedia of Algorithms, M. Y. Kao, Ed., Springer Reference, Berlin, Germany: Springer, 2008, pp. 944-946.

Articles in journals or magazines

Articles in conference proceedings

  • S. Shen, Z. Yao, A. Gholami, M. Mahoney, and K. Keutzer, "Powernorm: Rethinking batch normalization in transformers," in International Conference on Machine Learning, 2020, pp. 8741--8751.
  • Y. Cai, Z. Yao, Z. Dong, A. Gholami, M. W. Mahoney, and K. Keutzer, "Zeroq: A novel zero shot quantization framework," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13169--13178.
  • Y. You, J. Li, S. Reddi, J. Hseu, S. Kumar, S. Bhojanapalli, X. Song, J. Demmel, K. Keutzer, and C. Hsieh, "Large batch optimization for deep learning: Training bert in 76 minutes," in International Conference on Learning Representations, 2020.
  • P. Jain, A. Jain, A. Nrusimha, A. Gholami, P. Abbeel, K. Keutzer, I. Stoica, and J. Gonzalez, "Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization," in Proceedings of Machine Learning and Systems 2020, Machine Learning and Systems, 2020, pp. 497--511.
  • S. Shen, Z. Dong, J. Ye, L. Ma, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer, "Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.," in AAAI, 2020, pp. 8815--8821.
  • S. Zhao, B. Li, X. Yue, Y. Gu, P. Xu, R. Hu, H. Chai, and K. Keutzer, "Multi-source domain adaptation for semantic segmentation," in Advances in Neural Information Processing Systems, 2019, pp. 7287--7300.
  • X. Yue, Y. Zhang, S. Zhao, A. L. Sangiovanni-Vincentelli, K. Keutzer, and B. Gong, "Domain randomization and pyramid consistency: Simulation-to-real generalization without accessing target domain data," in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 2100--2110.
  • Z. Dong, Z. Yao, A. Gholami, M. W. Mahoney, and K. Keutzer, "Hawq: Hessian aware quantization of neural networks with mixed-precision," in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 293--302.
  • Z. Yao, A. Gholami, P. Xu, K. Keutzer, and M. W. Mahoney, "Trust region based adversarial attack on neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 11350--11359.
  • B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, "Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 10734--10742.
  • B. Wu, X. Zhou, S. Zhao, X. Yue, and K. Keutzer, "Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud," in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 4376--4382.
  • A. Gholami, K. Kwon, B. Wu, Z. Tai, X. Yue, P. Jin, S. Zhao, and K. Keutzer, "Squeezenext: Hardware-aware neural network design," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 1638--1647.
  • Y. You, Z. Zhang, C. Hsieh, J. Demmel, and K. Keutzer, "Imagenet training in minutes," in Proceedings of the 47th International Conference on Parallel Processing, 2018, pp. 1--10.
  • A. Gholami, A. Azad, P. Jin, K. Keutzer, and A. Buluç, "Integrated Model, Batch, and Domain Parallelism in Training Neural Networks," in SPAA'18: 30th ACM Symposium on Parallelism in Algorithms and Architectures, 2018.
  • P. Jin, K. Keutzer, and S. Levine, "Regret minimization for partially observable deep reinforcement learning," in International conference on machine learning, 2018, pp. 2342--2351.
  • S. Zhao, G. Ding, Q. Huang, T. Chua, B. W. Schuller, and K. Keutzer, "Affective Image Content Analysis: A Comprehensive Survey.," in IJCAI, 2018, pp. 5534--5541.
  • A. Gholami, A. Azad, P. Jin, K. Keutzer, and A. Buluc, "Integrated model, batch, and domain parallelism in training neural networks," in Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, 2018, pp. 77--86.
  • B. Wu, A. Wan, X. Yue, P. Jin, S. Zhao, N. Golmant, A. Gholaminejad, J. Gonzalez, and K. Keutzer, "Shift: A zero flop, zero parameter alternative to spatial convolutions," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 9127--9135.
  • B. Wu, A. Wan, X. Yue, and K. Keutzer, "Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud," in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 1887--1893.
  • F. Iandola and K. Keutzer, "small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures," in 2017 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2017, pp. 1--10.
  • B. Wu, F. Iandola, P. H. Jin, and K. Keutzer, "Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 129--137.
  • F. N. Iandola, M. W. Moskewicz, K. Ashraf, and K. Keutzer, "Firecaffe: near-linear acceleration of deep neural network training on compute clusters," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2592--2600.
  • E. Gonina, G. Friedland, H. Cook, and K. Keutzer, "Fast Speaker Diarization using a High-Level Scripting Language," in Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, ASRU'11, 2011.
  • J. Chong, E. Gonina, K. You, and K. Keutzer, "Exploring Recognition Network Representations for Efficient Speech Inference on Highly Parallel Platforms," in Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010, pp. 1489-1492.
  • N. Sundaram, T. Brox, and K. Keutzer, "Dense point trajectories by GPU-accelerated large displacement optical flow," in European conference on computer vision, 2010, pp. 438--451.
  • D. Kolossa, J. Chong, S. Zeiler, and K. Keutzer, "Efficient Manycore CHMM Speech Recognition for Audiovisual and Multistream Data," in Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010, pp. 2698-2701.
  • J. Chong, E. Gonina, and K. Keutzer, "Monte Carlo Methods," in 2nd Annual Conference on Parallel Programming Patterns (ParaPLoP'10), 2010.
  • M. Dixon, J. Chong, and K. Keutzer, "Acceleration of Market Value-at-Risk Estimation," in Proceedings of the 2nd Workshop on High Performance Computational Finance, WHPCF '09, New York, NY, USA: ACM, 2009, pp. 5:1--5:8.
  • K. You, J. Chong, Y. Yi, E. Gonina, C. Hughes, Y. Chen, W. Sung, and K. Keutzer, "Parallel Scalability in Speech Recognition: Inference engine in large vocabulary continuous speech recognition," in IEEE Signal Processing Magazine, Vol. 26, 2009, pp. 124-135.
  • J. Chong, E. Gonina, Y. Yi, and K. Keutzer, "A Fully Data Parallel WFST-based Large Vocabulary Continuous Speech Recognition on a Graphics Processing Unit," in Proceedings of the 10th Annual Conference of the International Speech Communication Association (InterSpeech), 2009, pp. 1183–1186.
  • J. Chong, E. Gonina, Y. Yi, and K. Keutzer, "A Fully Data Parallel WFST-based Large Vocabulary Continuous Speech Recognition on a Graphics Processing Unit," in Proceedings of the 10th Annual Conference of the International Speech Communication Association (InterSpeech), 2009, pp. 1183–1186.
  • B. C. Catanzaro, S. A. Kamil, Y. Lee, K. Asanović, J. Demmel, K. Keutzer, J. Shalf, K. A. Yelick, and A. Fox, "SEJITS: Getting productivity and performance with selective embedded JIT specialization," in Proceedings First Workshop on Programming Models for Emerging Architectures, 2009.
  • B. Catanzaro, N. Sundaram, and K. Keutzer, "Fast support vector machine training and classification on graphics processors," in Proceedings of the 25th international conference on Machine learning, 2008, pp. 104--111.
  • B. C. Catanzaro, N. Sundaram, and K. Keutzer, "Fast support vector machine training and classification on graphics processors," in Proc. 25th Intl. Conf. on Machine Learning (ICML 2008), A. McCallum and S. Roweis, Eds., ACM International Conference Proceeding Series, Vol. 307, New York, NY: The Association for Computing Machinery, Inc., 2008, pp. 104-111.
  • J. Chong, Y. Yi, A. Faria, N. Satish, and K. Keutzer, "Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors," in Proceedings of the 1st Annual Workshop on Emerging Applications and Many Core Architecture, 2008, pp. 23-35.
  • J. Chong, Y. Yi, A. Faria, N. Satish, and K. Keutzer, "Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors," in Proceedings of the 1st Annual Workshop on Emerging Applications and Many Core Architecture, 2008, pp. 23-35.
  • B. Catanzaro, N. Sundaram, and K. Keutzer, "Fast support vector machine training and classification on graphics processors," in ICML '08: Proceedings of the 25th international conference on Machine learning, New York, NY, USA: ACM, 2008, pp. 104--111.
  • B. Catanzaro, K. Keutzer, and B. Y. Su, "Parallelizing CAD: A timely research agenda for EDA," in Proc. 45th ACM/IEEE Design Automation Conf. (DAC 2008), New York, NY: The Association for Computing Machinery, Inc., 2008, pp. 12-17.
  • S. Sapatnekar, E. Haritan, K. Keutzer, A. Devgan, D. A. Kirkpatrick, S. Meier, D. Pryor, and T. Spyrou, "Reinventing EDA with manycore processors," in Proc. 45th ACM/IEEE Design Automation Conf. (DAC 2008), New York, NY: The Association of Computing Machinery, Inc., 2008, pp. 126-127.
  • J. Chong, Y. Yi, A. Faria, N. Satish, and K. Keutzer, "Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors," in Proceedings of the 1st Annual Workshop on Emerging Applications and Many Core Architecture (EAMA), 2008, pp. 23--35.
  • J. Chong, N. R. Satish, B. C. Catanzaro, K. Ravindran, and K. Keutzer, "Efficient parallelization of H.264 decoding with macro block level scheduling," in Proc. 2007 Intl. Conf. on Multimedia and Expo (ICME 2007), Piscataway, NJ: IEEE Press, 2007, pp. 1874-1877.
  • F. Bacchini, G. Spirakis, J. A. Carballo, A. de Geus, F. C. Hsu, K. Keutzer, and K. Yamada, "Megatrends and EDA 2017 (Panel Session)," in Proc. 44th Design Automation Conf. (DAC 2007), New York, NY: The Association for Computing Machinery, Inc., 2007, pp. 21-22.
  • N. R. Satish, K. Ravindran, and K. Keutzer, "A decomposition-based constraint optimization approach for statically scheduling task graphs with communication delays to multiprocessors," in Proc. 10th Design, Automation and Test in Europe Conf. and Exhibition (DATE 2007), R. Lauwereins and J. Madsen, Eds., San Jose, CA: EDA Consortium, 2007, pp. 57-62.
  • Y. Jin, N. R. Satish, K. Ravindran, and K. Keutzer, "An Automated Exploration Framework for FPGA-based Soft Multiprocessor Systems," in Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES), ACM Press, 2005, pp. 273-278.
  • K. Ravindran, N. R. Satish, Y. Jin, and K. Keutzer, "An FPGA-based Soft Multiprocessor for IPv4 Packet Forwarding," in 15th International Conference on Field Programmable Logic and Applications (FPL), 2005, pp. 487-492.
  • K. Keutzer, S. Malik, and A. R. Newton, "From ASIC to ASIP: The next design discontinuity," in Proc. 2002 IEEE Conf. on Computer Design, Los Alamitos, CA: IEEE Computer Society Press, 2002, pp. 84-90.
  • K. Keutzer and A. R. Newton, "The MARCO/DARPA Gigascale Silicon Research Center (Plenary Talk)," in Proc. 1999 Intl. Conf. on Computer Design (ICCD '99), Los Alamitos, CA: IEEE Computer Society Press, 1999, pp. 14-19.
  • D. Sylvester and K. Keutzer, "Getting to the bottom of deep submicron," in 1998 IEEE/ACM Intl. Conf. on Computer-Aided Design. Digest of Technical Papers, New York, NY: ACM, 1998, pp. 203-11.
  • K. Keutzer, A. R. Newton, and N. V. Shenoy, "The future of logic synthesis and physical design in deep-submicron process geometries," in Proc. 1997 Intl. Symp. on Physical Design, New York, NY: ACM, Inc., 1997, pp. 218-224.
  • A. Ghosh, S. Devadas, K. Keutzer, and J. White, "Estimation of average switching activity in combinational and sequential circuits," in Proc. 29th ACM/IEEE Conf. on Design Automation, Los Alamitos, CA: IEEE Computer Society Press, 1992, pp. 253-259.
  • K. Keutzer, "DAGON: Technology binding and local optimization by DAG matching," in Proc. 24th ACM/IEEE Conf. on Design Automation, Piscataway, NJ: IEEE, 1987, pp. 341-347.

Technical Reports

Talks or presentations

  • J. Cong, K. Keutzer, and G. Martin, "High-level CAD and architecture (Invited Talk)," presented at Pre-Conference Workshop on Grand Challenges in FPGA Research: 15th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGS 2007), Monterey, CA, Feb. 2007.

Ph.D. Theses

Masters Reports