|Other - Colloquium on Artificial Intelligence Research and Optimization|
|SLIDE: Commodity Hardware is All You Need for Large-Scale Deep Learning|
|Anshumali Shrivastava, Rice University|
|Virtual- details TBD Zoom
March 17, 2021 - 01:00 pm
Current Deep Learning (DL) architectures are growing larger to learn from complex datasets. The trends show that the only sure-shot way of surpassing prior accuracy is to increase the model size, supplement it with more data, followed by aggressive fine-tuning. However, training and tuning astronomical sized models are time-consuming and stall the progress in AI. As a result, industries are increasingly investing in specialized hardware and deep learning accelerators like GPUs to scale up the process. It is taken for granted that commodity hardware CPU is incapable of outperforming powerful accelerators such as V100 GPUs in a head-to-head comparison of training large DL models. However, GPUs come with additional concerns: expensive infrastructural change, hard to virtualize, main memory limitations.
Anshumali Shrivastava is an assistant professor in the computer science department at Rice University. His broad research interests include randomized algorithms for large-scale machine learning. In 2018, Science news named him one of the Top-10 scientists under 40 to watch. He is a recipient of the National Science Foundation CAREER Award, a Young Investigator Award from the Air Force Office of Scientific Research, and a machine learning research award from Amazon. He has won numerous paper awards, including Best Paper Award at NIPS 2014 and Most Reproducible Paper Award at SIGMOD 2019. IEEE Spectrum describes his work on scaling up deep learning as,