Artificial Intelligence (AI) is transforming our lives in the same way as the advent of the Internet and cellular phones has done. AI is revolutionizing the healthcare industry with complex medical data analysis, actualizing self-driving cars, and beating humans at strategy games such as Go. However, it takes thousands of CPUs and GPUs, and many weeks to train the neural networks in AI hardware. Over the last six years, this compute power has doubled every 3.5 months. Traditional CPUs, GPUs and even neuromorphic electronics (IBM TrueNorth  and Google TPU ) have improved both energy efficiency and speed enhancement for learning (inference) tasks. However, electronic architectures face fundamental limits as Moore's law is slowing down. Furthermore, moving data electronically on metal wires has fundamental bandwidth and energy efficiency limitations, thus remaining a critical challenge facing deep learning hardware accelerators .