Lately, I have spent large swaths of my time focused around Deep Learning and Neural Networks (either with customers or in our lab). One of the most common questions that I get is around underperforming model training with regard to “wall clock time”. This has more to do with focusing on only one aspect of their architecture, say GPUs. As such, I will spend a little time writing about the 3 fundamental tenets for a successful Deep Learning architecture. These fundamental tenants are compute, file access, and bandwidth. Hopefully this will resonate and help provide some thoughts for those customers on their journey.
Deep Learning (DL) is certainly all the rage. We are defining DL as a type of Machine Learning (ML) built on a deep hierarchy of layers, with each layer solving different pieces of a complex problem. These layers are interconnected into a “neural network”.
The use cases that I am presented with continue to grow exponentially with very compelling financial return on investments. Whether it is Convolutional Neural Networks (CNNs) for Computer Vision or Recurrent Neural Networks (RNNs) for Natural Language Processing (NLP) or Deep Belief Networks (DBN) for Restricted Boltzmann Machines (RBMs), Deep Learning has many architectural structures and acronyms. There is some great Neural Network information out there. Pic 1 is a good representation of the structural layers for Deep Learning on Neural Networks:
Orchestration tools like BlueData, Kubernetes, Mesosphere, or Spark Cluster Manager are the top of the layer cake of (more…)