H2O: Holistic Hyper-Parameter Optimization for Large-Scale Deep Neural Network Training