Content

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus viverra, eros sed ullamcorper adipiscing, dolor ante aliquam justo, sed mollis turpis augue ut nisl. Nunc id orci metus, a rhoncus massa. In hac habitasse platea dictumst. Fusce sit amet arcu a dui mattis ultricies.


A stochastic optimal control framework for quantifying and
reducing uncertainties in deep learning


Sponsor: U.S. Department of Energy, Advanced Scientific Computing Research, Applied Mathematics

Project description:
 

We propose to develop a stochastic optimal control framework for quantifying and reducing uncertainties in deep learning by exploiting the connection between probabilistic network architectures and optimal control of stochastic dynamical systems. Despite neural networks achieving impressive results in many machine learning tasks, current network models often produce unrealistic decisions due to the computational intractability of existing uncertainty quantification (UQ) methods in measuring uncertainties of very deep networks. As UQ is increasingly important to the safe use of deep learning in decision making for scientific applications, the computing capability developed in this effort will significantly advance the reliability of machine-learning assisted scientific predictions for DOE applications.

Funding period: 2019 -- 2021


Accomplishments

Accelerating Reinforcement Learning with
a Directional-Gaussian-Smoothing Evolution Strategy

Evolution strategy (ES) has been shown great promise in many challenging reinforcement learning (RL) tasks, rivaling other state-of-the-art deep RL methods. Yet, there are two limitations in the current ES practice that may hinder its otherwise further capabilities. First, most current methods rely on Monte Carlo type gradient estimators to suggest search direction, where the policy parameter is, in general, randomly sampled. Due to the low accuracy of such estimators, the RL training may suffer from slow convergence and require more iterations to reach optimal solution. Secondly, the landscape of reward functions can be deceptive and contains many local maxima, causing ES algorithms to prematurely converge and be unable to explore other parts of the parameter space with potentially greater rewards. In this work, we employ a Directional Gaussian Smoothing Evolutionary Strategy (DGS-ES) to accelerate RL training, which is well-suited to address these two challenges with its ability to i) provide gradient estimates with high ac- curacy, and ii) find nonlocal search direction which lays stress on large-scale variation of the reward function and disregards local fluctuation. Through several benchmark RL tasks demonstrated herein, we show that DGS-ES is highly scalable, possesses superior wall-clock time, and achieves competitive reward scores to other popular policy gradient and ES approaches.

The untrained agent    DGS-ES trained agent     Comparison with baselines
Figure: The Hopper control problem in OpenAI Gym, where the goal is to make a one-legged robot hop forward as fast as possible. (Left): the untrained agent cannot make the robotic hopper jump forward. (Middle): The DGS-ES trained agent can make the hopper jump forward fast. (Right): Comparison of our DGS-ES method with 6 baseline methods for the Hopper control problem. The agent trained by our method achieves the highest reward (the black curve).

Publication:  Jiaxin Zhang, Hoang Tran, and Guannan Zhang, Accelerating Reinforcement Learning with a Directional Gaussian Smoothing Evolution Strategy, (https://arxiv.org/abs/2002.09077).

Other Support: ORNL AI Initiative (https://www.ornl.gov/ai-initiative).



A Scalable Evolution Strategy for High-Dimensional Blackbox Optimization

We developed an Evolution Strategy with Directional Gaussian Smoothing (DGS-ES) which exploits nonlocal searching to maximize/minimize high-dimensional non-convex blackbox functions. The main contributions of this effort include (i) development of the a new DGS-gradient operator and its Gauss-Hermite estimator, which introduces, for the first time, an accurate nonlocal searching technique into the family of Evolution Strategy (ES). (ii) Theoretical analysis verifies that the scalability of the DGS-ES method, i.e., the number of iterations needed for convergence, is independent of the dimension for convex functions. (iii) Demonstration of the DGS-ES method on both high-dimensional non-convex benchmark optimization problems, as well as a real-world material design problem for rocket shell manufacture. (iv) Massive parallelization: the DGS-ES method is suited to be scaled up to a large number of parallel workers. All the function evaluations within each iteration can be simulated totally in parallel, and each worker only needs to return a scalar to the master, such that the communication cost among workers is minimal.

        
Figure: (Left) Illustration of the nonlocal exploitation. The DGS-gradient (red arrow) point to the global minimum (black star) while the local gradient (blue arrow) points to a wrong direction. (Right)  The number of iterations needed for convergence is independent of the dimension for a highly non-convex Rastrigin function from 10D to 1000D.

Publication:  Jiaxin Zhang, Hoang Tran, Dan Lu, and Guannan Zhang, A Scalable Evolution Strategy with Directional Gaussian Smoothing for Blackbox Optimization, (https://arxiv.org/abs/2002.03001).

Other Support: ORNL AI Initiative (https://www.ornl.gov/ai-initiative).



Learning nonlinear level sets for dimensionality reduction in function approximation

We developed a Nonlinear Level-set Learning (NLL) method for dimensionality reduction in high-dimensional function approximation with small data. This work is motivated by a variety of design tasks in real-world engineering applications, where practitioners would replace their computationally intensive physical models (e.g., high-resolution fluid simulators) with fast-to-evaluate predictive machine learning models, so as to accelerate the engineering design processes. There are two major challenges in constructing such predictive models: (a) high-dimensional inputs (e.g., many independent design parameters) and (b) small training data, generated by running extremely time-consuming simulations. Thus, reducing the input dimension is critical to alleviate the over-fitting issue caused by data insufficiency. Existing methods, including sliced inverse regression and active subspace approaches, reduce the input dimension by learning a linear coordinate transformation; our main contribution is to extend the transformation approach to a nonlinear regime. Specifically, we exploit reversible networks (RevNets) to learn nonlinear level sets of a high-dimensional function and parameterize its level sets in low-dimensional spaces. A new loss function was designed to utilize samples of the target functions’ gradient to encourage the transformed function to be sensitive to only a few transformed coordinates.


Figure 1: Comparison of neural network regression with and without using our DR method. (Left column): our DR method completely avoided overfitting issue and the relative error is 1.6%; (Middle column): direct neural network regression results in severe overfitting and 35.1% relative error; (Right column): direct neural network regression with more hidden neurons leads to much worse overfitting and 51.3% relative error. 

Publication:  Guannan Zhang, Jiaxin Zhang and Jacob Hinkle, Learning nonlinear level sets for dimensionality reduction in function approximation, Advances in Neural Information Processing Systems (NeurIPS), 32, pp. 13199-13208, 2019.

Other Support: ORNL AI Initiative (https://www.ornl.gov/ai-initiative).




Scalable Machine-Learning-based Optimal Design for Additively Manufactured Materials

We developed a scalable deep-learning-based optimal design method that exploits SUMMIT to significantly accelerate composite material design process with up to 85% cost reduction. The new method addresses three grand challenges in optimal design: (i) high-dimensional design space, (ii) computationally expensive multi-physics models, (iii) non-parallelizable optimization algorithms. Those challenges are addressed based on our observation that an optimizer only walks along a 1-D search path to find the optimum, regardless of the dimension of the design space. Thus, our goal is to construct a low-dimensional ML-model that can cover the 1-D search path. To this end, we developed a sequence of local deep neural networks (DNNs), each of which only covers a segment of the search path. To further reduce the dimensionality, we designed a new sampling strategy that can concentrate the training samples along the gradient descent direction. Our algorithm was implemented on SUMMIT, where hundreds of CPUs are used to produce training data, and hundreds of GPUs are used to train DNN models.

         
Figure: (Left) This is an illustration of the key idea. The sequence of local DNNs are trained using local samples (red dots) covering only the search path (black line). (Right): The designed two-material composite  (256x256 resolution) with optimal  thermal conductivity. The design process used 192 GPUs on SUMMIT to train the DNN models.

Presentation: Sirui Bi, Jiaxin Zhang, and Guannan Zhang, Scalable Machine-Learning-based Optimal Design for Additively Manufactured Materials, ORNL AI Expo, 2019.

Other Support: ORNL AI Initiative (https://www.ornl.gov/ai-initiative).