Scalable and Practical Natural Gradient for Large-Scale Deep Learning


Because of the increase in the effective mini-batch size, the generalization performance of the models produced by large-scale distributed training of deep neural networks is inferior. This is the result of the larger effective mini-batch size. Previous methods have attempted to solve this issue by altering the learning rate and batch size across epochs and layers, as well as by making ad hoc modifications to the batch normalization process. We propose scalable and practical natural gradient descent (SP-NGD), a principled approach for training models that enables them to achieve similar generalization performance to models trained with first-order optimization methods, but with accelerated convergence. This is accomplished through the use of a natural gradient descent algorithm that is scalable and can be implemented practically. In addition, in contrast to first-order methods, SP-NGD is able to scale to large mini-batch sizes with only a negligible increase in the amount of computational overhead. Training a ResNet-50 model to classify images on ImageNet was the benchmark task that we used to evaluate SP-NGD. The available references for this task were highly optimized first-order methods. We show that it is possible to converge to a top-1 validation accuracy of 75.4% in 5.5 minutes when using a mini-batch size of 32,768 and 1,024 GPUs. Additionally, we show that it is possible to converge to an accuracy of 74.9 % when using an extremely large mini-batch size of 131,072 in 873 steps of SP-NGD.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Towards End-to-End Text Spotting in Natural Scenes ABSTRACT: The ability to recognize text within images of natural scenes is critical for a wide variety of image understanding tasks. Text detection and recognition
PROJECT TITLE : A Natural Language Process-Based Framework for Automatic Association Word Extraction ABSTRACT: In psychology, word association has been extensively explored for exposing mental representations and relationships
PROJECT TITLE : Blind Deblurring of Natural Stochastic Textures Using an Anisotropic Fractal Model and Phase Retrieval Algorithm ABSTRACT: It has been thoroughly researched for natural photographs the tough inverse problem of
PROJECT TITLE : Improved ArtGAN for Conditional Synthesis of Natural Image and Artwork ABSTRACT: This research offers a number of innovative ways to improve the generative adversarial network (GAN) for conditional picture synthesis,
PROJECT TITLE : A Natural Language Processing Framework for Assessing Hospital Readmissions for Patients with COPD - 2017 ABSTRACT: With the passage of recent federal legislation several medical institutions are now accountable

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry