Improving Speech Emotion Recognition With Adversarial Data Augmentation Network


When there aren't many training data to work with, it can be difficult to train a deep neural network without triggering the overfitting problem. This article proposes a new data augmentation network, which is referred to as an adversarial data augmentation network (ADAN). This network is based on generative adversarial networks, which is a challenge that needs to be overcome (GANs). The GAN, the autoencoder, and the auxiliary classifier are the three components that make up the ADAN. These networks are trained in an adversarial fashion to synthesize class-dependent feature vectors in both the latent space and the original feature space. These feature vectors can then be added to the actual training data that is used to train classifiers. The Wasserstein divergence is used for adversarial training rather than the more traditional cross-entropy loss in an effort to produce high-quality synthetic samples. This is done in place of the conventional cross-entropy loss. Both EmoDB and IEMOCAP were utilized as evaluation data sets when the proposed networks were put to use in the context of speech emotion recognition. It was discovered that the gradient vanishing problem can be significantly alleviated if one forces synthetic latent vectors and real latent vectors to share a common representation. This is how the problem was discovered. In addition, the findings demonstrate that the augmented data produced by the proposed networks contain a wealth of information regarding the subjects' emotional states. As a consequence of this, the emotion classifiers that were produced can hold their own against the most advanced speech emotion recognition systems.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

PROJECT TITLE : Smartphone based Indoor Path Estimation and Localization without Human Intervention ABSTRACT: Many different kinds of indoor positioning systems have been developed as a result of the growing market interest in
PROJECT TITLE : Robust Fuzzy Learning for Partially Overlapping Channels Allocation in UAV Communication Networks ABSTRACT: The emerging cellular-enabled unmanned aerial vehicle (UAV) communication paradigm poses significant challenges
PROJECT TITLE : Prediction of Traffic Flow via Connnected Vehicles ABSTRACT: We propose a framework for short-term traffic flow prediction (STP) so that transportation authorities can take early actions to control flow and prevent
PROJECT TITLE : Passenger Demand Prediction with Cellular Footprints ABSTRACT: An accurate forecast of the demand for passengers across the entire city enables providers of online car-hailing services to more efficiently schedule
PROJECT TITLE : NCF: A Neural Context Fusion Approach to Raw Mobility Annotation ABSTRACT: Improving business intelligence in mobile environments requires a thorough comprehension of human mobility patterns on a point-of-interest

Ready to Complete Your Academic MTech Project Work In Affordable Price ?

Project Enquiry