PROJECT TITLE :
Probabilistic Word Selection via Topic Modeling
We have a tendency to propose selective supervised Latent Dirichlet Allocation (ssLDA) to boost the prediction performance of the widely studied supervised probabilistic topic models. We tend to introduce a Bernoulli distribution for each word in one given document to selectthis word as a strongly or weakly discriminative one with respect to its assigned topic. The Bernoulli distribution is parameterized by the discrimination power of the word for its assigned topic. Thence, the document is represented as a “bag-of-selective-words” instead of the probabilistic “bag-of-topics” in the subject modeling domain or the flat “bag-of-words” within the traditional natural language processing domain to make a brand new perspective. Inheriting the final framework of supervised LDA (sLDA), ssLDA will additionally predict several types of response specified by a Gaussian Linear Model (GLM). That specialize in the use of this word choice mechanism for singe-label document classification during this paper, we have a tendency to conduct the variational inference for approximating the intractable posterior and derive a maximum-likelihood estimation of parameters in ssLDA. The experiments reported on textual documents show that ssLDA not only performs competitively over “state-of-the-art” classification approaches based mostly on each the flat “bag-of-words” and probabilistic “bag-of-topics” illustration in terms of classification performance, however also has the power to find the discrimination power of the words specified in the topics (compatible with our rational information).
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here