I am learning to use topic modeling for documents clustering. I would like to clarify whether my understaning of the relationship between latent dirichlet allocation (LDA) and the generic task of document clustering is
correct or not?
The LDA analysis tends to output the topic proportions for each document. This is not the direct result of document clustering. However, we can treat this probability proportions as a feature reprsentation for each document. Afterwards, we can invoke other established clustering method, like K-means, to cluster documents based on the feature configurations generated by LDA analysis.
The best metric we found for computing the semantic similarity of topics was a pairwise topic coherence, using the coherence metric from "Automatic Evaluation of Topic Coherence," by Newman et al., NAACL 2010.
訂閱:
張貼留言 (Atom)
Types of Bots: An Overview
Learn more about all the different varieties of bots, and what they can do for you http://botnerds.com/types-of-bots/ In this articl...
-
Knowledge-based Artificial Intelligence http://www.mkbergman.com/1816/knowledge-based-artificial-intelligence/ A recent interview with a n...
-
Just like any technical or business IT capability, one pre-requisite for adoption is understanding the WHAT and the WHY; and a clear definit...
-
http://resources.narrativescience.com/h/i/124944227-what-is-natural-language-generation
沒有留言:
張貼留言