Differences

This shows you the differences between two versions of the page.

projs:clans:docs:entity_disambiguation_xu:extractkeywords [2014/01/21 09:27]
xmill.zod
projs:clans:docs:entity_disambiguation_xu:extractkeywords [2014/01/21 09:28] (current)
xmill.zod
Line 7: Line 7:
  - A boolean value represents the train process is successful or not   - A boolean value represents the train process is successful or not
===== Detail Information ===== ===== Detail Information =====
-Filtering out the posts in which apear both a person A's name and the related company's stock name.  +  * Unordered List ItemFiltering out the posts in which apear both a person A's name and the related company's stock name.  
-Take the posts that filtered out as a reliable post set that related to person A. Randomely pick out 1000 posts that appears person A's name, take each of them as a document and the reliable post set as a document.  +  * Take the posts that filtered out as a reliable post set that related to person A. Randomely pick out 1000 posts that appears person A's name, take each of them as a document and the reliable post set as a document.  
-Then calculate the TF-IDF(term frequency - inverse document frequency) for each notational word in the document of reliable post set. Set their weights as the TF-IDF and sort them in desending order.  +  * Then calculate the TF-IDF(term frequency - inverse document frequency) for each notational word in the document of reliable post set. Set their weights as the TF-IDF and sort them in desending order.  
-Take the top 10 of them as filtering keywords for person A.+  * Take the top 10 of them as filtering keywords for person A.
 
projs/clans/docs/entity_disambiguation_xu/extractkeywords.1390267654.txt.gz · Last modified: 2014/01/21 09:27 by xmill.zod     Back to top