国产bbaaaaa片,成年美女黄网站色视频免费,成年黄大片,а天堂中文最新一区二区三区,成人精品视频一区二区三区尤物

首頁> 外文期刊>Journal of Computational Methods in Sciences and Engineering >Hot topic identification from micro-blog based on improved Single-pass algorithm
【24h】

Hot topic identification from micro-blog based on improved Single-pass algorithm

機(jī)譯:基于改進(jìn)的單遍算法的微博熱點(diǎn)話題識(shí)別

獲取原文
獲取原文并翻譯 | 示例

摘要

Hot topic identification from micro-blog is very important for detection and control of the public opinion. When using Single-pass algorithm to cluster hot topics for Chinese micro-blog, Chinese word segmentation technology is a necessary preprocessing, but it will introduce inevitable segment errors. This kind of errors will make topic identification has low clustering precision. To solve this problem, this paper proposed an improved algorithm based on Single-pass which combines CS (Cosine Similarity) and LCS (Longest Common Subsequences) to calculate the similarity between Chinese words. Experiments on three different micro-blog data sets for hot topic identification are made, and the results show that the improved algorithm has both higher recall rate and precision rate than the original ones. The proposed algorithm is feasible and effective.
機(jī)譯:微博中的熱門話題識(shí)別對(duì)于檢測(cè)和控制輿論非常重要。當(dāng)使用單次通過算法對(duì)中文微博客的熱門話題進(jìn)行聚類時(shí),中文分詞技術(shù)是必不可少的預(yù)處理程序,但是它會(huì)不可避免地引入分段錯(cuò)誤。這種錯(cuò)誤會(huì)使主題識(shí)別的聚類精度降低。為了解決這個(gè)問題,本文提出了一種基于單遍的改進(jìn)算法,該算法結(jié)合了余弦相似度和最長(zhǎng)公共子序列,計(jì)算了漢字之間的相似度。對(duì)三種不同的微博數(shù)據(jù)進(jìn)行熱點(diǎn)識(shí)別實(shí)驗(yàn),結(jié)果表明,改進(jìn)算法比原始算法具有更高的查全率和查準(zhǔn)率。該算法是可行和有效的。

著錄項(xiàng)

相似文獻(xiàn)

  • 外文文獻(xiàn)
  • 中文文獻(xiàn)
  • 專利
獲取原文

客服郵箱:kefu@zhangqiaokeyan.com

京公網(wǎng)安備:11010802029741號(hào) ICP備案號(hào):京ICP備15016152號(hào)-6 六維聯(lián)合信息科技 (北京) 有限公司?版權(quán)所有
  • 客服微信

  • 服務(wù)號(hào)