国产bbaaaaa片,成年美女黄网站色视频免费,成年黄大片,а天堂中文最新一区二区三区,成人精品视频一区二区三区尤物

首頁> 外文會議>International ACM SIGIR conference on research and development in information retrieval >Multidimensional Search Result Diversification:Diverse Search Results for Diverse Users
【24h】

Multidimensional Search Result Diversification:Diverse Search Results for Diverse Users

機譯:多維搜索結(jié)果多樣化:針對不同用戶的多樣化搜索結(jié)果

獲取原文

摘要

Hundreds of millions of people today rely on Web based Search Engines to satisfy their information needs. In order to meet the expectations of this vast and diverse user population, the search engine should present a list of results such that the probability of satisfying the average vser is maximized [1]. This leads us to the problem of Search Result Diversification. Given a user submitted query, the search engine should include results that are relevant to the user query and at the same time, diverse enough to meet the expectations of diverse user populations. However, it is not clear in what respect the results should be diversified.Much of the current work in diversity [1, 3] focuses on ambiguous and underspecified queries and tries to include results corresponding to diverse interpretations of the ambiguous query. This is not always sufficient. My analysis of a commercial web search engine's logs reveals that even for well-specified informational queries, click entropy is very high indicating that different users prefer different types of documents. Very recently, a diversification algorithm fine-tuned for such informational queries has been proposed [5]. Further, high click entropies were also observed for a large fraction of transactional queries. One major goal of my PhD thesis will then be to identify the various possible dimensions along which the search results can be diversified. Having such an information will enhance our understanding about the expectations of an average user from the search engine. By utilizing aggregate statistics about queries, users and their interact ion with the search engine for different queries, more concrete evidences about diverse user preferences as well as relative importance of different diversity dimensions can be derived.Once we know different diversity dimensions, the next natural question is: given a query, how can we determine the diversification requirement best suited for the query? For some queries sub-topic coverage may be more important while for others diversification with respect to document source or stylistics might be important. This problem is related to the problem of selective diversification [4] where the goal is to identify queries for which diversification techniques should be used. However, in addition, we are also interested in identifying different diversity classes a given query belongs to. Further, for some queries it may be required to diversify along multiple diversity dimensions. In such cases, it is also important to determine the relative importance of different diversity dimensions for the given query. By utilizing past user interaction data, query level features (like query clarity, entropy, lexical features etc.) and document level features (e.g. popularity, content: quality, previous click history etc.), classifiers for diversification requirements can be developed.Given a user query, once we know the type of diversity requirements for the user, an appropriate diversification technique is required. I would like to study the problem of simultaneously diversifying search results along multiple dimensions, as discussed above. One possible way here could be to build upon the nugget based framework introduced by Clarke et al. [2] where we represent each document as a set of nuggets, each nugget corresponding to a diversity dimension.
機譯:如今,數(shù)以億計的人依靠基于Web的搜索引擎來滿足他們的信息需求。為了滿足廣大用戶群的期望,搜索引擎應(yīng)提供結(jié)果列表,以使?jié)M足平均vser的概率最大化[1]。這導(dǎo)致我們出現(xiàn)“搜索結(jié)果多樣化”的問題。給定用戶提交的查詢,搜索引擎應(yīng)包括與用戶查詢相關(guān)的結(jié)果,同時,結(jié)果應(yīng)足以滿足不同用戶群體的期望。但是,尚不清楚應(yīng)在什么方面對結(jié)果進行多元化。多樣性[1,3]中的許多當(dāng)前工作都集中在歧義和未指定的查詢上,并試圖包括對應(yīng)于歧義查詢的各種解釋的結(jié)果。這并不總是足夠的。我對商業(yè)網(wǎng)絡(luò)搜索引擎日志的分析表明,即使對于明確指定的信息查詢,點擊熵也很高,這表明不同的用戶喜歡不同類型的文檔。最近,已經(jīng)提出了一種針對這種信息查詢進行微調(diào)的多樣化算法[5]。此外,在很大一部分事務(wù)查詢中也觀察到了高點擊熵。我的博士學(xué)位論文的一個主要目標(biāo)是確定各種可能的維度,沿著這些維度可以使搜索結(jié)果多樣化。擁有此類信息將增強我們對搜索引擎對普通用戶的期望的了解。通過利用有關(guān)查詢,用戶及其與搜索引擎的交互信息的匯總統(tǒng)計信息來查詢不同的查詢,可以得出有關(guān)用戶偏好偏好以及不同多樣性維度的相對重要性的更具體的證據(jù)。一旦我們知道了不同的??多樣性維度,下一個自然問題是:給定查詢,我們?nèi)绾未_定最適合該查詢的多樣化要求?對于某些查詢,子主題的覆蓋范圍可能更重要,而對于其他查詢,文檔來源或文體方面的多樣化可能很重要。這個問題與選擇性多樣化的問題有關(guān)[4],選擇性的多樣化的目標(biāo)是確定應(yīng)該使用多樣化技術(shù)的查詢。但是,此外,我們還對識別給定查詢所屬的不同分集類感興趣。此外,對于某些查詢,可能需要沿多個多樣性維度進行多樣化。在這種情況下,為給定查詢確定不同多樣性維度的相對重要性也很重要。通過利用過去的用戶交互數(shù)據(jù),查詢級別的特征(例如查詢清晰度,熵,詞法特征等)和文檔級別的特征(例如受歡迎程度,內(nèi)容:質(zhì)量,以前的單擊歷史記錄等),可以開發(fā)出多樣化需求的分類器。用戶查詢后,一旦我們知道了用戶的多樣性要求的類型,就需要一種適當(dāng)?shù)亩鄻踊夹g(shù)。如上所述,我想研究在多個維度上同時使搜索結(jié)果多樣化的問題。這里的一種可能的方法可能是基于Clarke等人介紹的基于塊的框架。 [2]我們將每個文檔表示為一組塊,每個塊對應(yīng)一個多樣性維度。

著錄項

相似文獻

  • 外文文獻
  • 中文文獻
  • 專利
獲取原文

客服郵箱:kefu@zhangqiaokeyan.com

京公網(wǎng)安備:11010802029741號 ICP備案號:京ICP備15016152號-6 六維聯(lián)合信息科技 (北京) 有限公司?版權(quán)所有
  • 客服微信

  • 服務(wù)號