a国产,中文字幕久久波多野结衣AV,欧美粗大猛烈老熟妇,女人av天堂

當(dāng)前位置:主頁 > 碩博論文 > 信息類碩士論文 >

信息檢索中支持隱式時間查詢的文檔排名方法

發(fā)布時間:2018-04-11 00:09

  本文選題:時態(tài)信息檢索 + 查詢時間意圖; 參考:《江蘇大學(xué)》2017年碩士論文


【摘要】:互聯(lián)網(wǎng)的普及帶來了信息資源的爆炸式增長,為用戶提供更多選擇機會的同時也增加了尋找有效信息的難度,于是如何利用搜索引擎從海量的信息中篩選出滿足用戶需求的文檔成為了一個重要的挑戰(zhàn)。近年來,互聯(lián)網(wǎng)中包含時間信息的網(wǎng)頁與查詢數(shù)目不斷增多,時態(tài)信息檢索(Temporal Information retrieval,TIR)成為研究人員關(guān)注的熱點。它主要研究如何使用有效的技術(shù)提取網(wǎng)頁中的時態(tài)信息,分析查詢的時間意圖以及建立與時間有關(guān)的檢索排名模型等以改善搜索引擎的檢索質(zhì)量。信息檢索中具有時間意圖的查詢分為兩種,一種查詢中包含時間表達式,明確指定時間約束,稱為顯式時間查詢;而另一種查詢中沒有提供明確的時間標(biāo)準(zhǔn),但查詢的時間意圖在某個特定的時間區(qū)間,稱為隱式時間查詢。據(jù)統(tǒng)計,互聯(lián)網(wǎng)中超過7%的查詢包含隱式時間意圖,大約1.5%的查詢包含明確的時間約束,可見隱式時間查詢在互聯(lián)網(wǎng)查詢中占據(jù)的比例更大,有更多的研究工作有待開展。本論文研究如何分析隱式時間查詢的時間意圖與優(yōu)化檢索性能,主要的工作內(nèi)容歸納如下:(1)對于隱式時間查詢,提出了一種結(jié)合語義網(wǎng)DBpedia和排名前k個文檔分析查詢時間意圖的方法。如果用戶查詢的內(nèi)容是關(guān)于著名人物或者歷史上某個重大事件,則查詢DBpedia(基于維基百科的語義網(wǎng))得到的具體的時間區(qū)間作為查詢的時間意圖;其他類型的查詢使用排名前k個文檔內(nèi)容中出現(xiàn)頻率較高的時間表達式分析查詢的時間意圖。(2)在語言模型的基礎(chǔ)上提出一種支持隱式時間查詢的文檔排名模型,考慮時間不確定性因素計算各個文檔產(chǎn)生查詢的概率作為文檔時間相關(guān)性得分,最后線性結(jié)合時間相關(guān)性得分和內(nèi)容相關(guān)性得分對文檔重新排序。(3)使用NTCIR-11會議Temporal Information Access(Temporalia)任務(wù)中的文檔集作為實驗數(shù)據(jù),評價本文提出的分析隱式時間查詢意圖方法和文檔排名模型的性能。首先與已提出的幾種分析查詢時間意圖的方法比較,實驗結(jié)果表明在計算文檔相關(guān)性得分前分析查詢的時間意圖具有一定的意義,本文提出的結(jié)合DBpedia和排名前k個文檔方法能夠較好地分析查詢時間意圖。在得到查詢時間意圖的基礎(chǔ)上,比較本文提出的方法與目前已存在的考慮時間因素排名方法的性能,結(jié)果顯示考慮時間因素的排名模型中大多數(shù)的指標(biāo)值都高于僅考慮內(nèi)容相關(guān)性的初始排名,說明在檢索模型中考慮時間相關(guān)性有利于改善檢索質(zhì)量。與其他的排名方法相比,本文提出的基于語言模型的排名方法性能較好。
[Abstract]:The popularity of the Internet has brought explosive growth of information resources, providing users with more choice opportunities and increasing the difficulty of finding effective information.Therefore, how to use search engines to select documents from massive information to meet the needs of users has become an important challenge.In recent years, the number of web pages and queries containing time information in the Internet has been increasing. Temporal Information retrieval (TIR) has become a hot topic for researchers.It mainly studies how to use effective techniques to extract temporal information from web pages, analyze the temporal intention of queries and establish time-related search ranking models to improve the search quality of search engines.There are two kinds of queries with time intention in information retrieval. One kind of query contains a time expression, which explicitly specifies time constraints, which is called explicit time query, and the other kind of query does not provide a clear time standard.But the time intention of the query is in a specific time interval, which is called implicit time query.According to statistics, more than 7% of the queries in the Internet contain implicit time intention, and about 1.5% of the queries contain explicit time constraints. It can be seen that implicit time queries occupy a larger proportion in Internet queries, and more research work needs to be carried out.In this paper, we study how to analyze the time intention of implicit time query and optimize its retrieval performance. The main work is summarized as follows: 1) for implicit time query,This paper presents a method of analyzing query time intention by combining semantic web DBpedia with top k documents.If the content of a user query is about a famous person or a major event in history, the specific time interval obtained by the query DBpedia (Wikipedia based semantic Web) is taken as the time intention of the query.Other types of queries analyze the time intention of the query using the high frequency time expression in the top k document contents.) based on the language model, a document ranking model supporting implicit time query is proposed.Considering the time uncertainty factor, the probability of each document producing query is calculated as the document time correlation score.Finally, a linear combination of time correlation score and content correlation score is used to resort the document using the document set in the NTCIR-11 meeting Temporal Information access temporary Task as experimental data.The performance of the implicit time query intention method and the document ranking model proposed in this paper is evaluated.The experimental results show that it is significant to analyze the time intention of the query before calculating the correlation score of the document.The proposed method combined DBpedia with the top k documents can well analyze the query time intention.On the basis of obtaining the time intention of the query, this paper compares the performance of the proposed method with the existing ranking method considering time factors.The results show that most of the index values in the ranking model taking into account time factors are higher than the initial ranking which only considers the content correlation, which indicates that considering time correlation in the retrieval model is beneficial to improve the retrieval quality.Compared with other ranking methods, the proposed ranking method based on language model has better performance.
【學(xué)位授予單位】:江蘇大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.3

【參考文獻】

相關(guān)期刊論文 前2條

1 張曉娟;陸偉;周紅霞;;用戶查詢中潛在時間意圖分析及其檢索建模[J];現(xiàn)代圖書情報技術(shù);2011年11期

2 張宗仁;楊天奇;;基于自然語言理解的SPARQL本體查詢[J];計算機應(yīng)用;2010年12期

相關(guān)碩士學(xué)位論文 前1條

1 熊燕龍;移動學(xué)習(xí)中課程本體的構(gòu)建與應(yīng)用研究[D];江西師范大學(xué);2015年



本文編號:1733544

資料下載
論文發(fā)表

本文鏈接:http://www.wukwdryxk.cn/shoufeilunwen/xixikjs/1733544.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶7587a***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
中文在线最新版天堂8 | 成人毛片一区二区| 国产一卡2卡三卡4卡免费网站| 丁香婷婷在线| 久久久久久国产精品无码超碰动画 | 欧美丰满老妇性猛交| 午夜a级片| 天天激情| 免费无码一区二区三区| 亚洲精品tv久久久久久久久| 国产女主播喷水视频在线观看| 顺平县| 天堂v在线观看| 日韩欧美国产一区二区三区| 亚洲成在人线天堂网站| 婷婷综合久久中文字幕| 色哟哟精品视频在线观看| 汉中市| 高雅人妻用嘴在我胯下| 色综合久久| 亚洲精品一区二区口爆| 综合亚洲伊人午夜网| 99久久综合狠狠综合久久AⅤ | 九九精品99久久久香蕉| 夜夜添无码一区二区三区| 制服肉丝袜亚洲中文字幕| 麻阳| 伊人超碰| 男女xx| 午夜精品久久久久| 老色鬼网站| 学生姝被内谢出白浆| 欧美猛男gay粗大1069| 午夜爽爽爽| 精品一区二区久久| 无码中文字幕av免费放| 久久国产欧美日韩精品| 国产免费无码一区二区视频 | 精品一区av| 第一电影网| 无码国产精品一区二区高潮|