What's new? Analysing language-specific Wikipedia entity contexts to support entity-centric news retrieval

[thumbnail of WRAP-whats-new-language-wikipedia-retrieval-Zhou-2017.pdf]
Preview
PDF
WRAP-whats-new-language-wikipedia-retrieval-Zhou-2017.pdf - Accepted Version - Requires a PDF viewer.

Download (717kB) | Preview

Request Changes to record.

Abstract

Representation of influential entities, such as celebrities and multinational corporations on the web can vary across languages, re- flecting language-specific entity aspects, as well as divergent views on these entities in different communities. An important source of multilingual background knowledge about influential entities is Wikipedia — an online community-created encyclopaedia — containing more than 280 language editions. Such language-specific information could be applied in entity-centric information retrieval applications, in which users utilise very simple queries, mostly just the entity names, for the relevant documents. In this article we focus on the problem of creating languagespecific entity contexts to support entity-centric, language-specific information retrieval applications. First, we discuss alternative ways such contexts can be built, including Graph-based and Article-based approaches. Second, we analyse the similarities and the differences in these contexts in a case study including 220 entities and five Wikipedia language editions. Third, we propose a context-based entity-centric information retrieval model that maps documents to aspect space, and apply languagespecific entity contexts to perform query expansion. Last, we perform a case study to demonstrate the impact of this model in a news retrieval application. Our study illustrates that the proposed model can effectively improve the recall of entity-centric information retrieval while keeping high precision, and provide language-specific results.

Item Type: Book Item
Subjects: H Social Sciences > HM Sociology
Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Library of Congress Subject Headings (LCSH): Celebrities, Information retrieval, News Web sites, International business enterprises
Series Name: Lecture Notes in Computer Science
Journal or Publication Title: Transactions on Computational Collective Intelligence
Publisher: Springer
Place of Publication: Cham
ISBN: 9783319592671
ISSN: 2190-9288
Book Title: Transactions on Computational Collective Intelligence XXVI
Editor: Nguyen , N. and Kowalczyk, R. and Pinto, A. and Cardoso , J.
Official Date: 15 June 2017
Dates:
Date
Event
15 June 2017
Published
7 October 2016
Accepted
Volume: 10190
Page Range: pp. 2010-231
Status: Peer Reviewed
Publication Status: Published
Date of first compliant deposit: 14 February 2017
Date of first compliant Open Access: 27 July 2017
Funder: European Cooperation in the Field of Scientific and Technical Research (Organization) (COST), European Research Council (ERC), Horizon 2020 (European Commission) (H2020)
Grant number: IC1302 KEYSTONE (COST), ALEXANDRIA ERC 339233 (ERC), H2020-MSCA-ITN-2014 WDAqua 64279 (H2020)
Related URLs:
URI: https://wrap.warwick.ac.uk/85950/

Export / Share Citation


Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item