
The Library
Mining the UK web archive for semantic change detection
Tools
Tsakalidis, Adam, Bazzi, Marya, Cucuringu, Mihai, Basile, Pierpaolo and McGillivray, Barbara (2019) Mining the UK web archive for semantic change detection. In: Recent Advances in Natural Language Processing (RANLP) 2019, Varna, Bulgaria, 2–4 Sep 2019. Published in: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019) pp. 1212-1221. ISBN 9789544520557. doi:10.26615/978-954-452-056-4_139 ISSN 1313-8502.
|
PDF
WRAP-Mining-UK-web-archive-semantic-change-detection-2019.pdf - Published Version - Requires a PDF viewer. Available under License Creative Commons Attribution 4.0. Download (3576Kb) | Preview |
|
![]() |
PDF
nlp_adam_tsakalidis.pdf - Accepted Version Embargoed item. Restricted access to Repository staff only - Requires a PDF viewer. Download (3018Kb) |
Official URL: http://doi.org/10.26615/978-954-452-056-4_139
Abstract
Semantic change detection (i.e., identify- ing words whose meaning has changed over time) started emerging as a grow- ing area of research over the past decade, with important downstream applications in natural language processing, historical linguistics and computational social sci- ence. However, several obstacles make progress in the domain slow and diffi- cult. These pertain primarily to the lack of well-established gold standard datasets, resources to study the problem at a fine- grained temporal resolution, and quantita- tive evaluation approaches. In this work, we aim to mitigate these issues by (a) re- leasing a new labelled dataset of more than 47K word vectors trained on the UK Web Archive over a short time-frame (2000- 2013); (b) proposing a variant of Pro- crustes alignment to detect words that have undergone semantic shift; and (c) intro- ducing a rank-based approach for evalu- ation purposes. Through extensive nu- merical experiments and validation, we il- lustrate the effectiveness of our approach against competitive baselines. Finally, we also make our resources publicly available to further enable research in the domain.
Item Type: | Conference Item (Paper) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics T Technology > TK Electrical engineering. Electronics Nuclear engineering Z Bibliography. Library Science. Information Resources > ZA Information resources |
||||||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Mathematics | ||||||||||||
Library of Congress Subject Headings (LCSH): | Semantic Web, Semantic computing, Information technology -- Sociological aspects, Data mining -- Great Britain, Web archives -- Great Britain | ||||||||||||
Journal or Publication Title: | Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019) | ||||||||||||
Publisher: | INCOMA Ltd. | ||||||||||||
ISBN: | 9789544520557 | ||||||||||||
ISSN: | 1313-8502 | ||||||||||||
Official Date: | 22 October 2019 | ||||||||||||
Dates: |
|
||||||||||||
Page Range: | pp. 1212-1221 | ||||||||||||
DOI: | 10.26615/978-954-452-056-4_139 | ||||||||||||
Status: | Peer Reviewed | ||||||||||||
Publication Status: | Published | ||||||||||||
Access rights to Published version: | Open Access (Creative Commons) | ||||||||||||
Date of first compliant deposit: | 30 October 2019 | ||||||||||||
Date of first compliant Open Access: | 1 March 2021 | ||||||||||||
RIOXX Funder/Project Grant: |
|
||||||||||||
Conference Paper Type: | Paper | ||||||||||||
Title of Event: | Recent Advances in Natural Language Processing (RANLP) 2019 | ||||||||||||
Type of Event: | Conference | ||||||||||||
Location of Event: | Varna, Bulgaria | ||||||||||||
Date(s) of Event: | 2–4 Sep 2019 | ||||||||||||
Related URLs: | |||||||||||||
Open Access Version: |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year