The Library
Utilizing Lexicon-enhanced approach to sensitive information identification
Tools
Cai, Lihua, Zhou, Yujue, Ding, Yulong, Jiang, Jie and Yang, Shuang-Hua (2022) Utilizing Lexicon-enhanced approach to sensitive information identification. In: 2022 27th International Conference on Automation and Computing (ICAC), Bristol, 01-03 Sep 2022. Published in: 2022 27th International Conference on Automation and Computing (ICAC) ISBN 9781665498074. doi:10.1109/icac55051.2022.9911164
Research output not available from this repository.
Request-a-Copy directly from author or use local Library Get it For Me service.
Official URL: https://doi.org/10.1109/ICAC55051.2022.9911164
Abstract
Large-scale sensitive information leakage incidents have occurred frequently, causing huge impacts and losses to individuals, enterprises, and society. Most sensitive information exists in unstructured data, making it challenging for people to identify when it is leaked, an important cause of information leakage. Therefore, sensitive information identification from unstructured data has received extensive attention. In addition, the smallest unit of Chinese is a character, so its lexical boundary is flexible, which makes it very difficult to identify sensitive information in Chinese. It is worth mentioning that there are no publicly available datasets in this field of sensitive information identification due to the sensitivity. To address the above challenges, we first create the SPIDC (Sensitive Personal Information Dataset in Chinese) and release it as a public resource for related research. Second, we apply the existing sensitive information identification methods on the English datasets to the Chinese datasets. In addition, to solve the problem of uncertainty and ambiguity of Chinese vocabulary boundary, we apply three lexicon-enhanced technologies from NER (Named Entity Recognition) to the Chinese sensitive information identification for the first time. Experimental results on the SPIDC show that the lexicon-enhanced approach has better performance than other methods.
Item Type: | Conference Item (Paper) | ||||||
---|---|---|---|---|---|---|---|
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science Faculty of Science, Engineering and Medicine > Engineering > Engineering |
||||||
SWORD Depositor: | Library Publications Router | ||||||
Journal or Publication Title: | 2022 27th International Conference on Automation and Computing (ICAC) | ||||||
Publisher: | IEEE | ||||||
ISBN: | 9781665498074 | ||||||
Official Date: | 10 October 2022 | ||||||
Dates: |
|
||||||
DOI: | 10.1109/icac55051.2022.9911164 | ||||||
Status: | Peer Reviewed | ||||||
Publication Status: | Published | ||||||
Reuse Statement (publisher, data, author rights): | ** From Crossref proceedings articles via Jisc Publications Router ** History: ppub 01-09-2022; issued 01-09-2022. | ||||||
Access rights to Published version: | Restricted or Subscription Access | ||||||
Conference Paper Type: | Paper | ||||||
Title of Event: | 2022 27th International Conference on Automation and Computing (ICAC) | ||||||
Type of Event: | Conference | ||||||
Location of Event: | Bristol | ||||||
Date(s) of Event: | 01-03 Sep 2022 |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |