Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Addressing token uniformity in transformers via singular value transformation

Tools
- Tools
+ Tools

Yan, Hanqi, Gui, Lin, Li, Wenjie and He, Yulan (2022) Addressing token uniformity in transformers via singular value transformation. In: Uncertainty in Artificial Intelligence, Eindhoven, The Netherlands, 01-05 Aug 2022. Published in: Proceedings of Machine Learning Research, 180 pp. 2181-2191. ISSN 2640-3498.

[img]
Preview
PDF
WRAP-Addressing-token-uniformity-transformers-via-singular-transformation-22.pdf - Accepted Version - Requires a PDF viewer.

Download (4046Kb) | Preview
Official URL: https://proceedings.mlr.press/v180/yan22b.html

Request Changes to record.

Abstract

Token uniformity is commonly observed in transformer-based models, in which different tokens share a large proportion of similar information after going through stacked multiple self-attention layers in a transformer. In this paper, we propose to use the distribution of singular values of outputs of each transformer layer to characterise the phenomenon of token uniformity and empirically illustrate that a less skewed singular value distribution can alleviate the token uniformity problem. Base on our observations, we define several desirable properties of singular value distributions and propose a novel transformation function for updating the singular values. We show that apart from alleviating token uniformity, the transformation function should preserve the local neighbourhood structure in the original embedding space. Our proposed singular value transformation function is applied to a range of transformer-based language models such as BERT, ALBERT, RoBERTa and DistilBERT, and improved performance is observed in semantic textual similarity evaluation and a range of GLUE tasks.

Item Type: Conference Item (Paper)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Library of Congress Subject Headings (LCSH): Natural language processing (Computer science) -- Congresses, Modeling languages (Computer science) -- Congresses, Machine learning -- Congresses
Journal or Publication Title: Proceedings of Machine Learning Research
Publisher: ML Research Press
ISSN: 2640-3498
Official Date: 1 August 2022
Dates:
DateEvent
1 August 2022Published
26 June 2022Accepted
Volume: 180
Page Range: pp. 2181-2191
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
Copyright Holders: © The authors and PMLR 2022. MLResearchPress
Date of first compliant deposit: 5 September 2022
Date of first compliant Open Access: 5 September 2022
RIOXX Funder/Project Grant:
Project/Grant IDRIOXX Funder NameFunder ID
EP/T017112/1[EPSRC] Engineering and Physical Sciences Research Councilhttp://dx.doi.org/10.13039/501100000266
EP/V048597/1[EPSRC] Engineering and Physical Sciences Research Councilhttp://dx.doi.org/10.13039/501100000266
EP/V020579/1UK Research and Innovationhttp://dx.doi.org/10.13039/100014013
Conference Paper Type: Paper
Title of Event: Uncertainty in Artificial Intelligence
Type of Event: Conference
Location of Event: Eindhoven, The Netherlands
Date(s) of Event: 01-05 Aug 2022

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us