Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Algorithm 898 : efficient multiplication of dense matrices over GF(2)

Tools
- Tools
+ Tools

Albrecht, Martin, Bard, Gregory and Hart, William B. (2010) Algorithm 898 : efficient multiplication of dense matrices over GF(2). ACM Transactions on Mathematical Software, Volume 37 (Number 1). Article: 9. doi:10.1145/1644001.1644010

[img]
Preview
Text
WRAP_Hart_0584144-ma-270913-mat_mult_gf2.pdf - Accepted Version

Download (473Kb) | Preview
Official URL: http://dx.doi.org/10.1145/1644001.1644010

Request Changes to record.

Abstract

We describe an efficient implementation of a hierarchy of algorithms for multiplication of dense matrices over the field with two elements (F-2). In particular we present our implementation in the M4RI library-of Strassen-Winograd matrix multiplication and the "Method of the Four Russians for Multiplication" (M4RM) and compare it against other available implementations. Good performance is demonstrated on AMD's Opteron processor and particulary good performance on Intel's Core 2 Duo processor. The open-source M4RI library is available as a stand-alone package as well as part of the Sage mathematics system.

In machine terms, addition in F2 is logical-XOR, and multiplication is logical-AND, thus a machine word of 64 bits allows one to operate on 64 elements of F2 in parallel: at most one CPU cycle for 64 parallel additions or multiplications. As such, element-wise operations over F2 are relatively cheap. In fact, in this paper, we conclude that the actual bottlenecks are memory reads and writes and issues of data locality. We present our empirical findings in relation to minimizing these and give an analysis thereof.

Item Type: Journal Article
Alternative Title: Algorithm XXX: efficient multiplication of dense matrices over GF(2)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Q Science > QA Mathematics
Divisions: Faculty of Science > Mathematics
Library of Congress Subject Headings (LCSH): Algorithms, Computer science -- Mathematics
Journal or Publication Title: ACM Transactions on Mathematical Software
Publisher: Association for Computing Machinery, Inc.
ISSN: 0098-3500
Official Date: January 2010
Dates:
DateEvent
January 2010Published
Volume: Volume 37
Number: Number 1
Number of Pages: 14
Page Range: Article: 9
DOI: 10.1145/1644001.1644010
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
Funder: Royal Holloway Valerie Myerscough Scholarship, Engineering and Physical Sciences Research Council (EPSRC)
Grant number: EP/D079543/1 (EPSRC)

Data sourced from Thomson Reuters' Web of Knowledge

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us