Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

An approach to source-code plagiarism detection investigation using latent semantic analysis

Tools
- Tools
+ Tools

Cosma, Georgina (2008) An approach to source-code plagiarism detection investigation using latent semantic analysis. PhD thesis, University of Warwick.

[img] PDF
WRAP_THESIS_Cosma_2008.pdf - Requires a PDF viewer.

Download (1663Kb)
[img] Other (Permission e-mail)
FW__Digitisation_of_your_PhD_Thesis_.msg
Embargoed item. Restricted access to Repository staff only

Download (1679Kb)
Official URL: http://webcat.warwick.ac.uk/record=b2248352~S15

Request Changes to record.

Abstract

This thesis looks at three aspects of source-code plagiarism. The first aspect of the
thesis is concerned with creating a definition of source-code plagiarism; the second aspect
is concerned with describing the findings gathered from investigating the Latent Semantic
Analysis information retrieval algorithm for source-code similarity detection; and the final
aspect of the thesis is concerned with the proposal and evaluation of a new algorithm that
combines Latent Semantic Analysis with plagiarism detection tools.
A recent review of the literature revealed that there is no commonly agreed definition of
what constitutes source-code plagiarism in the context of student assignments. This thesis
first analyses the findings from a survey carried out to gather an insight into the perspectives
of UK Higher Education academics who teach programming on computing courses. Based
on the survey findings, a detailed definition of source-code plagiarism is proposed.
Secondly, the thesis investigates the application of an information retrieval technique,
Latent Semantic Analysis, to derive semantic information from source-code files. Various
parameters drive the effectiveness of Latent Semantic Analysis. The performance of Latent
Semantic Analysis using various parameter settings and its effectiveness in retrieving
similar source-code files when optimising those parameters are evaluated.
Finally, an algorithm for combining Latent Semantic Analysis with plagiarism detection
tools is proposed and a tool is created and evaluated. The proposed tool, PlaGate, is
a hybrid model that allows for the integration of Latent Semantic Analysis with plagiarism
detection tools in order to enhance plagiarism detection. In addition, PlaGate has a facility
for investigating the importance of source-code fragments with regards to their contribution
towards proving plagiarism. PlaGate provides graphical output that indicates the clusters of
suspicious files and source-code fragments.

Item Type: Thesis or Dissertation (PhD)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Library of Congress Subject Headings (LCSH): Plagiarism -- Computer programs, Source code (Computer science), Latent semantic indexing
Official Date: July 2008
Dates:
DateEvent
July 2008Submitted
Institution: University of Warwick
Theses Department: Department of Computer Science
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Joy, Mike ; Russ, Steve
Extent: xix, 289 leaves : ill., charts
Language: eng

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item
twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us