Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Is AI ground truth really true? The dangers of training and evaluating AI tools based on experts’ know-what

Tools
- Tools
+ Tools

Lebovitz, S., Levina, Natalie and Lifshitz-Assaf, H. (2021) Is AI ground truth really true? The dangers of training and evaluating AI tools based on experts’ know-what. MIS Quarterly , 45 (3). pp. 1501-1525. doi:10.25300/MISQ/2021/16564

[img] PDF
wbs-140721-wrap--is_ai_ground_truth_reall_y_true_2021_author_accepted_version_april_30_2021.pdf - Accepted Version
Embargoed item. Restricted access to Repository staff only - Requires a PDF viewer.

Download (1550Kb)
Official URL: https://doi.org/10.25300/MISQ/2021/16564

Request Changes to record.

Abstract

Organizational decision-makers need to evaluate AI tools in light of increasing claims that such tools outperform human experts. Yet, measuring the quality of knowledge work is challenging, raising the question of how to evaluate AI performance in such contexts. We investigate this question through a field study of a major U.S. hospital, observing how managers evaluated five different machine-learning (ML) based AI tools. Each tool reported high performance according to standard AI accuracy measures, which were based on ground truth labels provided by qualified experts. Trying these tools out in practice, however, revealed that none of them met expectations. Searching for explanations, managers began confronting the high uncertainty of experts’ know-what knowledge captured in ground truth labels used to train and validate ML models. In practice, experts address this uncertainty by drawing on rich know-how practices, which were not incorporated into these ML-based tools. Discovering the disconnect between AI’s know-what and experts’ know-how enabled managers to better understand the risks and benefits of each tool. This study shows dangers of treating ground truth labels used in ML models objectively when the underlying knowledge is uncertain. We outline implications of our study for developing, training, and evaluating AI for knowledge work.

Item Type: Journal Article
Divisions: Faculty of Social Sciences > Warwick Business School
Journal or Publication Title: MIS Quarterly
Publisher: M I S Research Center
ISSN: 0276-7783
Official Date: September 2021
Dates:
DateEvent
September 2021Published
2 June 2021Available
30 April 2021Accepted
Volume: 45
Number: 3
Page Range: pp. 1501-1525
DOI: 10.25300/MISQ/2021/16564
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
Related URLs:
  • Publisher

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item
twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us