The Library
Statistical language learning
Tools
Onnis, Luca (2003) Statistical language learning. PhD thesis, University of Warwick.
|
Text
WRAP_THESIS_Onnis_2003.pdf - Submitted Version Download (16Mb) | Preview |
Official URL: http://webcat.warwick.ac.uk/record=b1709601~S1
Abstract
Theoretical arguments based on the "poverty of the stimulus" have denied a
priori the possibility that abstract linguistic representations can be learned
inductively from exposure to the environment, given that the linguistic input
available to the child is both underdetermined and degenerate. I reassess such
learnability arguments by exploring a) the type and amount of statistical
information implicitly available in the input in the form of distributional and
phonological cues; b) psychologically plausible inductive mechanisms for
constraining the search space; c) the nature of linguistic representations,
algebraic or statistical. To do so I use three methodologies: experimental
procedures, linguistic analyses based on large corpora of naturally occurring
speech and text, and computational models implemented in computer
simulations.
In Chapters 1,2, and 5, I argue that long-distance structural dependencies
- traditionally hard to explain with simple distributional analyses based on ngram
statistics - can indeed be learned associatively provided the amount of
intervening material is highly variable or invariant (the Variability effect). In
Chapter 3, I show that simple associative mechanisms instantiated in Simple
Recurrent Networks can replicate the experimental findings under the same
conditions of variability. Chapter 4 presents successes and limits of such results
across perceptual modalities (visual vs. auditory) and perceptual presentation
(temporal vs. sequential), as well as the impact of long and short training
procedures. In Chapter 5, I show that generalisation to abstract categories from
stimuli framed in non-adjacent dependencies is also modulated by the Variability
effect. In Chapter 6, I show that the putative separation of algebraic and
statistical styles of computation based on successful speech segmentation versus
unsuccessful generalisation experiments (as published in a recent Science paper)
is premature and is the effect of a preference for phonological properties of the
input. In chapter 7 computer simulations of learning irregular constructions
suggest that it is possible to learn from positive evidence alone, despite Gold's
celebrated arguments on the unlearnability of natural languages. Evolutionary
simulations in Chapter 8 show that irregularities in natural languages can emerge
from full regularity and remain stable across generations of simulated agents. In
Chapter 9 I conclude that the brain may endowed with a powerful statistical
device for detecting structure, generalising, segmenting speech, and recovering
from overgeneralisations. The experimental and computational evidence gathered
here suggests that statistical language learning is more powerful than heretofore
acknowledged by the current literature.
Item Type: | Thesis (PhD) | ||||
---|---|---|---|---|---|
Subjects: | P Language and Literature > P Philology. Linguistics | ||||
Library of Congress Subject Headings (LCSH): | Linguistics -- Statistical methods, Computational linguistics, Mathematical linguistics | ||||
Official Date: | October 2003 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Department of Psychology | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Chater, Nick | ||||
Sponsors: | European Union (EU) (HPRN-CT-1999-00065) | ||||
Extent: | xi, 229 leaves | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year