The probabilistic analysis of language acquisition : theoretical, computational, and experimental analysis
Hsu, Anne S., Chater, Nick and Vitányi, P. M. B.. (2011) The probabilistic analysis of language acquisition : theoretical, computational, and experimental analysis. Cognition, Vol.120 (No.3). pp. 380-390. ISSN 0010-0277Full text not available from this repository.
Official URL: http://dx.doi.org/10.1016/j.cognition.2011.02.013
There is much debate over the degree to which language learning is governed by innate language-specific biases, or acquired through cognition-general principles. Here we examine the probabilistic language acquisition hypothesis on three levels: We outline a novel theoretical result showing that it is possible to learn the exact generative model underlying a wide class of languages, purely from observing samples of the language. We then describe a recently proposed practical framework, which quantifies natural language learnability, allowing specific learnability predictions to be made for the first time. In previous work, this framework was used to make learnability predictions for a wide variety of linguistic constructions, for which learnability has been much debated. Here, we present a new experiment which tests these learnability predictions. We find that our experimental results support the possibility that these linguistic constructions are acquired probabilistically from cognition-general principles.
|Item Type:||Journal Article|
|Subjects:||P Language and Literature > P Philology. Linguistics
Q Science > QA Mathematics
Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
|Divisions:||Faculty of Social Sciences > Warwick Business School > Behavioural Science
Faculty of Social Sciences > Warwick Business School
|Library of Congress Subject Headings (LCSH):||Language acquisition -- Mathematical models, Language acquisition -- Statistical methods, Bayesian statistical decision theory, Natural language processing (Computer science)|
|Journal or Publication Title:||Cognition|
|Page Range:||pp. 380-390|
|References:||Ambridge, B., Pine, J., Rowland, C., & Young, C. (2008). The effect of verb semantic class and verb frequency (entrenchment) on children’s and adults’ graded judgements of argument-structure overgeneralization errors. Cognition, 106, 87–129. Angluin, D. (1988). Identifying languages from stochastic examples. Technical Report YALEU/DCS/RR-614. Yale University, Department of Computer Science, New Haven, CT. Baker, C. L., & McCarthy, J. J. (1981). The logical problem of language acquisition. Cambridge, Mass: MIT Press. Bowerman, M. (1988). The ‘no negative evidence’ problem: How do children avoid constructing an overly general grammar? In J. Hawkins (Ed.), Explaining language universals (pp. 73–101). Oxford: Blackwell. Brooks, P., Tomasello, M., Dodson, K., & Lewis, L. (1999). Young children’s overgeneralizations with fixed transitivity verbs. Child Development, 70, 1325–1337. Chater, N. (1996). Reconciling simplicity and likelihood principles in perceptual organization. Psychological Review, 103, 566–581. Chater, N., & Vitányi, P. M. B. (2007). Ideal learning’ of natural language: Positive results about learning from positive evidence. Journal of Mathematical Psychology, 51, 135–163. Chater, N., & Vitányi, P. M. B. (in preparation). Computable probability identification. Chomsky, N. (1975). The logical structure of linguistic theory. London: Plenum Press. Clark, A., & Eyraud, R. (2007). Polynomial identification in the limit of substitutable context-free languages. Journal of Machine Learning Research, 8, 1725–1745. Cover, T. M. (1973). On the determination of the irrationality of the mean of a random variable. Annals of Statistics, 1, 862–871. Crain, S. (1991). Language acquisition in the absence of experience. Behavioral and Brain Sciences, 14, 597–612. Davies, M. (2008). The corpus of contemporary American English (COCA): 385 million words, 1990-present. Corpus of Contemporary American English. <http://www.americancorpus.org>. Dowman, M. (in preparation). Minimum description length as a solution to the problem of generalization in syntactic theory. Machine Learning and Language. Feldman, J. (2000). Minimization of Boolean complexity in human concept learning. Nature, 407, 630–633. Feldman, J. A., Gips, J., Horning, J. J., & Reder, S. (1969). Grammatical complexity and inference. (Rep. No. CS 125). Stanford University. Feller, W. (1968). An introduction to probability theory and its applications (Vol. 1, 3rd ed.). New York: Wiley. Foraker, S., Regier, T., Khetarpal, N., Perfors, A., & Tenenbaum, J. B. (2009). Indirect evidence and the poverty of the stimulus: The case of anaphoric one. Cognitive Science, 33, 300. Gold, E. M. (1967). Language identification in the limit. Information and Control, 16, 447–474. Griffiths, T., Chater, N., Kemp, C., Perfors, A., & Tenenbaum, J. B. (2010). Probabilistic models of cognition: Exploring representations and inductive biasesG. Trends in Cognitive Sciences, 14, 357–364. Grünwald, P. (1994). A minimum description length approach to grammar inference. In S. Scheler, Wernter, & E. Rilof (Eds.), Connectionist, statistical and symbolic approaches to learning for natural language (pp. 203–216). Berlin: Springer Verlag. Hart, B., & Risley, J. (1995). Meaningful differences in the everyday experience of young american children. Baltimore, Maryland: Brookes Publishing. Horning, J. J. (1969). A study of grammatical inference. Stanford University. Hornstein, N., & Lightfoot, D. W. (1981). Explanation in linguistics: The logical problem of language acquisition. London: Longman. Hsu, A., & Chater, N. (2010). The logical problem of language acquisition goes probabilistic: No negative evidence as a window on language acquisition. Cognitive Science, 34, 972–1016. Hsu, A., & Griffiths, T. (2009). Differencial use of implicit negative evidence in generative and discriminative language learning. Neural Information Processing Systems, 22. Li, M., & Vitányi, P. M. B. (1997). An introduction to Kolmogorov complexity theory and its applications (2nd ed.). New York: Springer. Li, M., & Vitányi, P. M. B. (2008). An introduction to Kolmogorov complexity theory and its applications (3rd ed.). New York: Springer. Mac Whinney, B. (1995). The CHILDES project: Tools for analyzing talk. Hillsdale, NJ: Lawrence Erlbaum Associates. MacKay, D. (2003). Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press. Nikitina, T., & Bresnan, J. (2009). The Gradience of the Dative Alternation. In L. Uyechi & L. H. Wee (Eds.), Reality exploration and discovery: Pattern interaction in language and life (pp. 161–184). Stanford: CSLI Publications. Nowak, M., Komarova, N., & Niyogi, P. (2002). Computational and evolutionary aspects of language. Nature, 417, 611–617. Osherson, D., Stob, M., & Weinstein, S. (1985). Systems that learn. Cambridge, MA: MIT Press. Perfors, A., Regier, T., & Tenenbaum, J. B. (2006). Poverty of the stimulus? A rational approach. Proceedings of the Twenty-eighth Annual Conference of the Cognitive Science Society, 663, 668. Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press. Regier, T., & Gahl, S. (2004). Learning the unlearnable: The role of missing evidence. Cognition, 93, 147–155. Theakston, A. (2004). The role of entrenchment in children’s and adults’ performance on grammaticality judgment tasks. Cognitive Development, 19, 15–34. Vitányi, P. M. B., & Li, M. (2000). Minimum description length induction, Bayesianism, and Kolmogorov complexity. IEEE Transactions on Information Theory, IT, 46, 446–464.|
Actions (login required)