Child Development and Language Model Training: Comparing Psychological Theories of Learning

Authors

  • Luca Capone

DOI:

https://doi.org/10.4396/2025SFL03

Keywords:

linguistics, philosophy of language, language models, language acquisition, cognitive development

Abstract

This article presents a comparison between language model training techniques and the phenomenon of child language learning. The concept of learning, while central to human cognition, has struggled to find a stable place within the theoretical debate on language models, largely because training procedures appear to bear little resemblance to human learning and are markedly less efficient. The paper provides an overview of the relationship between the two processes, examining the theoretical challenges raised by dominant approaches in both the language model and developmental psychology literatures. Its aim is to assess whether learning and training are fundamentally incommensurable, or whether a comparison between the two might yield insights into both models’ behaviour and child cognitive development. The analysis reveals interesting points of contact, shifting the focus of the discussion away from biological and psychological innate constraints toward environmental and linguistic learning conditions.

Downloads

Download data is not yet available.

References

Baron-Cohen, Simon (1995), Mindblindness: An essay on autism and theory of mind, Cambridge, MIT Press 1995.

Bender, Emily, Koller, Alexander (2020), «Climbing towards nlu: On meaning, form, and understanding in the age of data». Proceedings of the 58th Annual Meeting of the ACL, pp. 5185–5198.

Berman, Ruth, Slobin, Dan Isaac (1994), Relating events in narrative: A cross-linguistic developmental study. Hillsdale, Taylor and Francis 1994.

Brown, Tom B, Mann Benjamin Ryder, Nick, Subbiah, Melanie, Kaplan, Jared, Dhariwal, Prafulla, Neelakantan, Arvind, Shyam, Pranav, Sastry, Girish, Askell, Amanda, Agarwal, Sandhini, Herbert-Voss, Ariel, Krueger, Gretchen, Henighan, Tom, Child, Rewon, Ramesh, Aditya, Ziegler, Daniel M., Wu, Jeffrey, Winter, Clemens, Hesse, Christopher, Chen, Mark, Sigler, Eric, Litwin Mateusz, Gray, Scott, Chess, Benjamin, Clark Jack, Berner, Christopher, McCandlish Sam, Radford Alec, Sutskever Ilya, Amodei Dario (2020), «Language models are few-shot learners», arXiv, https://doi.org/10.48550/arXiv.2005.14165.

Capone, Luca (2021), «Which theory of language for deep neural networks? Speech and cognition in humans and machines», Technology and Language, 2(4), pp. 29-60.

Capone, Luca (2024), «Wittgenstein’s remarks on the philosophy of computational linguistics. Philosophical clarification applied to natural language processing». RIFL, https://doi.org/10.4396/SFL202302

Capone, Luca, Suozzi, Alice, Lebani, Gianluca, Lenci, Alessandro (2024), «Babies: A benchmark for the linguistic evaluation of Italian baby language models», Proceedings of CLiC-it 2024, pp. 157–17, https://aclanthology.org/2024.clicit-1.20/

Chalmers, David J. (2023), «Could a Large Language Model be Conscious?», Boston Review, https://www.bostonreview.net/articles/could-a-large-language-model-be-conscious/

Chomsky, Noam (1980), Rules and Representations, Cambridge, CUP 2005.

Diessel, Holger (2004), The acquisition of complex sentences. Cambridge, CUP 2009.

Dündar‐Coecke, Selma, Toloime, Andrew, Schlottmann, Anne (2020), «Children’s reasoning about continuous causal processes: The role of verbal and non‐verbal ability», BJEP, 90(2), pp. 364–381.

Eco, Umberto (1997), Kant e l’ornitorinco, Milano, La nave di Teseo, 2016.

Felin, Teppo, Howleg, Matthias (2024), «Theory is all you need: Ai, human cognition, and decision making», SSRN, https://doi.org/10.2139/ssrn.4737265

Fodor, Jerry A. (1975), The language of thought, New York, Thomas Crowell 1975.

Frank, Michael C. (2023), «Bridging the data gap between children and large language models», Trends in Cognitive Sciences, 27(11), pp. 990–992.

Ge, Xuyang, Shu, wentao, Wu, Jiaxing, Zhou, Yunhua, He, Zhengfu, Qiu, Xipeng (2025), «Evolution of Concepts in Language Model Pre-Training», arXiv preprint, 10.48550/arXiv.2509.17196

Gilkerson, Jill, Richards, Jeffrey A., Warren, Steven F., Montgomery, Judith K., Greenwood, Charles R., Oller, D. Kimbrough, Hansen John H.L., Terrance, Paul D. (2017), «Mapping the early language environment using all-day recordings and automated analysis», AJSLP, 26(2), pp. 248–265.

Goldstein, Simon, Levinstein, Benjamin A. (2024), «Does chatgpt have a mind?», arXiv, https://doi.org/10.48550/arXiv.2407.11015

Gopnik, Alison (2011), «The theory theory 2. 0: Probabilistic models and cognitive development», Child Development Perspectives, 5(3), pp. 161–163.

Gopnik, Alison, Wellman, Henry M. (2012), «Reconstructing constructivism: causal models, Bayesian learning mechanisms, and the theory theory», Psychol. Bull. 138, pp. 1085–1108.

Harris, Paul L, German, Tim, Mills, Patrick (1996), «Children’s use of counterfactual thinking in causal reasoning», Cognition, 61(3), pp. 233–259.

Hart, Betty, Risley, Todd (1995), Meaningful differences in the everyday experience of young American children, Baltimore, Brookes 1995.

Hickmann, Maya (1995), «Discourse organization and the development of reference to person, space, and time», in P. Fletcher and B. MacWhinney, eds., Handbook of child language, Oxford, Basil Blackwell 1995.

Hu, Michael Y, Mueller, Aaron, Ross, Candace, Williams, Adina, Linzen, Tal, Chengxu, Zhuang, Cotterell Ryan, Choshen, Leshem, Warstadt Alex, Wilcox, Ethan Gotlieb (2024), «Findings of the second babylm challenge: Sample-efficient pretraining on developmentally plausible corpora», arXiv, https://doi.org/10.48550/arXiv.2412.05149

Leivada, Evelina, Marcus, Gary, Günther, Fritz, Murphy, Elliot (2024), «A sentence is worth a thousand pictures: Can large language models understand hum4n l4ngu4ge and the w0rld behind w0rds?», arXiv, https://doi.org/10.48550/arXiv.2308.00109

Lenci, Alessandro (2023), «Understanding natural language understanding systems», Sistemi Intelligenti, 2, pp. 277–302.

Luria, Aleksandr R. (1979), The Making of Mind: A Personal Account of Soviet Psychology, Cambridge, HUP 1979.

Mandelkern, Matthew, Linzen, Tal (2024), «Do language models’ words refer?», arXiv, https://doi.org/10.48550/arXiv.2308.05576

Marx, Karl (1867), Das Kapital, Hamburg, Verlang von Otto Meissner, (Capital, transl. by D. Fernbach, London, Penguin Book 1991).

McCormack, Teresa, Hoerl, Christoph (2005), «Children’s reasoning about the causal significance of the temporal order of events», Developmental Psychology, 41(1), pp. 54–63.

Mecacci, Luciano (2017), Lev Vygotskij. Sviluppo, educazione e patologia della mente, Firenze, Giunti 2021.

Merrill, William, Wu, Zhaofeng, Naka, Norihito, Kim, Yoon, Linzen, Tal (2024), «Can you learn semantics through next-word prediction? The case of entailment», arXiv, https://doi.org/10.48550/arXiv.2402.13956

Nikolaus, Mitja, Fourtassi, Abdellah (2021), «Modeling the interaction between perception-based and production-based learning in children’s early acquisition of semantic knowledge», Proceedings of the 25th CoNLL, https://doi.org/10.18653/v1/2021.conll-1.31

Ouyang, Long, Wu, Jeff, Almeida, Diogo, Wainwright, Carroll L., Mishkin, Pamela, Zhang, Chong, Agarwal, Sandhini, Slama, Katarina, Ray, Alex, Schulman, John, Hilton, Jacob, Kelton, Fraser, Miller, Luke, Simens, Maddie, Askell, Amanda, Welinder, Peter, Christiano, Paul, Leike, Jan, Lowe, Ryan (2022), «Training language models to follow instructions with human feedback», arXiv, https://doi.org/10.48550/arXiv.2203.02155

Perrine, Patric (2023), «Inaccessible neural language models could reinvigorate linguistic nativism», arXiv, https://doi.org/10.48550/arXiv.2301.05272

Peterson, Carole, McCabe, Allyssa (1987), «The connective ‘and’: Do older children use it less as they learn other connectives?» Journal of Child Language, 14(2), pp. 375–381.

Piaget, Jean (1923), Le langage et la pensée chez l’enfant, Delachaux et Niestlé, (Il linguaggio e il pensiero del fanciullo, transl. by, C. Musatti Rapuzzi, Firenze, Giunti 1967).

Piaget, Jean (1924), Le jugement et le raisonnement chez l’enfant, Delachaux et Niestlé, (Judgment and reasoning in the child, transl. by, M. Warden, Routledge 2002).

Piantadosi, Steven T., Hill, Felix (2022), «Meaning without reference in large language models», arXiv. http://arxiv.org/abs/2208.02957

Pinker, Steven (1994), The language instinct: How the mind creates language, New York, Morrow 1994.

Smolensky, Paul (1988), «On the proper treatment of connectionism», Behavioral and Brain Sciences, 11(1), pp. 1–23.

Soviany, Petru (2021), «Curriculum learning: A survey», arXiv, https://doi.org/10.48550/arXiv.2101.10382

Spelke, Elizabeth S., Kinzler, Katherine D. (2007), «Core knowledge», Dev. Sci., 10(1), pp. 89–96.

Suozzi, Alice, Capone, Luca, Lebani, Gianluca E., Lenci, Alessandro (2025), «Bambi: Developing baby language models for Italian», arXiv, https://doi.org/10.48550/arXiv.2503.09481

Tenenbaum, Joshua B., Kemp, Charles, Griffiths, Thomas L., Goodman, Noah D. (2011), «How to grow a mind: statistics, structure, and abstraction», Science 331, pp. 1279–1285

Tomasello, Michael (1999), The cultural origins of human cognition, London, HUP 1999.

Tomasello, Michael (2003), Constructing a language. A usage-based theory of language acquisition, Cambridge, HUP 2003.

Vygotskij, Lev S. (1934), Myšlenie i reč’, Gosudarstvennoe Sotsial’no-Ekonomicheskoe Izdatel’stvo, (The collected works of L:S: Vygorsky, Vol. 1, transl. by. N. Minick, NY, Plenum Press 1987).

Warstadt, Alex, Mueller, Aaron, Choshen, Leshem, Wilcox, Ethan, Zhuang, Chengxu, Ciro Juan, Mosquera Rafael, Paranjabe, Bhargavi, Williams, Adina, Linzen Tal, Cotterell, Ryan (2023), «Findings of the babylm challenge: Sample-efficient pretraining on developmentally plausible corpora», Proceedings of the BabyLM Challenge at the CoNLL, https://doi.org/10.18653/v1/2023.conll-babylm.1

Warstadt, Alex, Bowman, Samuel R. (2022), «What artificial neural networks can tell us about human language acquisition», arXiv, https://doi.org/10.48550/arXiv.2208.07998

Wei, Jason, Tay, Yi, Bommasani, Rishi, Raffel, Colin, Zoph, Barret, Borgeaud, Sebastian, Yogatama, Dani, Bosma, Maarten, Zhou, Denny, Metzler, Donald, Chi, Ed H., Hashimoto Tatsunori, Vinyals, Oriol, Liang, Percy, Dean, Jeff, Fedus, William (2022), «Emergent abilities of large language models», arXiv, https://doi.org/10.48550/arXiv.2206.07682

Zečević, Matej, Willig, Moritz, Dhami, Devendra Singh, Kersting, Kristian (2023), «Causal parrots: Large language models may talk causality but are not causal», arXiv, https://doi.org/10.48550/arXiv.2308.13067

Downloads

Published

2026-04-27

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.