Analyzing GPT-4 misinterpretations of russian grammatical constructions

Authors

  • Timofei Plotnikov UiT The Arctic University of Norway

Abstract

Generative Pre-trained Transformers (GPTs), which are Large Language Models (LLMs) primarily trained on datasets dominated by English texts, can incorporate inaccuracies into their final outputs. The training data bias presents challenges for accurate performance in non-Roman script languages, such as Russian. Challenges are specifically expected with grammatical constructions in Russian which are colloquial, idiomatic, and non-compositional. Although the empirical investigation reveals that, overall, GPT-4 performs well with the majority of Russian grammatical constructions, it still encounters limitations with low-frequency constructions due to insufficient training data, a lack of context, and the influence of the English language.

References

Bast, R., Endresen, A., Janda, L. A., Lund, M., Lyashevskaya, O., Mordashova, D., Nesset, T., Rakhilina, E., Tyers, F. M., & Zhukova, V. (2021). The Russian Constructicon. An Electronic Database of the Russian Grammatical Constructions. https://constructicon.github.io/russian/

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., & Amodei, D. (2020). Language models are few-shot learners. ArXiv abs/2005.14165. Web.

Diessel, H. (2019). The Grammar Network: How Linguistic Structure Is Shaped by Language Use. Cambridge University Press.

Diessel, H. (2023). The Constructicon: Taxonomies and Networks. Cambridge University Press.

Dodge, J., Sap, M., Marasović, A., Agnew, W., Ilharco, G., Groeneveld, D., Mitchell, M., & Gardner, M. (2021). Documenting large webtext corpora: A case study on the colossal clean crawled corpus. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 1286-1305).

Dudy, S., & Bedrick, S. (2020). Are some words worth more than others? Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 131-142). https://doi.org/10.18653/v1/2020.eval4nlp-1.13

Fillmore, C. J., Kay, P., & O'Connor, M. C. (1988). Regularity and idiomaticity in grammatical constructions: The case of let alone. Language, 64(3), 501-538.

Gemini Team. Anil, R., Borgeaud, S., Wu, Y., Alayrac, J.-B., Yu, J., Soricut, R., et al. (2023). Gemini: A family of highly capable multimodal models. ArXiv preprint arXiv:2312.11805.

Goldberg, A. E. (2006). Constructions at Work: The Nature of Generalization in Language. Oxford University Press.

Goldberg, A., & Suttle, L. (2010). Construction grammar. WIREs Cognitive Science, 1(4), 468-477. https://doi.org/10.1002/wcs.22

Gong, X.-R., Jin, J.-X., & Zhang, T. (2019). Sentiment analysis using autoregressive language modeling and broad learning system. Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1130-1134).

Google Books. (n.d.). Retrieved July 15, 2024, from https://books.google.com/

Hendy, A., Abdelrehim, M. G., Sharaf, A., Raunak, V., Gabr, M., Matsushita, H., Kim, Y. J., Afify, M. A., & Awadalla, H. H. (2023). How good are GPT models at machine translation? A comprehensive evaluation. ArXiv abs/2302.09210.

Hilpert, M. (2014). Construction grammar and its application to English. Edinburgh University Press.

Jackson, S., Beekhuizen, B., Zhao, Z., Zhao, Y. C., & McEwen, R. N. (2023). LLMs and linguistic competency: An exploration of GPT-4 and a non-hegemonic English variety. Newhouse Impact Journal, 1(1), 21-23.

Janda, L. A., Endresen, A., Zhukova, V., Mordashova, D., & Rakhilina, E. (2020). How to build a constructicon in five years: The Russian example. Belgian Journal of Linguistics, 34, 162-175. https://doi.org/10.1075/bjl.00043.jan

Janda, L. A., Lyashevskaya, O., Nesset, T., Rakhilina, E., & Tyers, F. M. (2018). Chapter 6. A constructicon for Russian: Filling in the gaps. In B. Lyngfelt, L. Borin, K. Ohara, & T. T. Torrent (Eds.), Constructicography: Constructicon development across languages (pp. 165-181). John Benjamins Publishing Co. https://doi.org/10.1075/cal.22.06jan

Janda, L. A., Zhukova, V., & Endresen, A. (2024). Typology of reduplication in Russian: Constructions within and beyond a single clause. In M. Kopotev & K. Kwon (Eds.), Constructions with lexical repetitions in East Slavic (pp. 71-96). De Gruyter Mouton. https://doi.org/10.1515/9783111165806-003

Johnson, R. L., Pistilli, G., Menédez-González, N., Dias Duran, L. D., Panai, E., Kalpokiene, J., & Bertulfo, D. J. (2022). The ghost in the machine has an American accent: Value conflict in GPT-3. arXiv preprint arXiv:2203.07785.

Kuznecov, S. A. (Ed.). (2000). Bolʹšoj tolkovyj slovarʹ russkogo jazyka. Bukinist.

Lai, V. D., Ngo, N. T., Pouran Ben Veyseh, A., Man, H., Dernoncourt, F., Bui, T., & Nguyen, T. H. (2023). Chatgpt beyond English: Towards a comprehensive evaluation of large language models in multilingual learning. arXiv preprint arXiv:2304.05613.

Liu, C. (2024). The investigation of application related to ChatGPT in foreign language learning. Applied and Computational Engineering, 35, 110-115. https://doi.org/10.54254/2755-2721/35/20230376

Liu, J., Shen, D., Zhang, Y., Dolan, B., Carin, L., & Chen, W. (2021). What makes good in-context examples for GPT-3? Workshop on Knowledge Extraction and Integration for Deep Learning Architectures; Deep Learning Inside Out. arXiv preprint arXiv:2101.06804.

Lyngfelt, B., Borin, L., Ohara, K., & Torrent, T. T. (Eds.). (2018). Constructicography: Constructicon development across languages. John Benjamins. https://doi.org/10.1075/cal.22

National Corpus of the Russian Language. (2024). Retrieved May 12, 2024, from https://ruscorpora.ru/en/

Nesset, T., & Janda, L. A. (2023). A network of allostructions: Quantified subject constructions in Russian. Cognitive Linguistics, 34(1), 67-97. https://doi.org/10.1515/cog-2021-0117

Pannatier, A., Courdier, E., & Fleuret, F. (2024). GPTs: A new approach to autoregressive models. arXiv preprint arXiv:2404.09562.

Papadimitriou, I., Lopez, K., & Jurafsky, D. (2022). Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models. arXiv abs/2210.05619.

Rakhilina, E., Zhukova, V., Demidova, D., Kudrjavceva, P., Rozovskaja, G., Endresen, A., & Janda, L. A. (2022). Frazeologija v rakurse Russkogo Konstruktikona [Phraseology in the light of the Russian Constructicon]. Bulletin of the Russian Academy of Sciences: Studies in Literature and Language, 2(32), 13-44. https://doi.org/10.31912/pvrli-2022.2.2

Rawte, V., Chakraborty, S., Pathak, A., Sarkar, A., Tonmoy, S. M., Chadha, A., Sheth, A. P., & Das, A. (2023). The troubling emergence of hallucination in large language models - An extensive definition, quantification, and prescriptive remediations. arXiv preprint arXiv:2310.04988.

Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121-154. https://doi.org/10.1016/j.iotcps.2023.04.003

Rozentalʹ, D.È., & Telenkova, M.A. (2005). Slovarʹ trudnostej russkogo jazyka. Ajris-press.

Sketch Engine - Russian Web 2011 corpus (ruTenTen 2011). (2024). Retrieved May 12, 2024, from https://app.sketchengine.eu

Tan, Y., Min, D., Li, Y., Li, W., Hu, N., Chen, Y., & Qi, G. (2023). Can ChatGPT replace traditional KBQA models? An in-depth analysis of the question answering performance of the GPT LLM family. International Semantic Web Conference (pp. 348-367). Cham: Springer Nature Switzerland.

Wendler, C., Veselovsky, V., Monea, G., & West, R. (2024). Do llamas work in English? On the latent language of multilingual transformers. arXiv abs/2402.10588, 1-29. https://arxiv.org/abs/2402.10588

Yenduri, G., Murugan, R., Govardanan, C., Supriya, Y., Srivastava, G., Reddy, P., Raj, D., Jhaveri, R., B, P., Wang, W., Vasilakos, A., & Gadekallu, T. (2024). GPT (Generative Pre-Trained Transformer) – A comprehensive review on enabling technologies, potential applications, emerging challenges, and future directions. IEEE Access, 12, 54608-54649. https://doi.org/10.1109/ACCESS.2024.3389497

Zhu, Q., & Luo, J. (2022). Generative pre-trained transformer for design concept generation: An exploration. Proceedings of the Design Society, 2, 1825-1834. https://doi.org/10.1017/pds.2022.185

Published

2024-12-23