Jack Unterweger - an authorship analysis of the notorious killer's autobiography "Fegefeuer oder die Reise ins Zuchthaus"
Keywords:
principal component analysis, n-grams, stylistics, function words, questioned authorshipAbstract
Jack Unterweger, a notorious killer and celebrated author, was a highly skilled manipulator and was caught more than once to have plagiarized poems and short stories (Herwig, 2022; Leake, 2010). Praised for his work as an author, and particularly for his autobiography “Fegefeuer oder die Reise ins Zuchthaus”, Unterweger was released from prison in 1990. Almost 40 years after the publication of his autobiography, rumors have been voiced that Unterweger might not have written his autobiography by himself and that Sonja von Eisenstein, one of Unterweger’s benefactors and supporters, might have been involved in the writing of it (Herwig, 2022). Thus, based on a corpus of nine books by Unterweger and von Eisenstein, this study sets out to investigate whether these rumors are potentially true. The analysis is carried out with the help of a combination of quantitative (HCA, PCA) and qualitative (stylistic) methods. The analysis shows that Unterweger’s writing style is similar to the one of the questioned autobiography, but that someone else’s writing style is also present.
References
Ainsworth, J., & Juola, P. (2018). Who wrote this?: Modern forensic authorship analysis as a model for valid forensic science. Washington University Law Review, 96, 1161–1189.
Anthony, L. (2022). AntConc. Waseda University. https://www.laurenceanthony.net/software/antconc/
Argamon, S. (2008). Interpreting burrows’s delta: Geometric and probabilistic foundations. Literary and Linguistic Computing, 23(2), 131–147. https://doi.org/10.1093/llc/fqn003
Argamon, S. (2018). Computational Forensic Authorship Analysis: Promises and Pitfalls. Language and Law / Linguagem e Direito, 5(2), 7–37.
Argamon, S., & Levitan, S. (2005). Measuring the Usefulness of Function Words for Authorship Attribution. Proceedings of the Joint Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing, 1–3.
Baayen, H., van Halteren, H., & Tweedie, F. (1996). Outside the Cave of Shadows: Using Syntactic Annotation to Enhance Authorship Attribution. Literary and Linguistic Computing, 11(3), 121–131.
Bachmann, C. (2014). Leichtlesbar.ch, http://www.leichtlesbar.ch/html/
Baker, P. (2006). Using corpora in discourse analysis. London & New York: Bloomsbury.
Bandelow, B. (2013). Wer hat Angst vorm bösen Mann? Hamburg: rowohlt.
Belvisi, N. M. S., Muhammad, N., & Alonso-Fernandez, F. (2020). Forensic Authorship Analysis of Microblogging Texts Using N-Grams and Stylometric Features. IEEE, 1-6.
Biber, D. & Conrad, S. (2001). Variation in English: multi-dimentional studies. Harlow: Pearson Education/Longman.
Biber, D. & Conrad, S. (2009). Register, genre and style. Cambridge: CUP.
Biber, D., Conrad, S., & Reppen, R. (2006). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press.
Biber, D. & Finegan, E. (1994). Sociolinguistic perspectives on register. New York: Oxford University Press.
Binongo, J. (2003). Who wrote the 15th Book of Oz? An application of multivariate analysis to authorship attribution. Chance, 16(2), 9-17.
Brezina, V. (2018). Statistics in corpus linguistics. A practical guide. Cambridge: Cambridge University Press.
Brezina, V., Weill-Tessier, P. & McEnery, A. (2022). #LancsBox v. 6. [software]. Available at: http://corpora.lancs.ac.uk/lancsbox.
Burrows, J. F. (1987a). Computation into criticism: a study of Jane Austen’s novels and an experiment in method. Oxford: Clarendon Press.
Burrows, J. F. (1987b). Word-Patterns and Story-Shapes: The Statistical Analysis of Narrative Style. Literary and Linguistic Computing, 2(2), 61–70. https://doi.org/10.1093/LLC/2.2.61
Burrows, J. F. (1992). Not Unles You Ask Nicely: The Interpretative Nexus Between Analysis and Information. Literary and Linguistic Computing, 7(2), 91–109. https://doi.org/10.1093/LLC/7.2.91
Busch, V. (2019). Jack Unterweger. In Psychologie des Guten und Bösen (pp. 359–369). Berlin/Heidelberg: Springer.
Coulmas, F. (1979). On the sociolinguistic relevance of routine formulae. Journal of Pragmatics, 3(3–4), 239–266. https://doi.org/10.1016/0378-2166(79)90033-X
Coulthard, M. (2004). Author identification, idiolect and linguistic uniqueness. Applied Linguistics, 25(4), 431–447.
Coulthard, M. (2007). By their words shall ye know them: on linguistic identity. In C. R. Caldas-Coulthard & R. Iedema (Eds.), Identity trouble (pp. 143–155). Palgrave Macmillan.
Dern, C. (2008). “Wenn zahle nix dann geht dir schlecht” – ein Experiment zu sprachlichen Verstellungsstrategien in Erpresserbriefen. Zeitschrift für germanistische Linguistik, 36(2), 240-265. DOI 10.1515/ZGL.2008.018
DWDS Kernkorpus (2023). DWDS Kernkorpus (1900-1999). Available online: https://dwds.de/d/korpora/kern (accessed February 13th, 2023)
Eder, M. (2013). Does size matter? Authorship attribution, small samples, big problem. Digital Scholarship in the Humanities, 1–16. https://doi.org/10.1093/llc/fqt066
Eder, M. (2017). Short samples in authorship attribution: a new approach. DH, 1–5.
Eder, M., Rybicki, J., Kestemont, M. & Pielstrom, S. (2022). “Stylo“: a package for stylometric analyses. Available at: https://github.com/computationalstylistics/stylo_howto [accessed August 16th, 2022]
Ehrhardt, S. (2018). Authorship attribution analysis. In M. Rathert & J. Visconti (Eds.), Handbook of Communication in the Legal Sphere (pp. 169–200). Berlin: De Gruyter. https://doi.org/10.1515/9781614514664-008
Fine, J. (2006). Language in psychiatry. A handbook of clinical practice. London: Equinox.
Fobbe, E. (2011). Forensische Linguistik. Tübingen: Narr.
Fobbe, E. (2020). Text-Linguistic Analysis in Forensic Authorship Attribution. International Journal of Language & Law, 9, 93–114.
Fobbe, E. (2021). Forensische Linguistik. Eine kriminaltechnische Disziplin in Deutschland. SIAK – Zeitschrift für Polizeiwissenschaft und polizeiliche Praxis, 4, 18-27. http://dx.doi.org/10.7396/2021_4_B
Fobbe, E. (2022). Authorship identification. In V. Guillén-Nieto & D. Stein (Eds.), Language as evidence. Doing forensic linguistics. (pp. 185–218). Palgrave Macmillan.
Grant, T., & Baker, K. (2001). Identifying reliable, valid markers of authorship: a response to Chaski. International Journal of Speech, Language & the Law, 8(1), 1350–1771.
Grant, T., & MacLeod, N. (2018). Resources and constraints in linguistic identity performance: a theory of authorship. Language and Law / Linguagem e Direito, 5(1), 80–96.
Gries, S. (2022). How to use statistics in quantitative corpus linguistics. In A. O’Keeffe & M. J. McCarthy (Eds.), The Routledge handbook of corpus linguistics (pp. 168–181). London/New York: Routledge.
Gries, S. & Stefanowitsch, A. (2010). Cluster analysis and the identification of collexeme classes. In S. Rise & J. Newman (eds.), Empirical and experimental methods in cognitive/functional research (pp. 73-90). CSLI Publications.
Helt, M. (2001). A multi-dimensional comparison of British and American spoken English. In D. Biber & S. Conrad (eds), Variation in English: multi-dimensional studies (pp. 171-183). Harlow: Pearson Education/Longman.
Hennig, M. (2020). Nominalstil. Tübingen: narr.
Herring, S. C., & Paolillo, J. C. (2006). Gender and genre variation in weblogs. Journal of Sociolinguistics, 10(4), 439–459. https://doi.org/10.1111/J.1467-9841.2006.00287.X
Herwig, M. (2022). JACK. Gier frisst Schönheiten. ARD Podcast. https://www.ardaudiothek.de/sendung/jack-gier-frisst-schoenheiten/10385179/
Holmes, D. I. (1992). A Stylometric Analysis of Mormon Scripture and Related Texts. Journal of the Royal Statistical Society, 155(1), 91–120.
Holzer, E., & Reibenwein, M. (2022). Jack Unterweger: Der Popstar unter den Serienmördern. https://kurier.at/chronik/oesterreich/jack-unterweger-der-popstar-unter-den-serienmoerdern/401888249
Hoover, D. L. (2004). Testing Burrow’s Delta. Literary and Linguistic Computing, 19(4), 453-475.
Johnson, A., & Wright, D. (2014). Identifying idiolect in forensic authorship attribution: an n-gram textbite approach. Language and Law / Linguagem e Direito, 1(1), 37–69.
Juola, P. (2012). Detecting stylistic deception. Proceedings of the workshop on computational approaches to deception detection, 91-96.
Juola, P. (2021). Verifying authorship for forensic purposes: A computational protocol and its validation. Forensic Science International, 325, 1–11. https://doi.org/10.1016/J.FORSCIINT.2021.110824
Kestemont, M. (2014). Function Words in Authorship Attribution From Black Magic to Theory? Proceedings of the 3rd Workshop on Computational Linguistics for Literature, 59–66.
Koppel, M., Schler, J., & Argamon, S. (2011). Authorship attribution in the wild. Language Resources and Evaluation, 45(1), 83–94. https://doi.org/10.1007/s10579-009-9111-2
Koppel, M., Schlier, J., & Argamon, S. (2009). Computational methods in authorship attribution. Journal of the American Society for Information Science and Technology, 60(1), 9–26. https://doi.org/10.1002/asi.20961
Larner, S. (2014). A preliminary investigation into the use of fixed formulaic sequences as a marker of authorship. International Journal of Speech, Language and the Law, 21(1), 1–22. https://doi.org/10.1558/ijsll.v21i1.1
Layton, R., Watters, P.A., & Dazeley, R. (2015). Authorship analysis of aliases: Does topic influence accuracy? Natural Language Engineering 21(4): p.497–518.
Leake, J. (2010). Der Mann aus dem Fegefeuer: das Doppelleben des Jack Unterweger. Heyne.
Locker, A. (2019). “Because the computer said so!” Can computational authorship analysis be trusted? Journal of Language Works, 4(1), 23–37.
Marko, K., Reitbauer, M., & Pickl, G. (2022). Same person, different platform: challenges and implications for forensic authorship analysis. An exploratory study of Instagram and Twitter users. Register Studies, 4(2), 202-231. https://doi.org/10.1075/rs.22006.mar
McMenamin, G. (2002). Forensic stylistics. CRC Press.
Milroy, J. & Milroy, L. (1998). Varieties and variation. In F. Coulmas (ed.), The Handbook of sociolinguistics, (pp. 47-64). Malden: Blackwell.
Miranda García, A., & Calle Martín, J. (2007). Function Words in Authorship Attribution Studies. Literary and Linguistic Computing, 22(1), 49–66. https://doi.org/10.1093/llc/fql048
Moon, R. (1998). Fixed expressions and idioms in English: a corpus-based approach. Clarendon Press.
Mose, J. (2015, October 26). Jack Unterweger Interview im Gefängnis Stein Part 1 [Video]. YouTube. https://www.youtube.com/watch?v=kjYNVQjggto
Mosteller, Frederick/Wallace, David (1984). Applied Bayesian and Classical Inference. New York, NY: Springer. https://doi.org/10.1007/978-1-4612-5256-6_1
Newman, M. L., Groom, C. J., Handelman, L. D., & Pennebaker, J. W. (2008). Gender differences in language use: An analysis of 14,000 text samples. Discourse Processes, 45(3), 211–236. https://doi.org/10.1080/01638530802073712
Nini, A. (2018a). An authorship analysis of the Jack the Ripper letters. Digital Scholarship in the Humanities, 33(3), 621-636. DOI: 10.1093/LLC/FQX065
Nini, A. (2018b). Developing forensic authorship profiling. Language and Law / Linguagem e Direito, 5(2), 38–58.
Nini, A., & Grant, T. (2013). Bridging the gap between stylistic and cognitive approaches to authorship analysis using Systemic Functional Linguistics and multidimensional analysis. Internation Journal of Speech, Language & the Law, 20(2), 173–202.
Orvell, A., Gelman, S. & Kross, E. (2022). What “you” and “we” say about me: how small shifts in language reveal and empower fundamental shifts in perspective. Social and Personality Psychology Compass, 16(5), 1-11. DOI: 10.1111/spc3.12665
Overdorf, R., & Greenstadt, R. (2016). Blogs, Twitter Feeds, and Reddit Comments: Cross-domain Authorship Attribution. Proceedings on Privacy Enhancing Technologies, 2016(3), 155–171. https://doi.org/10.1515/popets-2016-0021
Pennebaker, J. (2011). The Secret Life of Pronouns: What Our Words Say about Us. Bloomsbury.
Peters, A. (1984). The Units of Language Acquisition. Cambridge: Cambridge University Press.
R Core Team (2022). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org/ [Accessed August 14, 2022].
Rude, S., Gortner, E., & Pennebaker, J. (2004). Language use of depressed and depression-vulnerable college students. Cognition and Emotion, 18, 1121-1133.
Schmitt, N., Grandage, S., & Adolphs, S. (2004). Are corpus-derived recurrent clusters psycholinguistically valid? In N. Schmitt (Ed.), Formulaic sequences: Acquisition, processing, use (pp. 127–151). John Benjamins.
Solan, L. M. (2013). Intuition versus Algorithm: The Case of Forensic Authorship Attribution. Journal of Law and Policy, 21(2), 551-576.
Stamatatos, E. (2009). A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology, 60(3), 538–556. https://doi.org/10.1002/ASI.21001
Stamatatos, E. (2013). On the Robustness of Authorship Attribution Based on Character N-gram Features. Journal of Law and Policy, 21, 421-439.
Swain, S., Mishra, G., & Sindhu, C. (2017). Recent approaches on authorship attribution techniques - an overview. International Conference on Electronics, Communication and Aerospace Technology, 557–566.
Turell, M. T. (2010). The use of textual, grammatical, and sociolinguistic evidence in forensic text comparison. The International Journal of Speech, Language and the Law, 17(2), 211–250.
Turell, T. M., & Gavalda, N. (2013). Towards an index of idiolectal similitude (or distance) in forensic authorship analysis. Journal of Law and Policy, 21(2), 495-514.
Viding, E. (2019). Psychopathy. A very short introduction. New York: Oxford University Press.
Widler, Y. (2019). Psychiater Haller: “Gefährlich ist nicht der finstere Wald, sondern das eigene Heim” | kurier.at. https://kurier.at/chronik/oesterreich/profiler-haller-gefaehrlich-ist-nicht-der-finstere-wald-sondern-das-eigene-heim/400637114
Wortbrücke. (n.d.). Retrieved August 8, 2022, from https://www.onb.ac.at/oe-literaturzeitschriften/Wortbruecke/Wortbruecke.htm
Wray, A. (2002). Formulaic Language and the Lexicon. Cambridge University Press.
Wright, D. (2013). Stylistic variation within genre conventions in the Enron email corpus: Developing a textsensitive methodology for authorship research. International Journal of Speech, Language and the Law, 20(1), 45–75. https://doi.org/10.1558/ijsll.v20i1.45
Wright, D. (2017). Using word n-grams to identify authors and idiolects. International Journal of Corpus Linguistics, 22(2), 212–241. https://doi.org/10.1075/ijcl.22.2.03wri
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Karoline Marko, Alesia Locker
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Este trabalho está licenciado com uma Licença Creative Commons - Atribuição-NãoComercial 4.0 Internacional.