The creation of Base Rate Knowledge of linguistic variables and the implementation of likelihood ratios to authorship attribution in forensic text comparison


  • Sheila Queralt


This article contributes to the research challenges that Forensic Linguistics
faces in the 21st century – to compare texts of unknown authorship with
the same reliability as other disciplines that consider forensic evidence. This research implements advanced statistical techniques within the field of forensic text comparison that improve the reliability of linguistic evidence furnished in Court and assess its significance. The first part of the analysis creates a Base Rate Knowledge for some of the most relevant linguistic variables in Peninsular Spanish texts. The second part applies statistical tests to variables with discriminatory potential to identify the samples of the authors and also assesses the reliability of the results in a posteriori classification. The implementation of the likelihood-ratio framework in the third part improves the reliability of linguistic evidence provided in court and offers probabilistic results to assist not only the judge and jury but also the linguistic expert in order to carry out more rigorous testing and extensive performance analysis of the data.