A Corpus-Based Analysis of Using Function Words in English Forensic Authorship Attribution - A Case of Political Journalism Disputes

von: Khalid Shakir Hussein, Eman Abdul Kareem

GRIN Verlag , 2018

ISBN: 9783668637474 , 124 Seiten

Format: PDF

Kopierschutz: frei

Windows PC,Mac OSX für alle DRM-fähigen eReader Apple iPad, Android Tablet PC's

Preis: 36,99 EUR

Mehr zum Inhalt

A Corpus-Based Analysis of Using Function Words in English Forensic Authorship Attribution - A Case of Political Journalism Disputes


 

Case Study from the year 2017 in the subject English Language and Literature Studies - Linguistics, , language: English, abstract: The advancement in computational linguistics and statistics has made an explicit impact on the emergence of corpus linguistics and the sophistication of its applications and studies involving not only pure linguistic issues but also areas related to real-life problems. One of these areas is authorship attribution studies. Authorship attribution is a domain of a study concerned with identifying the most likely author of a particular anonymous or disputed document from a set of suspected authors. To this end, several methodologies, techniques, and approaches have been devised and so often assessed on various sets of data to make sure of their effectiveness. Although the literature shows no consensus as to which methodology is the best among others, there is an overwhelming fact that all authorship attribution studies are grounded on the assumption that each author has a particular 'linguistic fingerprint' which can be captured through detecting and measuring the linguistic clues hidden in their authorial styles. Taking an experimental framework, this study is an attempt to gauge the discriminating and clustering power of the selected methodology against a particular type of data covering samples of political journal articles. The corpus compiled is a special purpose one strictly controlled for genre, register, and date of publication. It comprises eleven samples extracted from eleven articles with their lengths ranging between (1,101) to (1,113) words long; three ones are taken as test (hypothetically questioned) samples and the rest as training samples. The corpus represents the journalistic writings of four authors.

Khalid Shakir is a Professor of Corpus Stylistics at the University of Thi-Qar, College of Arts. He is interested in a variety of topics ranging from corpus approaches to authorship attribution and plagiarism to statistical investigations of linguistic and conceptual metaphors.