In Issue 2/2025

Introduction

In his article “The Risks of Researching Syntactic Phenomena in Edited Parliamentary Transcripts” (Sigurðsson 2025), Sigurðsson highlights potential challenges in studying linguistic variation utilising written transcripts of political speeches. He illustrates these issues with reference to our research methodology in a single case study (Stefánsdóttir and Ingason 2022) and to the wider European Research Council (ERC) project from which it derives. Sigurðsson’s concern is that some speeches may have been written by others and then read aloud by the Member of Parliament (MP). His second and main claim is that edited transcripts differ from what was actually said in parliament. We agree that both concerns are crucial, which is why we had addressed them in developing our research methods. In what follows, we explain why our methodology is appropriate and respond to the points raised by Sigurðsson.

The EILisCh project

Our ongoing ERC-funded project Explaining Individual Lifespan Change (EILisCh) examines sociolinguistic style-shifting of Icelandic MPs over time. Style-shifting is operationalised here as the presence or absence of Stylistic Fronting (SF), a sociolinguistic variable indexing formality (Mechler et al. 2025: 314). Since analyses rely on a large-scale dataset (n=574,913), examples of SF were automatically extracted from written transcripts of speeches. Automatic coding facilitates fast processing times; however, it is crucial to ensure its accuracy. This entails comparing the written transcripts with audio files of the speeches. To complement the quantitative analysis, we also conduct qualitative interviews with several MPs. Here, we compare automatic coding based on transcripts with manually corrected data derived from the corresponding audio files from 20 MPs in the Icelandic Gigaword Corpus (Steingrímsson et al. 2018).

Error analysis of written transcripts versus audio or video recordings

As Sigurðsson (2025) acknowledges, MPs rarely read from prepared speeches. An example is MP Steingrímur Sigfússon. In two interviews with Sigfússon, he confirms that he rarely relied on prepared speeches: “For the last 20 years or so, I hardly ever read from a written speech – except when a formal introductory speech for a bill had been prepared. I might have had some notes for policy statements and the general debate speeches, but I usually drifted off that and just spoke off the cuff.” Discussing potential issues and contextualising the data is the reason why interviews with individual MPs are crucial in our mixed-methods approach.

We also know that ministers have more prepared speeches than regular MPs, and thus, we consider an MP’s position in our statistical models. However, having pre-written speeches does not automatically mean that MPs read the speeches as written. As mentioned above, MPs still make their own linguistic choices in the moment. This trend is even more prevalent among ministers. In our interviews, many MPs commented that as ministers they were more confident and thus relied less on pre-written speeches, though they had more prepared speeches in this role. Therefore, we can reliably consider the transcripts as sources for an MP’s language, in this case MP Steingrímur Sigfússon.

As for the second point, we will demonstrate that automatic coding effectively traces trends and changes across years, exemplified first by the error analysis of Sigfússon. Figure 1 illustrates the SF distribution (n=3,115) for Sigfússon from 2005 to 2021 as reported by automatic and manual coding. Both types of coding trace his malleable SF use well, revealing intricate details of his SF trajectory across time. Crucially, this is supported by our statistical analysis, where the interaction between year/method type is not significant in a generalised linear mixed model. For further verification, we examined an additional sample of 20 MPs, analysed below.

Figure 1: Individual error analysis for Steingrímur Sigfússon (n=3,115), comparing automatic and manual coding (2005–2021).
Figure 1: Individual error analysis for Steingrímur Sigfússon (n=3,115), comparing automatic and manual coding (2005–2021).

The EILisCh project aims to examine how politicians and entire political parties behave linguistically across time. Figure 2 visualises the aggregated error analysis for 20 MPs (n=50,192), comparing automatic and manual coding between 2005 and 2021. The automatic coding slightly overestimates the rate of SF, indicating that MPs’ actual speeches contained fewer instances of SF than in the transcripts.

Sigurðsson (2025) is correct in claiming that the MPs were made to sound more formal by editors in parliament. However, this is limited to very few instances, which do not make a statistical difference (the interaction between year/method type is not significant in a regression model). The essential point for us is that both coding methods trace linguistic change effectively across time (see Figures 1 and 2).

Figure 2: Error analysis for all 20 MPs combined (n=50,192), comparing automatic and manual coding (2005–2021).
Figure 2: Error analysis for all 20 MPs combined (n=50,192), comparing automatic and manual coding (2005–2021).

Sigurðsson (2025) argues that our research is limited to drawing conclusions on the “editorial policy rather than MPs’ language”, but our research can indeed add new evidence on both: it informs our understanding of the process of editing in Alþingi by comparing automatic and manual coding, as we have done here. Because this comparison reveals that the automatic coding is reliable, we can also investigate linguistic trajectories across time.

Our analysis confirms that automatic coding provides reliable evidence of MPs’ linguistic behaviour over time. By comparing automatic and manual coding, we also contribute to understanding editorial practices in Alþingi. Together, these findings demonstrate that the EILisCh methodology is well suited to tracing individual lifespan changes through linguistic trajectories.

Conclusion

In response to Sigurðsson (2025), we have demonstrated the reliability of written parliamentary transcripts, which, alongside interviews with the MPs, form the primary data source for the EILisCh project. We find that SF rates based on the manually corrected coding generally correspond to, though are slightly lower than, those obtained through automatic coding. Thanks to the meticulouswork of the Alþingi editors, both transcripts and recordings provide dependable evidence for tracing linguistic variation and malleability in political speech over time.

Johanna Mechler is a postdoctoral researcher at the University of Iceland within the EILisCh project. Lilja Björk Stefánsdóttir is a project manager and PhD student at the University of Iceland. Anton Karl Ingason is a professor at the University of Iceland and principal investigator of the EILisCh project. All authors work at the School of Humanities, where they focus on language variation and change in Icelandic.

Acknowledgments

This project is supported by a grant from the European Research Council (ERC), project ID 101117824. We would like to thank the transcribers without whom this paper would not be possible.

References

Mechler, Johanna, Lilja Björk Stefánsdóttir, and Anton Ingason. 2025. Language use of political parties over time: Stylistic Fronting in the Icelandic Gigaword Corpus. In Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, 313–318.

Nikolaev, Dmitry, and Sean Papay. 2025. Strategies for political-statement segmentation and labelling in unstructured text. In Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, 437–451.

Sigurðsson, Kristján F. 2025. The risks of researching syntactic phenomena in edited parliamentary transcripts. Tiro 1. URL: https://tiro.intersteno.org/2025/06/the-risks-of-researching-syntactic-phenomena-in-edited-parliamentary-transcripts/.

Stefánsdóttir, Lilja Björk, and Anton Karl Ingason. 2022. Einstaklingsbundin lífsleiðarbreyting. Þróun stílfærslu í þingræðum Steingríms J. Sigfússonar. [Individual lifespan change. The evolution of Stylistic Fronting in the parliament speeches of Steingrímur J. Sigfússon.] Íslenskt mál 44:151–178.

Steingrímsson, Steinþór, Sigrún Helgadóttir, Eiríkur Rögnvaldsson, Starkaður Barkarson, and Jón Guðnason. 2018. Risamálheild: A very large Icelandic text corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evolution (LREC-2018), 4361–4366.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.