Introduction
This article builds on the article published in the last issue of Tiro, which explored the impact of using an automatic transcription system on grammatical corrections in Portuguese parliamentary plenary session reports (Pereira & Granja 2025). In this second article, we present the linguistic structures in more detail, together with the results we obtained and examples illustrating how oral syntactic structures were edited (or not) when transcribing speech into written form. We will also attempt to draw some preliminary conclusions about the influence of STAAR (the Automatic Transcription System for the Assembly of the Republic of Portugal) on parliamentary transcription.
As data, we selected four major types of Portuguese syntactic structures that are seen as problematic from the point of view of written standard prose. Our data consists of 729 cases from 198 sittings. These linguistic categories are examined in four sections below.
Omitted prepositions in verb complements
We examined the verbs concordar com (“to agree with”) and discordar de (“to disagree with”), with a complement in the form of a finite or infinitive clause. Speakers often omit the preposition com/de (“with/of”) or replace it with the conjunction que (”that”).
From a normative perspective, sentences with these verbs and nominal complements lacking prepositions are considered ungrammatical, due to the verbs’ argument structure, and so they are unlikely to be uttered by native speakers (1). However, sentences with phrasal complements, not preceded by a preposition, are accepted and often produced (2).
(1) a. *Todos nós concordamos a opinião de que o jornalismo deve ser livre.
We all agree the opinion that journalism should be free.
b. *Nós não concordamos a proposta de que a votação passe para o final.
We do not agree the proposal that the vote be moved to the end.
(2) a. Todos concordamos que o jornalismo deve ser livre e independente. (DAR 79, XVI, 1.ª, 21/01/2025)
We all agree that journalism must be free and independent.
b. Nós não discordamos que a votação passe para o final… (DAR 35, XVI, 1.ª, 18/07/2024)
We all disagree that journalism must be free and independent.
In this study, we analysed the occurrence of examples such as the ones in (2), because, from a descriptive point of view, these kinds of sentences cannot be considered ungrammatical, merely non-standard grammar, since speakers accept and frequently produce them.
We found that this kind of problematic linguistic structures were mostly left unchanged in our official report. Reporters tend to agree with the omission of the preposition, if it is followed by a phrasal complement. Of the 56 cases we identified, only 8 (14%) were altered in the final text.

Prepositions added in verb complements
The second problematic linguistic structure we analysed concerns the verb tornar-se (“to become”), which typically has a non-prepositional complement. Probably by analogy with the verb “transformar-se em” (“to turn into”), speakers incorrectly add the preposition em (“into”), resulting in the verb tornar-se taking a prepositional complement.
Of the 17 cases we found, only 1 (6%) was changed, so this case was deemed even more acceptable by the editors than the previous one.

For example, the sentence in (3) was barely edited in the final text, apart from the introduction of commas, keeping the word no (contraction of the preposition em and the definite article o). However, the sentence in (4a) is the edited version of the original utterance, shown in (4b); the editor adequately corrected the word num (“in the”) to the non-prepositioned article um (“a”):
(3) […] em Portugal, a habitação tornou-se no novo risco social.
[…] in Portugal, housing has become in the new social risk.
(DAR 51, XVI, 1.ª, 17/10/2024)
(4) a. O País em que vivemos tornou-se um absoluto descontrolo… [final text edited]
The country we live in has become a state of absolute disorder…
b. *O país em que vivemos tornou-se num absoluto descontrolo nos últimos anos… [original utterance]
The country we live in has become into a state of absolute disorder …
(DAR 4, XVII, 1.ª, 25/06/2025)
Misplaced clitic pronouns in verbal complexes
Thirdly, we analysed linguistic structures with an incorrect placement of clitic pronouns within verbal complexes, namely some verbal complexes with at least one auxiliary or modal verb and one main verb.
In these cases, the norm in European Portuguese is for the dative pronoun to appear enclitically (i.e. after) with the main verb. However, it is common for speakers to produce utterances with a different placement of the pronoun. First, it happens in utterances where the dative pronoun is placed before (i.e. in proclisis to) the auxiliary/modal verb — gostar (“like to”), querer (“want to”) or poder (“may”) — instead of being associated (either before or after) with the main verb, as would be, normatively, expected.
Of the 103 cases identified, 31 (~30%) were edited, by linking the pronoun to the main verb instead of to the auxiliary verb, as originally uttered.
Examples in (5) show cases where this edit was made in the final text; the dative pronoun lhe (dat.pron) was moved to a position associated with the main verb (gostaria de lhe fazer) rather than the auxiliary verb (lhe gostaria de fazer). (About the linguistic glosses used here, see the end note in Pereira & Granja 2025.)
(5) a. E aqui surge a segunda pergunta sobre esta parte que lhe gostaria de fazer. [original utterance]
[…] the second question about this part that DAT.PRON. would-like to do
b. Aqui surge a segunda pergunta, sobre esta parte, que gostaria de lhe fazer. [final text edited]
[…] the second question about this part that would-like to DAT.PRON. do
(DAR 40, XV, 2.ª, 24/01/2024)

Secondly, the dative pronoun is often placed after the auxiliary/modal verb, as in (6a), instead of after the main verb, as in (6b).
(6) a. Então vou lhe dizer o que aconteceu. [original utterance]
So I-will DAT.PRON tell what happened.
b. Então, vou dizer-lhe o que aconteceu. [final text edited]
So I-will tell-DAT.PRON what happened.
DAR 138, XV, 1.ª, 06/06/2023
We searched for examples with auxiliary/modal verbs ir (“will”), querer (“want to”) and poder (“may”). Of the 545 cases we found, 360 (66%) were edited to link it to the main verb.

This is a very common syntactic structure in Portuguese. As our findings show, its acceptability varies considerably, probably depending on the auxiliary/modal verb used. As many authors before (Martins, 2016), we may conclude that the placement of dative clitic pronouns is a matter of debate in Portuguese, as their position varies a lot, as well as the speakers’ and the editors’ grammatical judgments concerning it.
Using an initial infinitive verbal expression
Finally, we analysed the use of the infinitive verbal expression dizer que (“to say that”) at the beginning of a sentence, instead of “I (would like to) say that” or “I (want to) say that”. This is a relatively recent and rare usage in Portuguese, but since it is mostly used at the beginning of the sentences it is a very noticeable non-standard structure. Maybe because its use is perceived as a particular oral style, it is not always corrected, as the example in (7) confirms, since it has not been edited in the final text, corresponding to the original utterance:
(7) Dizer que sim, Sr. Presidente. Confirmar… (DAR 31, XVI, 1.ª, 28/06/2024)
To say yes, Mr. Speaker. To confirm…
When editors choose to correct this initial expression (8a), in most cases, they simply remove it (8b):
(8) a. Dizer que, neste momento […], também nos Açores está-se a discutir o orçamento regional. [original utterance]
To say that, at the moment, the regional budget is also being discussed in Açores.
b. Neste momento […], também nos Açores se está a discutir o orçamento regional. [final text edited]
At the moment, the regional budget is also being discussed in Açores.
(DAR 4, XVII, 1.ª, 25/06/2025)
Our findings show that, although these cases are not that common, they are mostly amended by reporters/editors. In 8 cases found, 6 (75%) were edited.

Preliminary conclusions
The literature on the transcription of parliamentary sessions (e.g. Voutilainen, 2023; Kawahara, 2024) suggests that, in general, fewer changes are now made to transcribed text, meaning transcriptions are closer to verbatim than in the past. This could be one reason why our data is not now corrected as much by reporters or editors.
Another reason, as previously mentioned, is the evolution of language, and the fact that the analysed syntactic structures are on the borderline between grammatical and ungrammatical constructions. This may lead editors to respect the speaker’s linguistic choice, even when they are non-standard grammar structures.
A third reason, which we intend to investigate, by collecting more data in the future, is that using an automatic transcription system influences reporters’ or editors’ grammatical judgments. Our hypothesis is that, when reporters are faced with text close to oral language produced by someone else (in this case, an automatic system), they are more likely to accept syntactic structures that lie on the boundary of grammaticality and therefore correct them less.
As mentioned in our prior article (Pereira & Granja 2025), this is only the first phase of our study. The first findings suggest that the correction of the ungrammatical structures varies considerably depending on the structure itself. But for some structures many more edits would be expected as they are clearly ungrammatical — at least from a prescriptive point of view. In fact, the Portuguese language, like most Romance languages, frequently exhibits strong diamesic variation between everyday speech and formal written registers.
We still need to collect transcripts from before STAAR, to check if the trend of fewer edits is confirmed and if it is due to STAAR, or simply to the evolution of language/acceptability standards, to reach firmer conclusions. In any case, it should help us discuss reporters’ agreement on editing judgments and possibly reviewing style guides/training.
We believe that this project has already shown the benefits of AI in the transcription of parliamentary sessions. Without it, we would not have the time or personnel to conduct the study. We hope it will not only improve editorial practice in parliamentary reporting but also shed light on the evolution of language and on the broader impact of AI on linguistic norms.
Ana Rita Pereira and Paulo Granja work as parliamentary reporters at the Official Journal Division of the Parliament of Portugal.
References
Granja, P. (2023). The Portuguese Parliamentary Reporters’ Experience with Automatic Speech Recognition Systems. Tiro 2/2023. URL: https://tiro.intersteno.org/2023/12/the-portuguese-parliamentary-reporters-experience-with-automatic-speech-recognition-systems/
Martins, A. M. (2016). A colocação dos pronomes clíticos em sincronia e diacronia. In Manual de Linguística Portuguesa (A. M. Martins & E. Carrilho, p. 401–430.). Berlin/Boston: De Gruyter.
Nascimento, P., J.C. Ferreira & F. Batista (2024). Automatic transcription system for parliamentary debates in the context of assembly of the republic of Portugal. International Journal of Speech Technology 27, 613–635. URL: https://doi.org/10.1007/s10772-024-10126-4
Pereira, A. R. & P. Granja (2025). The Influence of AI on Grammatical Correction in Portuguese Parliament Plenary Session Reports – First Observations. URL: https://tiro.intersteno.org/2025/12/the-influence-of-ai-on-grammatical-correction-in-portuguese-parliament-plenary-session-reports-first-observations/
Voutilainen, E. (2023). Written representation of spoken interaction in the official parliamentary transcripts of the Finnish Parliament. Frontiers in Communication 8: 104779. URL: https://www.frontiersin.org/journals/communication/articles/10.3389/fcomm.2023.1047799/full

