Corpus guidance: a tool for understanding professional language usage, change and variety
Manuscript editors, translators, translation revisers – and many other wordface workers such as copywriters and journalists – have been encouraged to use corpora to clarify how words fit together in all kinds of text types they do not personally read, hear or produce on a daily basis. Monolingual target-language corpora can be valuable for expanding our personally or professionally constrained idiolects and for combating language shrinkage or interference if we live abroad. Or they can simply teach us humility and caution when we might over-edit an author or fellow translator’s prose.
This updated MET workshop will briefly review the types of corpora available on the web or that can be fairly easily compiled for a specific discipline or client setting. Criticisms leveled against a corpus-guidance approach (such as concerns about quality of language input, obsolescence, or the difficulty of making bespoke corpora) will be acknowledged and discussed.
We will then focus on corpus-guided decision-making, to clarify when and how a well-chosen corpus can be queried quickly or when another research tool might be better. We’ll use worksheets showing concordances (the main output of a corpus analysis tool) and practice inspecting and interpreting filtered and unfiltered search results. We will even touch on how to draw information from "negative findings". This is a "laptop workshop", so we’ll also work hands-on with bespoke corpora (medicine and engineering) on a desktop analysis tool (AntConc) and dip into a vast collection of tagged corpora that can be used free online and filtered by discipline and register. Its large American English portion is continually updated. A final handout will include an annotated list of some other tools that can currently be recommended.
Finally, time will be set aside to assess participants’ fresh work-based language questions that might or might not be answerable through corpus guidance. Therefore, registrants are invited to send their doubts about English usage to the facilitator as they arise over the next three months. We’ll discuss how and where (on which corpus) to pose the questions and make some decisions together.
Developer and facilitator: Mary Ellen Kerans. (This update was built on work done with Ailish Maher and Stephen Waller. That collaboration is gratefully acknowledged.)
Purposes: 1) To understand what various corpora can and can’t offer. 2) To learn how to pose a query and obtain an output appropriate for interpretation (including use of wildcards, simple ways to alter a concordance to focus on different aspects of context, and how to limit context on large corpora). 3) To learn about the merits and drawbacks of some of the online corpora now available.
Outcome: A better sense of when to go first to a corpus (and which one), how to pose the question, and how to answer it.
Who should attend? Newly specializing or experienced wordface workers who’ve heard of corpus guidance but haven’t acquired the habit of using it. Anyone concerned about the effects of language attrition or how our personal reading preferences might affect our ideas about language. Finally, those who collaborate with authors on texts, such as authors’ editors or translators, teams sharing large projects, translation revisers and publishers’ copy editors – because teams can use corpus guidance to help resolve the clashes of dialect and idiolect that sometimes arise.
Pre-workshop information and open-access pre-reading:
1) Participants should come with a laptop able to access a WiFi network.
2) A pre-workshop pack will be sent to participants about two weeks before the workshop. It will be helpful to look over the tasks briefly.
3) Just before his death in 2006, one of the pioneers of corpus analysis, John Sinclair, wrote a succinct, useful book chapter summarizing the principles of corpus building and analysis in plain terms and answering FAQs many newcomers have.
4) The three early developers of MET’s evolving corpus workshops wrote an article about the concepts we emphasized in 2008. They are still relevant today.
Maher A, Waller S, Kerans ME. Acquiring or enhancing a translation specialism: the monolingual corpus-guided approach. Journal of Specialised Translation 2008;10:56-75. Translator Kevin Lossner reflected on practical applications of that article in his blog on 27 November 2008.5) An article written for EMWA by the facilitator discusses how corpus guidance can be used to quickly reach consensus when arguments arise in a team of manuscript editors or between them and their clients.
Kerans ME. Grammarians or linguists? On using language corpus data to guide usage. The Write Stuff 2006;15:89-92.
About the facilitator: Mary Ellen Kerans has taught English, English for specific purposes (where corpora are often used), and writing in a variety of university and non-university settings. She has been an authors' editor and translator since the late 1980s, mainly but not exclusively in clinical medicine. She also has experience with journal copyediting of manuscripts from authors who use English as an additional language. In the last few years, she's begun editing fiction and has found corpus guidance to be useful there too. Her MA from Teachers College Columbia University is in TESOL (teaching English to speakers of other languages). She has long lived and worked in Barcelona.