Language corpora in technical writer’s daily work

Very think book that is open

You can always find yourself in a situation when you don’t know what word you should use or which word would sound best. It might happen, for example, in the following situations: 

  • You try to figure out the title of a chapter that describes a new functionality. 
  • You write an error message or any other piece of text that goes directly into the UI. 
  • You describe a non-standard procedure.

When you aren’t sure if a given collocation is correct, sometimes it’s enough to use Google. Remember to search for this expression using the exact match, that is put the phrase in double-quotes (“”). The number of results should give you a good overlook whether the phrase is used frequently. If there are not many results, try to find a better expression.

If you try to figure out what expression would be best in your text and your mind goes totally blank, there is another useful solution for this problem: language corpora

Which language corpus I use?

Language corpora are mainly created, maintained, and used by universities. That’s why their UI can put you off at first, but don’t be discouraged: they are very powerful tools, especially for language professionals. 

There are many language corpora on the internet. You can check some of them out for yourself on that lists the most popular English corpora. 

I use the iWeb Corpus because it shows collocates with all parts of speech at once. In some corpora, you need to enter a special shortcut, for example “[nn*]” to search for collocates containing a given part of speech. There’s always documentation, so it’s relatively easy to learn how a corpus works.

Language corpora for technical writing tasks

Here is an example of using the iWeb Corpus to find the right word. You have to figure out what expression would be best to call a new wizard in your application. The main task of a wizard is that it exports data from your application to a file. Later on, you can use this data on another computer or import it to retrieve the state of the application. 

Let’s assume that your first thought is to call the wizard “the export wizard”. However, you want to find some alternative ideas.

To use iWeb Corpus effectively, you must register. There is a limit on the number of searches each day, but it’s relatively high, so don’t worry about it.

To search for collocates of the word “wizard”, follow the steps below: 

  1. Sign in to iWeb Corpus. 
  1. Go to the Search tab. 
  1. Over the search box, click Word
  1. Enter the word you want to search for, in this case “wizard”. 
Search box where you can search for a word in a language corpus
  1. Click See detailed info for word.
    The iWeb Corpus displays detailed results for the word “wizard” including, for example: the meaning, synonyms, topic, collocates, related words, and clusters.
  1. Click Collocates.
Shows the results for the word "wizard" in a language corpus.

In the first table, you see how the word “wizard” collocates with nouns. However, here you notice that higher in the table is the word “backup” which sounds even better than “export” and better describes the main objective of this new functionality:

Shows a table with different noun collocates for the word "wizard"

You decide to use “the backup wizard”. 

By the way, let’s go back to the previous page with detailed info for the word “wizard”. You can look this page through to get some other ideas for documentation you are going to prepare.

Note how the word works in some language clusters. You will use this word frequently when describing the new functionality, so it can be useful to get more ideas. From this page, you find out that:

  • “In the wizard” is quite a popular cluster.
  • When referring to steps of the wizard, the word “page” is used more frequently. For me personally, “step” sounds better 🤷‍♀️, but what’s important, here you learn that both of these expressions are in use and you can decide for yourself.

Recommended Articles