This week I got my first opportunity to try out corpus linguistics in the classroom. It was a class of technically-oriented students (computers and media), and it was a small enough group that I could get them all in front of a computer for a bit, so I thought I'd give it a go.
(See below for a brief summary of what corpus linguistics actually is if you don't already know.)
I didn't think to check beforehand whether they understood the concept of regular expressions (a computing term, not a linguistic one). Not a major mistake. It turned out that they haven't been taught that in their courses, so I ended up teaching a little bit about regular expressions. There's nothing wrong with teaching a bit of computing in an English class, as long as you are teaching it through English, after all!
I didn't prepare enough good examples to pull out anything interesting from, as I wanted to work with whatever the students suggested. Why? Well, the whole point of corpus linguistics is that it's full of things you don't expect, and wouldn't be able to guess. With the first class, this resulted in findings so dull that I can't even remember what words we used. But in the second, someone said "amazing" (which may well have been a sarcastic reaction to my geeky enthusiasm for corpus linguistics!) and I searched for it in the British National Corpus. As I was looking at the computer screen and reading out a few of the words appearing around "amazing", I spotted a pattern: beach... bar... hotel... wait! The word "amazing" appears very frequently in adverts for package holidays. You learn something new every day.
So the spontaneous examples from the class are definitely a good thing, but next time I'll have a list of other examples that show interesting results.
Overall, though, I felt the lesson went really well for a first attempt. I focused on two tasks, the first of which was shamelessly ripped off of the first assessed task I carried out with a corpus back at uni: checking the frequency of occurence of must, have to, and 've got to in English, then drilling down to see differences in register. The second task was far more freeform and exploratory, asking them to look for common phrasal verbs. It was far more of an open-ended task than I would usually set, and I was quite unsure of myself setting it. It worked well with the first group and not so well with the second. Basically, there wasn't enough support to kick off the phrasal verb task. I should have given them a more gradual introduction by starting with a specific phrasal verb, then asking them to find phrasal verbs with a specific verb, and then verbs that go with a particular particle, then leave them to explore openly for the last 20 minutes or so.
But, yeah... if I ever find myself in front of a class who that sort of thing would appeal to, and where the facilities are available, I'll give it another go.
What is corpus linguistics?
"Corpus linguistics" is the analysis of a large body (corpus) of texts using computers. It allows us to search for patterns in language statistically, rather than relying on our intuition or simply trusting the grammar book. You use a piece of software called a concordancer to extract the information, and there's a great concordancer free on Brigham Young University's website, with access to several different English-language corpora, as well as Spanish and Portuguese.