Lingua Frankly: 2015

19 August 2015

The folly of trying to pronounce a place "like the natives do"

Well intentioned people often insist on trying to pronounce placenames in the "authentic" native form, even when there's a well-known variation in their own language.

At times, that change is remarkably successful, such as the change of "Peking" to "Beijing". There is an argument that this is futile, however, as the Chinese phonemes are rarely that close to English ones, and the tones of the Chinese are completely absent.

About a fortnight ago, the TV was marking the 70th anniversary of the brutal slaughter of the people of Hiroshima.

Now, most of us say "hiROshima", but a few people say "HEEroSHEEma". I thought about it a bit, and I figured that the first one is probably right, as the second one sounds like two words. I then looked up on the internet, and felt a bit sheepish when I read that the name means "wide island" in Japanese. Two words? Oh. But then I brought up Forvo and nope -- it's pronounced as one word.

So why do we end up with two forms in English?

It's all about perception. There are multiple things that you might detect. First up, there's word stress. In Japanese, Hiroshima is stressed on the second syllable, which is how I pronounce it in English. However, a knock-on effect of English stress is that adjacent syllables are weakened, so the Is are both I-schwa in English. However, in Japanese, vowels are generally clear, and vowel reduction is a matter of length, not vowel quality.

When the English speaker's ear hears Hiroshima, it either notes the correct stress, and fails to perceive the "EE" sounds, or it hears the EE sounds and fails to perceive the correct stress.

Which of these is further from the original? From an English speaker's perspective, it's impossible to say -- you need to make reference to the original language. I do not know for sure, but as Japanese has far fewer vowels than English, I would imagine hiROshima is readily recognised for the intended meaning, and that HEEroSHEEma would be pretty hard to process.

So it's a bit of a fool's errand trying to be "authentic", in my book.

30 July 2015

Language following

Last week, I was at a party in Edinburgh to mark Peruvian independence day. As I was leaving, I heard someone refusing a drink because "tengo que manejar" -- "I have to drive".

Funnily enough, I've had a couple of discussions recently about that very word "drive". It all started with a discussion on a Welsh-language Facebook group. The traditional word presented there was gyrru, whereas people often tend to use the term dreifio, presented there as an Anglicism. Strangely enough, the very next day, I ran into an old university classmate of mine, Carwyn, who was up from Wales to visit a conference in Edinburgh. When I asked him which word he would use to say "drive", his answer was "probably the wrong one", which I immediately took to mean dreifio.

I explained to him why I felt that dreifio was less of an Anglicism than gyrru.

How so?

This is a phenomenon that I call "dictionary following", for wont of a better term. (If there's a widely-accepted alternative name, please do let me know in the comments.) It's a peculiar form of language change that minority languages seem particularly prone to undergoing, where a word-form in one language gets locked to the changing meaning of a single equivalent in another language.
Edit: An Cionnfhaolach over at IrishLanguageForum.com tells me that this transferrence of all meanings for a word in one language to a similar word in another is called a "semantic loan".

In this case, the dictionary word gyrru is a word that means to spur animals onwards -- it's "drive" as in "driving cattle": what drovers do. The modern sense of "drive" comes via the idea of forcing carthorses forward, and thus the English word has broadened.

Across Europe, the equivalent word often evolved analogously. The French and Italian equivalent term is actually to "conduct" a car, and in Spanish, you either "conduct" or "handle" your car -- which is where manejar comes into the equation (manejar = manage = handle; mano = hand).

It's too easy to focus on the grammatical and lexical items as being the characteristics of a language, but if that is not underpinned by idiomatic usage and unique linguistic metaphors, then it doesn't feel like a complete language; and for me at least, much of the joy of learning and speaking that language is lost.

So for me, I'm happier to adopt the English "drive" morpheme into languages like Gaelic and Welsh than to adopt the English metaphor with a native room and claim that this is somehow "purer".

20 July 2015

Undefined article error

No, Blogger isn't on the blink, that's the intended title of the article.

The error in question is the continued use of classical terminology for grammatical articles: specifically the terms definite article and indefinite article. For over a decade, I tried to reconcile the grammatical feature with the common sense of the words "definite" and "indefinite" -- i.e. certain and uncertain -- but it made no sense at all.

It wasn't until I started discussing grammar in foreign languages that I clicked what I'd been missing all along -- the terms we use are basically a mistranslation of classical terminology.

The English word definite has diverged drastically from its etymological roots, but this is not true in the Romance languages on mainland Europe. When the French say défini or the Spanish say definido, what they are actually saying is defined.

That's right, the definite article is really the defined article, which means the indefinite article must be the undefined article. From that perspective, everything seems to make much more sense.

Plenty of languages survive quite well without any articles -- they are essentially redundant as even in English, in a lot of circumstances you can drop them without losing any information in the sentence.

What I'd never got my head round was that the articles don't add any information to the sentence -- they simply act as a sort of "signpost" to information that already exists elsewhere. But most importantly, it refers to the listener's frame of reference and not the speakers.

What the definite article flags up is essentially "you know which one I mean", and the indefinite article says "you don't know which one I mean". If I say "You should go home -- the wife'll be waiting," context says I'm talking about your wife, but if I say "I should go home -- the wife'll be waiting," then you know that I'm talking about my wife. And if I say "a friend of mine is coming to visit," I'm telling you that I don't expect you to know which one I'm talking about. But in both cases, if you delete the articles, I would still make the assumption of yours/mine or that I'm not sure in the second.

Now I know that isn't very clear, but to be honest, I still haven't got this that clear in my own head.

This "signposting" idea is pretty abstract, so describing it is pretty difficult. But to be fair, it's no more abstract than the phenomenon it's describing, and the more I think about articles, the more weird and abstract they look to me. For something at first class so basic, they are incredibly complex.

I suppose I'll be working for years trying to work out the best way to teach, discuss and describe them, but for now I'll satisfy myself with using the terms defined and undefined in place of definite and indefinite, because at the very least we'll be one step closer to a meaningful definition.

12 April 2015

I would of written this headline properly, but...

I wanted to revisit an old theme today. A lot of people still complain about people writing would of instead of would have. There's a saying in linguistics: there's no such thing as a common error (for native speakers) because standard language is (or should be) a statistical norm of what people actually say or write, and a legitimate standard is one that accepts all common variations (hence modern English spellcheckers accepting both "all right" and "alright" -- and as if just to cause maximum embarassment, the Firefox spellchecker doesn't like "alright"... or "spellchecker").

If people write "would of", it's because in their internal model of the English language, they do not see the verb "to have" in there at all. I was looking back at an earlier blog post on this topic, and I saw that I used the phrase "the "error" only occurs when have is used as a second auxiliary". Spot the mistake.

Standard Modern English clauses can only ever have one auxiliary -- there is no "I will can..." or "I would can...", you either have to switch to a copular construction ("I will be able to...") or inflect, eg can to could: I could tell him (if you want).

The have of the perfect aspect in English has traditionally been slightly ambiguous as to whether it's an auxiliary or not. Placement of adverbs gives us an indication of what's going on: "I always have time for it" is fine where "*~~I have always time for it~~" feels quite odd and stilted, whereas perfect have is perfectly OK with having such adverbs after it, which makes it look like an auxiliary: "I have always been lucky".

Negatives (and questions) take us further: "I don't have a car" is far more natural to many English speakers than "I haven't a car", but "*~~I don't have been to Russia~~" is clearly wrong, and "I haven't been to Russia" is the only possible correction.

So, let's say that the history of the perfect-aspect-have has been one of becoming more and more like the auxiliary verbs. English has, over time, lost the ability to have more than one auxiliary verb in a clause. Those two changes, taken in parallel, means the construction "would have" is in the process of becoming impossible in English.

What do we have instead? Well, like I said before, I see it as the formation of a new suffix, one that is applied to auxiliary verbs to indicate perfect aspect.

I would argue that we already have one established, recognised auxiliary suffix in English: -ould. This first appeared as "would" (or rather "wolde"), the past form (both indicative and subjunctive of "willan" (will). Notice that there are two changes here -- firstly the grammatical vowel change i->o (->ou), and the suffixing of past D. The same changes from first principles could describe shall giving us should, even though the exact vowel change is different, but cannot account for can giving us could, as the N->L change isn't typical in English. Furthermore, it is not a commonly observed pattern for people to spell could would and should differently. Therefore -ould must be a single morpheme common to all three words.

If this is the case, then adding another suffix to that seems perfectly sensible, and we've got coulda, woulda, shoulda; or could've, would've, should've; or coodov, woodov, shoodov or however you want to write it.

Of course, this same perfective suffix can be applied to certain auxiliaries without the -ould suffix:

must: that must've been him etc.
will: he'll've been told by now

And yet "must" is already practically dead (we all use have to/have got to) in normal usage, leaving "will" rather isolated as the only non-ould auxiliary to take [ha]ve, so even that might slip out of usage fairly quickly.

The case for writing "have" is purely etymological, it doesn't fit the evidence from "mistakes", and it presents a rather more complex model of the language than the alternative I present. It's a complexity that is possible, but I believe only insofar as it is as a transitional form between two stable conditions. I think we should let the language take that final evolutionary step to find a stable state.

02 April 2015

The UK's privatisation agenda hits immigrants and language students

I normally try not to put too much politics into a language blog, but this time it definitely deserves it. I have never been a fan of privatising public infrastructure, as it typically shifts the burden of cost to those who can least afford it. This case is no exception.

I discovered through a news story shared on Facebook this morning that the UK's Home Office is changing the English language prerequisites for visas. Previously, the SQA (the public sector exam board for Scotland) had an ESOL qualification that was recognised by the Home Office, but this will be struck off the list, which will now consist only of two exams -- the big ones, the expensive ones: Cambridge and Trinity.

The site reporting it, having nationalist inclinations, chose to focus on the angle that Westminster was trying to undermine Scotland's education sector. As a left-leaning site, though, they failed to spot the bigger picture: this is about privatisation.

The current UK government is determined to dismantle whatever public infrastructure that remains to us, and leave the populace at the mercy of marketplace economics. (Which does make this a Westminster vs Holyrood issue, to an extent, as the Scottish Government is far less keen on privatisation.)

But anyone involved in the language teaching sector will know roughly how expensive the private sector exams are, and anyone teaching English in the UK will have seen firsthand how little their students can afford these tests.

Forcing more immigrants into expensive exams (which many criticise for not being a good measure of language ability anyway) is just making life harder for some of the most vulnerable members of our society, because make no mistake -- an immigrant is a member of our society, regardless of what the majority of politicians and newspapers tell us.

23 February 2015

Well-rounded learning...?

I just stumbled across a post from last year on the Guardian's language learning blog. It's by a guy trying to learn Spanish using only Duolingo, and I thought his feedback was quite interesting.

The author Alan Haburchak, found himself struggling to internalise the grammar of Spanish as there is no conscious presentation of verb tables. I found similar problems using the course for German -- not with the verbs (I'd done most of the verb system via the Michel Thomas course already), but with the articles. The declension of the articles in German is more complex than verb conjugation, because there are so many sets of overlaps. There is no marking for gender for plural, but gender is marked in singular. The feminine declension matches the plural declension in 3 out of 4 cases. Masculine and neuter declensions match in genitive and dative, but not nominative and accusative... and the neuter nominative and accusative are the same as each other, but not the masculine ones. I'll stop now because I've probably lost you, and that's only half of it. And when you've finished with the definite articles, there's the indefinite ones too, and the adjective endings which are more complicated again. Certain patterns are shared, certain are distinct.

Trying to learn such a complex and arbitrary pattern from examples is, I think, pretty much impossible. So much information is obscured (not least of all the gender of the noun) that you cannot generalise. The end result is an ability to read fairly comfortably, but not to reproduce.

Since I started looking at tables myself, I've found the German course much easier, but still not trivial. Alan's doubt was about whether this was a general problem with "naturalistic" learning, or just his own habits formed through school language classes. I could (and maybe should) have the same doubt, but I just cannot fathom how anyone would intuit these patterns without some conscious knowledge to help them sort through the tangle.

19 February 2015

Possessives and terminology

A couple of years ago, I wrote a post describing my objections to the traditional definitions of possessive adjectives and possessive pronouns. At the time, I still favoured calling what is called the possessive pronoun a "possessive adjective" (eg. "it is mine"), and calling the possessive adjective a "possessive pronoun" (eg "my car"0 for English... in theory.

However, practicality is another thing entirely, because there is nothing more confusing than using the same words as someone else to mean entirely the opposite thing, so I have never used the terms that way for students. In fact, I actively avoid using either term if I can possibly avoid it, as it doesn't seem helpful.

So recently I've been working on trying to find ways to better categorise grammar, and I've settled on what actually seems like a reasonable compromise.

For the possessive "adjective" of traditional grammar, I'm going with the alternative from the previous post -- the possessive determiner. It aligns with this, that, a and the, so it naturally falls into the class of determiners. This doesn't mean it isn't a pronominal form -- it actually means that forms like "John's" have to be considered determiners themselves... which is entirely logical, as "John's car" is "the car that belongs to John", but "John's" has replaced the definite article; hence "John's" must be a determiner anyway.

For the possessive "pronoun", I'm going with the possessive predicate. I decided on this when I was thinking about it as a predicative adjective in sentences like "It is mine," or "that's yours." Of course, that's not the only situation it occurs in, which is something I overlooked a little when blogging from the unbearable heat of Corsica in the summer. There is no predicate in "I'll give you mine," or "She didn't have hers, so she took yours." But at least it gets away from the counter-intuitive implication that the other form is not a pronoun.

I will continue to think about this, but if I do ever come up with a better term, I can just do a search-and-replace on what I've already got without any problems....

30 January 2015

The slow death of the MOOC continues

This morning, I checked my email like I always do, and Coursera were plugging their latest "specialization" -- one for so-called cloud computing.

Coursera specialisations were originally launched as a single certificate for a series of "signature track" (ie "paid for") courses, but there's always the free option alongside it.

So I was very surprised when I clicked on the link for more information about the specialisation, then clicked through to the course, and it was only offering the $49 paid-for version. Now I did go back later and track down the free version of the course by searching the course catalogue, but the notable thing was that you can't get to the free version by navigating from the information about the specialisation.

It's there -- it is -- but by making click-through impossible, they're actively trying to push people into the paid versions. This suggests that the business model isn't working, and it's not really much of a surprise -- there's no such thing as a free lunch, and the only free cheese is in the mousetrap.

Some of the universities seemed to be using the free courses as an advert for there accredited courses, but it's a very large and expensive way to advertise -- teaching thousands in order to get half-a-dozen extra seats filled on your masters programme -- and so really the only way to get money is to get more of the students to pay.

Is it worth it for the student?

Cloud Computing costs £150, and going by their time estimates, that's between 120 and 190 hours of work. The academic credit system here in Scotland says that ten hours of work is one "credit point", and there are 120 credits in a year. Timewise, the Cloud Computing specialisation is then roughly equivalent to a 15-point or 20-point course -- ie. a single "module" in a degree course. A 15-point module costs £227.50, and a 20-point module costs just over £300, so £150 for this seems like a pretty good deal. Of course, this is only the cost to students resident in Scotland to begin with, and it is controlled by law to stay artificially low -- in England, the basic rate would be £750 for a 15-point course or £1000 for a 20-point one, but many universities "top-up" their fees by half again: £1125 and £1500 respectively. And English universities are still cheaper than many of their American counterparts.

So the Coursera specialization could be half the price of a university equivalent, or a tenth, or even less, depending on where you live. Sounds like a good deal, right?

Sadly, though, the certificates are worthless -- almost all the institutions offering courses through Coursera (and EdX, and FutureLearn) are allowed to accredit their own courses for university credit, but they choose not to. If they accredited a £30 course as university-level study, they'd be competing against themselves, and they'd kill the market for their established distance courses, and perhaps even their on-campus courses.
If they can run a course for £150, is there any justification for their usual high prices? Well... yes. Coursera is on a freemium model (free for basic use, pay for "premium" services), but in reality everything on Coursera is still the "free" part of the freemium. The online-only courses are not viable for universities for a number of reasons, so it's the fully-accredited courses run by the universities themselves that make it possible for the universities to offer the cheap courses "on the side", using repurposed versions of their existing course materials.

Technology and knowledge sharing can and should be used to reduce the cost of education. When I studied languages with the Open University, I looked at the cost of the course I was taking, vs equivalent unaccredited alternatives -- I could have bought equivalent books and spent more time with a one-on-one teacher than I did in group tutorials, and still only spent half of the money I did with the OU. If I hadn't wanted to get the degree, it would have made no sense at all to continue with them, but I want to teach in schools, so I need the degree.

So yes, there is undoubtedly unnecessary expense in education and there's a lot of "fat" that could be trimmed away, but the Coursera model won't do it, and for now it remains something of a distraction -- a shiny object that draws our attention away from the real problems and solutions.

23 January 2015

The militant wing of immersion and the examiner's dilemma.

So, last time I was talking about a discussion I had with a communicative approach hardliner. A couple of days later, I had a new student ask for exam prep classes, so I got out my exam prep materials and had a quick look over them to remind myself of the specifics of the Cambridge Advanced exam, and I very quickly remembered something else from Sunday's conversation.

One of his big bugbears about the Scottish education system was that the foreign language exams all have their instructions in English. This, of course, is a natural consequence in the belief in immersion above all else -- if language must be immersive, then native-language task instructions clearly break the immersion, and therefore burst the language "bubble".

But here's the thing: when I prepare students for an exam, I explain the language task to them and then practise it over and over. By the time my students reach the exam, they don't need to read the instructions. Now the exams I prepare people for are international exams, so economies of scale dictate that the exam questions stay in English. My students go into the exam, don't need to spend time reading and understanding the question and can instead focus on carrying out the actual task that is set for them.

But there are people who don't do a lot of preparation for exams, and will go in and need to read the task. Sometimes they misunderstand the task, which means they lose marks. A hardliner would say this is fair enough, because if they don't understand English, they shouldn't pass an English exam. That would be all well and good if anyone really understood the question first time round, but students who prepare are not being tested on understanding the nature of the task, so this is inherently asymmetrical.

Indeed, most adherents to a target-language-only method are also likely to believe in the "four skills" model of language (which I don't agree with, but that's not the point here), which is fundamentally incompatible with target-language-only exam instructions.

How so? Well, if you believe that language is composed of reading, writing, speaking and listening, then it follows that you should test the four components individually. However, if you put task instructions in the target language, then every exercise includes a reading component, and you cannot objectively measure the students' levels in the other four skills.

It's a dilemma I have heard discussed even at university level, and it's very much a living debate, so nobody really should be putting forward their views as though they are objectively correct, because as with everything, we can all agree that a line has to be drawn somewhere, but we all have different views on where.

I personally feel that with a student cohort with a shared native language, native-language task instructions are the fairest way to ensure that students are being tested on the skills that we claim to be testing.

But what about listening tasks? Should we be asking the comprehension questions in the native language too, in order to ensure that we are genuinely assessing their listening comprehension? I kind of think we should, but at the same time, it doesn't feel right. But I have personally done exam past papers with students where they have clearly understood the meaning of the recording, but didn't understand the synonym used in the answer. How can you lose a mark in a listening comprehension test for failing to understand a piece of written language?

But of course, that argument does start to extend to the reading comprehension test too, because you can understand the set passage perfectly, but again have problems with the question. Here it is a reading comprehension problem leading to a lost reading mark, but there is still a fundamental question to answer about whether you should be setting an exam where you cannot determine the cause of the students' errors.

When you think about it, though, the problem in both previous paragraphs (although only one example of the various types of errors that students might make) is not really one of listening or reading anyway -- it's a vocabulary problem; vocabulary, which we do not consider worthy of the title "skill".

Some examiners have tacitly recognised this, and stopped trying to explicitly score the "four skills" individually, such as the Open University, whose degree-level exams have only spoken and written assessment, with the written part incorporating listening and reading as source material for an essay writing task. It's a holistic approach that accepts that trying to identify why a student isn't good enough isn't really an issue -- either they are or they aren't. I was perfectly happy with the approach as a student, and I would be happy with it as a teacher.

Language is certainly too complicated for us to ever hope to devise a truly fair and balance way to assess student attainment, but the current orthodoxy has tied itself in something of a knot trying to reconcile two competing goals. So are we offering immersion, or are we assessing the skills?

18 January 2015

Shooting my mouth off...

A student of mine invited me round to his flat for Sunday lunch -- I big traditional paella, which was absolutely delicious. Now this isn't twitpic, so I'm not going to bore you with a photograph hashtagged #nomnomnom -- no, I'm more interested in a discussion I had.

Regular readers know I can be more than a little opinionated at times, and I'm not afraid to disagree with people, so when I met another teacher shortly after I arrived, the conversation quickly turned heated.

First, he asked where I taught. I said I was teaching privately because I don't like the way things are done in schools. He asked what I meant, and I explained that I don't like mixed native-language groups, because the problems a Spanish person has with English are completely different from those a Polish person has (a stereotypical TEFL class in Edinburgh is composed of one Polish person, one Italian, and then a whole pile of Spanish people). Targeting lessons at resolving student problems is then really difficult.

He had a bone to pick with that "mixed-ability is real life, there are no heterogeneous groups in the real world."

Mixed ability is real-life, true, and you will never have a truly heterogeneous class group, true. However, this argument doesn't hold up to logical analysis -- a simple reductio ad absurdum (note: this is not a strawman) is enough to cut it down: we do not mix absolute beginners and advanced students, therefore we all draw a line somewhere on the scale between heterogeneous and completely mixed; every teacher sees that line as being somewhere different, and "no such thing as heterogeneous" is no more a justification for his chosen line than it is for mine.

The next thing he said was quite interesting, and certainly bears reflecting upon. He suggested that my desire to teach students in more uniform groups was not respecting their individual needs. It's an interesting viewpoint. He felt that teachers who propose heterogeneous groups in order to reduce individual differences so that they could give one lesson and not worry about addressing individual needs. This may be true of some teachers, but it is not true of me. Personally, I find that in a heterogeneous group, I can predict individual needs better, because actually most Spanish speakers I've met have exactly the same problems... which means they are not "individual problems" at all. If I eliminate all the group problems early, then I can really deal with the genuinely individual problems as they come up.

He wasn't convinced... far from it. Now he objected that I was talking about "accuracy" when... (wait for it!)... "communication is the important thing." Oh dear -- my least favourite meaningless statement. I had used an example of a particular error that a lot of Spanish people make: even if they normally remember to put the adjective before the noun (eg "a pretty girl"), when they qualify the adjective, it tends to migrate to after the noun (eg "*~~a girl very pretty~~"). He (quite correctly) responded by saying that this type of error does not interfere with communication. However, just as with mixed-ability groups, there must be a line somewhere, and inductive logic allows us to generate incomprehensible:

the US colloquial term "purdy" is readily understood to mean "pretty"
foreigners who have difficulty pronouncing /ʌ/ will usually be understood if they instead use /æ/
=> It therefore follows that saying "pardy" instead of "pretty" will be understood.

Except, of course, it doesn't. Language has evolved to contain a lot of redundancy, and one or two steps of difference is acceptable, but the effect of errors on communication is cumulative, and there's a critical mass where the information finally gets lost (as well explained by Claude Shannon's information theory, in particular the Noisy Channel Theorem).

Let's continue the induction.

"a girl very pretty" is understood to mean "a very pretty girl"
=> "a girl very pardy" is understood to mean "a very pretty girl"
Oh, and swapping T and D isn't normally a problem either, eg "breat and budder" instead of "bread and butter"
=> "a very party girl"

Three errors, and the meaning is gone. But is this because it's just a phrase and not a sentence? Let's add in another "insignificant" error that's common in the English of Spanish speakers and get ourselves a sentence to look at:

dropping a subject pronoun that can be inferred from the context, though incorrect in English, does not hinder comprehension. eg "Last night, met a very pretty girl" instead of "Last night, I met a very pretty girl."
=> "Last night, met a girl very party"

Four errors, and not a lot of meaning left. You might just get it, but it's going to be an effort to understand.

Superfluous "the" added to "last night" doesn't interfere with comprehension. (In many sentences, this is true.)
=> "The last night met a girl very party"

And of course Polish and Chinese people have a tendency that we can add in here

Dropping of "a" or "the" doesn't usually interfere with comprehension -- eg "give me apple" can easily be understood as either "give me an apple", "give me the apple" or even "give me apple" from the context
=> "The last night met girl very party"

Oh yes, and I've still got some Ts I could make into Ds

The lazd nighd med girl very party

Notice how the first two Ds do nothing to interfere with understanding, as you would still recognise "last night", but it doesn't make sense to accept it in this situation as that sets up a habit that will also affect words and phrases that aren't ambiguous.

The third D now makes things more difficult. Is that "met" or "made"? Or maybe we're talking about a "med girl", ie. a student doctor.

Can we agree that the lazd nighd med girl very party is not comprehensible? I hope so.

But let's rewind and look at all the individual sentences we can make with one single error:

Last night, I met a very purdy girl
Last night, I met a girl very pretty
Last night, met a very pretty girl
Last night, I met a very preddy girl
Lazd nighd, I med a very pretty girl
The last night, I met a very pretty girl

None of these are going to cause a native speaker too much trouble, although the last one may be ambiguous depending on the context, but when we add all these errors together, the result is incomprehensible.

The boundaries of comprehensibility are difficult to judge, especially for a teacher, whose own ability to understand non-native language is much better developed than that of an average native speaker, even if only due to the difference in the amount of contact time they have with non-natives.

But if comprehensibility can be measured, it can only be done in terms of the number and severity of errors... ie. comprehensibility can only be gauged by measuring accuracy.

And if comprehensibility and accuracy are so tightly connected, you cannot declare that one is more or less important than the other.

But I digress.

The conversation moved on, and the other teacher lamented Scotland's lack of attention to and alignment with the CEFR. He was there when it started, he told me, back in 2001. Now it's been a while since I've mentioned the CEFR on this blog, but when I did, I gave a fairly strong opinion -- an opinion that I readily repeated to my new acquaintance... who then clarified that when he said "he was there" he didn't just mean he was teaching -- he was involved in setting it up. Ah, right.

As it turns out, the guy isn't actually an active teacher any more either... he's now employed in his country's diplomatic service and is in Scotland to liaise locally on the teaching of foreign languages in Scotland's schools and universities.

They say you should pick your battles carefully, and in this case I didn't -- this was a debate that could not be won. To argue against Communicative Language Teaching with someone whose entire career and professional identity was built on the championing of CLT is not going to get you anywhere.

It's not just that I couldn't change his mind, though, but also that he couldn't change mine. It is very rare that I come out of a debate on language teaching without at the very least questioning myself, but that typically occurs when a question of personal style or perspective comes in. When you reconcile the personal views with the formal views, you can start to see why people believe what they believe, even if you don't share that belief. Both parties get to reanalyse themselves against someone else's frame-of-reference, and try to analyse the other in terms of their own. It offers new perspectives.

But if someone is truly, deeply invested in the orthodoxy, all you will hear is the dogma. There will be nothing new -- you will have heard it before from teacher trainers, colleagues and boss. You will have read it in several books and magazine/web articles.

Ah well, never mind.

16 January 2015

Duolingo: Web 2.0, free labour and the power of ignorance

Last time I wrote anything here, I had decided I was going to get some German under my belt. So I've tried out a couple of things on the net, and I've spent a lot of time on Duolingo, which in many ways is a very good resource, but is frustrating in the way it keeps generating nonsensical phrases and fragments.

Well, it turns out they recently added an interesting clause to their user agreement:

Temporary Restrictions on Users from the European Union
Users within the European Union are not presently allowed to submit materials for translation or translated materials to Duolingo. While these users can continue to use the educational services offered through the Website, they will not be involved in the translation of any documents. If you submit a request for translation or translated materials to Duolingo, you thereby warrant and represent that you are not currently within the European Union, did not translate the document within the European Union, and will not be within the European Union when your translation request has been finalized.

So what's going on here then?

I have always felt that most dot-com organisations run on a model that breaches workers' rights laws. In most countries, a for-profit organisation is not allowed to solicit or accept free labour, and yet a great many commercial internet sites rely on free labour for their profit.

When YouTube first launched, all advertising revenue was kept by the site -- uploaders made no money. YouTube argued that the uploaders weren't working for the site, so didn't need paid... and yet, without the uploaders, there would be no site. YouTube changed their business model later on to grant uploaders a share of the advertising take. The reason they did this was so that they could get on board professional media (including music videos) and then also to stop the higher quality amateurs from migrating to sites that were willing to split the profits. Market forces worked in the interests of the little guy... this time.

But what about Facebook's big translation push at the time of the public share offering? The public sale brought in enough money to translate the site into all the world's major languages several times over, and yet they did not pay a single translator, instead "crowdsourcing" the translation. It would be one thing if they had opened translation to any and all languages, but they chose the languages and were only interested in the "big" languages that would draw plenty of users and make Facebook more money.

If it was small languages, I could understand: you don't want to pay for a translation to eg Irish when all the users will happily use the English version -- it doesn't make you any money. But when you're translating into Spanish, one of the world's most widely spoken native languages, you'll make your money back many times over even if you pay for one of the world's best translators.

Facebook clearly thought that if so many other websites had got away with free labour, they would too, but they inadvertently brought the issue to far more public attention than they expected.

You see, translators have real power in Europe. With such a linguistically diverse base, the institutions of the European Union are full of translators, which makes them one of the most powerful lobby groups you can imagine. Seriously, there is no-one who "has the ear" of a Brussels bureaucrat than the person who's talking in that ear throughout the meetings.

Now I don't recall ever hearing of any sanctions being made against Facebook for this, but the groundwork was set and Duolingo walked right into the problem, because more than any other site, their business model is built on unpaid labour... and crucially unpaid translation. Duolingo seeks to generate income by having learners translate documents for paying clients as part of their "immersion" in the language. Already, Duolingo is translating articles for Buzzfeed and CNN. Their justification is that the translators are getting something in return -- the teaching. I can see where they're coming from, to a point, but that's the same justification people try to make for internships as a source of unpaid labour.

So somewhere along the line, Duolingo has been warned off and put up these "temporary restrictions"... but didn't tell anyone about it. It's there, right at the end of the Ts&Cs, but they didn't actively notify users, and there is no notice on the translation page to warn you that you might be about to do something potentially illegal.

But it gets worse, because they don't only leave you access to the section that is illegal, but they actively encourage you to use it. I've been using it a lot recently, and after most exercises, it tells me to try the translation.

Now, if you're sitting at a computer in the UK and try to access BBC Worldwide clips on YouTube, you won't get anything. Why? BBC Worldwide content is licensed for use outside the UK, and YouTube knows where you are. The same thing happens on plenty of sites.

Duolingo makes no attempt to block based on location, but there is no technical reason that they shouldn't. I cannot imagine that a company their size would not be tracking user locations anyway, in order to optimise their marketing strategies and their technology. They must know. Furthermore, there is even a section in the profile (optional, admittedly) for you to tell them where you live.

It's a pretty stupid course of action, if you ask me. With geolocation being such a simple and standard admin task (although admittedly not 100% accurate), failure to attempt to identify and block EU-based users could be argued to be negligent. That negligence is surely made worse by the fact that they are leading their users not only to arguably (not tested in court) break the law, but also to indisputably break their own license agreement. And all the while their negligence allows them to continue selling translations to commercial clients.

It's a dangerous path, and it could lead to a very messy end....

Lingua Frankly