27 February 2012

The best language in the world

Voltaire once claimed to have heard a lady in the French court say
"What a great shame that the bother at the Tower of Babel should have got language all mixed up; but for that, everyone would always have spoken French."
(This is the version quoted in David Crystal's book Language Death, but from the slightly old-fashioned tone, I don't think it's Crystal's own translation.)

She's not the only person to say something like that though.  Scottish folklore claims Gaelic to be "the language of Eden", and I'm sure many other languages say the same more or less the same thing.

The thing is, anyone will find his or her own language to be the most complete, logical and natural thing on the planet -- the frustration of people who can't express themselves in their own language is not directed towards the language, but where a structure exists in the native language and there is no easy translation to a second language, that is perceived as a gap or a flaw.

In consequence, then, most people tend to be a bit bigoted when it comes to language.  When there's any degree of personal distance, many open-minded people would agree that it's a good thing to save a language from dying out.  So say to someone from the UK that Quechua is a valuable thing, and they'll nod.  The mere name "Quechua" sounds exotic, but seeing as most people here have never encountered the language, they don't have any natural inclination towards or away from it.  But if someone has encountered a language, it's pretty likely that they're going to consider the language "wrong", because it doesn't do what their language does.

Many sound wrong.  Welsh is despised for its hissing LL and its CH.  Dutch sounds to many people as though it's someone clearing their throat.  Russian and Polish seem all hissy and slippery.

Others build their sentences in a funny way.  To people who've tried (and failed) to learn German, it's a bundle of unnecessary endings on nouns, and verbs that always arrive late.

And yet French gets away with it.  Why do all the differences between French and English stop it being disliked?  It's because we've been acculturated to believe that French is sexy, charming, sophisticated and intelligent.  Unless we've been acculturated to think badly of the French, as is often the case.

This is the first and most difficult hurdle that any would-be language learner has to get over.  We must learn to see the things that a new language does differently not as "mistakes", but as a valid means of expression.

For me, this means looking at the structure, playing around with it in my head, looking for an angle from which it makes more sense than English.  When I look at languages which use an idiom equivalent to it pleases me in place of I like it, I acknowledge that the speaker is giving the credit to "it", whereas in English, it's all about me.  Both make equal sense -- it pleases me discusses the properties of the thing in question, I like it is talking about my opinion.

And learning to accept that different does not mean wrong is a lesson that improves every aspect of our lives.  Racism, sectarianism and all other kinds of bigotry are essentially an extension of a tribal instinct -- we protect "us" by rejecting "them".  But it's an instinct that we have to manage in an increasingly pluralistic, urban society.

Learning languages makes us better people.

20 February 2012

From the mouths of bits - curiosities of machine translation

Google Translate is undeniably one of the most useful tools most of us will ever see, yet to the vast majority of people, it is a joke.

The principles behind Google Translate go completely against what we expect of language.  Our first instinct is to believe that Google used a big set of rules and tables, like in those dusty old Latin books on a shelf at the back of the university library.

But Google Translate is something very different.  It is based statistical translation techniques.  What that means is that no-one has programmed it with any rules at all, instead feeding it with gigabyte after gigabyte of text in the target language, from which it identifies patterns of words that go together, and words that don't.  It also gets some directly translated texts to compare translations, but much less than you might expect.

Occasionally, this statistical approach throws up some very odd results.

For example, on How-To-Learn-Any-Language someone recently gave the example of a Finnish band who sing some of their songs in English and some of them in Finnish.  When he translated a piece of Finnish with the song title Kuolema Tekee Taiteilijan in it, it spat out the Siren, which is another of their song, but one they sing in English.  The correspondent on HTLAL blames that on human correction, but that is highly unlikely.  Instead, Google's algorithm will have correctly identified the first song title as Finnish and the second as English, even when in a document in the other language, and therefore it won't add the Finnish song's title to the English database or the English song's title to the Finnish database.  And because both co-occur with the band name, the software ends up associating them.

In fact, if you look at the bands list of singles, you'll find that Kuolema Tekee Taiteilijan was released directly before the Siren, so it could be that the Google algorithm is actively looking for a translation directly after an embedded foreign word.  So if I talk about the clàrsach (Gaelic harp) or about an t-Eilean Sgitheanach (the Isle of Skye), you get the picture.  And it's quite right that Google Translate should do that, it just so happens that while it means that it makes less mistakes, the mistakes it does make are mistakes that look particularly weird to us.

19 February 2012

In order to teach, you must understand

I was talking to several of my fellow students before Christmas about the courses on offer at SMO, Scotland's only college offering degrees through the medium of Gaelic, and in particular the lost opportunity to teach languages efficiently to a completely bilingual student body -- an opportunity that no other higher education institution in the country has.

The only foreign language course currently offered is a single 15-point module in Irish, and this isn't offered until 3rd year.  Now one person said that it had been raised before, and it had been asked why Welsh never found its way onto the syllabus.  Apparently the college feels it isn't similar enough to Gaelic to be worth doing.  OK, there aren't many cognates between the two languages and those that there are aren't exactly transparent, but similarity can go much deeper than mere transparency.  In fact, from an academic point of view, it's the less visible similarities that inform most.

I've been trying to improve my Welsh recently, and while doing so I've been looking for academically interesting issues, and one I found was in the adjectives.

In Gaelic, the most common adjective ending is -ACH.  In Welsh, adjectives usually end in a vowel followed by G.  Different?  Not as much as you'd think.  CH in Gaelic is a C modified by a process called "lenition" in all the literature.  Lenition is the weakening of a consonant.  There's a process of lenition in Welsh too, although it's called the "soft mutation" in Welsh learning literature.  Guess what C becomes under soft mutation...?  Yup, it's G.  So even though the end result looks markedly different, on a process level, it's identical.

Teaching in this way (in either direction)  may make the new language seem a little less arbitrary.  But more than that, it also takes a single case that covers many examples and encourages an intuitive sense of equivalences across the languages, because it teaches you to accept unconsciously that Welsh G is often Gaelic CH and vice-versa. Thinking in terms of lenition as a process may also help students cope with examples that are lenited in one language and not the other.  It opens up a framework of possibilities, and if we are open to more possibilities, we're more likely to understand new and unknown language.

But in order to do this, the teacher must understand the material at a far deeper level than they're intending to teach.  You cannot simply spend your career one lesson ahead of your students in the textbook, because you have to know more about the subject than the textbook teaches.

The teacher needs to know a fair bit about the history and science that explains the development the language, even if he isn't going to talk about history or linguistic processes in class.

To give another example, it helps to know that the O->UE and E->IE changes in Spanish (eg poder->puedo, tener->tiene) are to do with Latin long and short vowels.  Now I'm not sure, but I believe the words that undergo this change had short vowels in Latin, and the ones that never change had long vowels in Latin, but it might be the other way round.  I don't need to know for sure, because I'm not going to teach this in class.  But I understand the mechanism, which at the very least means I won't waste time looking for a non-existent pattern in the Spanish. ( Or more correctly, I've stopped looking for a non-existant pattern in the Spanish, because until I was told about Latin, everyone told me "it's irregular" and I didn't believe it could be.)

Also, while many books will present this as a feature of irregular verbs, I now know that it's no such thing -- in reality it's a fairly regular and productive phonetic feature.  If you look at morphemes that can occur in multiple positions this same change occurs, and it happens for nouns, adjectives and adverbs as well as verbs.

The nouns "puerta" and "portal" share a root, and the latter retains the monophthong O simply because it's not in the stressed syllable.
An accountant (contable) works with accounts (cuentas) and again we see the change in action.

If we teach the vowel change as an irregular verb feature, the student won't necessarily be able to make the link between puerto and portal, or contable and cuenta -- the student won't have an understanding that imitates the intuition of the native speaker.  One morpheme will end up being considered as two, and the learner will find it more difficult to learn vocabulary or to devine the meaning of new vocabulary on first encounter.

Knowing the background allows us to identify the important distinctions that we need to present to our students, and that's a very good thing.

As a teacher, I am aware that always-O and unstressed-O-stressed-UE are different phonemes which share certain allophones.  This knowledge isn't explicitly required by my students (I'm a language teacher, not a linguistics professor) but it certainly must be known by them implicitly if they are to have a natural understanding of the language.

How can we do this? 

The obvious place to start would be with a couple of common verbs.  Poder and tener are very frequent, very useful, and a traditional place to start.

We have to make it clear what phoneme the student is using, and the O and E phonemes are never unambiguous in an unstressed position, so when we teach new vocabulary, we should pick a word that places the O or E in the stressed position -- so we teach "cuenta" before "contable".  We make comment on the fact that cuent/contable share a morpheme (but we don't need the word "morpheme", because everyone's happy with the word "root") and that therefore it follows the same rule as verbs -- it's diphthongised in the tonic syllable and a monophthong elsewhere (although we can follow Michel Thomas's lead and replace the technical talk with the idea that the vowels "break under stress").

But plainly and simply, the student must have an implicit understanding of what's happening, or everything becomes arbitrary and meaningless, hence difficult to learn.

14 February 2012

Meaningful vs rote: traps

Despite everything I've said so far, the term "meaningful" is quite dangerous in language.

Because, after all, from a certain point of view, all languages is arbitrary - That which we call a rose by any other name would smell as sweet - but from another point of view, all language is meaningful.

So we need to recognise that the term "meaningful" has to relate to the relationship between the material and the learner and is not some inherent property of the material.

One of the biggest "meaningful" traps is the idea of word-pairs.  The most common word-pair would have to be the antonym (opposites).

So we teach "beautiful - ugly; tall - short; big - small".

The idea is that by linking the words, we're utilising the meaningful relationships between the words.  Ignoring the potential for confusion (discussed previously), teaching by antonyms fails to exploit the learners own meaningful framework.

If you have never encountered beautiful before, then it cannot help you learn the meaning of ugly.  So in the end, you're learning two things that are arbitrary to the learner -- you're teaching them by rote.  That the data is meaningful is irrelevant, because it is not meaningful to the learner.

Better then to teach one and then the other.  Previously learned vocabulary is part of the learner's framework that can be used to allow later meaningful learning.

09 February 2012

Meaningful vs Rote: a worked example

I've been discussing meaningful and rote learning recently, and Thrissel made this comment to one of my earlier posts:
A layman's question: one of the first things I learnt when beginning with Gaelic was that the n in an changes to m before b, f, m, p. Was it rote learning (because I wasn't told why these particular four) or meaningful learning (because it constituted a rule)?
I told him that this is rote learning, because he's simply learned a list of letters and a mechanical rule.  But there is a more meaningful way to learn this, which makes it an excellent example to work through to demonstrate my point.  I'll use a bit of linguistics terminology, but I'll try to make sure that I make the meanings clear as I go.  Remember, though, that just because I'm using it here, doesn't mean I'm advocating its use in the classroom.

To make a rule meaningful, there has to be structure round about it that makes automatic sense to the learner.  Logic's good, but logic neither guarantees nor is necessary for something to be meaningful.  (In fact, my Dad used to quote one of his lecturers all the time: people don't think logically, they think psychologically.)

The key to making it meaningful is tying it into a network of easily-understood relationships.  (And sometimes the thing that's easy to remember isn't altogether logical.)

In order to work out a meaningful teaching strategy, you need to analyse the material to be taught.

In this case we have a fairly simple rule:
"An" becomes "am" before words starting with B, F, M or P.
(The complication is that "an" is actually several different words, but we'll skip over that for now.)

What are the special properties of these four letters?

B, F, M and P are labial consonants, ie. they are pronounced using the lips.  In fact hey are the only labial consonants in Gaelic.

M and N are nasal consonants, and are very closely related, as they're nasal consonants.  It's not particularly easy to switch from N to a labial consonant, so the N steals the labial quality of the following consonant and becomes an M.

The technical rule:
The unstressed clitic forms "an" become "am" before a labial consonant.
So we have a full linguistic description of what's going on.  If you say that this isn't "meaningful", you're right -- only a trained linguist would be able to make use of this information.  It's the teacher's job to turn this technical knowledge into something meaningful -- but the teacher needs to understand this technical rule before he can teach it.

The next thing is to look for something similar that the student already knows.

Let's have a look at how N behaves in English.

The prefix in- is used for negatives.  We know these are negative and we know they're N.
Admissable - Inadmissable
Tractable - Intractable
We know this.

But what happens when we want to add it to a word starting with B, M or P?
Balance - Imbalance
Material - Immaterial
Possible - Impossible

This doesn't just happen with negatives.  The opposite of ex- is also in-
External - Internal
...which also changes to M:
Explicit - Implicit.
So the students now know that N changed to M before B, M and P.  You can now explain that it's because these consonants are pronounced with the lips (or you can get the students to notice that for themselves).  Now the Gaelic rule is no longer strange and arbitrary, but quite familiar and comfortable. 

(It might also pay to point out that while you might see words that appear to break this rule in the written form, they tend to follow it in pronunciation -- eg input.)

This only leaves F as a troublesome case, but seeing as your students are now aware of the notion of a labial consonant (although you've never used the word "labial") you can just point out that while B, M and P use both lips (ie they are bilabial), F uses one lip and your teeth (called labio-dental).  As it's only using one lip it's a borderline case.  Some languages bundle it with B, M and P, others with the rest of the consonants.  Gaelic's one of the former.  Shrug and move on.  It makes sense, and it doesn't pay to think about it. 

Yes, there is an element of "smoke and mirrors" here.  But some people have a tendency to overthink things and start to question things that don't need questioning.  Write it down as a formal rule and people will analyse and question it.  If instead you present it quickly as natural, if you don't encourage thinking, people will accept this, and that's good.
Now that may seem a much longer way than "an becomes am before B, F, P or M", but taking five minutes to make sure people understand it means you're not going to spend as much time revising it later.  Less haste, more speed.

However, you can do more than make this one rule meaningful -- you can also prepare the student to learn later parts of the language meaningfully.

The concept of N taking on a labial quality is analogous to several other changes in the language -- the student needs to be aware that sounds affect each other.  For example, in a phrase like an comhnaidh, the C starts to sound like an English G, because the voiced nature of the N affects the C.  The O becomes a nasal vowel, due to the influence of MH.  In many dialects, the C has an effect on the preceding N, too.  Just like how English ink is pronounced like ingk because of the effect of the NK combination, and how engage is often pronounced eng-gage rather than en-gage, the N picks up an "ng" quality -- an becomes ang.

As this sound "spreading" is an important and productive feature of the language, bringing it in early makes it all easier later on.

Also, you can help your students understand related sounds better.  In my example, B, M and P were presented separately from F, echoing the distinction between the bilabials and the labio-dental.  We can go one step further, and rearrange BMP to something else.  I would advise putting B and P together because they're both plosives, ie they have a "pop" involved (thing explosion).  Now we have a choice: do we stick B and M next to each other?  Both are voiced, so it would make sense.  Now we have MBP or PBM, rather than the alphabetical BMP.  We're building associations that can be built on later, and we're subtely drawing attention to something the student really already knows, at a fundamental level.  Which brings us to the core point of meaningful learning: it must be built on what the student already knows.

05 February 2012

Rote vs Meaningful

Last time I wrote about the confusion of "Rote and meaningful" learning with "discovery and reception" learning.

This perhaps isn't as big a problem in language learning as it is in other forms of learning, as it would appear to be accepted in language-learning circles that all information learned is equal.  For example, if I learn the conjugations of a verb, then regardless of how I have done so, I have learnt it.  But it is the contention of David Ausubel that this is not the case -- if I learn something by rote, I learn it without structure or association, and if I learn it meaningfully, I know it by structure.
To quote Educational Psychology: a Cognitive Approach,
Rote learning occurs ... if the learner lacks the relevant prior knowledge necessary for making the learning task potentially meaningful, and also (regardless of how much potential meaning the task has) if the learner adopts a set merely to internalize it in an arbitrary, verbatim fashion (that is, as an arbitrary series of words). (2nd Ed, p 27)
The part I've put in bold here is the bit that most language teachers don't seem to appreciate.  There is a belief that somehow the inherent meaningfulness of language will shine through and all the rote-learned material will spontaneously become a single meaningful whole.  But core to Ausubel's core argument is that meaningful and rote learning are not merely superficial different methods, but that the internal modelling of learned knowledge relies on how it is learned.

So if a learner memorises yo estoy, tu estás, el está, nosotros estamos, vosotros estáis, ustedes están without having any previous exposure to Spanish verbs, each item will be more or less independent and unitary -- the inherently meaningful information (the regular and partially regular inflectional suffixes) cannot be noticed by a learner who has no previous concept or understanding of them.  Even once the learner is taught the rules of Spanish conjugation, the original representation of the rote memorised conjugations will remain intact -- it will not spontaneously decompose into morphemes.

A strong learner will eventually generalise this away and learn the verb meaningfully, but this will not take the form of "adjusting" the learned language, but of relearning it in a meaningful way.
What rote learning gives the learner is therefore not true learning, but the possibility of memorising the learning material which he can then teach himself at a later date.  By this token, phrase-based learning could be justified as providing a "corpus" (Wikipedia) which the learner can subsequently learn from. 

Such a "memorise first, learn later" approach can only really be justified if the memorisation stage takes significantly less time than the learning step, as a means of getting more learning out of a limited amount of tutor time.  Unfortunately, as I pointed out in a recent post entitled Who am I?, it takes a very long time to learn very short phrases, and it seems far more efficient to learn meaningfully from the outset.

Besides of which, "memorise first, learn later" assumes that all students are equally capable of teaching themselves, which is not true.  In my first foreign language, I made plenty of mistakes in trying to move from memorisation to learning, and from conscious to unconscious competence: mistakes that I now know how to avoid repeating in my subsequent languages.  How did I overcome these hurdles in the first place?  I was looking for them.  But nobody told me to look for them, so many people don't ever realise that they're there -- they instead justify their failure with phrases like "I'm no good at languages".

So to me, it makes no sense to have a student ever say anything if they don't understand it completely, ground-up.  The meaning of the sentence is irrelevant if they don't understand the vocabulary and construction of the sentence.