29 October 2013

Journalists: beware the headline writer

There are some people in life you have to avoid offending; for example, however important being nice to your boss is, it's far more important that you're nice to his PA. Treating the janitor and cleaners with respect always makes your working life much easier and more pleasant, too.

But the journalist has a much more important person to please: the headline writer. Whether this is your editor or there's someone dedicated to the role, this person has the power to undermine your entire article, or, in extreme cases, just outright insult you.

This appears to have happened to a Scotsman journalist by the name of Hugh Reilly. Hugh is a retired teacher who writes a column that is quite informal in style, and often more than a little... abrasive.

Yesterday's column, though, was downright insulting. He opened with the clear implication that Gaelic hasn't moved with the times. Lie. He described it as "terminally-ill". Lie. (And why the sub-editors let him away with that superfluous hyphen I'll never know.) He put the responsibility for its current resurgence at the hands of the SNP, who in real terms have done less for the language than the Tories (who oversaw the inauguration of Gaelic-medium education) or the Labour and Liberal Democrat parties (whose coalition government passed the Gaelic Language Act 2005). The SNP is more open to accusations of tokenism towards Gaelic than anything approaching Reilly's claimed "life support".

Reilly also treads that weary line of quoting figures that are incomprehensible to readers. Twenty-five million pounds seems like a lot to the average punter who earns a thousanth of that in a year, but in television terms it's utterly piffling. And of course, several prominent figures claim that this claimed figure is an exaggeration of the true cost anyway.

I could continue to deconstruct the column, but that would only serve to labour the point: Hugh Reilly's article was ignorant and bigoted, and downright insulting to a great many people.

And one of those people, it would seem, was the man responsible for putting a headline on the piece: A tilt at the windmill of Gaelic.

Ah yes, The Ingenious Gentleman Don Quixote of La Mancha, the great Spanish novel that is often credited as being the first true modern novel, the watershed between the heroic romances of the Middle Ages and the realism and cynicism of Renaissance literature.

The phrase "tilting at windmills" has passed into common speech, and refers to a specific incident in the novel. "Tilting" is a word for charging with a lance, and Don Quixote "tilted" at windmills because he mistook them for giants, believing their whirling sails to be flailing arms.

Don Quixote, as you see, was quite seriously deluded. He was a man declining in years, a retired gentleman, and as a pasttime read far too much heroic fiction -- fiction he mistook for fact.

The headline is frankly brilliant. In a mere seven words, it fillets the entire article and gives the author a tremendous slap in the face.

So if you're ever called upon to write a newspaper column, check you're not going to offend the headline writer before you submit your copy.

26 October 2013

Verb valency and its consequences in formal grammar

When I first went to university, I was studying computer science and artificial intelligence (although I switched to pure CS in third year) and I was introduced to formal grammars in elementary natural language processing (NLP -- also known as computational linguistics.) When I later studied languages with the Open University, I again went through an introduction to formal grammars.

Both of these basic introductions were based very heavily on the idea of context-free grammars (Wikipedia link for the curious). First time round, I found it difficult to get my head round these CFGs, but as I hadn't done any serious study of language (French and Italian in high school doesn't count) I couldn't put my finger on it... somehow they just felt wrong.

The only objection that I could form clearly was that language is so intrinsically tied to context that it's meaningless without it. For CFGs to be a valid model of language would imply that grammar has at most only a very minor role in the formation of meaning -- an idea that I personally am opposed to. This was indeed one of claims of Noam Chomsky (WP), the man credited with first formalising CFGs. Chomsky's claim was that our awareness of grammaticality was not tied into our understanding of language. This idea he demonstrated with nonsense sentences that were superficially grammatically correct, the most famous of which is colorless green ideas sleep furiously.

When I came back to formal grammars it was in the context of a course on English grammar as part of my Open University studies, and comparing the results and consequences of CFGs to the reality of complex structures in English  started to make my objections to CFGs much clearer.

My beef was that CFGs typically broke a sentence into a noun phrase and a verb phrase, which then contained everything but the  This rule was typically formalised as:
S -> NP VP

The first problem I had with this was to do with the passive and active voice, and the role of the grammatical subject of a sentence.

For example, this splits the man ate the snake into NP "the man" and VP "ate the snake", but the man was eaten by the snake also gets given an NP of "the man", even though the relationship between "the man" and the verb "eat" is pretty much diametrically opposite in the two examples.

With some verbs, this complication is even clearer. Consider the verb close.

The company closed the factory.
The factory was closed by the company.
The factory was closed.
The factory closed.

Notice that in all the different permutations, there is one constant: the thing being closed. We don't need an agent, but we must always have an affected party. Traditional grammatical terminology tells us that close is transitive, because it has to take a direct object in normal use, but there's more to it than that, because we can remove the idea of an agent altogether. There's a small set of verbs that behave this way in English, but that set is growing. These verbs are referred to as ergative, even though there are differences between them and the handling of verbs in an ergative-transitive language such as Basque. (WP article on English ergative verbs.)

If we split our sentence S->NP VP, then we're not considering the role of the VP, and without knowing what we've already got, how can we identify what is still needed? This is how Chomsky's model allows us to generate such utter nonsense.

The other thing that bugged me about Chomsky's model was how the trees we were dealing with were so Anglo-centric. Now, we were always given the caveat that the rules were different for different languages, and while that in principle seems fair enough, the differences in structure between CFGs for different languages are pretty huge.

Consider the effect of word order. Languages are often classified by the order of the subject (S), verb (V) and object (O). English is typically SVO: I (S) shot (V) him (O) and in S->NP VP our subject S is NP and our object O becomes part of the verb phrase VP. Most of the Romance languages follow the same SVO pattern SVO, and even though German verbs can be fairly weird from an Anglo-centric viewpoint, most of the time, German sentences start with the subject, so Sentence->NP VP handles most of Western Europe.

In fact, the rule even survives into the extremities of the Indo-European family, as the Indic languages (Hindi/Urdu, Punjabi, Bengali etc) are SOV, so the only difference is that you now have a VP of OV instead of VO, which is a pretty trivial difference. Even Japanese has a tendency to be SOV.

Even a VOS language would be OK, because now you Sentence -> VP NP, and your VP is verb,object and your NP subject. It's a trivial reordering.

But what about the Celtic languages in Europe?

They're mostly VSO (Breton's changed a bit under influence from Latin and then French) and with the subject trapped between the verb and the object, any rule you make now is fundamentally different from the old S->NP VP of English.

First up, if you define it as S->VP NP, your NP now represents the object, not the subject, which is a non-trivial difference. Secondly, it's just plain wrong, because adverbials go after the object (normally; exceptions apply) and you can't make the adverbials part of the object, because they are tied to the verb (hence the name).

Now it should be noted that Noam Chomsky's other big idea is that of universal grammar. Chomsky proposes that all language is just combinations and permutations of an evolved grammatical system that occurs in the brain as a sort of "language acquisition device", hence all subsets of a single genetically-determined "universal grammar". It is quite ironic, then, that it was Chomsky who ended up defining a system that exaggerates the differences between languages rather than highlighting the fundamental similarities between them.

And even if you don't believe in universal grammar, this hiding of similarity is still a problem: conceptually, the Indic languages are very similar to the Celtic languages when word order is ignored (because they are still surprisingly closely related despite thousands of years of independent evolution), but superficially, a CFG for Hindi looks more like one for Japanese than one for Welsh.

Which brings us to valency. This term was proposed by Lucien Tesnière in a 1959 book, as an analogy to valency in chemistry -- the idea that atoms bond to other atoms by a number of connections with certain restrictions.

Looking back at the earlier example of close, the verb has a mandatory bond to an affected entity (the factory, in the example), and an optional bond to an active agent (the company).

Basically, all noun phrases are subordinate to the verb, or as I like to say, the clause "pivots" on its verb.

Tesnière's valency should have changed the way we defined formal grammars, because on paper it's a very small change to the model, and yet 40 years later after Tesnière's book was published, my CS/AI study of formal grammars started with a near-identical model to Chomsky's. 47 years after Tesnière, the OU was still using the same models for teaching to language students. And a full 55 years after the book, Daniel Jurafsky and James H. Martin published the second edition of Speech and Language Processing, and Chomskyian CFGs are still the sina qua non of formal grammars, rather than a historical footnote. Any course in NLP or computational linguistics is layered on top of CFGs, and results in piling hack upon hack on top to overcome the deep-seated flaws in the paradigm, rather than simply building a paradigm which works.

So why didn't CFGs die?


CFGs, while they were designed to describe human language, are closely coupled to the development of the computer. The parse trees behind CFGs model structures developed early on in the history of computer science, and it is therefore evident that the cognitive scientists involved in them were of the school that believed that the human brain works on the same lines as an electronic computer. (This, by the way, is patent nonsense. If anyone built a computer that worked like a brain, it would be impossible to program, and would be as trustworthy as a single human being in its conclusions -- i.e. completely untrustworthy.)

As it turns out, CFGs are pretty efficient as a model for computers to use to identify parts of speech in a sentence, and it was all the computers had the power to do in the early days. It gave us a good approximation, and something to work with.

But while an approximation based on fundamentally flawed models may get us close, if it gets us close in the wrong way, improving the approximation may be intrinsically too difficult.

The planet Earth, for example, is over forty thousand kilometres in diameter at its widest point. To be able to pinpoint a spot to within 30km would therefore seem to be a pretty damn good approximation. However, the Grand Canyon is 29km wide at its widest, and the amount of work, time and effort it would take to get from the right bank to the left bank would be phenomenal.

So, just as this is a numerically "good" approximation but in practice a really bad one, so are CFGs apparently good, but store up problems for later.

This all goes back to what I was saying last time about meaning existing on multiple levels of abstraction, and as Chomsky had decided that there was some conceptual gap between meaning and grammar, his trees make no attempt to model meaning.

22 October 2013

The word is not the basic unit of language.

I'm sure I've said it before, but the word is not the basic, indivisible unit of language.

From one perspective, in all but completely analytical languages, there are word roots and affixes that words can be divided into. The English presuppose can only really be thought of in English as two elements (pre- and suppose), but the Spanish equivalent presuponer retains the four elements of its Latin origin (pre su(p) pon er). It is not, therefore indivisible.

So, am I proposing that we spend a lot of time learning the affixes and roots independently? No.

You see, the affixes and roots only gain any real conceptual meaning once they are combined into words that have concrete meaning. An overly explicit focus on the meaning of prefixes becomes an academic exercise, rather than language learning. NB: I say an overly explicit focus as there must be some focus or you're going to introduce a lot of extra work -- all the pose verbs (suppose, oppose, propose, impose, etc) in Spanish follow the same conjugation -- the irregular verb poner that each incorporates. If you don't learn that poner is a recurring element, you either memorise all the verbs independently or start making mistakes.

Now, rereading the last paragraph, I notice that I have unconsciously switched from discussing the root pon and the infinitival suffix er to talking about the "element" poner. This demonstrates quite aptly the point I was trying to make in this piece: as language users, we maintain multiple levels of abstraction simultaneously, and at each level, we conceptualise a unit of meaning. Teaching and learning must focus on all units of meaning.

But what does that mean, "all units of meaning"?

The idiom principle and lexical approach suggest that meaning lies in the phrase level, suggesting that the fixed phrase is the indivisible unit of meaning.

The communicative approach and Total Physical Response suggest something similar, stating that meaning is only given to language through use to solve problems.

So here we've already catalogued four levels of abstract that define units of meaning:
  1. Morphemes (word roots and affixes)
  2. Word
  3. Phrase
  4. Task
These form a clear hierarchy built on selection. A word is a specific selection of morphemes. A phrase is a specific selection of words. Performing a classroom task requires a specific selection of phrases (and words).

The flaw in most approaches is that they place excessive emphasis on individual levels of abstraction. The most infuriating part of it is that they often place excessive emphasis on multiple levels of abstraction. If this seems paradoxical, consider this stereotype of a classroom situation.

The phrase Would you like a banana? is introduced and practiced (phrase focus). Now the banana is replaced with an apple, an orange, a lemon and a strawberry. There is the illusion that we're still phrase focused, but in fact we've switched into word focus and are effectively simply listing fruit vocabulary. There is no need to even think about the meaning of the Would you like part, and you descend into rote repetition; it becomes a mere string of sounds before the one bit that actually needs you to think. In the end, we're memorising a list of phrases and a list of words -- there is no greater model of meaning built.

But there's another practice in teaching that undermines meaning even more, and it manifests itself clearest in verb drills. Consider:

I sleep.
He eats.
She drinks.

How much meaning do these individual utterances have? I argue very little. Everyone sleeps. Everyone eats. Everyone drinks... unless this last one is understatement and you're telling me she's an alcoholic.

But what do we need to make these meaningful? Part of the answer is in the valency or transitivity of the verbs. In English, almost all verbs can be used intransitively (ie with no noun phrase following as direct object or predicate), and in this sense it means "this person carries out this activity with some degree of frequency", but the more common the verb is, the less common this form is. I mean, he skydives is fair enough, because you don't expect it, but he talks is extremely unlikely because, well, doesn't everybody?

So if I'm asked to translate something that doesn't really mean anything to me, it's always going to be a rote exercise. Even if I'm not translating, and I'm just conjugating within the language, changing comer to come produces an utterance just as devoid of meaning as the English he eats, and language without meaning is nothing. The obvious solution is to use transitive patterns of the verb, adding in objects to make a more meaningful utterence -- he eats fish, they don't eat meat etc.

But that doesn't work for I sleep, because sleep is intransitive, so never takes an object.

So this is where the idea of verb valency comes in. The term is taken from Chemistry, where it refers to the number of bonds atoms can make, and here refers to the number of arguments a verb can take. "Arguments" in this context refers to a wider range of items than simply the objects of the transitive verb -- the subject and any adverbials are also classed as arguments.

What's missing in the utterance I sleep? An adverbial. It would become natural as soon as it had an adverbial of quality, time or place. eg:
  • I sleep badly
  • I sleep during the day
  • I sleep in a bed
These are still very contrived examples, but they feel more meaningful because they follow the patterns of real use.

The concept of valency is far more appropriate here than the idea of "fixed phrases" with "slots" for individual words, as otherwise we end up with rather minimalist "phrases" like the following:
<person> <conjugation of to do>n't <verb>
This "phrase" would explain the prevalence of the "he doesn't X" over the rare "he Xs"... for certain verbs.

09 October 2013

A wrod on errors

Since I started learning languages, I've had lots of discussions, both online and face-to-face, about the nature of errors. What always surprised me was how blasé many people were about errors -- including teachers.

There are two particular philosophies that I find quite worrying.

First:

"Errors take care of themselves."

The belief expressed by many is that there's no need for any systematic correction of errors, as the learner will work them out given enough time and contact with the language.

In the extreme case, this means no correction whatsoever, leaving the student to pick it up from input.
In the more moderate case, the teacher is expected only to give minimal correction, relating to the very specific error.

But isn't it self-evident that this is not the case? Hasn't everyone met at least one immigrant who still speaks with errors that are systematic and predictable? It's easy to dismiss this as an immigrant "language bubble", with the immigrant living and socialising within his minority community, but that only works in a major urban area, where there is enough of a concentrated population for clique community to form. But when you get to a rural community, and there's only one or two minority families in the area, why is it that a pensioner who has lived in the country for half his life still sounds decidedly foreign? And not just in accent, but in language patterns too?

Errors do not take care of themselves.

So on to the second:

"Errors don't matter - native speakers make mistakes too"

This one I've heard in many situations, but the most potentially damaging of these is in the learning of minority languages, because in those instances, it is argued that a learner who has failed to learn correctly is somehow equal to a native speaker.

This is a pretty insidious leap of logic, and confuses two issues.

First up, you have the distinction between two classes of errors: systemic errors where the speaker doesn't know the correct structure or word; and performance errors, or "slips", where the speaker stumbles during speech, despite being perfectly comfortable with the correct form.

Then we've got that hoary old chestnut of "bad grammar". Apparently, when we split our infinitives, or end a sentence with a preposition, or use "who" for a grammatical object, we're natives making errors. Well, no. People "break" these rules all the time, so they are in fact not errors. At worst, they are variant forms and therefore a correct form; at best they are the most common forms, hence the correct form.

In truth, the only type of error a native can make is a slip, a performance error, because a native has a full internal model of their language (in a particular dialect), assuming we are talking about someone without a mental or learning disability.

As many linguists say: there is no such thing as a common native error.

Non-native errors are real, and informative

But common non-native errors do exist, and we do a disservice to ourselves and/or our students by ignoring them, as errors provide a very useful insight into what's wrong with a learner's internal model of the language.

The main inspiration for this post was an error I noticed recently in my own French.

It was something along the lines of *ça ne me rien dit, which should've been ça ne me dit rien. I noticed the mistake immediately, and what I said was the corrected version, but my brain had initially formed the sentence incorrectly.

Why? What was the underlying cause of the error?

For those of you who aren't familiar with French, basic negatives traditionally consist of two parts: the particle ne before the verb and any clitic pronouns, and a second particle (pas=not, rien=nothing, jamais=never etc) after the verb. Or rather, I should say "after the first verb", which is more correct, even if some teachers don't bother to go that far.

You see, in French, there is very often only one verb -- where we add "do" in the negative (I do not know), the French don't (je ne sais pas -- compare with the archaic English I know not).

As high school had drummed this into me in simple (one word) tenses, I initially had great difficulty in correctly forming compound verb structures -- I would erroneously place the "pas" after the final verb:
*je ne peux voir pas instead of je ne peux pas voir
and
*je n'ai vu pas instead of je n'ai pas vu

That was a diagnosable error, and having diagnosed it, I consciously worked to eliminate it, and now I have no problems with ne... pas.

And yet I made this mistake with the placing of jamais, even though it's the exact same structure... and when I made this mistake, I recognised that it was something I struggle with frequently. Furthermore, I make that mistake with every negative word except "pas".

So I have a diagnosis for this error: my internal model has incorrectly built two structures where it should have created one, because a native speaker has only one. It is clearly, therefore, a non-native error.

(Actually, there's a longer story about a series of errors and corrections, but let's keep it short, shall we?)

Taking action...

What can we do as teachers?

Well, it's not easy, but we have to monitor our students constantly to identify consistent errors. Moreover, we have to look out for apparently inconsistent errors -- I say "apparently" inconsistent, because there really is no such thing as an inconsistent error. If it appears inconsistent, it means that the learner has done what I did with French negatives: used two rules where one should be used. It is then the teachers job not to correct the broken rule, but to guide the student to use the correct rule.

The more you spot these errors, the more you'll see them recurring in different students, and you'll find that they're actually pretty common errors. The fixes you implement for your students will feed into your initial teaching as a way to avoid the errors in the first place, and everyone wins in the long-term.

04 October 2013

Spaced Recognition Systems: timing and time off

Every now and then I head back to the How To Learn Any Language forums for a wee lurk. A few days ago I popped in and one of the users there was asking for advice on how to deal with the backlog of cards in an SRS flashcard system after time off.

About SRS

For anyone not familiar with SRS, its basically flashcard software that incorporates an algorithm to schedule repeats and revision of individual items at increasing intervals. The notion of increasing intervals is nothing groundbreaking, often occurring even in primary school spelling lessons (introduce new words, test later in the day, test in end of week spelling test, test at end of term in final "big" spelling test), but there have been researchers who have tried to formalise this and identify optimal gap lengths. Paul Pimsleur, for instance, devised a series of strictly scheduled gaps for optimal memorisation, a schedule which was used for the creation of the audio language learning software GradInt.

Problems with "memory schedules"

The problem with this strict scheduling is evident in GradInt: the first few "lessons" you generate will consist mostly of silence, because you have no old material to revise. It's not hard to believe the author of GradInt when he states that the commercially-produced Pimsleur courses do not strictly follow the memory schedule that Paul Pimsleur laid out.

Furthermore, the Pimsleur courses are a great example of why memory schedules cannot work: Pimsleur courses are almost identical for most languages (at least at the initial stages) and yet some languages are still more difficult than others. I borrowed several of the 5 and 6 CD courses from my local library (hence why I can't really talk about anything after the initial stages) and while I found Irish easy and finished it quickly with no repeats, I found Vietnamese so difficult that I couldn't even get the first two lessons down pat, so I gave up and handed it back without finishing.  Irish is of course similar to Scottish Gaelic, which I was already passably proficient in at the time, whereas Vietnamese is nothing like anything I had ever studied up to that point. Even if I hadn't learnt any Scottish Gaelic beforehand, Irish would still have been easier than Vietnamese, as there are fewer alien sounds to it, and the grammar is still somewhat related to English.

So yes, that's the common-sense and anecdotal rebuttal, but this sort of thing has been measured and the common-sense anecdote proven: it's easier to learn something that's kind of familiar to start off with. Memory schedules don't factor in the "difficulty" of the material, and that's a weakness.

Enter the SRS

SRS systems don't try to introduce any direct notion of difficulty or complexity at the level of the item being memorised, but instead try to adapt based on the user's perception of difficulty at each revision. It's a good compromise and it works well for the most part, and certainly better than blindly following a set pattern.

However, the more you adapt the algorithm away from the published research papers, the less you can support your software by quoting research. Isn't that an interesting quandry? It's probably better, and few but a handful of extreme determinists would doubt it, but it's difficult to say it's better, because they've got research and you haven't.

That said, I'm happy to accept that an adaptive algorithm is better than a rigid schedule; it's only once we start attempting to chose between adaptive algorithms that the problems start, because whichever choice you make, there's no proof for it. Which brings us to the problem raised in the HTLAL thread...

Stelle's problem

The original poster, Stelle, is planning a holiday, and is going to be away from all PCs smartphones etc for six weeks, and the algorithm-as-dictator is going to be pretty unhappy the first time Anki is booted up after the holiday, because there will be an awful lot of overdue cards.

Several posters commented that that's to be expected, because time away from Anki is time forgetting words, but Stelle then revealed that the holiday in question is walking the Camino de Santiago de Compostela, through Spain. Some words will be forgotten during that time, but a hell of a lot will be revised and remembered better than Anki alone could ever achieve.

SRS and its false assumption

You see, SRS seems to assume that it's the only source of learning. As a compromise, that's fine, because you can't go asking the user to log absolutely everything they've done in every other package, class and book -- it would be too big a job, and would put people off. But if you assume that no SRS for six weeks means no learning, there will inevitably be a lot of revision after that.

The algorithm Anki uses is derived from the SuperMemo 2 algorithm, which generates scores based entirely on the card's previous score and the user's perceived difficulty. The algorithm appears to take for granted that each revision has been carried out when scheduled.

A possible solution

So I say it's wrong to assume that an SRS user hasn't been learning and revising outside of the SRS software, but obviously it's equally wrong to assume that they have. And by the same token it would be wrong to ask "have you revised?" and apply a single answer to every card in the deck, because we can be sure that Stelle will ask where the toilets are, but there's no guarantee that rainbow trout will ever come up in conversation.

But the algorithm could try to work out whether you've revised or not.

I mean, say you had two cards that were due for revision after a day, and you came back to them six weeks later. You get the first one right, you get the second wrong. Isn't it fairly reasonable to assume that you revised the first and not the second? And even though that isn't a 100% safe assumption (the first one might be an easy word, like "taxi", the second may be abnormally difficult), it doesn't really matter, because one way or another, you know one, you don't know the other.

The scheduling of the next revision really has to take this extra information into account.

Under the SM2 algorithm, the first word, the easy one, will be treated as though it's been asked after one day, and even if I rate it really easy, I'm still going to have it scheduled for a few days later... even though I've proven that I've already learnt it. Isn't that crazy? Why not schedule it for weeks away? This may not immediately clear the backlog, but it will reduce it quicker. (Because cards won't be re-added to the backlog as quickly.) As you work through the backlog, the dispersion of cards will increase, as cards are increasingly late.

By the time this is finished, they'll be thoroughly mixed, and you'll be revising the genuinely least well known cards more. If the algorithm was too optimistic about the words you seemed to know, this will be corrected in a couple of iterations.

02 October 2013

Onward and upward

So yesterday I got out of my sling, and I'm now typing two-handed again, which is soooooooooo much easier: I just couldn't avoid typos one-handed, and spent ages correcting them (which made typing so little fun that I did nothing on this blog for the whole time).

I swore I was going to stop wasting my time, yet I achieved nothing today (apart from doing a few mobilisation exercises to try to get the arm moving again).

Tomorrow I'm going into town (heading to the Jobcentre), so while I'm in I'm going to pick up a few very useful things: a very large pad of paper (a flipchart pad, probably) and some pens for it.

I have two projects that have been hard to get into on a computer left-handed, and the idea of being able to "splurge" my thoughts onto paper in a messy, arbitrary way is quite appealing, cos it's hard to really map out language stuff on computer screen, given that the structures in language are themselves rather messy and arbitrary, and don't really fit tables that well.

So what exactly are these projects?

Project 1: Face-to-face Gaelic course

I've been thinking about teaching a Gaelic course in my quite little Central Scotland village. There's a fair few people here who would be interested (in theory, at least) and I've spoken to one guy who is very keen on it. But it's scary doing the "home gig", where everyone in the crowd is someone you know, and I've always chickened out. However, I'm sitting here with no definite work (there's some possible online teaching in the pipeline) and no unemployment benefits, so it would be nice to get some money coming in, even if it is scary.

What I want to do now is get a map of some of the big patterns in the language and try to see how to teach them in a logical, integrated manner, rather than the normal scattergun approach a lot of courses (in any language) take.

Then I can phone up the caretaker of the local community centre and price up a course, and canvas interest before booking anything.

Project 2: Language learning software

About a year ago, I started prototyping a somewhat ambitious piece of language learning software, which I used to pick up a bit of Corsican while I was living in Corsica. The prototype was a bit of a hack, and every time I upgraded it the software got more and more unwieldy, with every new addition and change taking longer and longer to program in, and taking more and more memory and processor time.

So I've restarted, trying to reduce the memory requirements and increase the modularity of the code, so that changing one thing doesn't start a domino-rally of changes. This also means that it's easier to generalise across languages, so I've been trying to build the latest version in several languages simultaneously.

But last year I promised my (now former) students I'd have an alpha/beta version for them (English for French speakers) and I've lost a lot of time while I did very very little left-hand-only work on the software.

So I've got no choice: I made a promise to get this software going, and I really want to have something useful before Christmas.