Showing posts with label software. Show all posts
Showing posts with label software. Show all posts

08 December 2013

Memory schedules and the danger of the hard line

Spaced recognition systems (SRS) have become very popular in the computer language learning community as a means to learn and revise vocabulary and phrases.

Spaced recognition is essentially just flashcard software, but flashcard software backed with an intelligent algorithm that attempts to find the optimally efficient timing to aid your memory.

This timing all arises from the idea of "memory schedules" put forth by people such as Paul Pimsleur, who gave his name to a popular series of cassette-based language courses in the 1960s (a series which continues to be published to this day with virtually no differences to the basic structure).

The first principle behind these memory schedules is that new information is gradually forgotten, and needs to be reminded. (So far, so obvious.)

The second principle is that the better learned something is, the longer it takes before being fully forgotten -- the memory pattern is stronger. (Also obvious.)

Taken together, these principles give the basic form of memory schedules: revise the item to be remembered just late enough so that it's not forgotten, at progressively longer intervals.

Now this of course is obvious, and all teachers schedule their revision along similar principles. The real promise of memory schedules is the ability to get a measurable and verified number on it.

So the hard liners within memory schedules did put figures in place, completely ignoring the fact that some things are more difficult to remember than others -- for example, memorising the full 10 or 11 digit phone number for someone in a foreign country is going to be harder than memorising the number of your neighbour, whose number will probably only differ from yours by 3 or 4 digits.

The numbers, therefore can't be right, which is why more recent memory researchers have been looking for formulas that can take into account complexity. And as it's hard to objectively measure that complexity beforehand, very sensibly, SRS tries to work it out for each item to be learned as they go, based on the learner's performance. That's a good compromise.

But there's another claim from the hardliners that's really very hard to stomach: the claim that nearly forgetting and then reminding makes for a stronger memory trace. Than what? Than revising it more frequently.

Less revision = better learning...?

That is a weird claim, and the guys that claim it must presumably have data to back it up, but I just can't see how it can be a generally applicable truth.

What is definitely true is that excessively frequent repetition can keep something in short-term memory without ever forcing it to be stored in, or recalled from, long-term memory; and something that isn't in long-term memory isn't learned. Worse: if you are really frequent in your repetition, you can hold the information in working memory, and never have to recall it at all.

This is, of course, a problem that should be familiar to most language teachers and learners. The first hour of language instruction is often very light. In a one-hour introduction to Finnish, I was taught to say "good morning", "how are you?", "I'm fine,", "what is your name?", "My name is...", "what is it?" (or maybe "what is this?" or "that?"), "it's a..." and 6 or 8 proper nouns (including "car", "key" and "aeroplane"). I forgot everything except "good morning" and "key" within about 24 hours. I now only remember "good morning". Everything was repeated too quickly, so nothing went into long-term memory.

But that is not to say that repeating at the very last minute is the correct answer, and this flies in the face of a lot of material on memory anyway, and particularly in terms of language.

There is a simple rule in memory: the more often you are called on to recall something, the quicker and easier it becomes to recall. To use a trivial example, "is" is far quicker to recall than "carbunkle", because we use "is" every single day of our lives. (Or the equivalent in our language, if not English speakers.)

If not reaching that threshold of "nearly forgetting" inhibited better memorisation, then we would be in the paradoxical situation of knowing words we almost never say better than the words we say every day, but I have never said "what's the word... it's on the tip of my tongue... oh yes... is." It just doesn't work that way.

So no, SRS isn't the optimal way to remember individual items, but it's certainly a pretty efficient way to learn a bulk load of items.

It's good, but it doesn't deserve the hard line.

29 September 2012

Online education's elephant in the room.

It's funny how things come together to give you a better understanding of your own mind. A couple of weeks ago I got caught up the internet debate on mass-participation online education started by an American stats professor critiquing Udacity'sIntroduction to Statistics by Sebastian Thrun.  Then the other day I started debating online education again, this time triggered by the Technology Review article The Crisis in Higher Education linked and debated on Slashdot. One thing I didn't mention in the first debate, but did in the second, was something that has been bugging me for a very long time, and it's really only thanks to the recent debates I've been having with Owen Richardson on DI that I was finally able to articulate it.

These massive courses claim the potential to be better than anything that's come before, thanks to the availability of masses of automatically-collected feedback that will be used to improve them. This, theoretically, means the fastest pace of change in the history of education.
But is that really the case in practical terms?
Right now, I'm at the steepest part of the learning curve with respect to the courses I'm delivering at the university. I can't write more than one full lesson plan at a time, as in each new lesson I receive crucial feedback on what my students are capable of. So I'm constantly revising my material.
My father, during his career as a Chemistry teacher, delivered the same course year after year to classes of no more than 20 pupils at a time. Every time he taught a lesson, though, he was looking for improvements and refinements based on the reaction of the class. If someone made a mistake, he'd try to change the teaching to remove the possibility of someone in the next class making the same mistake.

So in the case of a conscientious teacher, material is revised for every 20 students taking the course.
 
Sebastian Thrun's first sitting of the Artificial Intelligence course had 160,000 pupils. OK, only 14% completed the course, but 22,400 students is still an incredibly high number. That's 1120 iterations of a class for me or my Dad. We're talking about numerous lifetimes of teaching. For a course taught once a year, it's equivalent to going back to the first millenium AD, not only before the computer, but before algebra, cartesian geometry and even the adoption of the Hindi-Arabic number system in Europe.  So we're talking about "A.D. DCCCXCII", not "892 AD".
A millenium's worth of teaching, with no improvement – I think that qualifies as the slowest rate of change in education ever, rather than the fastest.

Worse than that, while Thrun complains that his contemporaries are simply throwing existing courses onto the net without making them truly match the new paradigm, these are at least courses that have a fair amount of real-world testing behind them.  By contrast, his attempt at completely new means that he has giving a course to over 20,000 students without having tested it even once (as far as I can see). That's... worrying.
So what's the source of the problem?
The problem as I see it has two root causes: the medium and (as always) money.
The medium.

The current trend to massive online courses is a development of MIT's OpenCoursware initiative. Essentially, MIT videoed a bunch of lectures and stuck them online with various course notes, exercise sheets and textbook references. I know a few people who got a lot out of one or two courses, but often the quality was bitty, with incomplete materials (due to copyright or logistical reasons) and little motivation to complete.

The early pioneers of the current wave saw a major part of the problem as being in the one-hour lecture format, and revised it to a “micro-lecture” format, delivering short pieces to camera, interspersed with frequent concept-checking and small tasks.
But however small the lecture, it is still fundamentally the same thing, with a live human writing examples on some kind of board, and any revision means the human going back to the board and writing it out again, and giving the explanations again. The presented material cannot be manipulated automatically, so the potential for rapid revision and correction is reduced.
Money.

Revising a course manually takes time, and time is money. Squeezing several lifetimes' worth of improvements into a rapid development cycle isn't a part-time job – it's probably more than a full-time job, yet in the brave new world of online education, this is nobody's day job. Most of the course designers are still teaching and researching, and Thrun himself is still doing research while working at one of the world's biggest tech companies and trying to start up a new company.
No-one's yet really worked out the way to cash in on these developments, so no-one's investing properly.

Here in the UK, online education (on a smaller scale) is already on the increase, but mostly as a cost-cutting measure. That's fine as a long term goal, but in the short-term there is a need for massive investment in order to get things right.

What are we left with?
Not a lot, frankly. Data-mining requires a widely-varying dataset, in order to allow the computer to detect patterns that are too subtle or on too large a scale for a human to be able to pick up independently. But the data collected on these online programmes is pretty much one-dimensional. There are no variables explored in the teaching – there is one course, so the feedback can say if something is difficult or easy (based on number of correct answers and time taken to answer) – it can't tell us why, and it can't tell us what would be better. That means that the feedback from 22,400 students is less valuable to a good teacher than one question from an average student during an average class. That's.... worrying.

So much for the revolution.

So what's the solution?

If there's two parts to the problem, there must be two parts to the solution.

Medium
The Open University has, over the years, moved away from lectures to producing TV quality documentaries that use the best practices of documentary TV to present material in a way that genuinely enlightens the viewer.
 
As a documentary isn't a single continuous lecturer, it would theoretically be possible to have a computer modify and re-edit a documentary to make it easier to understand.
 
On the most basic level, a difficult concept might be made easier by inserting an extra second of thinking time at a certain point in the video -- an algorithm would be able to test this dynamically.  Conversely, the algorithm might find that reducing the pause is more effective, and do so dynamically (we assume then that the concept is easy and that extra time allows the student to become distracted).
 
Then there's the slides and virtual whiteboards used in the videos themselves -- produced in real-time as the presenter speaks.  This splits the presenter's attention, often resulting in rushed, unclear writing, or pauses and hesitations in speech.  Revising the visuals means redoing the whole video.
 
Why doesn't the computer build the visuals to the presenters specification, but with the ability to modify them to optimise to student feedback?
 
Eventually, we would get to the point where a course definition is a series over voice-over fragments and descriptions of intended visuals, and the computer decides what to put where.
 
But the reason that'll never happen in the current model is reason 2:

Money

Where there is a genuine incentive to drive down the cost of education, there on-line education will find its most fertile ground. When you look at the tuition fees in places like Stanford, Harvard and MIT, you'll see that these aren't the schools with the biggest incentive to make online education work.
Instead, we need to look to Europe, and in particular the countries with significant public funding for higher education. Universities funded by the public purse are under intense pressure to cut costs – it's the only way to balance the books in a shrinking economy.
However, the universities alone can't make this happen, as the current pressure is for cost savings NOW, and so they're producing online programmes with insufficient research and the quality of education is suffering for it.

Governments are sacrificing students to the God of Market Forces, when they should instead be planning intelligently. Instead of cutting funding to force universities to be more economical, they should be investing to make universities more economical. Give universities money now in order to produce high-quality programmes that will reduce costs for years to come.
 
But It Will Not Be Cheap – quite the opposite.  The creation of a genuinely high-quality online course is phenomenally expensive in terms of up-front costs, while being ridiculously cheap in the long term.


The current clientele of Udacity, edX and Coursera will no doubt feel cheated that I'm talking about education for the classic “student” rather than the free “everyman” approach of Coursera et al, but there's no need to. Established, well-researched, properly tested and adequately trialled online courses may take a while to perfect, but once they exist, their running costs will be so low that they will surely be made widely available. And while they're being developed, they're going to need a constant source of beta testers, and that's going to mean people who're doing it for personal interest, not for grades – ie you. The end result will still be open education, but it will be better.

19 August 2012

The weirdness of learning something you already "know"...

I've started looking at starting on Corsican a few times, but it's always seemed really difficult.  Why?  Because all the beginners' stuff looks too obvious to me to be worth looking at.  If it's not like Italian, it's like Spanish or French.  Everything I'd found was phrase-based, and I could understand most of them with no effort.  My brain saw no reason for effort.

That's something I find quite interesting, because I've always argued that extrinsic motivation (motivation from outside the material/course) is not enough -- all teaching has to be intrinsically motivating.

Well, today I came across a slightly better site (for French speakers).  The lessons there aren't brilliant, as they're just information and the exercises on the site aren't related to the lessons.

But the information's there, so I can juggle it about to suit myself.

Maybe it's time to dive back into the Python NLTK and write myself a wee bit of self-teaching software....