Dead on Time

MORSE recommended movies to users by comparing their tastes to those of other users, and recommending movies liked by others of similar tastes. Users' tastes were determined by asking them to rate the movies they had seen on a scale from 0 to 10. The accuracy of MORSE was evaluated by leave-one-out cross-validation, which involved removing each individual movie rating, predicting it using MORSE's algorithm, and comparing the rating with the prediction. MORSE's algorithm, which was quite complex, performed significantly better than all other published techniques, such as average of N nearest neighbour. For more about MORSE, see An application of social filtering to movie recommendation.

After my MSc thesis on MORSE, funding for continued research stopped, and was I unable to continue development, despite several ideas I had to improve MORSE. One of these was for Lewis , an agent which checked through the TV programme listings on the BBC, ITV and Channel 4 web sites, looking for films which MORSE predicted users would like, and then mailing those users who opted in to this service. Another was to charge movie companies for placing personalized banner ads on the MORSE pages. Finally, I had some ideas for improving MORSE's accuracy. For example, I had determined that if a user rated fewer than about 30 movies, the algorithm was inaccurate, and it was better just to present the user with a chart of movies in order of popularity.

Then I wrote a proposal for an alternative use of the MORSE algorithm. Social filtering was already being used for other purposes, such as music recommendation and newsgroup filtering, so I thought that it would be worth testing the algorithm's performance on the more general problem of missing data in databases. This is called imputation by statisticians, and makes use of known data to impute unknown data. (E.g. you can impute that someone called Mrs Wong is certainly female and over 16, and is probably Chinese.)

I was given data mining research funding to hire a student for a summer placement, to work on this small project (called Endeavour after Inspector Morse's first name). After interviewing about ten students, I hired and supervised Ian Parkinson, who was studying at Oxford University. He modified my code to work on different data sets, which are available on the web for use by statisticians. He also ran alternative algorithms on the same data sets. The evaluation technique was, again, leave-one-out cross-validation. He then wrote a report and gave a presentation on his findings, finishing slightly ahead of schedule.

What he found was very interesting, and I now regret not arranging for him to publish his results at the time. Ian found that MORSE did not perform any better than much simpler algorithms on the data he analysed. This result was surprising to me. It also meant that this was a dead-end as far as research funding was concerned, even though it generated more questions than it resolved, in particular, why the algorithm does well on ratings given to movies but not on (for example) census data, and what other types of data the algorithm is suited to. Perhaps it only applies to personal ratings of things, or to data which falls within a narrow range of integers; perhaps only to movies. I never got the chance to find out.