Wow. This is a far more thorough approach than the experience-focused app I made in 2007, but then again Roger Craig is also a lot better at Jeopardy! than I am.
The only thing I can get from this article is that he created a study tool that wasn't a set of papers or a document but actually a computer program. I don't see how this has any significance other than he created a computer program. I wish there was more information to this article other than fluff.
The amount of Jeopardy! questions that are online in J!-Archive is by now quite large. So large that it has begun to approach the corpus of facts from which knowledge-based non-current-events Jeopardy! questions are drawn.
Although Jeopardy does not repeat questions per se, they often find different ways to clue the same fact or connection.
I would guess that if you took the text of a clue, ran some trivial clustering on the J!-Archive contents, and instantly displayed the top three clues and responses for a human, they could be quite competitive on the show even if they knew nothing on their own.
So if you put the J!-Archive into a Leitner-system flashcard app and put a few hundred hours into it, I imagine it would improve a contestant's chances considerably. It doesn't look like much, but that looks like the major tool here. If I had won my last game (the final regular game of Season 23, followed by a six-week hiatus) I was planning something like this. However, I never thought of using analytic feedback with categorization in this way. That clearly also gave him some valuable feedback on what to study and what categories to be confident about.
However, this is to take nothing away from Roger Craig's exceedingly impressive untrained skill at the game, which he was able to train effectively to unreal levels. He plays Jeopardy! like a boss.
I think the big innovation highlighted in the video is that he used historical data on frequency of certain topics, the dollar value assigned to those topics, and his personal performance on certain categories to figure out which categories were most important to study. I know that I am bad at fashion and 19th century French literature but what relative fraction should I spend on one over the other? His code gave him a number and that's the ratio he studied them. So you're right, at that point it's not much different from flipping through a giant study manual, other than the manual telling you should should really look at pages 3, 11, 21, 47, 96, ... because that's what the solution to the nonlinear optimization problem said would give you the best chance at winning against simulated opponents.
The information is in the second video, I figured, why transcript Craig's description when he did a succinct job of it himself?
To be honest, there's little information other than what he divulged out there on the net, as it was a private tool. If I find more i'll write a followup, it's pretty fascinating.
My bet is that he'd lose. Beyond a certain level of knowledge, successful Jeopardy is all about hitting the buzzer first. Watson has a huge advantage in that respect because it gets the signal to buzz in electrically.
Allow both contenders to press the button within the first 2 seconds and, if they both buzz in within the 2 second window, the human answers first, because Watson could be trusted to "lock-in" it's answer.
Points are issued to whomever gets the answer right, allowing for both to receive or lose points on the same question.
Watson had a physical buzzer pusher. While he may have an advantage of reflexes, he did not do it fully electronically (obviously he 'electronically' activated the motor that pushed the button).
It's hard to say if that really makes it "unfair", since a lot of Ken Jennings' (and every other Jeopardy champion's) success was due to his reaction time with the buzzer. That's simply a huge part of the game.
On the other hand, nobody would dispute that a machine has faster reactions than a human, so maybe that makes it less interesting. Tough to figure out a great way to resolve that.
Having actually played against Watson, it certainly feels unfair when you've got a response you're 100% sure of on the tip of your tongue but you have to wait several long seconds for the question to be read. All the while you're desperately aware that Watson's reaction time is always 100 ms and yours ranges from 50-200 ms.
It is interesting to watch machines encroach on things we once considered exclusively human mental abilities. In the early days processors just multiplied large numbers faster than we could. Then they beat a grandmaster at chess and now a machine is definitely better at Jeopardy.
What is fair? Defining fair in terms of equal outcome is vacuous here.
For a bit I thought it would be more interesting to let the humans ring in as soon as they had read the question, but the problem is that today they would slaughter Watson, and three years from now Watson would slaughter them, and possibly without even updating its software.
I suppose the most fair thing to do would be to do away with the buzzer entirely. Let all players choose whether or not to answer all questions (that is, all three could get it right, all could get it wrong, one right, one abstain, one wrong, etc), and check the scores at the end. Without the buzzer advantage for Watson I suspect Jennings and Rutter still win... for another year or two, anyhow.
The way it works in Jeopardy is that after the question is read, there's a guy who flicks a switch after which the betting window is open (and there's a visual cue so the players know when this happens). If you buzz before that you get briefly locked out.
One advantage Roger has is insane betting (which could backfire, obviously). I didn't watch Watson -- did it bet so aggressively on Daily Double questions?
Watson was able to analyze the current situation and make a rational bet on Daily Doubles. It analyzed it's comfort in the category the Daily Double was in, and then used a regression model to pick the optimal bet size, which factored in how winning or losing the Daily Double would change Watson's chances of winning or losing the game.
I'm not really sure if Watson's model goes as far to consider other players' Daily Double behavior, or simply examines other players' scores. It would make sense to try and look at the game theory aspect though, especially in an iterative-game scenario. If a player bets hyper-aggressively on Daily Doubles (and Final Jeopardy) and gets most of them right, you'd think Watson would want to take that into consideration.
"Of course, if he’s smart enough to build a focused learning app that allowed him to dominate the show like this, it’s probably likely he would have done ok on his own."
I somehow missed the data-scraping introduction, that he sourced a fan-operated website that archives questions and answers. Here I'd thought he'd just off-handedly created a phenomenal data scraper for the general web.
He didn't talk much about the algorithm he used to present questions for optimal memorization. I imagine it was spaced repetition, as used in SuperMemo:
A while back I actually built a mobile site for playing Jeopardy (originally conceived as a drinking game based on jeopardy, but that's another story...) part of which was a scraper that pulled data off the fan-sourced site and parsed it out.
http://zachbaker.com/how-to-win-on-jeopardy-with-ruby-on-rai...