LensKit Recommender Implementation

Release 0.10

The Mercurial changelog and the list of closed tickets provide more information on what has happened, including bugs that have been fixed.

Rewrote evaluator configuration to use Groovy rather than XML+JavaScript. The result is much more readable & easier to extend. See the manual entry for documentation of the evaluator. This also allows crossfolding to be done and written to disk independent of actual train-test evaluation.
Enhance train-test predict metrics to support per-user results (written to a separate table), and to have knowledge of starting and stopping the evaluation (to support additional accumulation of results that doesn't fit in the per-user or per-run paradigm).
Allow demo & smoke-test to use a local copy of the ML-100K data set.
Evaluator inputs & outputs transparently support GZip compression if the file names end in ".gz".
Reworked evaluation command line arguments for new Groovy-based support. Evaluation now takes a single file with the -f option (defaults to eval.groovy), and the remaining command line arguments are task names.
Metrics for the train-test code now take test users, in the form of TestUser, rather than ratings & predictions. This allows them to measure recommendations, access training history, and gives us flexibility to let them do even more in the future.

Made item-item CF use ItemVector rather than just ImmutableSparseVector for representing item vectors and querying for their similarities.
Defined the direction similarities are used in the item-item recommender and made it consistent and correct in the face of asymmetric similarity functions such as conditional probability (#131). The similarity functions are now better-documented.
Support unlimited neighborhood sizes (but not yet model sizes) in item-item recommender.
Made ItemItemModel an interface, so alternative sources of neighborhoods with similarity scores can be used. The default implementation uses a similarity matrix as before.
Added global recommenders and scorers that compute for items with respect to other items but independent of particular users, useful for creating “more like this” or “related items” views (#125).
- General API — see the GlobalItemScorer and GlobalItemRecommender classes.
- Implementation of global recommenders for item-item CF.

Moved predict evaluators to o.g.l.eval.metrics.predict and renamed the base interface to PredictEvalMetric.
Refactored cursor implementations:
- Renamed AbstractRatingCursor to AbstractEventCursor and made it handle any event.
- Made AbstractPollingCursor support fast polling, and make AbstractEventCursor extend it.
- Incompatible change: renamed ScannerRatingCursor to DelimitedTextRatingCursor, and made it use buffered readers (via DelimitedTextCursor) rather than scanners. Only affects code that uses the scanner rating cursor directly.
- Incompatible change: removed support for URL-backed streams from SimpleFileRatingDAO. If support for generic streams are needed, we can re-add this with InputSuppliers from Guava.
Replaced TaskTimer by commons-lang3 StopWatch. Any code using TaskTimer will need to be updated.
Removed now-unnecessary data tree code.