lp:~berdario/+junk/scala-hadoop-indexer

Created by Dario Bertini and last modified
Get this branch:
bzr branch lp:~berdario/+junk/scala-hadoop-indexer
Only Dario Bertini can upload to this branch. If you are Dario Bertini please log in for upload directions.

Related bugs

Related blueprints

Branch information

Owner:
Dario Bertini
Status:
Development

Recent revisions

9. By Dario Bertini

Added readme

8. By Dario Bertini

Now the output key of the mapper is a TextPair

7. By Dario Bertini

Scrapped the non-working generics code, and rewrote WritableTuple2 by correctly using GenericWritable
(but the compareTo of GenericType cannot work this way)

6. By Dario Bertini

Applied my fix for the WritableComparable java generics problem from
http://stackoverflow.com/questions/5337336/create-a-generic-class-with-wildcard-types-and-multiple-constructors/5543096
unfortunately, BinaryComparable is not a Writable, and together with the fact that Text isn't
strictly Comparable with itself, it's impossible to instantiate the WritableComparable tuple

5. By Dario Bertini

Converted the simple mapreduce program to be able to create an index of documents
Waited to commit until I got a barebone working version of the indexer
Unfortunately, after days of struggling, this is still not the case...
Scala seems quite unfit to integrate with somewhat complex Java libraries like the Hadoop ones
(especially when involving the use of generics, also considering that before 2 weeks ago I never touched Scala before)

4. By Dario Bertini

Fixed problems with the scala code: the signature of the mapper and reducer
wasn't matching with the superclass, and thus wasn't overriding the methods
(and the default mapper & reducer is the identity, thus requiring different types in output)
Dropped old commented code

3. By Dario Bertini

Switched from Java to Scala
Converted the example Hadoop Test to Scala
This also means that the pom.xml was completely rewritten
(Due to the unhelpful errors, mangled downloads in the repository and
the old maven-eclipse plugin being unaware of scala, transitioning the project
from Java to Scala with Maven was quite painful)

2. By Dario Bertini

Added pom.xml for building with maven3
(i'm evaluating it with this project, but i'm not sure if i'll stick with it:
i kinda find other tools more appealing... in the meanwhile i'll roll with maven)

1. By Dario Bertini

First test (taken from cloud9 libraries simple examples)

Branch metadata

Branch format:
Branch format 7
Repository format:
Bazaar repository format 2a (needs bzr 1.16 or later)
This branch contains Public information 
Everyone can see this information.

Subscribers