Class | Description |
---|---|
FileNullInputFormat |
A virtual input format that checks one file and returns its name to
the mapper.
|
FileNullInputFormat.FileNullRecordReader | |
WikiFullRevisionJsonInputFormat |
Provide a converter of Json revisions to FullRevision object
The following code is inspired by the source code of
Manning book, "Hadoop in Practice", source:
https://github.com/alexholmes/hadoop-book
|
WikiFullRevisionJsonInputFormat.JsonRevisionReader | |
WikiRevisionDiffInputFormat | |
WikiRevisionDiffInputFormat.DiffReader |
Read every pairs of consecutive revisions and calculate their diffs
using Meyer's alogirthm.
|
WikiRevisionFullInputFormat | |
WikiRevisionFullInputFormat.RevisionReader |
Read each revision of Wikipedia page and transform into a WikipediaRevision object.
|
WikiRevisionHeaderInputFormat |
This is probably the simplest inputformat: It reads the
chunks of dump files and extracts only the headers for
each revision.
|
WikiRevisionInputFormat<KEYIN,VALUEIN> |
A InputFormat implementation that splits a Wikipedia Revision File into page fragments, output
them as input records.
|
WikiRevisionPageInputFormat | |
WikiRevisionPageInputFormat.RevisionReader |
Read each revision of Wikipedia page and transform into a WikipediaRevision object
|
WikiRevisionPairInputFormat | |
WikiRevisionPairInputFormat.RevisionReader |
read a meta-history xml file and output as a record every pair of consecutive revisions.
|
WikiRevisionReader<VALUEIN> | |
WikiRevisionTextInputFormat | |
WikiRevisionTextInputFormat.RevisionReader |
read a meta-history xml file and output as a record every pair of consecutive revisions.
|
WikiRevisionTimeInputFormat | |
WikiRevisionTimeInputFormat.RevisionReader |
Enum | Description |
---|---|
WikiRevisionReader.STATE | |
WikiRevisionTimeInputFormat.TimeScale |
Copyright © 2014. All rights reserved.