| Class | Description |
|---|---|
| FileNullInputFormat |
A virtual input format that checks one file and returns its name to
the mapper.
|
| FileNullInputFormat.FileNullRecordReader | |
| WikiFullRevisionJsonInputFormat |
Provide a converter of Json revisions to FullRevision object
The following code is inspired by the source code of
Manning book, "Hadoop in Practice", source:
https://github.com/alexholmes/hadoop-book
|
| WikiFullRevisionJsonInputFormat.JsonRevisionReader | |
| WikiRevisionDiffInputFormat | |
| WikiRevisionDiffInputFormat.DiffReader |
Read every pairs of consecutive revisions and calculate their diffs
using Meyer's alogirthm.
|
| WikiRevisionFullInputFormat | |
| WikiRevisionFullInputFormat.RevisionReader |
Read each revision of Wikipedia page and transform into a WikipediaRevision object.
|
| WikiRevisionHeaderInputFormat |
This is probably the simplest inputformat: It reads the
chunks of dump files and extracts only the headers for
each revision.
|
| WikiRevisionInputFormat<KEYIN,VALUEIN> |
A InputFormat implementation that splits a Wikipedia Revision File into page fragments, output
them as input records.
|
| WikiRevisionPageInputFormat | |
| WikiRevisionPageInputFormat.RevisionReader |
Read each revision of Wikipedia page and transform into a WikipediaRevision object
|
| WikiRevisionPairInputFormat | |
| WikiRevisionPairInputFormat.RevisionReader |
read a meta-history xml file and output as a record every pair of consecutive revisions.
|
| WikiRevisionReader<VALUEIN> | |
| WikiRevisionTextInputFormat | |
| WikiRevisionTextInputFormat.RevisionReader |
read a meta-history xml file and output as a record every pair of consecutive revisions.
|
| WikiRevisionTimeInputFormat | |
| WikiRevisionTimeInputFormat.RevisionReader |
| Enum | Description |
|---|---|
| WikiRevisionReader.STATE | |
| WikiRevisionTimeInputFormat.TimeScale |
Copyright © 2014. All rights reserved.