package

org.apache.lucene.benchmark.utils

Classes

ExtractReuters Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body  
ExtractWikipedia Extract the downloaded Wikipedia dump into separate files for indexing.