Management and retrieval of large volumes of text can be expensive in both space and time. Moreover, the range of document sizes in a large collection such as TREC presents difficulties for both the retrieval mechanism and the user. We consider division of documents into parts as a solution to the problem of the range of document sizes, and show that, for databases with long documents, use of
... [Show full abstract] document parts can improve the quality of the information presented to the user. We also describe the compressed text database system we use to manage the TREC collection; the compressed inverted files with which it is indexed; and the techniques we use to evaluate the TREC queries, both on whole documents and on document parts.