Retrieve Table of Contents
Contents
[
Hide
]
Retrieve Table of Contents in Java
GroupDocs.Parser, a Java API (which is a part of Conholdate.Total for Java) allows you to extract table of contents from documents, please use the getToc method:
Iterable<TocItem> getToc();
TocItem class has the following members:
Member | Description |
---|---|
getDepth | The depth level. |
getPageIndex | The page index. |
getText | The text. |
extractText | Extracts a text from the document to which TocItem object refers. |
Follow the steps below to extract extract table of contents from the document:
- Instantiate Parser object for the initial document;
- Call getToc method and obtain collection of TocItem objects;
- Check if collection isn’t null (table of contents extraction is supported for the document);
- Iterate through the collection and get page index to extract a page text from the document.
The following example shows how to extract table of contents from CHM file:
// Create an instance of Parser class
try (Parser parser = new Parser(Constants.SampleChm)) {
// Check if text extraction is supported
if (!parser.getFeatures().isText()) {
System.out.println("Text extraction isn't supported.");
return;
}
// Check if toc extraction is supported
if (!parser.getFeatures().isToc()) {
System.out.println("Toc extraction isn't supported.");
return;
}
// Get table of contents
Iterable<TocItem> toc = parser.getToc();
// Iterate over items
for (TocItem i : toc) {
// Print the Toc text
System.out.println(i.getText());
// Check if page index has a value
if (i.getPageIndex() == null) {
continue;
}
// Extract a page text
try (TextReader reader = parser.getText(i.getPageIndex())) {
System.out.println(reader.readToEnd());
}
}
}