Extract from ZIP or Attachments

Extract from ZIP in Java

GroupDocs.Parser for Java (which is a part of Conholdate.Total for Java) allows you to extract items from containers by the getContainer method:

Iterable<ContainerItem> getContainer();

This method returns a collection of ContainerItem objects:

Member Description
getName The name of the item.
getDirectory The directory of the item.
getFilePath The full path of the item.
getSize The size of the item in bytes.
getMetadata The collection of item metadata.
openStream Opens the stream of the item content.
openParser Creates the Parser object for the item content.
openParser(LoadOptions) Creates the Parser object for the item content with LoadOptions.
openParser(LoadOptions, ParserSettings) Creates the Parser object for the item content with LoadOptions and ParserSettings.

Here are the steps to extract container from the document:

  • Instantiate Parser object for the initial document;
  • Call getContainer method and obtain collection of document ContainerItem objects;
  • Check if collection isn’t null (container extraction is supported for the document);
  • Iterate through the collection and get container item names, sizes and obtain content.

The following example shows how to extract attachments from a container:

// Create an instance of Parser class
try (Parser parser = new Parser(Constants.SampleZip)) {
    // Extract attachments from the container
    Iterable<ContainerItem> attachments = parser.getContainer();
    // Check if container extraction is supported
    if (attachments == null) {
        System.out.println("Container extraction isn't supported");
    }
    // Iterate over attachments
    for (ContainerItem item : attachments) {
        // Print an item name and size
        System.out.println(String.format("%s: &s", item.getName(), item.getSize()));
    }
}

Container represents both container-only files (like zip archives, outlook storage) and documents with attachments (like emails, PDF Portfolios).

In case of outlook storage (ost/pst files) container consists of email documents (msg files).