List Images

This guide will show how to get all images in a given pdf file.

Before you begin

You should get your API key from your UniCloud account.

If this is your first time using UniPDF SDK, follow this guide to set up a local development environment.

Project setup

Clone the project repository

In your terminal, clone examples repository. It contains the Go code we will be using for this guide.

git clone https://github.com/unidoc/unipdf-examples.git

Navigate to the image folder in the unipdf-examples directory.

cd unipdf-examples/image

Configure environment variables

Replace the UNIDOC_LICENSE_API_KEY with your API credentials from your UniCloud account.

Linux/Mac

export UNIDOC_LICENSE_API_KEY=PUT_YOUR_API_KEY_HERE

Windows

set UNIDOC_LICENSE_API_KEY=PUT_YOUR_API_KEY_HERE

How it works

The import section imports the necessary unipdf packages and other libraries.

Lines 21and 22 declare colorspaces and filters maps respectively to save the colospaces and filters found in the pdf documents.

The init function in lines 24-31 authenticates your request with UNIDOC_LICENSE_API_KEY.

Lines 34-36 in the main function check the number of command line arguments.

The for loop in lines 37-49 reads the files paths from the command line arguments and calls the function listImages() with inputpath as an argument.

In lines 49-52 each filter and the number of times it was found is printed. The color space of and the number of instances of each occurrence is printed in lines 53-36

The listImages functions defined in lines 60-89 lists the images found in each page of the given document on inputPath by using listImagesOnPage(page).

The listImagesOnPage function in lines 91-98 gets the content stream on the given page using page.GetAllContentStreams() and calls listImagesInContentStream to get the images from the contentstream.

In lines 100-237 the function listImagesInContentStream is defined. This function does all the heavy lifting of listing images from the pdf page by going through the content stream and getting instances of both XObject Images and inline images. Also it handles images referred within XObject Form content streams. In Addition to this it saves a summary of the filters and colorspaces used by the images found to the filters and colorspaces variables declared earlier.

Got any Questions?

We're here to help you.