This guide will show how to get all images in a given pdf file.
Before you begin
You should get your API key from your UniCloud account.
If this is your first time using UniPDF SDK, follow this guide to set up a local development environment.
Clone the project repository
In your terminal, clone examples repository. It contains the Go code we will be using for this guide.
git clone https://github.com/unidoc/unipdf-examples.git
Navigate to the
image folder in the unipdf-examples directory.
Configure environment variables
UNIDOC_LICENSE_API_KEY with your API credentials from your UniCloud account.
How it works
import section imports the necessary unipdf packages and other libraries.
filters maps respectively to save the colospaces and filters found in the pdf documents.
init function in lines
24-31 authenticates your request with
34-36 in the main function check the number of command line arguments.
The for loop in lines
37-49 reads the files paths from the command line arguments and calls the function
inputpath as an argument.
49-52 each filter and the number of times it was found is printed.
The color space of and the number of instances of each occurrence is printed in lines
listImages functions defined in lines
60-89 lists the images found in each page of the given document on
inputPath by using
listImagesOnPage function in lines
91-98 gets the content stream on the given page using
page.GetAllContentStreams() and calls
listImagesInContentStream to get the images from the contentstream.
100-237 the function
listImagesInContentStream is defined. This function does all the heavy lifting of listing images from the pdf page by going through the content stream and getting instances of both
XObject Images and inline images. Also it handles images referred within XObject Form content streams. In Addition to this it saves a summary of the filters and colorspaces used by the images found to the
colorspaces variables declared earlier.