Advance Search and Replace

This guide will show you how to do a more advanced example of text search and replace in PDF document.

The search process will takes font encoding into account when finding occurences of the search term. In addition, it attempts to identify occurences of the search term which are contained within multiple PDF text processing operators (Tj, TJ)

Sample Input

page that contains the search term

Before you begin

Get your API key from your UniCloud account.

If this is your first time using UniPDF SDK, follow this guide to set up a local development environment.

Clone the project repository

In your terminal, clone the examples repository. It contains the Go code we will be using for this guide.

git clone https://github.com/unidoc/unipdf-examples.git 

Navigate to the text folder in the unipdf-examples directory.

cd unipdf-examples/text

How it works

Lines 13-24 import the UniPDF packages and other required dependecies.

Lines 26-31 is the declaration of textChunk struct that will be used to store page’s text chunks for processing.

Lines 33-44 define the encode function that will encode textChunk object into PDF string objects.

Lines 46-49 define textChunks struct that will holds text and it’s text chunks.

Lines 51-128 define replace function which will process text chunks list for searching any occurences of the provided search term and replace it with the provided replacement text.

The main function in lines 122-141 validates your input and passes them as arguments to the searchReplace function.

Lines 143-207 define the searchReplace function, which will read the PDF document then pass each pages into searchReplacePageText function to be processed and write the result back into output PDF document.

Lines 209-311 define the searchReplacePageText function, which will read and parse the page’s content stream to look for the required search text and replace it with the supplied replacement text.

Run the code

Run this command to search and replace specific text in PDF document. This will also get all the required dependencies to run the program.

go run pdf_search_replace_advanced.go input.pdf output.pdf "search text" "replacement text"

# to reproduce sample output
go run pdf_search_replace_advanced.go input.pdf output.pdf "sumdolor" "lorem ipsum"

Sample output

You will see the replaced text on the output PDF.

page with replaced text

Got any Questions?

We're here to help you.