V4 Migration Guide
A new major version of UniPDF (v4) was released on May 11, 2025, and is now available here. This release brings significant improvements and new features, but also introduces breaking changes that may affect existing codebases.
This guide is going to help you smoothly migrate to UniPDF v4. You will find step by step instructions for updating your code, explanations of deprecated APIs and their replacements, and an overview of the key enhancements included in this release.
New Features
1. Text Extraction Enhancements
In UniPDF v4, text extraction has been significantly improved to deliver more accurate extraction, and to fix earlier bugs and issues. The new extractor package contains now the following extraction modes.
ExtractionModeLayout: This mode is a renaming of the default extraction that has been available as default extraction mode in UniPDF v3.
ExtractionModePlain: This extraction mode is a newly implemented feature. It follows the internal structure of PDF content and extracts the text in the order they appear. Since this by passes lines and paragraphs structures, it doesn’t make any assumption in the organization of the pages. This means predefined heuristics and constants doesn’t affect extracted text which has been causing extraction issue in many PDFs.
ExtractionModeLayoutNoBreaks: This extraction mode is similar to ExtractionModeLayout but it extracts text without any break of the horizontal line. This means the extractor when in this mode assumes the whole page contains a single column.
2. Grid Component
The Grid component serves similar purpose to an existing Text component but with the requirement of explicit specification of number of rows needed. It follows similar structure to an HTML table with table → rows → cells
hierarchy. The usage example and step by step explanation of this component can be found here.
Deprecated Components
Paragraph Component Deprecation
The paragraph component is being deprecated and will be removed in the future. A more generic and feature rich StyledParagraph
component is going to replace this feature. Users should use this component instead.
Other Improvements
SVG Gradient Handling
In the new version support to svg definitions and implementation of linear gradient has been introduced
API Cleanup
The PdfObject’s WriteString
method has been replaced by Write
method and its return type has been changed to []byte
instead of String
.
Migration Guide
Here a detailed guide on how to update migrate your old version to new one will be discussed.
1. Installation
You can install the v4.x.x
version of UniPDF using the following command.
go get github.com/unidoc/unipdf/v4
2. Breaking Changes and Deprecated APIs
Since the WriteString
method of PDFObject
has been replaced by Write
any code using the older method should updated accordingly.
Example:
- On digitally signing a PDF file using an external signing service the older usage
sigBytes := make([]byte, 8192)
copy(sigBytes, signatureData)
sig := core.MakeHexString(string(sigBytes)).WriteString()
copy(pdfData[byteRange[1]:byteRange[2]], []byte(sig))
Should be updated to:
sigBytes := make([]byte, sigLen)
copy(sigBytes, signatureData)
sig := core.MakeHexString(string(sigBytes)).Write()
copy(pdfData[byteRange[1]:byteRange[2]], sig)
Other similar patterns and usages throughout the codebase should be reviewed and updated in the same way.
The usage of
NewParagraph
should also be replaced byNewStyledParagraph
since it is deprecated and will be removed in the future.
Example:
p := c.NewParagraph("Hello World")
Should be replaced by:
p := c.NewStyledParagraph()
p.SetText("Hello World")
3. New APIs and Recommended Patterns
Grid Component
A new component called Grid
is introduced in v4
.
It is recommended to use it instead of the old Table
component. It follows ``table → rows → cells` structure and is easy to use.
Since it simplifies row or column span calculations, it brings performance improvement. The detailed step by step usage guide is documented here, but for quick glance here is a simple usage:
// create grid that can have n columns, in this case 2.
grid := c.NewGrid(2)
// instantiate new Row form that grid.
row := grid.NewRow()
// create new cell form that row
cell, err := row.NewCell()
// create paragraph with its properties and components
p := c.NewStyledParagraph()
p.SetText(text)
p.SetMargins(5, 5, 5, 5)
p.SetFontSize(14)
// set the content of that cell
cell.SetContent(p)
// then continue doing this. create rows, cells, and set their contents.
Extraction Modes
The usage of simple mode extraction has been changed. the usage a new extractor with options for simple extraction process should be updated as follows.
Old usage:
ex, err := extractor.NewWithOptions(page, &extractor.Options{
UseSimplerExtractionProcess: true,
})
New usage:
ex, err := extractor.NewWithOptions(page, &extractor.Options{
ExtractionMode: extractor.ExtractionModePlain,
})
This new mode is not the same as the old one, it is different in the way it extracts contents. It extracts the text from the internal PDF Content Streams, without building tables or text marks.
There is also one more additional mode to try if PDF content is not extracted correctly, The ExtractionModeLayoutNoBreaks
.
This basically uses the same underlying structure as the default extraction mode (ExtractionModeLayout
), but doesn’t break lines. It assumes text lines span from the left side of the page the right end of the page. The text is extracted as a single column page.