-
Notifications
You must be signed in to change notification settings - Fork 37
PdfInvalidFormatException occurs for some PDF files when using PDFtoImage 5.2.0 on Ubuntu 22.04 #175
Description
PDFtoImage version
5.2.0
OS
Linux
OS version
Ubuntu 22.04
Architecture
x64
Framework
.NET (Core)
App framework
Console App
Detailed bug report
Problem
I am using the PDFtoImage NuGet package version 5.2.0 to extract images from PDF files on Ubuntu 22.04. Some PDFs are processed correctly, but on certain PDFs I get the following exception:
PDFtoImage.Exceptions.PdfInvalidFormatException: File not in PDF format or corrupted.
Interestingly, if I limit the application to process only one or two PDFs per run, all files are processed correctly. The issue consistently occurs when processing the third file in the same run.
Code to Reproduce
internal class Program
{
private static async Task Main(string[] args)
{
foreach (var file in Directory.EnumerateFiles(Directory.GetCurrentDirectory(), "*.pdf"))
{
await using var input = new FileStream(file, FileMode.Open, FileAccess.Read);
using var _ = Conversion.ToImage(input, page: 0, leaveOpen: false);
}
}
}Environment
OS: Ubuntu 22.04
PDFtoImage: 5.2.0
.NET version: 10
Additional
On Windows, this issue does not occur.
Tried manually forcing garbage collection, but it did not help.
Tried other methods from the Conversion class, but the issue persists.
test_1.pdf
test_2.pdf
test_3.pdf
test_4.pdf
test_5.pdf
test_6.pdf
It seems like there might be a resource leak or an issue when multiple PDFs are processed sequentially.