Modules for extracting data from PDF?

SandbagTiara2816@lemmy.dbzer0.com · 6 months ago

Modules for extracting data from PDF?

charolastra@lemmy.world · edit-2 6 months ago

pypdf, recently been updated to version 3… it sometimes takes a bit of wrangling for more specific use cases: I’ve used it in conjunction with reportlab when needing to add text and other bits with a bit more flexibility.

milkisklim · 6 months ago

From what I understand PyPDF3 and 4 are separate from pypdf which is the modern version of PyPDF2 as of last year

source link

charolastra@lemmy.world · 6 months ago

That’s correct afaik. The maintainers of PyPDF2 merged it back into the original pypdf for version 3 I believe.