ByteScout PDF Extractor SDK for .NET - Visual Studio Marketplace

PDF Extractor SDK

Convert PDF to text, Excel CSV, and XML; extract text, images, metadata from PDF files in your desktop or web applications.

Features & Benefits

converts PDF to plain text (and can follow columns if you converting a newspaper in PDF format!) - including invisible text extraction;
converts tables in PDF to Excel (CSV) by reading cells from given rectangle;
converts tables in PDF to XML files;
extracts PDF file metadata (title, author, description) and get other information about the file (number of pages, encrypted or not);
extracts embedded images from PDF document (in ASP.NET, VB.NET, C#, VB6 and VBScript);
doesn't require Adobe Reader or any other PDF reader software to be installed;
provides .NET and ActiveX interfaces;
made with 100% managed C# code.

What's new 10.6.0.3659 (October 1, 2019) version:

New vector removal feature! Check new methods to remove vector objects and new 'Remover' and 'Remover2' classes.
New experimental 'TableDetector2' class with new table detection method which is more flexible and is able to detect much more non-bordered tables inside documents.
Improved replacement of non-embedded PDF fonts.
Improved internal separation of text objects when CustomExtractionColumns is set.
text search was fixed on some documents.
new method 'CreateProfile()' added to all extractors. You can creates a JSON profile from the current object for use with PDF.co Web API or on-premise Cloud API Server.