GroupDocs.Parser for .NET offers document parsing API to extract text, metadata and images from 50+ popular document file formats in .NET applications.
GroupDocs.Parser for .NET is a flexible data extraction and parsing API for C#, ASP.NET and any type of .NET application. It supports extracting metadata, images and text (raw, plain, formatted, markdown and HTML formatted) from Microsoft Word, Excel, PowerPoint, PDF, Emails and various other document formats. The API allows you to parse password protected documents and does not require any document reader or third-party software installed on the system.
Extract simple, formatted or structured text from popular file formats
Extract data fields and tables from documents
Extract images as well as text blocks with coordinates and images
Extract metadata in read-only mode from supported document formats
Extract text from Zip archives and other formats (PDF, Database, Emails) that support attachments
Extract forms in a PDF file
Searching and extracting tables on document pages
Supports detecting the encoding of text files
GroupDocs document parsing API supports .NET and Mono frameworks and enables developers to build feature rich text extraction and parsing applications on Microsoft Windows Desktop, Azure, Server and Linux operating systems.
Support and Learning Resources
Documentation – Access developers guide, features overview, installation, limitations and configurations settings
Source Code – Work with functional source code examples, showcases and plugins