Extract tables from PDF files and convert them to CSV format — opens directly in Excel, Google Sheets or any spreadsheet application. 100% browser-based.
Drop your PDF here or click to browse
Files are processed entirely in your browser — nothing is uploaded to any server
Click the upload area or drag and drop your file. Upload your PDF — the extractor detects tables on each page, converts them to rows and columns, and produces a clean CSV file ready for Excel or Sheets.
Adjust the tool options to match your requirements — all settings are explained with helpful labels and previews where applicable.
Click the action button to process your file instantly in your browser. Download the output — no waiting, no email, no account required.
The extractor can identify and extract tables that are visually structured with lines, borders or consistent column alignment in the PDF. It works best with programmatically generated PDFs (from software such as Excel, Word, accounting systems or reporting tools) where table structure is defined in the PDF data. Scanned PDFs (images of tables) require OCR processing before table extraction — the tool will flag these and apply OCR automatically where possible.
The output CSV contains the extracted table data as comma-separated values, with each table row on a new line and each cell separated by a comma. Column headers from the first row of each table are preserved. Multiple tables from the same page are extracted in order of their position on the page (top to bottom). If the PDF contains multiple pages with tables, all tables are extracted and appended in page order.
Yes — the output CSV file opens natively in Microsoft Excel, Google Sheets, LibreOffice Calc and Numbers on macOS. In Excel, go to File → Open or double-click the CSV file. If columns are not automatically separated, use the Data → Text to Columns wizard and choose comma as the delimiter. Google Sheets handles CSV files automatically when imported via File → Import.
Merged cells (cells that span multiple columns or rows) are expanded in the CSV output — each cell's value is repeated across the number of columns or rows it spanned in the original. This is necessary because CSV format does not support cell merging. In complex tables with many merged cells, you may need to clean up the output in a spreadsheet application after extraction.
Yes — when a scanned PDF is uploaded, the tool automatically applies OCR (Optical Character Recognition) to detect text before attempting table extraction. OCR accuracy depends on the scan quality and resolution. High-quality scans (300 DPI or higher, good contrast, no skew) produce excellent results. Low-quality or heavily compressed scans may result in some text recognition errors that need manual correction in the CSV output.
All tables on each page are detected and extracted. Each table is labelled in the CSV with a comment row (or a separate sheet if you choose multi-sheet output) indicating the source page and table number. This allows you to easily identify which data came from which location in the original document, especially useful for PDFs like annual reports that contain many separate data tables.
Number values are extracted as plain text in CSV format — currency symbols (£, $, €), percentage signs and comma thousands separators are preserved as they appear in the PDF. You can then apply number formatting within your spreadsheet application. Dates are extracted in whatever format they appear in the PDF; you may need to reformat them using your spreadsheet's date functions if you need a specific date format.
CSV (Comma-Separated Values) is a plain text format readable by any spreadsheet software, with no formatting — just rows and columns of data. Excel (XLSX) supports multiple sheets, cell formatting, formulas and charts. For raw data extraction and maximum compatibility, CSV is the better choice. For a formatted output with column widths, bold headers and number formatting preserved, use our PDF to Excel converter instead.
Negative numbers are extracted as they appear in the PDF. Accounting-format negatives (enclosed in brackets, e.g. (1,500.00)) are preserved as text in that format. Standard minus sign negatives (-1500.00) are extracted with the minus sign. To convert bracket-format negatives to numerical negatives in Excel, use a custom formula or Find & Replace to reformat them after extraction.
No — PDF table extraction runs entirely within your browser using JavaScript-based PDF parsing. For scanned PDFs requiring OCR, the processing also happens locally using a browser-based OCR engine (Tesseract.js). No file is transmitted to any external server at any stage. This ensures complete privacy for financial records, invoices, payroll data and other sensitive business documents.