Overview
MarkItDown provides two converters for Excel files:XlsxConverter- For modern Excel files (.xlsx, Excel 2007+)XlsConverter- For legacy Excel files (.xls, Excel 97-2003)
Dependencies
pandas, openpyxlXLS requires:
pandas, xlrd
XlsxConverter
Accepted Formats
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xlsx
Class Definition
Constructor
Methods
accepts()
True for .xlsx files.
convert()
DocumentConverterResult with all sheets as Markdown tables
Raises: MissingDependencyException if dependencies not installed
Example Usage
Output Example
XlsConverter
Accepted Formats
application/vnd.ms-excelapplication/excel
.xls
Class Definition
Constructor
Methods
accepts()
True for .xls files.
convert()
DocumentConverterResult with all sheets as Markdown tables
Raises: MissingDependencyException if dependencies not installed
Example Usage
Implementation Details
Source Location
~/workspace/source/packages/markitdown/src/markitdown/converters/_xlsx_converter.py
XlsxConverter: Line 36XlsConverter: Line 98
Conversion Pipeline
Both converters use the same process:-
Read All Sheets - Load all worksheets using pandas
-
Convert to HTML - Each sheet converted to HTML table
-
HTML to Markdown - HTML table converted to Markdown
- Combine Sheets - All sheets joined with H2 headers
Sheet Headers
Each sheet is prefixed with an H2 heading using the sheet name:Features
Supported Elements
- Data Types - Numbers, text, dates, booleans
- Multiple Sheets - All sheets included with headers
- Formulas - Evaluated values shown (not formula text)
- Merged Cells - Handled by pandas
Data Handling
- Index Column - Not included (
index=False) - Column Names - First row used as header
- Missing Values - Empty cells rendered as empty table cells
- Formatting - Cell formatting (colors, fonts) not preserved
Limitations
- Charts and images not extracted
- Cell styling and colors not preserved
- Formulas shown as values, not expressions
- Macros and VBA code not included
- Multiple tables per sheet may merge
- Conditional formatting not preserved
- Comments and notes not extracted
Performance Considerations
- Entire workbook loaded into memory
- Large spreadsheets may require significant RAM
- Processing time scales with number of sheets and cells