Convert Word DOCX files to clean HTML using mammoth.js. Outputs semantic HTML with heading tags, bold, italic, hyperlinks, and lists. Live rendered preview alongside raw HTML output. Copy HTML or download as .html file. Full document wrap option.
Click or drag a .docx file. mammoth.js converts it to clean HTML entirely in your browser. The conversion uses Word's built-in styles (Heading 1 → <h1>, Bold → <strong>, etc.) to produce semantic HTML rather than style-attribute-heavy markup.
Two panels show side by side: the rendered preview (how the HTML looks in a browser) and the raw HTML code. Toggle between them on mobile. Check that headings, links, lists, and formatting converted correctly.
Click Copy HTML to copy the raw HTML. Click Download .html to save as a file. Toggle Full Document to wrap the HTML in a complete <!DOCTYPE html> document with <head> and <body> tags — useful for standalone HTML pages.
Word Heading 1 → <h1>, Heading 2 → <h2>, etc. Normal paragraph → <p>. Bold → <strong>. Italic → <em>. Underline → <u> (or CSS text-decoration). Hyperlinks → <a href="">. Unordered lists → <ul><li>. Ordered lists → <ol><li>. Line breaks → <br>. Word tables → <table><tr><td>. Images are embedded as base64 data URIs in the HTML.
mammoth.js converts semantic structure, not visual layout. Word-specific formatting (column layouts, text boxes, WordArt, SmartArt, exact font sizes and colours) is not reproduced. The HTML output is clean and semantic — suitable for web publishing — but will not be a pixel-perfect replica of the Word document. For pixel-perfect conversion, server-side tools like LibreOffice headless or Aspose.Words are more appropriate.
Yes — mammoth.js extracts embedded images and converts them to base64-encoded data URIs embedded directly in the HTML: <img src="data:image/jpeg;base64,...">. This means the HTML file is self-contained and does not require separate image files. Large documents with many images will produce a large HTML file.
Clean HTML means the output uses semantic HTML5 tags rather than Word's proprietary XML-to-HTML translation, which typically produces hundreds of span tags with inline styles and Microsoft-specific class names. mammoth.js maps Word styles to their HTML equivalents, producing readable, maintainable HTML that is suitable for pasting into a CMS, blog post editor, or web page.
mammoth.js supports custom style mappings for developers. For example, you can map "My Custom Heading" Word style to <h3> instead of the default. This is done via JavaScript configuration and is not available in this browser UI — it requires developer setup using mammoth.js directly in your own codebase.
The HTML output uses semantic tags but no inline CSS, which email clients require for proper rendering. For email use, after converting: add inline styles to each element (Gmail strips <style> blocks), replace block elements with table-based layout if needed, and test in Litmus or Email on Acid. Tools like MJML or Foundation for Emails are designed specifically for email-compatible HTML generation.