Section 1
Your PDFs Are Leaking Tokens
When you drag a PDF into Claude, it has to "look" at the whole thing. Every image, every table, every bit of formatting that doesn't matter. That burns up to 5,000 tokens per page.
A 20-page PDF can eat 10% of your entire context window before you've even asked a question. That's why your long chats get slow, forgetful, or cut off early.
MarkItDown is a free tool from Microsoft that converts that PDF (or Word doc, PowerPoint, Excel sheet, image, even a YouTube link) into clean Markdown first.
Markdown is plain text. It's the language LLMs natively speak. So instead of Claude squinting at a fat PDF, it reads a lean text file with the same information for a fraction of the tokens. You fit more into one chat, Claude answers faster, and you stop hitting the wall.
Section 2
Why This Is Legit
This isn't some random script from a stranger. It's infrastructure that serious engineers depend on every day.
Made by Microsoft
An official Microsoft open-source project, built by the same team behind AutoGen.
140,000+ GitHub stars
One of the most popular developer tools on the planet. Most "viral" repos celebrate hitting 10k. This one is past 140k.
9,500+ forks, actively maintained
Real teams build on top of it. It's not going anywhere.
Section 3
The Prompt
Open Claude Code (or Claude on desktop with a terminal or agent enabled) and paste this in exactly. Claude checks your setup, installs it the right way, tests it, and wires it in as a permanent tool. A few minutes later, it's done.
Copy-paste prompt
Install Microsoft's MarkItDown on my machine and set it up so you (Claude) can use it going forward. Repo: https://github.com/microsoft/markitdown What it is: a Python tool that converts files (PDF, Word, PowerPoint, Excel, images, HTML, CSV, audio, YouTube URLs, etc.) into clean Markdown for use with LLMs. Please do all of the following: 1. Check my environment first. Confirm I have Python 3.10 or higher (python3 --version) and pip. If Python is too old or missing, stop and tell me how to fix it before continuing. 2. Install the full version with all optional file-format dependencies: pip install 'markitdown[all]' If my system uses an externally-managed Python and pip complains, use a sensible approach (pipx, a virtual environment, or uv) and tell me which one you chose and why. 3. Verify the CLI works. Run markitdown --version and confirm the markitdown command is on my PATH. If it installed but isn't on PATH, fix the PATH or tell me exactly what to add. 4. Do a real test conversion on a small PDF or .docx and show me the output so I know it actually works. 5. Set it up as a persistent MCP server so you can call it as a tool in future sessions. The package is markitdown-mcp. Install it and register it with "claude mcp add" at user scope. Confirm the server shows up when you list MCP servers. 6. When you're done, give me a short summary: what got installed, how I run it manually from the terminal, and how I ask you to convert a file in future chats. If anything fails along the way, don't silently skip it. Tell me what broke and what you'd recommend.
No coding background? Doesn't matter.
That's the whole point. You don't install anything by hand. You paste the prompt, Claude does the technical part, and it explains each step as it goes.
Section 4
What You Can Convert
Once it's installed, MarkItDown handles all of these, and turns each into clean Markdown:
PDFs (the big one)
Word docs (.docx)
PowerPoint (.pptx)
Excel (.xlsx / .xls)
Images (pulls out text and metadata)
Audio (transcribes speech to text)
HTML web pages
CSV / JSON / XML data files
EPUB ebooks
ZIP files (digs through the contents)
YouTube URLs (grabs the transcript)
Section 5
When You'd Actually Use It
Feeding Claude a long report, contract, or whitepaper. Convert first, then upload the Markdown file. Massive token savings.
Summarizing a slide deck. Turn a 40-slide PowerPoint into text Claude can chew through instantly.
Pulling data out of a spreadsheet for analysis without uploading the whole file.
Turning a YouTube video into a transcript to repurpose or summarize.
Dumping a folder of mixed files into one clean text format before any AI workflow.
Rule of thumb: if it's more than a couple pages and you're about to upload it to an LLM, run it through MarkItDown first.
Section 6
The Intricacies Nobody Tells You
This is where most people trip up. Read these so you don't.
1. Regular PDFs convert beautifully. Scanned ones don't.
If your PDF is really a photo of a page (scanned paperwork, a screenshot saved as PDF), there's no text to grab. For those you need OCR, which Claude can add with the markitdown-ocr plugin or Microsoft's Azure Document Intelligence. For 90% of normal PDFs, the basic install is perfect.
2. Images convert their text, not a description.
Out of the box it reads words inside an image and the metadata. If you want Claude to actually describe what's in a photo, that's a separate feature that needs an AI vision key plugged in. Most people don't need it.
3. The [all] part matters.
Always install with markitdown[all] (the prompt does this). The bare install only handles a couple formats. The [all] version unlocks everything in the list above.
4. It needs Python 3.10 or newer.
Older Macs sometimes ship with Python 3.9. The prompt checks for this first and tells Claude to fix it before going further, so you won't hit a confusing error.
5. It's a one-way street, built for machines.
MarkItDown makes files readable by AI, not to perfectly recreate a pretty document for a human. The formatting is clean but simple. That's exactly what you want for feeding an LLM.
6. It runs on your computer, with your permissions.
Nothing gets uploaded to Microsoft. The conversion happens locally. Just don't point it at sketchy files from strangers, same as any program.
Section 7
How to Use It After It's Installed
Two ways. Pick whichever fits the moment.
1. Ask Claude to do it (easiest)
Because the prompt sets it up as an MCP server, in any future chat you can just say "convert this PDF to Markdown" and point at the file. Claude handles it as a built-in tool. No copy-pasting, no leaving the chat.
2. Run it yourself in the terminal
One line spits out a clean Markdown file you can open, read, or upload anywhere.
Terminal command
That's it. You now have a Microsoft-built, 140k-star tool that quietly saves you up to 70% of your tokens every time you work with a document. Install it once, forget it's there, and stop watching your context window vanish on a single PDF.
Go convert something this week.
Work with Me
Need AI to actually work for your business?
I help businesses cut through the AI hype and build the workflows, automations, and systems that actually move the needle. Direct, hands-on, no fluff.
Work with me