PrivateGPT: ImportError: docx2txt is required to read Microsoft Word files

When trying to interact with Microsoft Word files, PrivateGPT throws out an error and does not want to parse it:
ImportError: docx2txt is required to read Microsoft Word files: pip install docx2txt

When actually installing docx2txt with pip, the error is still there. For some strange reason, PrivateGPT is still recommending pip, while it has moved away from pip for quite a while now. To resolve the issue, the answer is simple: Do Not Use Pip to install docx2txt (or whisper or any of the other converters you may need to upload your files.

Instead use poetry to install everything you need. Here a short list with available poetry installs:

poetry add docx2txt
poetry add whisper
poetry add doc2text
poetry add epub2txt
poetry add pdf2txt
poetry add ppt2txt
poetry add html2txt

To install all at once: poetry add docx2txt whisper doc2text epub2txt pdf2txt ppt2txt html2txt

Leave a Reply

Your email address will not be published. Required fields are marked *