In this example you will see how to convert docx to pdf in Python programming language. Your word document may contain images, paragraphs, headings, text, table, title etc. This program will put them into a pdf file. Note that this program will convert only word document of docx type.
This is a very simple program with just one line of code will do your job. I am using here docx2pdf module to convert word document into pdf file.
Prerequisites
Python 3.8.0 – 3.9.1, docx2pdf 0.1.7
You can install docx2pdf module using the command pip install docx2pdf
. Make
sure you run this command in Administrator mode.
Convert docx to pdf
Now I will create Python script to convert docx file to pdf file. You can give name of the file anything you want with the following code.
The convert function takes two parameters – source file location and output file location.
In the below source, the input file is in the current directory where my Python script is kept.
from docx2pdf import convert
convert("word_images.docx", "C:/python/python-docx-to-pdf/word_images.pdf")
Running the above script will give you the expected output in the destination location.
In the above example I have read only single word file and converting to pdf file and keeping into the destination folder.
Now if you have multiple word documents and you want to read them one by one and want to perform conversion without knowing the names of the input documents then you can use the following approach.
import os
from docx2pdf import convert
files = [f for f in os.listdir('.') if os.path.isfile(f)]
for f in files:
fbasename = os.path.splitext(os.path.basename(f))[0]
if f.endswith('.docx'):
convert(f, os.path.realpath('.') + '/' + fbasename + '.pdf')
This way you don’t need to check for the file name but you need to check for the extension of the input files whether they are word documents or not. I am keeping the converted pdf files into the same directory as the Python script has been kept.
The sample word file and converted pdf file can be found in the source code.