Convert XML to JSON

I recently needed to converse a large collection of fairly complex XML files into JSON format to make them ready for leading into an Elasticsearch cluster.

After a bit of searching around, I found a really nice Python library called xmltodict, which worked perfectly.

You can download xmltodict using pip:

$ pip3 install xmltodict

xmltodict worked well with all the XML files I threw at it, so I'm not going to talk about the form and structure of the XML I was working with. The code below merely assumes that you have directory consisting of XML files and that you'd like to convert them to JSON format and write them to a new JSON file.

import xmltodict, json
import os

#path to the folder holding the XML
directory = '/path/to/folder'

#iterate over the XML files in the folder
for filename in os.listdir(directory):
    if filename.endswith(".xml"):
        f = open(filename)

        XML_content = f.read()

        #parse the content of each file using xmltodict
        x = xmltodict.parse(XML_content)
        j = json.dumps(x)

        print (filename)

        filename = filename.replace('.xml', '')
        output_file = open(filename + '.json', 'w')
        output_file.write(j)