Converting XML to CSV

XMLutils is a neat little Python package for converting XML to various file formats like CSV and JSON. The particularly useful thing about is that it can be executed from the command line, which makes it quick and easy to start using. 

Installation

I installed XMLutils at the command line using:

sudo easy_install XMLutils

Using XMLutils at the command line

I had a sample XML file that I wanted to convert to CSV format. There was a lot in the XML file that I didn't really want going into the output CSV file, so I executed the following command:

$ xml2csv --input "/Users/danielhoadley/Library/Mobile Documents/com~apple~CloudDocs/Documents/Development/Stuff/Dockets/10519[DK]Davies_v_Davies_(msb)(sub-bpw).xml" --output "test.csv" --tag "CaseInfo" --ignore "CaseMain" "AllNCit" "AllECLI" "TempIxCardNo" "FullReportName" "AltName" "CaseJoint_IxCardNo_TempIxCardNo_FullReportName_CaseName" "LegalTopics" "Reportability"
  • xml2csv invokes the converter
  • Declare the input XML file with --input followed by the path to the file
  • Declare the output CSV file with --output followed by the path to output file
  • Declare the XML node that represents a record in the input file (in my case, the node was CaseInfo)

Running a command like this will be sufficient to do a straight conversion to CSV:

$ xml2csv --input "/Users/danielhoadley/Library/Mobile Documents/com~apple~CloudDocs/Documents/Development/Stuff/Dockets/10519[DK]Davies_v_Davies_(msb)(sub-bpw).xml" --output "test.csv" --tag "CaseInfo"

However, as I've said, there was quite a lot in the input file that I wanted to ignore. Ignoring tags is pretty straightforward: simply declare the tags you want to ignore after that --ignore flag, e.g:

--ignore "CaseMain" "AllNCit" "AllECLI" "TempIxCardNo" "FullReportName" "AltName" "CaseJoint_IxCardNo_TempIxCardNo_FullReportName_CaseName" "LegalTopics" "Reportability"

Note

Remember to enclose the names of tags in quotes!