The following process shows the step-by-step procedure to convert an XML file to CSV.
Sample XML file:- PaymentMode.xml
<?xml version="1.0" encoding="ISO-8859-15"?> <paymentModes> <paymentMode> <id>1</id> <mode>Cash</mode> </paymentMode> <paymentMode> <id>2</id> <mode>Debit Card</mode> </paymentMode> <paymentMode> <id>3</id> <mode>Credit Card</mode> </paymentMode> <paymentMode> <id>4</id> <mode>Paytm</mode> </paymentMode> </paymentModes>
Step 1. Open Talend Open Studio, create a new job and give it a name-
Step 2. Click Finish. A blank area will appear where you need to add input component, which is an XML file and an output component which is a delimited file (or CSV) in our case. Write tfileInputXML anywhere on the page and select it from the dropdown list by double clicking it.
Step 3. Similarly, write tFileOutputDelimited anywhere on the page and select it from the dropdown list by double clicking it. After adding both components, your screen would look like this:
Click on the component tFileInputXML and rename it as Input XML by clicking the default name twice. Similarly, click on the component tFileOutputDelimited and rename it as Output CSV.
Step 4. To configure the settings, click on Input XML component and then click on the component view in lower section.
Step 5. Most of the fields are already populated with default values. Still, you need to change some values as per the requirements. These are below:
- Click […] button next to Edit schema. A window pane will be open where you can describe the structure of your XML file in terms of columns.
- Click on plus (+) icon to add the first column and populate column name and other fields such as type, Nullable etc. Repeat the process for each column in XML file and then click OK to close the Schema wizard.
- Add respective data type for each column. Optionally, you can populate other fields such as Length, Precision etc. Click OK.
- Next, in File name/Stream field, click […] button to select the input XML file from your system.
- Next field is Loop XPath Query which defines the XML structure. You need to mention an XPath path for iterating the XPath which starts with the root node (PaymentModes in our case).
- Next is the Mapping field, where mapping for each column is done to the corresponding XPath query field. Such as id column is mapped with tag in our XML. Similar case is with mode column. Populate the XPath query field for each such columns.
Step 6. After setting up the Input XML field, in main view, right click on Input XML component and select Row -> Main and drop the link on Output CSV component. This way output of the Input XML component will be stored in Output CSV component.
Step 7. Click on Output CSV component and then click on Component view in below section to populate the required fields similarly. Provide a desired file name for output file in File Name field. Row Separator and Field Separator fields are already populated but you may change it as per the requirement.
Step 8. Last, but not the least, Click on Sync columns button to automatically sync columns in both input and output files. To make any changes such as column name or type, or to simply view the schema, click on Edit Schema […] button.
Step 9. Click Ok to close the Schema wizard. Now, click on Run tab next to Component tab and click on Run button. You will see the output in logs similar to this:
In case of any errors, it will be shown in log screen (shown in previous screenshot). If no error, check for the output file generated on the specified location. It will look something like this:
1,Cash 2,Debit Card 3,Credit Card 4,Paytm
Hence, using above steps you can easily convert an XML file into CSV using Talend Open Studio.