0

I have multiple .xml files, all of them having the same node names but different values. Example:

File1.xml has the following contents:

<?xml version="1.0"?><Data><Waf>No</Waf><Name>TEMP\1</Name><Number>0</Number><Iteration>1</Iteration><Lot> </Lot><DateAndTime>11:36:24:35 10/8/2019</DateAndTime><Id>5555</Id><SW>6.40.22.10900</SW><Image>Reference Point 750</Image><Angle >0</Angle ><Algo></Algo></Data>

Similarly, File2.xml has:

<?xml version="1.0"?><Data><Waf>Yes</Waf><Name>TEMP\2</Name><Number>10</Number><Iteration>6</Iteration><Lot>99</Lot><DateAndTime>11:36:49:35 10/8/2019</DateAndTime><Id></Id><SW>6.40.22.10900</SW><Image>Reference Point 90</Image><Angle >180</Angle ><Algo></Algo></Data>

I use C# (Visual Studio 2010); my goal is to obtain a .csv / .txt file which has first row:

Waf, Name, Number, Iteration, Lot, DateAndTime, Id, SW, Image, Angle,   Algo
No, TEMP\1, 0, 1, - , 11:36:24:35 10/8/2019, 5555, 6.40.22.10900, Reference Point 750, 0, -    
Yes, TEMP\2 , 10, 6, 99, 11:36:49:35 10/8/2019, -, 6.40.22.10900, Reference Point 90, 180, -    

The input to my algorithm would be the name of xml files. These are the steps I have done till now:

for (idx = 0; idx < num_files; idx++)
{
    file_name = file_name + ".xml"; // this contains the name of xml file
    if (idx == 0)   // if I'm reading the first xml file, make a note of all the node names since they will be the column headers. 
    {
      fs = new FileStream(location_xml_file, FileMode.Open, FileAccess.Read);
      xmldoc.Load(fs);
      xml_num_nodes = xmldoc.n  ; //.Count;
      Console.Write("\n xml_num_nodes = {0}", xml_num_nodes);
    }
}

However,

  • The number of nodes xml_num_nodes is output as 2.
  • I think that it's unnecessary for me to write this code from scratch and there must be as easier way. If so, what am I missing? I am using Linq and saw a fer resources but I'm not able to get what I want.
1
  • Deaserialize xml into a concrete class, Serialize it using a dedicate CSV lib (which will likely just take a concrete class, or interface)
    – TheGeneral
    Nov 25, 2019 at 23:15

2 Answers 2

1

Code if very simple using xml linq :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        const string FOLDER = @"c:\temp\";
        const string CSV_FILENAME = @"c:\temp\test.csv";

         static void Main(string[] args)
         {
            string[] xmlFiles = Directory.GetFiles(FOLDER, "*.xml");
            StreamWriter writer = new StreamWriter(CSV_FILENAME);
            Boolean firstLine = true;
            for (int idx = 0; idx < xmlFiles.Length; idx++)
            {
                string file_name = xmlFiles[idx]; 
                XDocument doc  = XDocument.Load(file_name);

                foreach(XElement data in doc.Descendants("Data"))
                {
                    if (firstLine)
                    {
                        string[] headers = data.Elements().Select(x => x.Name.LocalName).ToArray();
                        writer.WriteLine(string.Join(",", headers));
                        firstLine = false;
                    }
                    string[] row = data.Elements().Select(x => (string)x).ToArray();
                    writer.WriteLine(string.Join(",", row));
                }
            }
            writer.Flush();
            writer.Close();
        }
    }

}
0
1

Define a class to accept the deserialized XML data, then deserialize each XML file into the class, then iterate on the class members and write the data from each member to a CSV string, then finally write the CSV string to your output CSV file.

Reference: http://www.janholinka.net/Blog/Article/11

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.