天天看點

.Net Compact Framework開發(3)——XML SAX操作

  • XmlTextReader是隻讀的單向XML解析器,在解析發生錯誤時觸發XmlException,不支援DTD;XmlTextReader可以采用多種方法裝入XML文檔,XmlResolver屬性用于解析遠端資源;

//不支援相對路徑,必須使用完整路徑或者檔案位于root目錄

XmlTextReader reader = new XmlTextReader("filename.xml");

XmlTextReader xmlReader = new XmlTextReader("http://foo.com/bar.xml");

//如果你的網站需要密碼認證,可以采用如下方法

static void Main()

{

  NetworkCredential cred =

    new NetworkCredential("usrnm", "psswd", "domain");

  Stream s =

    GetDocumentStream("http://www/foo.com/bar.xml", cred,);

  XmlTextReader reader = new XmlTextReader(s);

  // Do something interesting with the XmlTextReader

}

static Stream GetDocumentStream(string address, ICredentials cred)

{

  XmlUrlResolver xur = new XmlUrlResolver();

  Uri uri = new Uri(address);

  if(cred != null)

    xur.Credentials = cred;

  try

  {

    return (Stream)xur.GetEntity(uri, null,null);

  }

  catch(Exception e)

  {

    MessageBox.Show(e.ToString());

    throw e;

  }

}

  • XmlTextReader的Namespaces屬性為true的時候支援namespace,如果為false,prefix和local name被組合成local name,這個屬性必須在進行任何讀操作之前(XmlTextReader的ReadState屬性為ReadState.Initial)設定完成;

reader.Namespaces = true;

reader.MoveToContent();

MessageBox.Show("Local Name: " + reader.LocalName);

MessageBox.Show("Prefix: " + reader.Prefix);

MessageBox.Show("Namespace: " + reader.NamespaceURI);

reader.Close();

  • XmlTextReader的Normalization屬性控制是否對空格和attribute進行标準化,将相鄰的空格合并為一個,将entity reference替換成resolved value(不支援DTD),因為後者總是要進行的(例如對&#gt的解析),實質上Compact Framework隻進行空格合并;這個屬性可以在解析的任意時候進行;

public static void TestNormalization(bool normOn)

{

  string data =

     @"<Root>

     <Element attr='     &lt;Testing Normalization&gt;

     New Line'/>

      </Root>";

     StringReader str = new StringReader(data);

     XmlTextReader reader = new XmlTextReader(str);

     reader.WhitespaceHandling=WhitespaceHandling.None;

     reader.MoveToContent();

     reader.ReadStartElement("Root");

     reader.Normalization=normOn;

     MessageBox.Show("Normalization On: " + normOn.ToString());

     MessageBox.Show("attr's Value: " + reader.GetAttribute("attr"));

     reader.Close();

}

public static void Main(string[] args)

{

     // Test with normalization off

     TestNormalization(false);

     // Test with normalization on

     TestNormalization(true);

}

  • XmlTextReader的WhitespaceHandling屬性決定了如何處理空格,

All  ------    Both Whitespace and SignificantWhitespace nodes are returned.

None  ------   No Whitespace or SignificantWhitespace nodes are returned.

Significant  ------    Only SignificantWhitespace is returned.

  • 在解析XML流的時候,XmlTextReader指向目前遇到的Node,每個Node具有4種資訊(但不是每種Node這4種資訊都有效):Node Name, Node Namespace, Node Value, and Node Attributes;

NODE TYPE                              VALUE

Attribute ------   The string value of the attribute

CDATA  ------  The content of the CDATA section

Comment  ------  The comment of the comment node

ProcessingInstruction  ------  The entire content, not including the target

SignificantWhitespace  ------  The white space within an xml:space = 'preserve' scope

Text  ------  The content of the text node

Whitespace  ------  The white space between markup

XmlDeclaration  ------  The content of the declaration

NODE TYPE            Available Attribute

Element  ------  Any custom attribute

XmlDeclaration  ------  Version, encoding, & standalone

  • XmlTextReader可以通過NodeType屬性判斷目前Node的類型,通過HasValue屬性判斷目前Node是否有Value,HasAttributes屬性判斷目前Node是否有Attribute;通過Read方法來周遊讀取XML内容;

while(reader.Read())

{

  switch(reader.NodeType)

  {

  case XmlNodeType.Element:

    if(reader.IsEmptyElement)

     MessageBox.Show("<" + reader.Name + "/>");

    else

     MessageBox.Show("<" + reader.Name + ">");

     break;

  case XmlNodeType.EndElement:

    MessageBox.Show("</" + reader.Name + ">");

    break;

  case XmlNodeType.CDATA:

    MessageBox.Show("<![CDATA[" + reader.Value + "]]>");

    break;

  case XmlNodeType.Comment:

    MessageBox.Show("<!-- " + reader.Value + " -->");

    break;

  case XmlNodeType.Document:

    MessageBox.Show("Reading an XML document");

    break;

  case XmlNodeType.DocumentFragment:

    MessageBox.Show("Reading an XML document fragment");

    break;

  case XmlNodeType.ProcessingInstruction:

    MessageBox.Show("<? " +

              reader.Name + " " +

              reader.Value + "?>");

    break;

  case XmlNodeType.Text:

    MessageBox.Show("Text: " + reader.Value);

    break;

  case XmlNodeType.XmlDeclaration:

    MessageBox.Show("<?xml " + reader.Value + "?>");

    break;

  }

}

  • XmlTextReader的ReadStartElement方法相當于在IsStartElement方法之後調用Read方法,檢查目前Node是否是StartElement并前進到下一個節點;這個方法可以同一個Name/Namespace參數來檢查目前Node是否比對;這個方法主要用于跳過StartElement直接跳轉到Element的内容;如果你調用了ReadStartElement并且使用了這個Element中的内容,要對稱調用ReadEndElement(除非這個Element的IsEmptyElement屬性等于True);

using (XmlReader reader = XmlReader.Create("book3.xml")) {

  // Parse the XML document.  ReadString is used to

  // read the text content of the elements.

  reader.Read();

  reader.ReadStartElement("book");  

  reader.ReadStartElement("title");   

  Console.Write("The content of the title element:  ");

  Console.WriteLine(reader.ReadString());

  reader.ReadEndElement();

  reader.ReadStartElement("price");

  Console.Write("The content of the price element:  ");

  Console.WriteLine(reader.ReadString());

  reader.ReadEndElement();

  reader.ReadEndElement();

}

//book3.xml

<book>

  <title>Pride And Prejudice</title>

  <price>19.95</price>

</book>

    XmlTextReader的ReadElementString方法可以讀取text-only的Element内容,其内部首先調用MoveToContent,然後讀取content傳回一個String,同時相當于調用了ReadEndElement;

reader.Read();

reader.ReadStartElement("Exercise");

string name = reader.ReadElementString();

string bodypart = reader.ReadElementString();

reader.Close();

  • XmlTextReader的MoveToContent方法用于快速前進到一個content node(A content node is an element, an end element, an entity reference, an end entity, or non–white space text.),在向前搜尋content node的過程中,會略過DocumentType nodes, ProcessingInstruction nodes, Whitespace nodes, SignificantWhitespace nodes;如果目前node是一個content node的attribute,reader會傳回到這個attribute的owner element;MoveToContent的傳回類型是System.Xml.XmlNodeType,如果直到檔案末尾都沒找到合适的Node,傳回XmlNodeType.None;

while( XmlNodeType.None != reader.MoveToContent())

{

  if(XmlNodeType.Element == reader.NodeType

     && reader.Name == "book")

  {

    MessageBox.Show(reader.ReadElementString());

  }

}

    XmlTextReader在讀取Attribute的時候如果不移動指針就隻能得到Value不能得到Name

public int

SearchAttributes(string value, XmlReader reader)

{

  if(!reader.HasAttributes)

    return -1;

  for(int ndx = 0;ndx<reader.AttributeCount;++ndx)

  {

    //既可以根據下标索引,也可以使用attribute name/local name + namespace URI來索引,例如reader["standalone"]

    if(reader[ndx] == value)

      return ndx;

  }

  return -1;

}

  • XmlTextReader可以通過MoveToAttribute/ MoveToFirstAttribute/ MoveToNextAttribute方法将指針從element移動到attribute;

public string

SearchAttributes(string value, XmlReader reader)

{

  if(!reader.HasAttributes)

    return string.Empty;

  for(int ndx = 0;ndx<reader.AttributeCount;++ndx)

  {

    reader.MoveToAttribute(ndx);

if(reader.Value == value){

  reader.MoveToElement();

      return reader.Name;

    }

  }

  reader.MoveToElement();

  return string.Empty;

}

    等效于

public string

SearchAttributes(string value, XmlReader reader)

{

  if(!reader.MoveToFirstAttribute)

    return string.Empty;

  do{

if(reader.Value == value){

  reader.MoveToElement();

      return reader.Name;

    }

  }while(reader.MoveToNextAttribute());

  reader.MoveToElement();

  return string.Empty;

}

  • XmlTextReader在移動指針查找完Attribute後,最好調用MoveToElement確定指針回到Element的開始
  • XmlTextReader提供了多種讀取Element内容的方法:ReadString、ReadString、ReadBase64、ReadBinHex、ReadInnerXml、ReadOuterXml
  • ReadString在目前Node是Element的時候會組合内部所有的text, white space, significant white space, CDATA nodes組成一個String,但是組合的過程中不能遇到任何的markup;如果目前Node是text node,也組合前面所述的所有Node得到String,直到一個end tag或者markup;這個方法的傳回值如果為String.Empty,表示目前Node沒有可讀取的text或者目前Node不是Element/Text Node

string data =

  @"<Root>Text data followed by mark up.<Child/></Root>";

StringReader str = new StringReader(data);

XmlTextReader reader = new XmlTextReader(str);

reader.WhitespaceHandling=WhitespaceHandling.None;

reader.MoveToContent();

MessageBox.Show("Content of Root: " + reader.ReadString());

  • ReadChars方法有三個參數,分别是目标位址,拷貝偏移,要拷貝的字元數,傳回實際讀取的字元數;本方法隻能工作在Element Node,不能工作于TextNode,這個方法主要用于分塊讀取一個Element中大量的資料;
  • ReadInnerXml和ReadOuterXml讀取一個element内所有的内容,使用string傳回,ReaderOuterXml傳回目前Node的start tag, content, 和end tag, ReadInnerXml隻傳回content;這兩個方法可以工作在Element Node和Attribute上;

ELEMENT XML    POSITIONED ON    ReadInnerXml    ReadOuterXml

<author>    <author>    <fn>Ronnie</fn>    <author>

<fn>Ronnie</fn>         <ln>Yates</ln>    <fn>Ronnie</fn>

<ln>Yates<ln>              </ln>Yates</ln>

</author>              </author>

ELEMENT XML    POSITIONED ON    ReadInnerXml    ReadOuterXml

<auth fn="Ronnie"/>    fn    Ronnie    fn="Ronnie"

  • XmlTextReader還支援Skip方法和Depth屬性
  • XmlTextWriter是單向的生成XML流的方法,同樣也不支援DTD;

XmlTextWriter writer = new XmlTextWriter("output.xml", Encoding.UTF8);

  • XmlTextWriter的Namespace屬性訓示是否支援Namespace,預設是支援;修改這個屬性必須在任何寫操作之前完成(WriteState必須是WriteState.Start)

XmlTextWriter writer = new XmlTextWriter("namespaces.xml");

writer.Namespaces = true;

writer.WriteElementString("po",

                          "test",

                          "http://www.fake.com");

Output

<po:test xmlns:po="http://www.fake.com" />

  • 在開始寫入XML資料前,你必須設定XmlTextWriter的輸出格式,這通過Formatting屬性完成,可以設定成Formatting.Indented或Formatting.None;可以通過IndentChar和Indention屬性來設定縮進字元和縮進量;這些設定可以在任何時候進行,對後續的寫入操作生效;
  • XmlTextWriter使用WriteStartDocument來寫入XML declaration,這必須是構造函數之後的第一個寫入操作;

XmlTextWriter writer =

  new XmlTextWriter("startdoc.xml", Encoding.UTF8);

writer.Formatting = Formatting.Indented;

writer.WriteStartDocument(true);

writer.WriteElementString("root", null);

writer.WriteEndDocument();

writer.Close();

Output

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

<root />

  • XmlTextReader提供WriteStartElement和WriteEndElement方法來逐漸寫入Element;WriteElementString用于一次性寫入隻含有text的element

XmlTextWriter writer =

  new XmlTextWriter("startelem.xml", Encoding.UTF8);

writer.Formatting = Formatting.Indented;

writer.WriteStartDocument();

// Make a true empty element

writer.WriteStartElement("root");

writer.WriteEndElement();

// Make an empty element with start and end tags, <root></root>

// writer.WriteStartElement("Empty");

// writer.WriteFullEndElement();

writer.Close();

Output

<?xml version="1.0" encoding="utf-8"?>

<root />

    另一個例子

XmlTextWriter writer =

  new XmlTextWriter("elementstring.xml", Encoding.UTF8);

writer.Formatting = Formatting.Indented;

writer.WriteStartElement("StockQuote", "http://fakequote.com");

writer.WriteElementString

  ("Symbol", "http://fakequote.com", "MSFT");

writer.WriteElementString("Value",

                          "http://fakequote.com",

                          XmlConvert.ToString(123.32));

writer.WriteEndElement();

writer.Close();

Output

<StockQuote xmlns="http://fakequote.com">

  <Symbol>MSFT</Symbol>

  <Value>123.32</Value>

</StockQuote>

  • XmlTextWriter使用WriteStartAttribute和WriteEndAttribute方法來逐漸寫入Attribute;也支援WriteAttributeString來一次性寫入;

public static void Main()

{

  XmlTextWriter writer =

    new XmlTextWriter("startatt.xml", Encoding.UTF8);

  writer.Formatting = Formatting.Indented;

  writer.WriteStartElement("root");

  writer.WriteStartAttribute("po", "att1", "http://bogus");

  writer.WriteString("value");

  writer.WriteEndAttribute();

  writer.WriteEndElement();

  writer.Close();

}

Output

<root po:att1="value" xmlns:po="http://bogus" />

    另一個例子

public static void Main()

{

  XmlTextWriter writer =

    new XmlTextWriter("attstring.xml", Encoding.UTF8);

  writer.Formatting = Formatting.Indented;

  writer.WriteStartElement("root");

  writer.WriteAttributeString("att1", "http://bogus", "value1");

  writer.WriteEndElement();

  writer.Close();

}

Output

<root d1p1:att1="value1" xmlns:d1p1="http://bogus" />

  • WriteAttributeString也可以寫入特殊的XML Attribute——xml:space和xml:lang,前者決定Element内的空格如何處理(preserve、default),後者描述這個Element的内容是何種語言

public static void Main()

{

   XmlTextWriter writer =

     new XmlTextWriter("nsdecl.xml", Encoding.UTF8);

   writer.Formatting = Formatting.Indented;

   writer.WriteStartElement("root");

   // set the xml:space attribute to preserver

   writer.WriteAttributeString("xml",

                               "space",

                               null,

                               "preserve");

  // set the xml:lang attribute to lang:en

  writer.WriteAttributeString("xml",

                              "lang",

                              null,

                              "en");

  writer.WriteEndElement();

  writer.Close();

}

Output

<root xml:space="preserve" xml:lang="en" />

  • WriteAttributeString還可以用于聲明Namespace

public static void Main()

{

  XmlTextWriter writer =

    new XmlTextWriter("nsdecl.xml", Encoding.UTF8);

  writer.Formatting = Formatting.Indented;

  writer.WriteStartElement("root");

  // redefine the default namespace

  writer.WriteAttributeString("xmlns",

                              null,

                              "http://default");

  // define a namespace prefix "po"

  writer.WriteAttributeString("xmlns",

                              "po",

                              null,

                              "http://post_office");

  writer.WriteEndElement();

  writer.Close();

}

Output

<root xmlns="http://default" xmlns:po="http://post_office" />

  • XmlTextWriter使用WriteString方法來一次性寫入内容字元串;如果需要多次寫入,使用WriteChars方法;WriteBase64和WriteBinHex用于寫入二進制資料;
  • XmlConvert提供了将.NET Compact Framework的類型轉換成XML Schema映射,也能将XML String轉換成.NET Compact Framework的類型,這比使用.NET自帶的ToString和System.Convert安全;

  XmlTextWriter writer =

    new XmlTextWriter("xmlconvert.xml", Encoding.UTF8);

  writer.Formatting = Formatting.Indented;

  writer.Indentation = 2;

  writer.WriteStartElement("root");

  writer.WriteElementString("boolean",

     XmlConvert.ToString(false));

  writer.WriteElementString("Single",

    XmlConvert.ToString(Single.PositiveInfinity));

  writer.WriteElementString("Double",

    XmlConvert.ToString(Double.NegativeInfinity));

  writer.WriteElementString("DateTime",

  XmlConvert.ToString(DateTime.Now));

  writer.WriteEndElement();

  writer.Close();

  XmlTextReader reader = new XmlTextReader("xmlconvert.xml");

  reader.WhitespaceHandling = WhitespaceHandling.None;

  reader.MoveToContent();

  reader.ReadStartElement("root");

  bool b = XmlConvert.ToBoolean(reader.ReadElementString());

  float s = XmlConvert.ToSingle(reader.ReadElementString());

  double d = XmlConvert.ToDouble(reader.ReadElementString());

  DateTime dt = XmlConvert.ToDateTime(reader.ReadElementString());

  reader.Close();

  MessageBox.Show("Boolean: " + b.ToString());

  MessageBox.Show("Single: " + s.ToString());

  MessageBox.Show("Double: " + d.ToString());

  MessageBox.Show("DateTime: " + dt.ToString());

  • 雜項:如果你先要向XmlTextWriter寫入Raw資料,使用WriteRaw方法;如果你想要寫入注釋,使用WriteComment方法;WriteCData方法會寫入CDATA Sections;使用WriteProcessingInstruction寫入處理指令;WriteWhitespace用于向XML内寫入空格;

public static void Main()

{

  string rawText = "<bad_xml att=/"bad chars ='/"<>/">/n/t" +

                   "This is bad content — & < > = ' /"/n" +

                   "</bad_xml>";

  XmlTextWriter writer =

    new XmlTextWriter("xmlconvert.xml", Encoding.UTF8);

  writer.Formatting = Formatting.Indented;

  writer.Indentation = 2;

  writer.WriteStartElement("root");

  writer.WriteRaw(rawText);

  writer.WriteEndElement();

  writer.Close();

}

Output

<root><bad_xml att="bad chars ='"<>">

      This is bad content — & < > = ' "

</bad_xml></root>