天天看點

一鍵解析XML檔案(利用Digester實作可配置)

大部分程式員,平時工作中除了與bug相伴之外,想必也會很多種多樣的檔案打交道吧。當然,xml 就是其中之一,擷取互動資料,建立規則等等,都離不開他。xml是個非常強大的描述語言,相比而言,txt之流則功力較弱了些。

     xml那麼重要,單解析xml的工作卻繁雜無聊。原因如下:解析xml工具繁雜,每個人掌握的東西不一樣;學習新工具又浪費時間;解析規則隻是個體力活。

    程式員的辛苦,也就展現于此了。一般程式員總是抱怨忙啊忙,累啊累。我們作為程式員來說,更應該想一想想法設法吧東西提煉一下,減少重複的工作。 說這些題外話,是有原因的。我現在的大部分同僚,加班加點在哪加班,貌似很辛苦,看一下他們的工作,讓人很無奈。代碼是一大段一大段的寫,确實一晚上寫個百十行,是非常“有成就感”的。

如生成xml 檔案,用拼串的方式實作。一個小小的xml,往往都50多行。讓人很無語。

   解析xml, 往往都需要有一套解析規則。而我們可以動态建立解析規則。讓後根據規則将xml檔案資料放到map中。思路簡單至極。其他擴充,暫時不考慮。

主要步驟 分為2部分

第一部分  使用digester實作可配置的規則樹結構

1)   規則節點對象

public    class node{ 

  private string nodename; 

  private list children = new arraylist(); 

  /** 

    * 屬性清單,以逗号分隔 

    */ 

  private string attrs=""; 

  public string getattrs() { 

    return attrs; 

  } 

  public void setattrs(string values) { 

    this.attrs= values; 

  public string getnodename() { 

    return nodename; 

  public void setnodename(string nodename) { 

    this.nodename = nodename; 

  public list getchildren() { 

    return children; 

  public void addchildren(node node) { 

    this.children.add(node); 

  @override 

  public string tostring() { 

    return this.nodename+":"+"have"+this.getchildren().size()+"子節點"; 

}

(2) 定義xml 解析規則檔案,當然也可以寫編碼實作。

<?xml version="1.0"?> 

<!doctype digester-rules public "-//jakarta apache //dtd digester-rules xml v1.0//en" "http://jakarta.apache.org/commons/digester/dtds/digester-rules.dtd"> 

<digester-rules> 

        <pattern value="root"> 

              <object-create-rule classname="com.aisino.common.parse.node"/> 

              <set-properties-rule /> 

              <set-next-rule methodname="add"/> 

              <pattern value="level1"> 

                <object-create-rule classname="com.aisino.common.parse.node"/> 

                <set-properties-rule /> 

              <set-next-rule methodname="addchildren"/> 

              <pattern value="level2"> 

                <!--    

                <call-method-rule methodname="setnodename" paramcount="1" paramtypes="java.lang.string"/> 

                <call-param-rule paramnumber="0"/> 

                 --> 

                 <pattern value="level3"> 

                    <object-create-rule classname="com.aisino.common.parse.node"/> 

                    <set-properties-rule /> 

                        <set-next-rule methodname="addchildren"/> 

                        <pattern value="level4"> 

                      <object-create-rule classname="com.aisino.common.parse.node"/> 

                      <set-properties-rule /> 

                         <set-next-rule methodname="addchildren"/> 

                         <pattern value="level5"> 

                        <object-create-rule classname="com.aisino.common.parse.node"/> 

                        <set-properties-rule /> 

                            <set-next-rule methodname="addchildren"/> 

                         </pattern> 

                        </pattern> 

                 </pattern> 

                <set-next-rule methodname="addchildren"/> 

              </pattern> 

            </pattern> 

        </pattern> 

</digester-rules>

3)  規則解析類ruleloader

public class ruleloader { 

  private static final log logger = logfactory.getlog(ruleloader.class); 

  private url digesterrulesurl; 

  private url fileurl; 

  public ruleloader(string rulepath,string filepath){ 

      digesterrulesurl= geturl(rulepath); 

      fileurl = geturl(filepath); 

//    digesterrulesurl= getclass().getclassloader().getresource(rulepath); 

  public ruleloader(url digesterrulesurl,url fileurl){ 

    this.digesterrulesurl = digesterrulesurl; 

    this.fileurl = fileurl; 

  public static ruleloader    getxmlruleloader(string filepath){ 

    url url = geturl("classpath:com/aisino/common/parse/xml-rules.xml"); 

    return new ruleloader(url,geturl(filepath)); 

    *    

    * 自定義指定規則<br/> 

    * 需要對digester 解析規則熟悉 

    * @return 

  public list parserules(){ 

     classloader classloader = getclass().getclassloader(); 

     object root = new arraylist(); 

     try { 

      digesterloader.load(digesterrulesurl, classloader, fileurl,root); 

    } catch (ioexception e) { 

      logger.error("ioexception"); 

    } catch (saxexception e) { 

      logger.error("sax error"); 

    } catch (digesterloadingexception e) { 

      logger.error("an error occurs while parsing xml into digester rules"); 

    } 

    return (list)root; 

    * 解析xml資料,并将資料儲存至map 

    * 最多支援5級 

    * xml節點:主要表示root、level1,level2,level3,level4,level5<br/> 

    *  for example:<br/> 

  *  <root nodename="dataexchangepackage"><br/> 

  *     <level1 nodename="envelopeinfo"><br/> 

  *        <level2 nodename="sourceid" /><br/> 

  *        <level2 nodename="destinationid" /><br/> 

  *        <level2 nodename="destinationappid" /><br/> 

  *        <level2 nodename="businesstype" /><br/> 

  *        <level2 nodename="globalbusinessid" /><br/> 

  *     </level1> 

  *     <level1> 

  *     ... 

     </root> 

  <br/> 

    約定 ${}為節點對應nodename值 

    ${root} = dataexchangepackage 

    ${level1}==envelopeinfo 

    ${level2}==sourceid... 

    * @param input 

    * @return map 

    * 從map 擷取值 

    * 如果key    是葉子節點:則傳回string,反之,為map 

    * 如果想取sourceid的值,則key=/dataexchangepackage/envelopeinfo/sourceid 

  public static map    reparserules(string filepath,inputstream input){ 

     list rules = ruleloader.getxmlruleloader(filepath).parserules(); 

     if(rules != null && rules.size()>0){ 

        digester digester = new digester(); 

      node root=(node)rules.get(0);//根節點 

      mapsetnextrule rule = new mapsetnextrule("put"); 

      addrule2dister(digester, root,"", rule,true); 

        try { 

        map valuemap = new hashmap(); 

        map parsemap = (map)digester.parse(input); 

        valuemap.putall(parsemap); 

        afterrootmap(parsemap,valuemap,""); 

        return valuemap; 

      } catch (ioexception e) { 

        e.printstacktrace(); 

      } catch (saxexception e) { 

      }finally{ 

        digester=null; 

      } 

     } 

    return null; 

  private static void afterrootmap(map valuemap,map destmap,string pattern){ 

        string fullpattern=""; 

        iterator keys = valuemap.keyset().iterator(); 

      while(keys.hasnext()){ 

        object key = keys.next(); 

        object v = valuemap.get(key); 

        fullpattern= pattern+"/"+key; 

        if(v instanceof map){ 

          afterrootmap((map)v,destmap,fullpattern); 

        }else{ 

          logger.debug(fullpattern+">>>>對應元素>"+v+"    放入傳回棧中"); 

          destmap.put(fullpattern, v); 

        } 

      }     

  private static url geturl(string resourcelocation){ 

    try { 

      if (resourcelocation.startswith(resourcehelper.classpath_url_prefix)) { 

        return resourcehelper.geturl(resourcelocation); 

      } else if(resourcelocation.startswith(resourcehelper.file_url_prefix)) { 

        resourcelocation = stringutils.replace(resourcelocation, resourcehelper.file_url_prefix, ""); 

        return new file(resourcelocation).touri().tourl(); 

    } catch (exception e) { 

      logger.error("解析xml路徑時,出錯"); 

/** 

* 遞歸添加解析規則 

* @param digester 

* @param node :目前節點 

* @param pattern:規則 

* @param rule:支援map添加 

* @param isroot 是否為根節點 

*/ 

private static void addrule2dister(digester digester,node node,string pattern,mapsetnextrule rule,boolean isroot){ 

     string fullpattern=""; 

     if(stringutils.isnotblank(pattern)){ 

         fullpattern = pattern+"/"+node.getnodename(); 

     }else{ 

        fullpattern = node.getnodename(); 

     }        

     if(node.getchildren().size()>0){ 

        logger.debug(" add rules >>> digester.addobjectcreate("+fullpattern+", hashmap.class);"); 

        digester.addobjectcreate(fullpattern, hashmap.class); 

        if(stringutils.isnotblank(node.getattrs())){ 

         string[] attrs =stringutils.split(node.getattrs(),","); 

         logger.debug(fullpattern+"有屬性:"+tostringbuilder.reflectiontostring(attrs)); 

         for(int i=0;i<attrs.length;i++){ 

            string attr= attrs[i]; 

            logger.debug(" add rules >>> digester.addcallmethod("+fullpattern+",\"put\", 2)"); 

            logger.debug(" add rules >>> digester.addobjectparam("+fullpattern+",0, "+attr+")"); 

            logger.debug(" add rules >>> digester.addcallparam("+fullpattern+",1, "+attr+")"); 

            digester.addcallmethod(fullpattern, "put", 2); 

            digester.addobjectparam(fullpattern, 0, attr); 

            digester.addcallparam(fullpattern, 1,attr);             

         } 

        if(!isroot){//不是根節點 

         logger.debug(" add rules >>> digester.addrule("+fullpattern+", rule);"); 

         digester.addrule(fullpattern, rule); 

        for(int i=0;i<node.getchildren().size();i++){ 

         node child = (node)node.getchildren().get(i); 

         addrule2dister(digester,child,fullpattern,rule,false); 

        //葉子節點 

        logger.debug("add rules >>> digester.addcallmethod("+fullpattern+", \"put\", 2)"); 

        digester.addcallmethod(fullpattern, "put", 2); 

        logger.debug("add rules >>>    digester.addobjectparam("+fullpattern+",0, "+node.getnodename()+")"); 

        digester.addobjectparam(fullpattern, 0, node.getnodename()); 

        logger.debug("add rules >>> digester.addcallparam("+fullpattern+",1)"); 

        digester.addcallparam(fullpattern, 1);     

... 

xml閱讀器

public class xmlreader { 

  private static final log logger = logfactory.getlog(xmlreader.class); 

  private string rulepath; 

  private string filepath; 

  private ruleloader rule; 

  public xmlreader(string rulepath){ 

    this.rulepath = rulepath; 

  public list read(string filepath){ 

        rule = new ruleloader(this.rulepath,filepath); 

        return rule.parserules(); 

 測試類

     public static void main(string args[]) throws exception{ 

      inputstream resourceasstream = thread.currentthread().getcontextclassloader().getresourceasstream("com/aisino/common/parse/101r.xml"); 

      map m = ruleloader.reparserules("classpath:com/aisino/common/parse/xml-value-example.xml",resourceasstream); 

      system.out.println(m.get("/dataexchangepackage/envelopeinfo/destinationid")); 

      system.out.println(m.get("/dataexchangepackage/transferinfo/age")); 

      system.out.println(m.get("/dataexchangepackage/transferinfo/sex")); 

        }

測試規則檔案(定義xml解析規則)

暫時隻支援5級。當然可以無限增加至6級别,8級...

digester 好想不支援通配符吧。 在這一點确實沒有做到充分的可擴充。但是5級相當可以滿足應用了

<root nodename="dataexchangepackage"> 

<level1 nodename="envelopeinfo"> 

    <level2 nodename="sourceid" /> 

    <level2 nodename="destinationid" /> 

    <level2 nodename="destinationappid" /> 

    <level2 nodename="businesstype" /> 

    <level2 nodename="globalbusinessid" /> 

</level1> 

<level1 nodename="transferinfo" attrs="age,sex"> 

    <level2 nodename="senderid" /> 

    <level2 nodename="receiverid" /> 

    <level2 nodename="isretry" /> 

    <level2 nodename="sendtime" /> 

    <level2 nodename="messageid" /> 

    <level2 nodename="sourcemessageid" /> 

<level1 nodename="contentcontrol"> 

      <level2 nodename="zip"> 

        <level3 nodename="iszip"/> 

      </level2> 

      <level2 nodename="encrypt"> 

        <level3 nodename="isencrypt"/> 

      <level2 nodename="code"> 

        <level3 nodename="iscode"/> 

<level1 nodename="packageinfo"> 

    <level2 nodename="subpackage"> 

      <level3 nodename="sequence"/> 

      <level3 nodename="content"> 

        <level4 nodename="dj_nsrxx"> 

          <level5 nodename="nsrsbh"/> 

          <level5 nodename="swjg_dm"/> 

          <level5 nodename="nsrdzdah"/> 

          <level5 nodename="nsrmc"/> 

          <level5 nodename="zjhm"/> 

          <level5 nodename="scjydz"/> 

          <level5 nodename="bsrmc"/> 

          <level5 nodename="dhhm"/> 

          <level5 nodename="djzclx_dm"/> 

          <level5 nodename="nsrzt_dm"/> 

          <level5 nodename="nsr_swjg_dm"/> 

        </level4> 

      </level3> 

    </level2> 

<level1 nodename="returnstate"> 

    <level2 nodename="returncode"></level2> 

    <level2 nodename="returnmessageid"></level2> 

    <level2 nodename="returnmessage"></level2> 

</root> 

測試資料

<?xml version="1.0" encoding="gbk"?> 

<dataexchangepackage xmlns="http://www.chinatax.gov.cn/tirip/dataspec" xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:schemalocation="http://www.chinatax.gov.cn/tirip/dataspec/dataexchangepackage.xsd" version="sw5001"> 

  <envelopeinfo> 

      <sourceid>sj</sourceid> 

      <destinationid>yd</destinationid> 

      <destinationappid>sjqz</destinationappid> 

      <businesstype>wlfp101</businesstype> 

      <globalbusinessid>sjwlfp1012012011100184241</globalbusinessid> 

    </envelopeinfo> 

  <transferinfo age="88777" sex="男"> 

        <senderid>sj</senderid> 

        <receiverid>yd</receiverid> 

        <isretry/> 

        <sendtime>2012-01-11 13:14:57:759</sendtime> 

        <messageid>sjwlfp1012012011100184241</messageid> 

        <sourcemessageid/> 

  </transferinfo> 

  <contentcontrol> 

            <zip> 

              <iszip>false</iszip> 

            </zip> 

          <encrypt> 

            <isencrypt>false</isencrypt> 

          </encrypt> 

          <code> 

              <iscode>false</iscode> 

          </code> 

  </contentcontrol> 

  <packageinfo> 

        <subpackage> 

          <sequence>1</sequence> 

          <content> 

             <dj_nsrxx> 

              <nsrsbh>350583729702365</nsrsbh> 

              <swjg_dm>13505833100</swjg_dm><xgrq/> 

              <nsrdzdah>350502001008324</nsrdzdah> 

              <nsrmc>南安市水頭康利石材有限公司</nsrmc> 

              <zjhm>130302610511351</zjhm> 

              <scjydz>南安市水頭鎮西錦村</scjydz> 

              <bsrmc>陳姿穎</bsrmc> 

              <dhhm>6981988</dhhm> 

              <djzclx_dm>230</djzclx_dm> 

              <nsrzt_dm>21</nsrzt_dm> 

              <nsr_swjg_dm>13505833100</nsr_swjg_dm> 

            </dj_nsrxx> 

          </content> 

          </subpackage> 

    </packageinfo> 

    <returnstate> 

            <returncode>00</returncode> 

            <returnmessageid>ydwlfp1012012011100000001</returnmessageid> 

            <returnmessage>成功</returnmessage> 

    </returnstate> 

</dataexchangepackage>

測試結果

 /dataexchangepackage/envelopeinfo/destinationid>>>>yd

/dataexchangepackage/transferinfo/age>>>88777

/dataexchangepackage/transferinfo/age>>>男

轉載:http://dba10g.blog.51cto.com/764602/1179244