Given a string representing a code snippet, you need to implement a tag validator to parse the code and return whether it is valid. A code snippet is valid if all the following rules hold:
The code must be wrapped in a valid closed tag. Otherwise, the code is invalid.
A closed tag (not necessarily valid) has exactly the following format : <TAG_NAME>TAG_CONTENT</TAG_NAME>. Among them, <TAG_NAME> is the start tag, and </TAG_NAME> is the end tag. The TAG_NAME in start and end tags should be the same. A closed tag is valid if and only if the TAG_NAME and TAG_CONTENT are valid.
A valid TAG_NAME only contain upper-case letters, and has length in range [1,9]. Otherwise, the TAG_NAMEis invalid.
A valid TAG_CONTENT may contain other valid closed tags, cdata and any characters (see note1) EXCEPTunmatched <, unmatched start and end tag, and unmatched or closed tags with invalid TAG_NAME. Otherwise, the TAG_CONTENT is invalid.
A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa. However, you also need to consider the issue of unbalanced when tags are nested.
A < is unmatched if you cannot find a subsequent >. And when you find a < or </, all the subsequent characters until the next > should be parsed as TAG_NAME (not necessarily valid).
The cdata has the following format : <![CDATA[CDATA_CONTENT]]>. The range of CDATA_CONTENT is defined as the characters between <![CDATA[ and the first subsequent ]]>.
CDATA_CONTENT may contain any characters. The function of cdata is to forbid the validator to parse CDATA_CONTENT, so even it has some characters that can be parsed as tag (no matter valid or invalid), you should treat it as regular characters.
Valid Code Examples:
Invalid Code Examples:
Note:
For simplicity, you could assume the input code (including the any characters mentioned above) only contain letters, digits, '<','>','/','!','[',']' and ' '.
這道題讓我們給了我們一個字元串,其實是html的代碼,讓我們驗證其寫法是否正确。規定了八條規則,比如說必須是封閉的,标簽名必須都是大寫,并且不能超過9個字元,還規定了CDATA的一些格式規範,并且給了一些執行個體,但是說實話,題目中給的這些例子完全不能覆寫OJ中的各種情況,部落客這次完全被OJ教育了,每次submit都被OJ打回來,然後分析未通過的test case,修改代碼,再送出,再打回,折騰了十幾次,終于通過OJ,綠色的Accepted出現的那一刹那,無比的快感,這也是部落客能堅持到現在的動力之一吧,當然最主要的動力還是大家的支援與鼓勵,部落客很喜歡跟大家留言互動哈。下面呈上部落客fail過的case,并來分析原因:
錯誤原因:沒有以Tag開頭
多餘的'>'不會影響
錯誤原因:最後一個閉标簽隻能閉合首标簽
沒有content data也沒關系
CDATA中間的内容可以是任意字元
錯誤原因:沒有标簽存在
錯誤原因:末尾存在多餘的'>'
注意content data的幹擾字元
錯誤原因:标簽的字元長度不能超過9個
錯誤原因:标簽字元必須都是大寫
錯誤原因:沒有正确的match上"<![CDATA[",也不能當做标簽
錯誤原因:不能以非标簽開頭
錯誤原因:不能以非标簽結尾
如果我們隻比對到了"</",說明是個結束标簽,那麼我們用find來找到右尖括号'>',如果沒找到直接傳回false,找到了就把tag到内容提出來,然後看此時的stack,如果stack為空,或者棧頂元素不等于tag,直接傳回false,否則就将棧頂元素取出。
如果我們隻比對到了"<",說明是個起始标簽,還是要找右尖括号,如果找不到,或者标簽的長度為0,或者大于9了,直接傳回true。然後周遊标簽的每一位,如果不全是大些字母,傳回false,否則就把tag壓入棧。那麼你可能會有疑問,為啥在處理結束标簽時,沒有這些額外的判斷呢,因為結束标簽要和棧頂元素比較,棧裡的标簽肯定都是合法的,是以如果結束标簽不合法,那麼肯定不相等,也就直接傳回false了。最後我們看棧是否為空,如果不為空,說明有未封閉的标簽,傳回false。參見代碼如下:
<a>class Solution {</a>
參考資料:
<a href="https://discuss.leetcode.com/topic/91473/clean-c-solution" target="_blank">https://discuss.leetcode.com/topic/91473/clean-c-solution</a>
<a href="https://discuss.leetcode.com/topic/91446/c-clean-code-recursive-parser" target="_blank">https://discuss.leetcode.com/topic/91446/c-clean-code-recursive-parser</a>
<a href="https://discuss.leetcode.com/topic/91505/6-lines-c-solution-using-regex" target="_blank">https://discuss.leetcode.com/topic/91505/6-lines-c-solution-using-regex</a>
<a href="https://discuss.leetcode.com/topic/91300/java-solution-use-startswith-and-indexof" target="_blank">https://discuss.leetcode.com/topic/91300/java-solution-use-startswith-and-indexof</a>
<a href="https://discuss.leetcode.com/topic/91406/java-solution-7-lines-regular-expression/2" target="_blank">https://discuss.leetcode.com/topic/91406/java-solution-7-lines-regular-expression/2</a>
,如需轉載請自行聯系原部落客。