天天看點

什麼是解析樹?What is a Parse Tree?

The Oracle (tm) Users' Co-Operative FAQ

What is a Parse Tree?

Author's name: Carel-Jan Engel

Author's Email: [email protected]

Date written: Mar 24, 2005

Oracle version(s): N/A

In documentation about tuning SQL, I see references to parse trees. What is aparse tree ?

Back to index of questions

A parse-tree is an internal structure, created by the compiler or interpreter while parsing some language construction. Parsing is also known as 'syntax analysis'.

An example (slightly adapted version of the example found at page 6 of the famous 'Dragon Book', Compilers: principles, techniques and tools, by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman, Published by Addison Wesley. My copy is from 1986) will illustrate a parse tree. Rather than dealing with the complexities of a SQL statement, let's take a rather simple language construction: The assignment of the result of an expression to a variable:

一個例子(稍微适應版本:發現了著名的“龍書”,編譯器第6頁的例子。原則,技術和工具,由阿爾弗雷德五阿霍,拉維Sethi和傑弗裡烏爾曼由Addison Wesley出版,我的副本從1986年)将展示一個解析樹。處理一個SQL語句的複雜性,而不是讓我們來看看一個相當簡單的語言建設:一個變量的表達式的結果的配置設定:

result := salary + bonus * 1.10

When the compiler analyzes this statement the resulting parse-tree will look like this :

當編譯器分析該語句生成的解析樹看起來像這樣:

                            assignment

           ________ statement ____

          /                          |                      \

         /                          :=                      \

    identifier                             ___ expression _______

        |                                     /                    |                            \

      result                           /                      +                              \

                                expression                                   __ expression ___

                                        |                                            /               |                    \

                                  identifier                                /                  *                      \

                                        |                                 expression                       expression

                                   salary                                   |                                              |

                                                                           identifier                                  number

                                                                                 |                                               |

                                                                              bonus                                     1.10

The picture is an upside-down representation of a tree. The language elements in this small simple assignment are:identifiers (result, salary, bonus), operators (:=, +, *), and anumber (1.10). 'Identifier' is the language element that names a variable, function or procedure. 'Operator' is the language element that represents some action to be taken, upon theoperands at either end of the operator. Number is a constant, 1.10 in this statement. The syntax rules (' grammar') will specify which 'sentences' are valid.

圖檔是倒樹的代表性。在這個小的簡單的指派的語言元素是:辨別符(是以,工資,獎金),操作符(:=,+ *)和(1​​.10)aNumber的。 “辨別”是一個變量,函數或過程的語言元素名稱。 “經營者”的語言元素,代表了一些,在營運商的兩端後應采取theoperands的行動。數量是一個常數,在此聲明1.10。将指定的文法規則(“文法”),“句子”是有效的。

After successfully decomposing the statement into its internal representation, the compiler or interpreter can 'walk the tree' to create the executable code for the construction. An interpreter will not generate code for the execution, but will invoke built-in executing functions by itself. Let's take the interpreter for the rest of the explanation, execution of the steps is easier to explain than the code-generation of a compiler. For the example I assume the bonus to be 100, and the salary to be 1000. The tree-walk will start at the root of the tree, the assignment statement. The rule for the assignment will tell the interpreter that the right hand has to be evaluated first. This evaluation is also known as 'reduction'. The right hand side of the assignment needs to be reduced to a value, the result of the expression, before it can be assigned to the variable at the left hand side of the statement.

成功後分解成它的内部表示的語句,編譯器或解釋可以“行走的樹”的建設,以建立可執行代碼。解釋器将不生成執行代碼,但會調用内置的執行本身的職能。讓我們看看其他的解釋解釋,執行的步驟是比一個編譯器的代碼生成更容易解釋。對于這個例子,我假定為100,獎金和工資為1000。樹步行将開始在樹的根,指派語句。轉讓的規則會告訴解釋器,右手先計算。這種評價也被稱為“還原”。轉讓的右側的需求将減少到一個值,表達式的結果,才可以配置設定的語句左邊的變量。

The first node at the right-hand side of the statement contains an expression with a '+' operator. The right hand side of the '+' operator needs to be assigned to the left hand side. So the walk goes on to the next node at the right hand side. There the interpreter will detect the expression with the '*' operator. The left hand side of this operator needs to be multiplied with the right hand side. The interpreter goes on to the right hand side, and detects an expression that consists of a single number: 1.10. This side is fully reduced, the result can be stored and the interpreter walks the tree back up to the '*' operator, and starts evaluating its left hand side. This is an expression that consists of one single identifier, representing a variable, 'bonus'., The memory location represented by this variable is read and it's contents (100) will be multiplied by the right hand side result, 1.10. This expression has been fully reduced to the result 110 now. The interpreter walks up, to the '+' operator, and starts evaluating its left hand side. There it will again detect an identifier, 'salary'. Its location is read (1000) and the expression is reduced to a number, 1000. The right and left hand side will be added, resulting in 1,110. Now the expression at the right hand side of the assignment is fully reduced, and the interpreter walks up the tree, finds the assignment operator ':='. This instructs the interpreter to copy the result of the expression to the left hand side. The left hand side contains an identifier, 'result'. The memory location represented by 'result' will be filled with the result of the expression, 1,110.

在右側的聲明的第一個節點包含一個“+”操作​​符的表達式。右側的“+”運算需要被配置設定到左側。是以走在右側的下一個節點。有解釋器将檢測到的“*”操作符表達式。這個操作符左邊需要乘以右側。口譯員的右側,并檢測到表達,由一個單一的數字:1.10。此方是完全還原,結果可存儲和口譯各界樹“*”操作符,并開始評估其左側。這是一個包含一個單一的辨別符代表一個變量,“獎金”,這個變量所代表的記憶體位置讀取和它的内容(100)将右邊的結果,1.10乘以的表達。此表達式已全面降低到現在的結果110。口譯員走了,“+”運算,并開始評估其左側。在那裡,它會再次檢測辨別符,“薪水”。它的位置是隻讀(1000),表達的是一個數字,1000。将增加的權利和左側,導緻在1110。現在,在右側轉讓的表達式是完全降低,口譯員走了樹,發現指派運算符':='.這訓示解釋複制的表達左側。左側包含一個辨別符,“結果”。 “結果”所代表的記憶體位置将被填充與表達式的結果,1110。

It is just a simplified explanation of how an interpreter or compiler uses a parse tree. It's out of scope of this answer to create a complete introduction to compiler building practices. However, it might be clear that creating a parse-tree consumes some resources. Before the language elements can be recognized they must be read character by character, type checking and possible conversion needs to be done, identifiers (tables, columns etc.) need to be identified and checked in the data dictionary, and so on. After this 'hard parse' the parse tree is composed, and is a far cheaper form to use to execute a statement than doing all this analysis over and over again. Therefore, storing the parse-tree in the SQL-area for future use can save quite some time during the processing of SQL-statements that have come across before.

它僅僅是一個簡單解釋如何解釋器或編譯器的使用解析樹。這是這個答案的範圍,建立一個完整的介紹編譯器的建設實踐。但是,它可能是明确的,建立解析樹消耗一些資源。的語言元素,可以确認之前,他們必須予以字元的字元,類型檢查和可能的轉換,需要做的,辨別符(表,列等)的需要确定和檢查資料字典,依此類推。在此之後的“硬解析”的解析樹組成,是一個便宜得多的形式用來執行比一遍又一遍的做這一切分析的聲明。是以,存儲在SQL區,供日後使用解析樹,跨前的SQL語句的處理過程中可以節省一段時間。

Further reading: If you are interested in compiler building techniques, consider reading The Dragon Book. It can be found at: http://www.amazon.co.uk/exec/obidos/ASIN/0201101947/026-7499645-2696457

注:以上翻譯是用google直接翻譯的,意思不明白,參考英文原文

繼續閱讀