天天看點

Managing Statistics part1

Managing Statistics

Understanding Statistics

Optimizer statistics are a collection of data that describe more details about the database and the objects in the database.These statistics are used by the query optimizer to choose the best excution plan for each SQL statement.Optimizer statistics include following:

*Table statistics

-Number of rows

-Number of blocks

-Average row length

*Column statistics

-Number of distinct values(NDV)in column

-Number of nulls in column

-Data distribution(histogram)資料分布直方圖

-Extended statistics

*Index statistics

-Number of leaf blocks

-Levels

-Clustering factor

*System statistics(相對其他比較容易變化)

-I/O performance and utilization

-CPU performance and utilization

When statistics are updated for a database object,Oracle invalidates(使無效) any currently parsed(已解析) SQL statements that access the object.The next time such a statement executes,the statement is re-parsed and the optimizer automatically chooses a new execution plan based on the new statistics.Distributed statements accessing objects with new statistics on remote database are not invalidated.The new statistics take effect the next time the SQL statement is parsed.

Managing Statistics

Use the dbms_stats package:

gather_table_stats收集某個表的統計資訊

gather_index_stats

gather_schema_stats

gather_database_stats

gather_stale_stats收集不新鮮的統計資訊

實驗一:收集表的統計資訊

1.建立一個表t,插入幾條記錄

2.desc dba_tables

3.select num_rows,blocks from dba_tables owner='HR' and tablename='T';發現沒有顯示統計資訊,因為剛建立完這個表,還沒收集統計資訊。

4.exec dbms_stats.gather_table_stats('HR','T');收集統計資訊

5.再執行第三步,就有統計資訊放在dba_tables這個資料字典裡了。

When to Gather Statistics

When gathering statistics manually,you not only need to determine how to gather statistics,but also when and how often to gather new statistics.

手動收集資訊,需要自己決定何時收集,怎麼收集等資訊。

For an appliation in which tables are being incrementally modified,you may only need to gather new statistics every week or every month.The simplest way to gather statistics in these enviroment is to use a script or job scheduling tool to regularly run the GATHER_SCHEMA_STATS and GATHER_DATABASE_STATS procedures.The frequency of collection intervals should balance the task of providing accurate statistics for the optimizer against the processing overhead incurred by the statistics collection process.

對于增量的修改,例如對一個幾萬行的資料,有幾十行的增加,你隻需要定期用一個腳本或者過程收集新的統計資訊,太頻繁執行雖然會帶來最新的統計資訊,但是也會有一定的額外負載,需要權衡這個收集的頻率。

For tables which are being substantially modified in batch operations,such as with bulk loads,statistics should be gathered on those tables as part of the batch operation.The DBMS_STATS procedure should be called as soon as the load operation completes.

對于大規模的資料修改,例如批量導入,那在每次這種大批量操作後都應該收集一次統計資訊。

For partitioned tables,there are often cases in which only a single partition is modified.In those cases,statistics can be gathered only on those partitions rather than gathering statistics for the entire table.Howerver,gathering global statistics for the partitioned table may still be necessary.

對于分區表,有時候隻需要收集改動了的那個分區的統計資訊,而不用所有分區都去收集。有時候表的全局的統計資訊也是很重要的。