天天看點

Oracle Golden Gate 系列十四 -- 監控 GG 狀态 說明

一.使用指令檢視

主要有如下指令:

這裡注意STATS 指靜态的資訊,而STATUS 是運作時的資訊。

1.1 Monitoring an Extract recovery

If Extractabends when a long-running transaction is open, it can seem to take a long timeto recover when it is started again. To recover its processing state, Extractmust search back through the online and archived logs (if necessary) to findthe first log record for that long-running transaction. The farther back intime that the transaction started, the longer the recovery takes, in general,and Extract can appear to be stalled.

--當一個長事務在運作時,此時Extract 程序異常中斷,那麼在下次啟動時就會花很長的時間來進行recover操作。

在恢複過程中,Extract 程序需要搜尋online和archived logs 資訊來查找長事務的第一條log 記錄。進而确定事務的開始時間,然後進行恢複,在恢複過程中,Extract 的操作是比較慢的。

To confirm thatExtract is recovering properly, use the SEND EXTRACT command with the STATUS option.One of the following status notations appears, and you can follow the progressas Extract changes its log read position over the course of the recovery.

為了确認Extract 的recover 狀态,可以使用如下指令檢視:

GGSCI>Send extract_name status

或者:

GGSCI>Send  extract extract_name status

該指令中的狀态有如下三種:

(1)    In recovery[1] – Extract isrecovering to its checkpoint in the transaction log.

(2)    In recovery[2] – Extract isrecovering from its checkpoint to the end of the trail.

(3)    Recovery complete – Therecovery is finished, and normal processing will resume.

示例:

GGSCI (gg1) 12>send extract ext1 status

Sending STATUS request to EXTRACT EXT1 ...

EXTRACT EXT1 (PID 5269)

  Current status: In recovery[1]: At EOF

 Current read position:

 Sequence #: 24

 RBA: 6921352

 Timestamp: 2011-11-17 20:17:20.000000

 Current write position:

 Sequence #: 0

 RBA: 0

 Timestamp: 2011-11-17 16:56:31.777616

 Extract Trail: /u01/ggate/dirdat/lt

GGSCI (gg1) 13> send ext1 status

  Current status: Inrecovery[1]: At EOF

  Extract Trail: /u01/ggate/dirdat/lt

1.2 Monitoring lag

Lag statisticsshow you how well the Oracle GoldenGate processes are keeping pace with theamount of data that is being generated by the business applications. With this information,you can diagnose suspected problems and tune the performance of the Oracle GoldenGateprocesses to minimize the latency between the source and target databases.

       Lag 的靜态資訊可以顯示GG 程序處理的data 數量。

For Extract, lagis the difference, in seconds, between the time that a record was processed byExtract (based on the system clock) and the timestamp of that record in thedata source.

--對于Extract,lag 表示Extract 程序處理記錄的時間與記錄在Data source中timestamp的一個時間差。 這個可以展現Extract 的反應時間。機關是秒。

For Replicat,lag is the difference, in seconds, between the time that the last record was processedby Replicat (based on the system clock) and the timestamp of the record in the trail.

--同樣對于Replicat,lag 表示的是Replicat 程序處理的最後一條記錄與這條記錄在trail 檔案中timestamp 的時間差。機關是秒。

檢視lag statistics 的資訊可以使用如下兩種文法:

(1)LAG {EXTRACT | REPLICAT | ER}{<group | wildcard>}

(2)SEND {EXTRACT | REPLICAT}{<group | wildcard>}, GETLAG

       這裡要注意的是, SEND 指令傳回的log statistics 是checkpointed 中記錄的最後一條記錄,而不是process 目前處理的記錄,是以SEND 指令顯示的資訊沒有LAG 或 INFO 指令顯示的準确。

GGSCI (gg1) 20> lag er *

Sending GETLAG request to EXTRACT DPUMP ...

No records yet processed.

At EOF, no more records to process.

Sending GETLAG request to EXTRACT EXT1 ...

Last record lag: 21 seconds.

GGSCI (gg1) 21> send ext1 getlag

有三種方式來控制Lag 的報警設定:

(1)Use the LAGREPORTMINUTES or LAGREPORTHOURSparameter to specify the interval at which Manager checks for Extract andReplicat lag.

       --這2個參數設定Manager 檢查Extract 和ReplicatLag的時間間隔。

(2)Use the LAGCRITICALSECONDS, LAGCRITICALMINUTES,or LAGCRITICALHOURS parameter to specify a lag threshold that is consideredcritical, and to force a warning message to the error log when the threshold isreached. This parameter affects Extract and Replicat processes on the localsystem.

       --這3個參數控制Lag 的界限值,當超過這個值,就認為是嚴重的,将強制寫一條警告資訊到error log裡。 這個參數隻影響本地系統上的Extract 和Replicat 程序。

(3)Use the LAGINFOSECONDS, LAGINFOMINUTES,or LAGINFOHOURS parameter to specify how often to report lag information to theerror log. If the lag is greater than the value specified with the LAGCRITICAL parameter,Manager reports the lag as critical; otherwise, it reports the lag as aninformational message. A value of zero (0) forces a message at the frequencyspecified with the LAGREPORTMINUTES or LAGREPORTHOURS parameter.

       --這3個參數指定多長時間将lag 資訊寫入error log。

1.3 Monitoring processing volume

The volumestatistics show you the amount of data that is being processed by an Oracle GoldenGateprocess, and how fast it is being moved through the Oracle GoldenGate system.With this information, you can diagnose suspected problems and tune the performanceof the Oracle GoldenGate processes.

1.3.1 檢視 volume statistics

文法:

STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>} [TABLE {<name | wildcard>}]

GGSCI (gg1) 22> stats er ext1

Sending STATS request to EXTRACT EXT1 ...

Start of Statistics at 2011-11-18 16:30:35.

DDL replication statistics (for alltrails):

*** Total statistics since extractstarted     ***

       Operations                                  0.00

       Mapped operations                            0.00

       Unmapped operations                          0.00

       Other operations                             0.00

       Excluded operations                          0.00

Output to /u01/ggate/dirdat/lt:

Extracting from DAVE.PDBA to DAVE.PDBA:

*** Total statistics since 2011-11-1815:13:17 ***

       Total inserts                                0.00

       Total updates                                0.00

       Total deletes                                1.00

       Total discards                               0.00

       Total operations                             1.00

*** Daily statistics since 2011-11-1815:13:17 ***

       Total inserts                               0.00

*** Hourly statistics since 2011-11-1816:00:00 ***

       No database operations have been performed.

*** Latest statistics since 2011-11-1815:13:17 ***

       Total updates                               0.00

End of Statistics.

GGSCI (gg1) 23> statsextract ext1 table pdba

Start of Statistics at 2011-11-18 16:31:17.

       Operations                                   0.00

1.3.2 檢視 processing rate

STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>}, REPORTRATE {HR | MIN | SEC}

--HR/MIN/SEC==小時/分鐘/秒

GGSCI (gg1) 24> stats er ext1,reportrate min

Start of Statistics at 2011-11-18 16:34:36.

       Total inserts/minute:                        0.00

       Total updates/minute:                        0.00

       Total deletes/minute:                        0.01

       Total discards/minute:                       0.00

       Total operations/minute:                     0.01

       Total discards/minute:                       0.00

       Total deletes/minute:                        0.01

1.3.3  檢視自啟動以來單表的總的操作

STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>},TOTALSONLY <table>

GGSCI (gg1) 25> stats er ext1,totalsonly pdba

Start of Statistics at 2011-11-18 16:37:51.

Cumulative totals for specified table(s):

1.3.4 To limit the types of statistics that are displayed

STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>},{TOTAL | DAILY | HOURLY | LATEST}

GGSCI (gg1) 28> stats ext1 total

Start of Statistics at 2011-11-18 16:44:52.

       Total updates                                0.00

小技巧:

       指令中的extract和replicat 類型不用指定,後面的逗号也可以省略,gg 都會自動識别。

1.3.5 To clear allfilters that were set with previous options

STATS {EXTRACT | REPLICAT | ER} {<group| wildcard>}, RESET

1.3.6 To send interimstatistics to the report file

SEND {EXTRACT | REPLICAT | ER} {<group |wildcard>}, REPORT

二.使用Errorlog

Error log 存放在GG 的安裝目錄下面:

gg1:/u01/ggate> ll ggserr.log

-rw-rw-rw- 1 oracle oinstall 149756 Nov 1816:44 ggserr.log

使用GG error log可以檢視一下資訊:

(1)    a history of GGSCI commands

(2)    Oracle GoldenGate processesthat started and stopped

(3)    processing that was performed

(4)    errors that occurred

(5)    informational and warningmessages

Because the error log shows events as they occurred in sequence, it is a good tool for detectingthe cause (or causes) of an error. For example, you might discover that:

(1)    someone stopped a process

(2)    a process failed to make aTCP/IP or database connection

(3)    a process could not open a file

2.1 To view the error log

Use any of the following:

(1)    Standard shell command to viewthe ggserr.log file within the root Oracle GoldenGate

(2)    directory

(3)    Oracle GoldenGate Director

(4)    VIEW GGSEVT command in GGSCI

文法:VIEW GGSEVT

GGSCI (gg1) 29> view ggsevt

2011-11-08 20:08:12  INFO   OGG-00987  Oracle GoldenGateCommand Interpreter for

 Oracle: GGSCI command (oracle): edit params mgr.

2011-11-08 20:11:09  INFO   OGG-00987  Oracle GoldenGateCommand Interpreter for

 Oracle: GGSCI command (oracle): start manager.

2011-11-08 20:11:11  INFO   OGG-00983  Oracle GoldenGateManager for Oracle, mgr

.prm: Manager started (port 7809).

2011-11-08 20:36:22  INFO   OGG-00987  Oracle GoldenGateCommand Interpreter for

 Oracle: GGSCI command (oracle): add extract ext1 tranlog, begin now.

2011-11-08 20:36:47  INFO   OGG-01749  Oracle GoldenGateCommand Interpreter for

 Oracle: Successfully registered EXTRACT EXT1 to start managing log retention at

 SCN1121060.

2011-11-08 20:37:16  INFO   OGG-00987  Oracle GoldenGateCommand Interpreter for

 Oracle: GGSCI command (oracle): add exttrail /u01/ggate/dirdat/lt  extract ext1

.

2.2 To filter the error log

The error logcan become very large, but you can filter it based on a keyword. For example, thisfilter show only errors:

$ moreggserr.log | grep ERROR

gg1:/u01/ggate> more ggserr.log | grepERROR

2011-11-09 21:00:32  ERROR  OGG-01224  Oracle GoldenGateCapture for Oracle, ext1.prm:  TCP/IPerror 113 (No route to host).

2011-11-09 21:00:33  ERROR  OGG-01668  Oracle GoldenGateCapture for Oracle, ext1.prm:  PROCESSABENDING.

2011-11-15 20:51:50  ERROR  OGG-01203  Oracle GoldenGateCapture for Oracle, ext2.prm:  EXTRACTabending.

Because the error log will continue to grow as you use Oracle GoldenGate, consider archivingand deleting the oldest entries in the file.

NOTE:

The Collectorprocess might stop reporting to the log on UNIX systems after the log has beencleaned up. To get reporting started again, restart the Collector process

after the cleanup.

三.使用程序報告

根據程序報告,可以檢視如下内容:

(1)    parameters in use

(2)    table and column mapping

(3)    database information

(4)    runtime messages and errors

(5)    runtime statistics for thenumber of operations processed

Every Extract,Replicat, and Manager process generates a report file at the end of each run. Thereport can help you diagnose problems that occurred during the run, such asinvalid mapping syntax, SQL errors, and connection errors.

每個Extract,Replicat和Manager程序,在運作結束時都會生成一個report 檔案。 通過這個檔案可以檢視進行在運作期間的相關資訊。

3.1 To view a process report

(1)    standard shell command forviewing a text file

(2)    Oracle GoldenGate Director

(3)    VIEW REPORT command in GGSCI

VIEW REPORT {<group> | <filename> | MGR}

Where:

(1)     <group> shows an Extract or Replicatreport that has the default name, which is the name of the associated group.

(2)    <file name> shows anyExtract or Replicat report file that matches a given path name. Must be used ifa non-default report name was assigned with the REPORT option of the ADDEXTRACT or ADD REPLICAT command when the group was created.

(3)    MGR shows the Manager processreport.

Report names arein upper case if the operating system is case-sensitive. By default,reportshave a file extension of .rpt, for example EXTORA.rpt. The default location isthe dirrpt sub-directory of the Oracle GoldenGate directory.

--如果作業系統大小寫敏感,那麼Report Name就是大寫,預設情況下,Report 檔案擴充名是rpt,預設目錄是GG 安裝目錄的dirrpt 目錄下。

GGSCI (gg1) 30> view report ext1

***********************************************************************

                 Oracle GoldenGate Capture forOracle

       Version 11.1.1.1 OGGCORE_11.1.1_PLATFORMS_110421.2040

  Linux, x64, 64bit (optimized), Oracle 11g on Apr 30 2011 18:52:51

Copyright (C) 1995, 2011, Oracle and/or itsaffiliates. All rights reserved.

                    Starting at 2011-11-1813:30:22

Operating System Version:

Linux

Version #1 SMP Tue Aug 18 15:59:52 EDT2009, Release 2.6.18-164.el5xen

Node: gg1

Machine: x86_64

                         soft limit   hard limit

Address Space Size   :   unlimited    unlimited

Heap Size            :   unlimited    unlimited

File Size            :   unlimited    unlimited

CPU Time             :    unlimited   unlimited

…..

3.2 To determine the name and location of a process report

Use the INFO command in GGSCI.

INFO<group>, DETAIL

3.3 To view information if a process abends without a report

Run the processfrom the command shell of the operating system (not GGSCI) to send the informationto the terminal.

如果程序中斷,并沒有生成Report 的情況,我們可以使用如下文法來檢視程序的資訊。

在作業系統裡執行如下文法:

<process>paramfile <path name>.prm

(1)     <process> is either Extract or Replicat.

(2)     paramfile <path name>.prm is the fullyqualified name of the parameter file.

gg1:/u01/ggate> extractparamfile /u01/ggate/dirdat/ext1.prm

Source Context :

 SourceModule            : [ggstd.util.file]

 SourceID                :[/scratch/sganti/view_storage/sganti_core_lin64/oggcore/OpenSys/src/gglib/ggstd/fileutl.c]

 SourceFunction          :[ggOpenFile]

 SourceLine              : [681]

 ThreadBacktrace         : [8]elements

                         :[extract(CMessageContext::AddThreadContext()+0x26) [0x66a416]]

                          :[extract(CMessageFactory::CreateMessage(CSourceContext*, unsigned int,...)+0x7b2) [0x660ee2]]

                          :[extract(_MSG_ERR_FILE_OPEN_ERROR(CSourceContext*, char const*,CMessageFactory::MessageDisposition)+0x92) [0x633952]]

                          :[extract(ggOpenFile(char const*, char const*)+0x7e) [0x58851e]]

                          : [extract[0x512f63]]

                          : [extract(main+0x1a8) [0x5254a8]]

                          :[/lib64/libc.so.6(__libc_start_main+0xf4) [0x34fa41d994]]

                          :[extract(__gxx_personality_v0+0x1f2) [0x4f2bda]]

2011-11-18 17:23:31  ERROR  OGG-01091  Unable to open file"/u01/ggate/dirdat/ext1.prm" (error 2, No such file or directory).

2011-11-18 17:23:31  ERROR  OGG-01668  PROCESS ABENDING.

3.4 Scheduling runtime statistics in the process report

By default,runtime statistics are written to the report once, at the end of each run. For longor continuous runs, you can use optional parameters to view these statistics ona regular basis, without waiting for the end of the run.

--預設情況下,運作時的靜态資訊隻在程序結束時寫如report。 如果是一個長時間運作的程序,我們可以使用可選的參數來檢視程序的資訊,而不是等程序stop。

3.4.1 To set a schedulefor reporting runtime statistics

Use the REPORT parameterin the Extract or Replicat parameter file to specify a day and time to generateruntime statistics in the report.

       --在Extract 或Replicat 程序裡指定REPORT參數,就可以在指定的時間間隔内規則的生成report。

3.4.2 To send runtimestatistics to the report on demand

Use the SENDEXTRACT or SEND REPLICAT command with the REPORT option to view current runtimestatistics when needed.

使用send extract 或者 send replicat 指令加report 參數來檢視程序目前的運作資訊。

GGSCI (gg1) 35> send ext1 report

Sending REPORT request to EXTRACT EXT1 ...

Request processed.

3.5 Viewing record counts in the process report

Use the REPORTCOUNTparameter to report a count of transaction records that Extract or Replicatprocessed since startup. Each transaction record represents a logical database operationthat was performed within a transaction that was captured by Oracle GoldenGate.The record count is printed to the report file and to the screen.

--REPORTCOUNT 參數可以顯示程序自啟動以來事務操作的數量。 每個事務操作都會被GG 捕獲。

3.6 Managing process reports

Once created, areport file must remain in its original location for Oracle GoldenGate to operateproperly after processing has started.

Whenever aprocess starts, Oracle GoldenGate creates a new report file and ages the previousone by appending a sequence number to the name. The numbers increment from 0(the previous one) to 9 (the oldest).

No process everhas more than ten aged reports and one active report. After the tenth aged report,the oldest is deleted when a new report is created. Set up an archivingschedule for aged report files in case they are needed to resolve a servicerequest.

3.6.1 To prevent anExtract or Replicat report file from becoming too large

Use the REPORTROLLOVERparameter to force report files to age on a regular schedule, instead of when aprocess starts. For long or continuous runs, setting an aging schedule controlsthe size of the active report file and provides a more predictable set ofarchives that can be included in your archiving routine.

3.6.2 To prevent SQLerrors from filling up the Replicat report

Use the WARNRATEparameter to set a threshold for the number of SQL errors that can be toleratedon any target table before being reported to the process report and to theerror log. The errors are reported as a warning. If your environment cantolerate a large number of these errors, increasing WARNRATE helps to minimizethe size of those files.

四.使用discardfile

Use a discardfile to capture information about Oracle GoldenGate operations that failed. Thisinformation can help you to resolve data errors, such as those that involveinvalid column mapping.

--discard file可以存放GG 失敗的操作記錄。

Discard file 包含如下資訊:

(1)    The database error message

(2)    The sequence number of the datasource or trail file

(3)    The relative byte address ofthe record in the data source or trail file

(4)    The details of the discardedoperation, such as column values of a DML statement or the text of a DDLstatement.

A discard filecan be used for Extract or Replicat, but it is most useful for Replicat to log operationsthat could not be reconstructed or applied.

--discard file 可以使用在Extract 和 Replicat 程序上,但是大多數情況下是在Replicat 程序上使用。

4.1 To use a discard file

Include the DISCARDFILEparameter in the Extract or Replicat parameter file. You must supply a name forthe file. The parameter has options that control the maximum file size, afterwhich the process abends, and whether new content overwrites or appends toexisting content.

       --Extract 和 Replicat 程序都可以包含DISCARDFILE參數,如果使用該參數,必須指明file name。 這個參數的可選參數包括最大filesize,和程式異常中止後,啟動時是overwrite 還是append 這個discard file.

DISCARDFILE<file name> [, APPEND | PURGE] [, MAXBYTES <n> | MEGABYTES<n>]

NOTE:

To prevent theneed to perform manual maintenance of discard files, use either the PURGE orAPPEND option. Otherwise, you must specify a different discard file name beforestarting each process run, because Oracle GoldenGate will not write to anexisting discard file.

       --為了避免人工的維護discard file,可以使用purge 或者append 參數,這樣就可以正常啟動了。 不然就需要在啟動程序前指定新的位置,因為GG 不會覆寫已經存在的discard file。

4.2 To view a discard file

Use either of the following:

(1)    Standard shell command to viewthe file by name

(2)    VIEW REPORT command in GGSCI,with the discard file name as input

VIEW REPORT<file name>

GGSCI (gg2) 4> view params rep1

replicat rep1

ASSUMETARGETDEFS

userid ggate@gg2,password ggate

discardfile /u01/ggate/dirdat/rep1_discard.txt, append, megabytes 10

--HANDLECOLLISIONS

ddl include all

ddlerror default ignore retryop

map dave.pdba, target dave.pdba;

GGSCI (gg2) 5> view report  /u01/ggate/dirdat/rep1_discard.txt

Oracle GoldenGate Delivery for Oracleprocess started, group REP1 discard file op

ened: 2011-11-08 20:51:55

ened: 2011-11-09 10:39:47

ened: 2011-11-16 11:23:44

4.3 To manage discard files

Use the DISCARDROLLOVERparameter to set a schedule for aging discard files. For long or continuousruns, setting an aging schedule prevents the discard file from filling up and causingthe process to abend, and it provides a predictable set of archives that can beincluded in your archiving routine.

DISCARDROLLOVER{AT <hh:mi> | ON <day of week> | AT <hh:mi> ON <day ofweek>}

-------------------------------------------------------------------------------------------------------

-------加群需要在備注說明Oracle表空間和資料檔案的關系,否則拒絕申請----

DBA1 群:62697716(滿);   DBA2 群:62697977(滿)  DBA3 群:62697850(滿)  

DBA 超級群:63306533(滿);  DBA4 群:83829929(滿) DBA5群: 142216823(滿) 

DBA6 群:158654907(滿)   DBA7 群:69087192(滿)  DBA8 群:172855474

DBA 超級群2:151508914  DBA9群:102954821     聊天 群:40132017(滿)