天天看点

ORACLE10G+asm+RAC集群(linux)

linux环境下建asm rac

RAC集群的关键点:

    1.共享存储

    2.节点间需要内部通讯,以协调集群正常运行,所以每个节点需要提供外部网络与内部网络.

    3.CRS集群软件: 需要集群软件(Clusterware)协调各节点.

    4.集群注册文件(OCR): 需要注册集群,保存在共享磁盘上.

    5.仲裁磁盘(Voting Disk): 需要协调各节点决定控制权,做为表决器,保存在共享磁盘上.

    6.虚拟IP(Virturl IP): 提供客户端连接,IP由集群软件接管,当集群就绪时,虚拟IP可以连接.

    共享存储访问方式(存储系统):

    1.集群文件系统(CFS:Cluster File System)

    2.自动存储管理(ASM:Automatic Storage Management)

    3.网络文件系统(NFS)

    4.裸设备(RAW)

    单机文件系统FAT32,NTFS,ext3不能作为共享存储

    选择以下储存方案来建立集群系统:

    项目                存储系统            存储位置

    Clusterware软件     本地文件系统        本地磁盘

    voting disk         RAW                 共享磁盘

    OCR                 RAW                 共享磁盘

    数据库软件          本地文件系统        共享磁盘

    数据库              ASM                 共享磁盘

环境:2台虚拟机(rac1,rac2),2块网卡,一块共享存储30g,内存2g

安装前准备 :

一、网络主机名(每个节点)

 1、修改网卡IP   vi /etc/sysconfig/network-scripts/ifcfg-eth0 (1)

 2、修改hosts文件  vi /etc/hosts

#rac1

192.168.56.10  rac1

10.10.10.10   rac1priv

192.168.56.211  rac1vip

#rac2

192.168.56.11  rac2

10.10.10.11   rac2priv

192.168.56.212  rac2vip

 3、修改主机名  vi /etc/sysconfig/network

 4、修改完成后,重启网络服务 service network restart

二、关闭不需要的服务(每个节点)

chkconfig  autofs off

chkconfig  acpid off

chkconfig  sendmail off

chkconfig  cups-config-daemon off

chkconfig  cpus off

chkconfig  xfs off

chkconfig  lm_sensors off

chkconfig  gpm off

chkconfig  openibd off

chkconfig  pcmcia off

chkconfig   cpuspeed off

chkconfig   nfslock off

chkconfig   ip6tables off

chkconfig   rpcidmapd off

chkconfig   apmd off

chkconfig   sendmail off

chkconfig   arptables_jf off

chkconifg   microcode_ctl off

chkconfig   rpcgssd off

chkconfig ntpd off

三、安装支持oracle软件所需的系统插件(每个节点) (不确定)

安装oracle依赖的软件包

mount光盘

[[email protected] ~]# mount /dev/cdrom /mnt

mount: block device /dev/cdrom is write-protected, mounting read-only

修改yum源

[[email protected] ~]# vi /etc/yum.repos.d/rhel-debuginfo.repo

name=Red Hat Enterprise Linux $releasever - $basearch - Debug

baseurl=file:///mnt/Server

enabled=1

gpgcheck=0

改好后  刷新

[[email protected] ~]# yum clean all

安装包

[[email protected] yum.repos.d]# yum install -y lib*

yum install -y  binutils-* libXp*  compat-libstdc++-33-* elfutils-libelf-* elfutils-libelf-devel-* gcc-* gcc-c++-* glibc-* glibc-common-* glibc-devel-* glibc-headers-* ksh-* libaio-* libgcc-* libstdc++-*  make-* sysstat-* unixODBC-*  unixODBC-devel-*

mount /dev/cdrom /mnt

cd /mnt/Server

rpm -p compat-db-4*

rpm -Uvh libaio-0*

rpm -Uvh compat-libstdc++-33-3*

rpm -Uvh compat-gcc-34-3*

rpm -Uvh compat-gcc-34-c++-3*

rpm -Uvh libXp-1*

rpm -Uvh openmotif-2*

rpm -Uvh gcc-4*

rpm -Uvh glibc-2.5-12.i686.rpm

四、创建oracle用户和dba组,rac的各个节点都要创建

groupadd -g 1100 dba

useradd -u 1000 -g dba oracle

passwd oracle

五、配置互信,每台机器都要执行

su - oracle

/usr/bin/ssh-keygen -t rsa

/usr/bin/ssh-keygen -t dsa

在第二个节点

cd .ssh

scp id_rsa.pub rac1:/home/oracle/.ssh/id_rsa.pub2

scp id_dsa.pub rac1:/home/oracle/.ssh/id_dsa.pub2

在第一台机器执行

cd .ssh

 cat id_dsa.pub  id_dsa.pub2 id_rsa.pub  id_rsa.pub2>authorized_keys

 chmod 644 authorized_keys

scp authorized_keys rac02:/home/oracle/.ssh

请注意,当您使用 ssh 第一次访问远程主机时,其 RSA 密钥将是未知的,从而将提示您确认是否希望连接该主机。 SSH 将记录该远程主机的 RSA 密钥,并在以后连接该主机时不再做出相关提示。 

在每台机器上,以 oracle 用户身份登录,运行

ssh rac1 date

ssh rac1priv date

ssh rac2 date

ssh rac2priv date

六、修改系统参数

1、修改系统核心参数 vi /etc/sysctl.conf (root用户)

kernel.core_uses_pid = 1

fs.file-max = 65536

fs.aio-max-nr = 1048576

net.ipv4.ip_local_port_range = 1024 65000

net.core.rmem_default = 1048576

net.core.rmem_max = 1048576

net.core.wmem_default = 262144

net.core.wmem_max = 262144

kernel.shmmni = 4096

kernel.sem = 500 64000 100 128

net.ipv4.tcp_tw_reuse = 1

net.ipv4.tcp_tw_recycle = 1

sysctl -p 使它生效

2、编辑vi /etc/profile 文件,添加如下部分:

if [ $USER = "oracle" ]; then

if [ $SHELL = "/bin/ksh" ]; then

ulimit -p 16384

ulimit -n 65536

else

ulimit -u 16384 -n 65536

fi

fi

之后,执行:$ulimit验证一下.

3、往vi /etc/csh.login文件里追加以下内容:

if ( $USER == "oracle" ) then

limit maxproc 16384

limit descriptors 65536

umask 022

endif

4、修改用户限制 vi /etc/security/limits.conf

oracle  soft     nofile 655360

oracle  hard     nofile 635360

oracle  soft    nproc   10240

oracle  hard    nproc   16384

七、修改oracle用户的环境变量 vi /home/oracle/.bash_profile

export ORACLE_BASE=/oracle/app/oracle

export ORACLE_HOME=$ORACLE_BASE/product/10.2/db_1

export ORA_CRS_HOME=$ORACLE_BASE/product/10.2/crs

export ORACLE_SID=test1(test2)

export PATH=$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:$PATH

八、创建/oracle目录,并赋予oracle用户的权限

mkdir /oracle

chown -R oracle:dba /oracle

chmod -R 755 /oracle

九、配置Hangcheck 计时器 (可以不用配置,把安装软件的节点时间调的比其他节点时间慢)

vi /etc/rc.local

增加:

modprobe hangcheck-timer hangcheck-tick=30 hangcheck_margin=180

这个增加后,记住一定要重新或者执行生效啊!

要立即加载模块,执行

modprobe -v hangcheck-timer 

查看是否执行成功,下面为成功

lsmod | grep hangcheck_timer

hangcheck_timer         8153  0

-------------------修改时间方法(修改节点1比节点2慢)

date -s 13:00:00( 例子:修改为13点整)

十、划分共享磁盘分区 

1、查看磁盘信息 fdisk -l

Disk /dev/sda: 32.2 GB, 32212254720 bytes

255 heads, 63 sectors/track, 3916 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          13      104391   83  Linux

/dev/sda2              14         535     4192965   82  Linux swap / Solaris

/dev/sda3             536        3916    27157882+  83  Linux

Disk /dev/sdb: 32.2 GB, 32212254720 bytes

255 heads, 63 sectors/track, 3916 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

2、划分/dev/sdb磁盘分区

fdisk /dev/sdb

 Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel

Building a new DOS disklabel. Changes will remain in memory only,

until you decide to write them. After that, of course, the previous

content won't be recoverable.

The number of cylinders for this disk is set to 3916.

There is nothing wrong with that, but this is larger than 1024,

and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO)

2) booting and partitioning software from other OSs

   (e.g., DOS FDISK, OS/2 FDISK)

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n

Command action

   e   extended

   p   primary partition (1-4)

p

Partition number (1-4): 1

First cylinder (1-3916, default 1): 

Using default value 1

Last cylinder or +size or +sizeM or +sizeK (1-3916, default 3916): +400M   

Command (m for help): n

Command action

   e   extended

   p   primary partition (1-4)

p

Partition number (1-4): 2

First cylinder (51-3916, default 51): 

Using default value 51

Last cylinder or +size or +sizeM or +sizeK (51-3916, default 3916): +400M

Command (m for help): n

Command action

   e   extended

   p   primary partition (1-4)

p

Partition number (1-4): 3

First cylinder (101-3916, default 101): 

Using default value 101

Last cylinder or +size or +sizeM or +sizeK (101-3916, default 3916): 

Using default value 3916

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

Syncing disks.

把分区映射为raw盘,2个节点

节点一:

cd  /etc/udev/rules.d

vi 60-raw.rules

在里面添加

ACTION=="add", KERNEL=="sdb1", RUN+="/bin/raw /dev/raw/raw1 %N"

ACTION=="add", KERNEL=="sdb2", RUN+="/bin/raw /dev/raw/raw2 %N"

ACTION=="add", KERNEL=="sdb3", RUN+="/bin/raw /dev/raw/raw3 %N"

ACTION=="add", KERNEL=="raw[1-3]",OWNER="oracle",GROUP="dba",MODE="660"

保存完成后,使它生效 start_udev 

验证:raw -qa

[[email protected] rules.d]# raw -qa

/dev/raw/raw1: bound to major 8, minor 17

/dev/raw/raw2: bound to major 8, minor 18

/dev/raw/raw3: bound to major 8, minor 19

节点二:

cd  /etc/udev/rules.d

vi 60-raw.rules

在里面添加

ACTION=="add", KERNEL=="sdb1", RUN+="/bin/raw /dev/raw/raw1 %N"

ACTION=="add", KERNEL=="sdb2", RUN+="/bin/raw /dev/raw/raw2 %N"

ACTION=="add", KERNEL=="sdb3", RUN+="/bin/raw /dev/raw/raw3 %N"

ACTION=="add", KERNEL=="raw[1-3]",OWNER="oracle",GROUP="dba",MODE="660"

partprobe  --重新读取分区

start_udev -- 启动

raw -qa --查看

[[email protected] rules.d]# raw -qa

/dev/raw/raw1: bound to major 8, minor 17

/dev/raw/raw2: bound to major 8, minor 18

/dev/raw/raw3: bound to major 8, minor 19

十一、上传软件至/oracle目录下

      解压:gunzip *.gz

            cpio -idcmv< *.cpio

c

            unzip *.zip

chown -R oracle:dba /oracle

chmod -R 755 /oracle

开始安装:

一、安装集群软件

用oracle用户,执行安装

用root用户执行rootpre.sh 脚本,两个节点

[[email protected] ~]# cd /oracle/clusterware/

[[email protected] clusterware]# cd rootpre/

[[email protected] rootpre]# ls

rootpre.sh

[[email protected] rootpre]# ./rootpre.sh

No OraCM running 

[[email protected] rootpre]# scp rootpre.sh rac2:/oracle

The authenticity of host 'rac2 (192.168.56.30)' can't be established.

RSA key fingerprint is c6:99:59:37:f5:e5:0d:9e:c6:72:18:ab:1c:2a:46:19.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'rac2,192.168.56.30' (RSA) to the list of known hosts.

[email protected]'s password: 

rootpre.sh                                                            100% 2981     2.9KB/s   00:00    

执行完成后,在 按Y键

出现以下界面,现在可以开始安装了

把警告忽略

每个节点执行上面的脚本,执行一个脚本后,再去另外个节点执行,切记不要同时执行

/oracle/app/oracle/oraInventory/orainstRoot.sh

/oracle/app/oracle/product/10.2/crs/root.sh

执行第二个脚本时间用的久,耐心等待!!!!!!!

在第二个节点,执行第二个脚本,报错

正常,这是oracle10g安装在red hat5上的bug

解决方法:1.直接升级集群软件

           2.修改vipca

我们用第二种方法:

cd /oracle/app/oracle/product/10.2/crs/bin

vi vipca

在里面添加 unset LD_ASSUME_KERNEL 

 。。。。编辑这个写错一行了。。。(你妹害我重装)

正确:

在oracle用户下运行 oifcfg iflist

[[email protected] ~]$ oifcfg iflist

eth0  192.168.56.0

eth1  10.10.10.0

在root用户下执行

./oifcfg setif -global eth0/192.168.56.0:public 

./oifcfg setif -global eth1/10.10.10.0:cluster_interconnect

再执行./vipca

报错。。。。

直接下一步,还是报错

估计原因就是因为,vipca服务没启动

退出重来吧。。。。或者直接升级也可以解决,我选择重来。

删掉集群

1 、 cd $ORA_CRS_HOME/install  root用户

     执行 ./rootdeinstall.sh

          ./rootdelete.sh

2. Stop the Nodeapps on all nodes:

srvctl stop nodeapps -n <node name>

3. rm -f /etc/init.d/init.cssd 

rm -f /etc/init.d/init.crs 

rm -f /etc/init.d/init.crsd 

rm -f /etc/init.d/init.evmd 

rm -f /etc/rc2.d/K96init.crs

rm -f /etc/rc2.d/S96init.crs

rm -f /etc/rc3.d/K96init.crs

rm -f /etc/rc3.d/S96init.crs

rm -f /etc/rc5.d/K96init.crs

rm -f /etc/rc5.d/S96init.crs

        rm -Rf /etc/oracle/scls_scr

rm -f /etc/inittab.crs 

cp /etc/inittab.orig /etc/inittab

4.rm -rf <CRS Install Location>/*

5. dd if=/dev/zero of=/dev/raw/raw1 bs=8192 count=2560

   dd if=/dev/zero of=/dev/raw/raw2 bs=8192 count=12800

6.删完重启

------再次安装,这次执行到最后,只出现一个脚本。。。。

又报这个错

编辑vipca

 正确编辑的

先前放错位置了,难怪报错。。,fuck!!! 一定要细心

运行完成后,直接OK,

安装oracle软件

执行/oracle/app/oracle/product/10.2/db_1/root.sh 

建监听 oracle用户

netca

完成

创建数据库 oracle 用户  dbca

 dbca

安装完成

-------------------------升级集群数据库---------------

1、首先升级集群软件

[[email protected] oracle]$ cd Disk1

[[email protected] Disk1]$ ls

install  patch_note.htm  response  runInstaller  stage

[[email protected] Disk1]$ ./runInstaller

每个节点执行这2个脚本

[[email protected] bin]# /oracle/app/oracle/product/10.2/crs/bin/crsctl stop crs    --停止集群服务

Stopping resources.

Successfully stopped CRS resources 

Stopping CSSD.

Shutting down CSS daemon.

Shutdown request successfully issued.

[[email protected] bin]# /oracle/app/oracle/product/10.2/crs/install/root102.sh    ---更新集群,并启动集群

Creating pre-patch directory for saving pre-patch clusterware files

Completed patching clusterware files to /oracle/app/oracle/product/10.2/crs

Relinking some shared libraries.

Relinking of patched files is complete.

WARNING: directory '/oracle/app/oracle/product/10.2' is not owned by root

WARNING: directory '/oracle/app/oracle/product' is not owned by root

WARNING: directory '/oracle/app/oracle' is not owned by root

WARNING: directory '/oracle/app' is not owned by root

WARNING: directory '/oracle' is not owned by root

Preparing to recopy patched init and RC scripts.

Recopying init and RC scripts.

Startup will be queued to init within 30 seconds.

Starting up the CRS daemons.

Waiting for the patched CRS daemons to start.

  This may take a while on some systems.

.

10205 patch successfully applied.

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

Successfully deleted 1 values from OCR.

Successfully deleted 1 keys from OCR.

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node <nodenumber>: <nodename> <private interconnect name> <hostname>

node 1: rac1 rac1priv rac1

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

clscfg -upgrade completed successfully

Creating '/oracle/app/oracle/product/10.2/crs/install/paramfile.crs' with data used for CRS configuration

Setting CRS configuration values in /oracle/app/oracle/product/10.2/crs/install/paramfile.crs

--升级oracle软件

oracle软件升级包和cluster软件包集合在一起,所以直接执行就行了。

报错:原因是因为我还没关闭数据库及监听

我们直接关闭集群服务吧(两边都执行)root用户

/oracle/app/oracle/product/10.2/crs/bin/crsctl stop crs

执行脚本/oracle/app/oracle/product/10.2/db_1/root.sh

---升级数据库

1、首先把集群服务起来 (每个节点都执行)

[[email protected] /]# cd /etc/init.d

[[email protected] init.d]# ./init.crs start

Startup will be queued to init within 30 seconds.

2、查看alert日志

cd $ORACLE_BASE/admin/test/bdump

tail -f alert_test2.log

发现数据库,启不来,要以升级模式upgrade启

3、创建pfile参数文件

①首先查看原pfile参数文件

cd $ORACLE_HOME/dbs

  查看里面内容

       SPFILE='+DATADG/test/spfiletest.ora'

 ②进入到sqlplus中去

创建一个pfile文件并存放到/home/oracle目录下

SQL> create pfile='/home/oracle/a.txt' from SPFILE='+DATADG/test/spfiletest.ora';

③修改刚刚创建的pfile文件(a.txt)

把cluster_database 注释掉

④以upgrade模式启动数据库

SQL> startup upgrade pfile='/home/oracle/a.txt';

ORACLE instance started.

Total System Global Area  599785472 bytes

Fixed Size    2098112 bytes

Variable Size  163580992 bytes

Database Buffers  427819008 bytes

Redo Buffers    6287360 bytes

Database mounted.

Database opened.

SQL>spool /home/oracle/upgrd.log  创建升级日志

SQL> @?/rdbms/admin/catupgrd.sql 开始升级

@?/rdbms/admin/utlrp.sql检测无效对象

SQL>spool off  

数据库升级完成后,一定要正常关闭数据库

shutdown immediate

然后再启startup