Postgres2015全國使用者大會将于11月20至21日在北京麗亭華苑酒店召開。本次大會嘉賓陣容強大,國内頂級PostgreSQL資料庫專家将悉數到場,并特邀歐洲、俄羅斯、日本、美國等國家和地區的資料庫方面專家助陣:
- Postgres-XC項目的發起人鈴木市一(SUZUKI Koichi)
- Postgres-XL的項目發起人Mason Sharp
- pgpool的作者石井達夫(Tatsuo Ishii)
- PG-Strom的作者海外浩平(Kaigai Kohei)
- Greenplum研發總監姚延棟
- 周正中(德哥), PostgreSQL中國使用者會創始人之一
- 汪洋,平安科技資料庫技術部經理
- ……
|
CitusDB本身支援表的分布式存儲, 同時利用PostgreSQL 9.1的FDW功能, 還可以與Hadoop結合使用. 目前CitusDB隻能與Hadoop 1.x配合使用. Hadoop 2.x支援多NameNode的結構暫時還不能配合使用.
【CitusDB + Hadoop 架構圖如下 : 】
目前的版本可以在 http://www.citusdata.com/main-download下載下傳. 下載下傳到的實際上是修改過的PostgreSQL資料庫版本. CitusDB的運作還需要依賴兩個插件, 如下 : 1. 其中FDW用做PostgreSQL和HDFS的接口, 需要另外安裝. ( https://github.com/citusdata/file_fdw) 2. 同時CitusDB還需要從Hadoop的NameNode中同步HDFS的中繼資料到CitusDB的Master節點(PostgreSQL DB). 還需要将DataNode的中繼資料同步到CitusDB的worker節點(PostgreSQL DB). 也需要另外安裝. ( https://github.com/citusdata/hadoop-sync)
【部署建議 : 】 1. hadoop-sync最好與Hadoop NameNode部署在同一主機. 2. CitusDB Master node最好與 Hadoop NameNode部署在同一主機. 3. CitusDB Worker node與Hadoop DataNode一對一部署在同一主機.
【本文主要講一下CitusDB第二個插件hadoop-sync的安裝方法 : 】 1. maven架構
(參考: http://maven.apache.org/download.cgi) 下載下傳maven, 用于安裝hadoop-sync.
# wget http://mirror.bjtu.edu.cn/apache/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz
# tar -zxvf apache-maven-3.0.5-bin.tar.gz
# mv apache-maven-3.0.5 /opt/
2. maven依賴java, javac, 是以還需要安裝java, javac.
# yum install java
# java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.9) (rhel-1.36.1.11.9.el5_9-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
# which java
/usr/bin/java
# 安裝javac.
# yum install java-devel
# javac -version
javac 1.6.0_24
# which javac
/usr/bin/javac
3. 安裝maven 配置環境變量 : # vi ~/.bash_profile
export M2_HOME=/opt/apache-maven-3.0.5
export M2=$M2_HOME/bin
export MAVEN_OPTS="-Xms256m -Xmx512m"
export JAVA_HOME=/usr
export PATH=$M2:$JAVA_HOME/bin:$PATH:.
# . ~/.bash_profile # mvn --version
Apache Maven 3.0.5 (r01de14724cdef164cd33c7c8c2fe155faf9602da; 2013-02-19 21:51:28+0800)
Maven home: /opt/apache-maven-3.0.5
Java version: 1.6.0_24, vendor: Sun Microsystems Inc.
Java home: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.18-274.el5", arch: "amd64", family: "unix"
4. 下載下傳hadoop-sync
# wget --no-check-certificate https://github.com/citusdata/hadoop-sync/archive/master.zip -O master.zip
# unzip master.zip
5. 安裝hadoop-sync 使用maven安裝hadoop-sync時需要連接配接外網, 下載下傳依賴包. 是以首先要確定主機能通外網. 如果不能通外網, 那麼請自行下載下傳依賴包. # cd hadoop-sync-master # ll
total 12
-rw-r--r-- 1 root root 2533 Feb 16 04:56 pom.xml
-rw-r--r-- 1 root root 3047 Feb 16 04:56 README.md
drwxr-xr-x 3 root root 4096 Feb 16 04:56 src
# cat pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.citusdata.sync</groupId>
<artifactId>hadoop-sync</artifactId>
<version>0.1</version>
<packaging>jar</packaging>
<name>hadoop-sync</name>
<url>http://maven.apache.org</url>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>com.citusdata.sync.hdfs.HdfsSynchronizer</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<id>copy</id>
<phase>package</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
<configuration>
<outputDirectory>${project.build.directory}/lib</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
<resources>
<resource><directory>src/main/config</directory></resource>
</resources>
</build>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>commons-configuration</groupId>
<artifactId>commons-configuration</artifactId>
<version>1.6</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.0.4</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.1-901.jdbc4</version>
</dependency>
</dependencies>
</project>
安裝hadoop-sync # mvn install
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building hadoop-sync 0.1
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ hadoop-sync ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ hadoop-sync ---
[INFO] Compiling 7 source files to /root/hadoop-sync-master/target/classes
[INFO]
[INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ hadoop-sync ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /root/hadoop-sync-master/src/test/resources
[INFO]
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ hadoop-sync ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-surefire-plugin:2.10:test (default-test) @ hadoop-sync ---
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-api/2.0.9/maven-plugin-api-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-api/2.0.9/maven-plugin-api-2.0.9.pom (2 KB at 1.4 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven/2.0.9/maven-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven/2.0.9/maven-2.0.9.pom (19 KB at 16.5 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/8/maven-parent-8.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/8/maven-parent-8.pom (24 KB at 9.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.pom (3 KB at 5.3 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.pom (3 KB at 4.0 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.pom (4 KB at 1.3 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.pom (4 KB at 4.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/16/spice-parent-16.pom
Downloaded: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/16/spice-parent-16.pom (9 KB at 7.3 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/5/forge-parent-5.pom
Downloaded: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/5/forge-parent-5.pom (9 KB at 9.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/2.0.9/maven-artifact-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/2.0.9/maven-artifact-2.0.9.pom (2 KB at 2.8 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-project/2.0.9/maven-project-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-project/2.0.9/maven-project-2.0.9.pom (3 KB at 4.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-settings/2.0.9/maven-settings-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-settings/2.0.9/maven-settings-2.0.9.pom (3 KB at 3.6 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-model/2.0.9/maven-model-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-model/2.0.9/maven-model-2.0.9.pom (4 KB at 5.2 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-profile/2.0.9/maven-profile-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-profile/2.0.9/maven-profile-2.0.9.pom (3 KB at 3.6 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact-manager/2.0.9/maven-artifact-manager-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact-manager/2.0.9/maven-artifact-manager-2.0.9.pom (3 KB at 1.6 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-repository-metadata/2.0.9/maven-repository-metadata-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-repository-metadata/2.0.9/maven-repository-metadata-2.0.9.pom (2 KB at 3.3 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-registry/2.0.9/maven-plugin-registry-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-registry/2.0.9/maven-plugin-registry-2.0.9.pom (2 KB at 3.4 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-core/2.0.9/maven-core-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-core/2.0.9/maven-core-2.0.9.pom (8 KB at 6.8 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-parameter-documenter/2.0.9/maven-plugin-parameter-documenter-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-parameter-documenter/2.0.9/maven-plugin-parameter-documenter-2.0.9.pom (2 KB at 3.4 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.pom (2 KB at 3.1 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting/2.0.9/maven-reporting-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting/2.0.9/maven-reporting-2.0.9.pom (2 KB at 2.6 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-error-diagnostics/2.0.9/maven-error-diagnostics-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-error-diagnostics/2.0.9/maven-error-diagnostics-2.0.9.pom (2 KB at 3.0 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-descriptor/2.0.9/maven-plugin-descriptor-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-descriptor/2.0.9/maven-plugin-descriptor-2.0.9.pom (3 KB at 3.6 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-monitor/2.0.9/maven-monitor-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-monitor/2.0.9/maven-monitor-2.0.9.pom (2 KB at 2.2 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-toolchain/2.0.9/maven-toolchain-2.0.9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-toolchain/2.0.9/maven-toolchain-2.0.9.pom (4 KB at 6.0 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.pom (4 KB at 6.4 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-components/12/maven-shared-components-12.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-components/12/maven-shared-components-12.pom (10 KB at 6.6 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/13/maven-parent-13.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/13/maven-parent-13.pom (23 KB at 13.2 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/apache/6/apache-6.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/apache/6/apache-6.pom (13 KB at 14.9 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-9/plexus-container-default-1.0-alpha-9.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-9/plexus-container-default-1.0-alpha-9.pom (2 KB at 2.1 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.jar
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.jar
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.jar
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.jar
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.jar
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-booter/2.10/surefire-booter-2.10.jar (34 KB at 19.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.jar
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/1.3/maven-common-artifact-filters-1.3.jar (31 KB at 18.0 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.jar (10 KB at 5.2 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/maven-surefire-common/2.10/maven-surefire-common-2.10.jar (60 KB at 13.0 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.10/surefire-api-2.10.jar (158 KB at 14.3 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/2.1/plexus-utils-2.1.jar (220 KB at 13.6 KB/sec)
[INFO] No tests to run.
[INFO] Surefire report directory: /root/hadoop-sync-master/target/surefire-reports
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.pom (2 KB at 2.9 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-providers/2.10/surefire-providers-2.10.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-providers/2.10/surefire-providers-2.10.pom (3 KB at 2.1 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.jar
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.jar (26 KB at 11.3 KB/sec)
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Results :
Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] --- maven-jar-plugin:2.3.2:jar (default-jar) @ hadoop-sync ---
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.pom (4 KB at 6.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.pom (3 KB at 4.9 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/17/spice-parent-17.pom
Downloaded: http://repo.maven.apache.org/maven2/org/sonatype/spice/spice-parent/17/spice-parent-17.pom (7 KB at 11.8 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/10/forge-parent-10.pom
Downloaded: http://repo.maven.apache.org/maven2/org/sonatype/forge/forge-parent/10/forge-parent-10.pom (14 KB at 8.0 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.pom (4 KB at 3.2 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.pom (2 KB at 3.0 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.19/plexus-components-1.1.19.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.19/plexus-components-1.1.19.pom (3 KB at 4.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus/3.0.1/plexus-3.0.1.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus/3.0.1/plexus-3.0.1.pom (19 KB at 7.3 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.pom
Downloaded: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.pom (10 KB at 8.7 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.jar
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.jar
Downloading: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.jar
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.jar
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.jar
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.4.2/maven-archiver-2.4.2.jar (20 KB at 9.1 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-io/2.0.1/plexus-io-2.0.1.jar (57 KB at 20.5 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/2.0.1/plexus-archiver-2.0.1.jar (176 KB at 19.4 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0/plexus-utils-3.0.jar (221 KB at 14.8 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.1/commons-lang-2.1.jar (203 KB at 12.7 KB/sec)
[INFO] Building jar: /root/hadoop-sync-master/target/hadoop-sync-0.1.jar
[INFO]
[INFO] --- maven-dependency-plugin:2.1:copy-dependencies (copy) @ hadoop-sync ---
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/1.5.1/plexus-utils-1.5.1.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/1.5.1/plexus-utils-1.5.1.pom (3 KB at 4.0 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-sink-api/1.0-alpha-10/doxia-sink-api-1.0-alpha-10.pom
Downloaded: http://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-sink-api/1.0-alpha-10/doxia-sink-api-1.0-alpha-10.pom (2 KB at 2.3 KB/sec)
...............................略...........................................
Downloaded: http://repo.maven.apache.org/maven2/plexus/plexus-utils/1.0.2/plexus-utils-1.0.2.jar (157 KB at 22.3 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/velocity/velocity/1.4/velocity-1.4.jar (353 KB at 13.5 KB/sec)
Downloaded: http://repo.maven.apache.org/maven2/velocity/velocity-dep/1.4/velocity-dep-1.4.jar (506 KB at 15.0 KB/sec)
[INFO] Copying ant-1.6.5.jar to /root/hadoop-sync-master/target/lib/ant-1.6.5.jar
[INFO] Copying commons-beanutils-1.7.0.jar to /root/hadoop-sync-master/target/lib/commons-beanutils-1.7.0.jar
[INFO] Copying commons-beanutils-core-1.8.0.jar to /root/hadoop-sync-master/target/lib/commons-beanutils-core-1.8.0.jar
[INFO] Copying commons-cli-1.2.jar to /root/hadoop-sync-master/target/lib/commons-cli-1.2.jar
[INFO] Copying commons-codec-1.4.jar to /root/hadoop-sync-master/target/lib/commons-codec-1.4.jar
[INFO] Copying commons-collections-3.2.1.jar to /root/hadoop-sync-master/target/lib/commons-collections-3.2.1.jar
[INFO] Copying commons-configuration-1.6.jar to /root/hadoop-sync-master/target/lib/commons-configuration-1.6.jar
[INFO] Copying commons-digester-1.8.jar to /root/hadoop-sync-master/target/lib/commons-digester-1.8.jar
[INFO] Copying commons-el-1.0.jar to /root/hadoop-sync-master/target/lib/commons-el-1.0.jar
[INFO] Copying commons-httpclient-3.0.1.jar to /root/hadoop-sync-master/target/lib/commons-httpclient-3.0.1.jar
[INFO] Copying commons-lang-2.4.jar to /root/hadoop-sync-master/target/lib/commons-lang-2.4.jar
[INFO] Copying commons-logging-1.1.1.jar to /root/hadoop-sync-master/target/lib/commons-logging-1.1.1.jar
[INFO] Copying commons-net-1.4.1.jar to /root/hadoop-sync-master/target/lib/commons-net-1.4.1.jar
[INFO] Copying hsqldb-1.8.0.10.jar to /root/hadoop-sync-master/target/lib/hsqldb-1.8.0.10.jar
[INFO] Copying junit-3.8.1.jar to /root/hadoop-sync-master/target/lib/junit-3.8.1.jar
[INFO] Copying log4j-1.2.17.jar to /root/hadoop-sync-master/target/lib/log4j-1.2.17.jar
[INFO] Copying jets3t-0.7.1.jar to /root/hadoop-sync-master/target/lib/jets3t-0.7.1.jar
[INFO] Copying kfs-0.3.jar to /root/hadoop-sync-master/target/lib/kfs-0.3.jar
[INFO] Copying commons-math-2.1.jar to /root/hadoop-sync-master/target/lib/commons-math-2.1.jar
[INFO] Copying hadoop-core-1.0.4.jar to /root/hadoop-sync-master/target/lib/hadoop-core-1.0.4.jar
[INFO] Copying jackson-core-asl-1.0.1.jar to /root/hadoop-sync-master/target/lib/jackson-core-asl-1.0.1.jar
[INFO] Copying jackson-mapper-asl-1.0.1.jar to /root/hadoop-sync-master/target/lib/jackson-mapper-asl-1.0.1.jar
[INFO] Copying core-3.1.1.jar to /root/hadoop-sync-master/target/lib/core-3.1.1.jar
[INFO] Copying jetty-6.1.26.jar to /root/hadoop-sync-master/target/lib/jetty-6.1.26.jar
[INFO] Copying jetty-util-6.1.26.jar to /root/hadoop-sync-master/target/lib/jetty-util-6.1.26.jar
[INFO] Copying jsp-2.1-6.1.14.jar to /root/hadoop-sync-master/target/lib/jsp-2.1-6.1.14.jar
[INFO] Copying jsp-api-2.1-6.1.14.jar to /root/hadoop-sync-master/target/lib/jsp-api-2.1-6.1.14.jar
[INFO] Copying servlet-api-2.5-20081211.jar to /root/hadoop-sync-master/target/lib/servlet-api-2.5-20081211.jar
[INFO] Copying servlet-api-2.5-6.1.14.jar to /root/hadoop-sync-master/target/lib/servlet-api-2.5-6.1.14.jar
[INFO] Copying oro-2.0.8.jar to /root/hadoop-sync-master/target/lib/oro-2.0.8.jar
[INFO] Copying postgresql-9.1-901.jdbc4.jar to /root/hadoop-sync-master/target/lib/postgresql-9.1-901.jdbc4.jar
[INFO] Copying jasper-compiler-5.5.12.jar to /root/hadoop-sync-master/target/lib/jasper-compiler-5.5.12.jar
[INFO] Copying jasper-runtime-5.5.12.jar to /root/hadoop-sync-master/target/lib/jasper-runtime-5.5.12.jar
[INFO] Copying xmlenc-0.52.jar to /root/hadoop-sync-master/target/lib/xmlenc-0.52.jar
[INFO]
[INFO] --- maven-install-plugin:2.3.1:install (default-install) @ hadoop-sync ---
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.pom (2 KB at 1.9 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.7/plexus-components-1.1.7.pom
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.7/plexus-components-1.1.7.pom (5 KB at 5.8 KB/sec)
Downloading: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.jar
Downloaded: http://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-digest/1.0/plexus-digest-1.0.jar (12 KB at 10.4 KB/sec)
[INFO] Installing /root/hadoop-sync-master/target/hadoop-sync-0.1.jar to /root/.m2/repository/com/citusdata/sync/hadoop-sync/0.1/hadoop-sync-0.1.jar
[INFO] Installing /root/hadoop-sync-master/pom.xml to /root/.m2/repository/com/citusdata/sync/hadoop-sync/0.1/hadoop-sync-0.1.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3:41.824s
[INFO] Finished at: Mon Mar 18 22:22:00 CST 2013
[INFO] Final Memory: 14M/247M
[INFO] ------------------------------------------------------------------------
# 安裝完後的目錄結構 :
# ll
total 16
-rw-r--r-- 1 root root 2533 Feb 16 04:56 pom.xml
-rw-r--r-- 1 root root 3047 Feb 16 04:56 README.md
drwxr-xr-x 3 root root 4096 Feb 16 04:56 src
drwxr-xr-x 7 root root 4096 Mar 18 22:21 target
# cd target/
# ll -R
.:
total 44
drwxr-xr-x 3 root root 4096 Mar 18 22:18 classes
drwxr-xr-x 3 root root 4096 Mar 18 22:14 generated-sources
-rw-r--r-- 1 root root 24084 Mar 18 22:19 hadoop-sync-0.1.jar
drwxr-xr-x 2 root root 4096 Mar 18 22:21 lib
drwxr-xr-x 2 root root 4096 Mar 18 22:19 maven-archiver
drwxr-xr-x 2 root root 4096 Mar 18 22:22 surefire
./classes:
total 12
drwxr-xr-x 3 root root 4096 Mar 18 22:18 com
-rw-r--r-- 1 root root 1228 Mar 18 22:14 log4j.properties
-rw-r--r-- 1 root root 266 Mar 18 22:14 sync.properties
./classes/com:
total 4
drwxr-xr-x 3 root root 4096 Mar 18 22:18 citusdata
./classes/com/citusdata:
total 4
drwxr-xr-x 3 root root 4096 Mar 18 22:18 sync
./classes/com/citusdata/sync:
total 4
drwxr-xr-x 2 root root 4096 Mar 18 22:18 hdfs
./classes/com/citusdata/sync/hdfs:
total 60
-rw-r--r-- 1 root root 10403 Mar 18 22:18 CitusMasterNode.class
-rw-r--r-- 1 root root 5728 Mar 18 22:18 CitusWorkerNode.class
-rw-r--r-- 1 root root 6867 Mar 18 22:18 HdfsMasterNode.class
-rw-r--r-- 1 root root 12373 Mar 18 22:18 HdfsSynchronizer.class
-rw-r--r-- 1 root root 1332 Mar 18 22:18 HdfsSynchronizer$MetadataDifference.class
-rw-r--r-- 1 root root 2265 Mar 18 22:18 HdfsWorkerNode.class
-rw-r--r-- 1 root root 642 Mar 18 22:18 MinMaxValue.class
-rw-r--r-- 1 root root 1687 Mar 18 22:18 ShardPlacement.class
./generated-sources:
total 4
drwxr-xr-x 2 root root 4096 Mar 18 22:14 annotations
./generated-sources/annotations:
total 0
./lib:
total 16920
-rw-r--r-- 1 root root 1034049 Mar 18 22:21 ant-1.6.5.jar
-rw-r--r-- 1 root root 188671 Mar 18 22:21 commons-beanutils-1.7.0.jar
-rw-r--r-- 1 root root 206035 Mar 18 22:21 commons-beanutils-core-1.8.0.jar
-rw-r--r-- 1 root root 41123 Mar 18 22:21 commons-cli-1.2.jar
-rw-r--r-- 1 root root 58160 Mar 18 22:21 commons-codec-1.4.jar
-rw-r--r-- 1 root root 575389 Mar 18 22:21 commons-collections-3.2.1.jar
-rw-r--r-- 1 root root 298829 Mar 18 22:21 commons-configuration-1.6.jar
-rw-r--r-- 1 root root 143602 Mar 18 22:21 commons-digester-1.8.jar
-rw-r--r-- 1 root root 112341 Mar 18 22:21 commons-el-1.0.jar
-rw-r--r-- 1 root root 279781 Mar 18 22:21 commons-httpclient-3.0.1.jar
-rw-r--r-- 1 root root 261809 Mar 18 22:21 commons-lang-2.4.jar
-rw-r--r-- 1 root root 60686 Mar 18 22:21 commons-logging-1.1.1.jar
-rw-r--r-- 1 root root 832410 Mar 18 22:21 commons-math-2.1.jar
-rw-r--r-- 1 root root 180792 Mar 18 22:21 commons-net-1.4.1.jar
-rw-r--r-- 1 root root 3566844 Mar 18 22:21 core-3.1.1.jar
-rw-r--r-- 1 root root 3929148 Mar 18 22:21 hadoop-core-1.0.4.jar
-rw-r--r-- 1 root root 706710 Mar 18 22:21 hsqldb-1.8.0.10.jar
-rw-r--r-- 1 root root 136059 Mar 18 22:21 jackson-core-asl-1.0.1.jar
-rw-r--r-- 1 root root 270781 Mar 18 22:21 jackson-mapper-asl-1.0.1.jar
-rw-r--r-- 1 root root 405086 Mar 18 22:21 jasper-compiler-5.5.12.jar
-rw-r--r-- 1 root root 76698 Mar 18 22:21 jasper-runtime-5.5.12.jar
-rw-r--r-- 1 root root 377780 Mar 18 22:21 jets3t-0.7.1.jar
-rw-r--r-- 1 root root 539912 Mar 18 22:21 jetty-6.1.26.jar
-rw-r--r-- 1 root root 177131 Mar 18 22:21 jetty-util-6.1.26.jar
-rw-r--r-- 1 root root 1024680 Mar 18 22:21 jsp-2.1-6.1.14.jar
-rw-r--r-- 1 root root 134910 Mar 18 22:21 jsp-api-2.1-6.1.14.jar
-rw-r--r-- 1 root root 121070 Mar 18 22:21 junit-3.8.1.jar
-rw-r--r-- 1 root root 11981 Mar 18 22:21 kfs-0.3.jar
-rw-r--r-- 1 root root 489884 Mar 18 22:21 log4j-1.2.17.jar
-rw-r--r-- 1 root root 65261 Mar 18 22:21 oro-2.0.8.jar
-rw-r--r-- 1 root root 539705 Mar 18 22:21 postgresql-9.1-901.jdbc4.jar
-rw-r--r-- 1 root root 134133 Mar 18 22:21 servlet-api-2.5-20081211.jar
-rw-r--r-- 1 root root 132368 Mar 18 22:21 servlet-api-2.5-6.1.14.jar
-rw-r--r-- 1 root root 15010 Mar 18 22:21 xmlenc-0.52.jar
./maven-archiver:
total 4
-rw-r--r-- 1 root root 112 Mar 18 22:19 pom.properties
./surefire:
total 0
6. 配置hadoop-sync # 配置檔案範例 # cat sync.properties
# HDFS related cluster configuration settings
HdfsMasterNodeName = localhost
HdfsMasterNodePort = 9000
HdfsWorkerNodePort = 50020
# CitusDB related cluster configuration settings
CitusMasterNodeName = localhost
CitusMasterNodePort = 5432
CitusWorkerNodePort = 9700
7. 用法
# java -jar hadoop-sync-0.1.jar table_name [--fetch-min-max]
【參考】 1. http://maven.apache.org/ 2. https://github.com/citusdata/hadoop-sync 3. http://hadoop.apache.org/docs/stable 4. http://www.citusdata.com/docs/sql-on-hadoop 5. hadoop-sync readme
hadoop-sync
hadoop-sync is a Java application that synchronizes block metadata from the HDFS NameNode to the CitusDB master node. The application also creates a colocated foreign table on CitusDB worker nodes for each block in the Hadoop File System (HDFS). Optionally, the application also collects statistics about the underlying HDFS blocks.
hadoop-sync is idempotent. You can run it multiple times; it calculates metadata differences between the HDFS NameNode and CitusDB master node, and only syncs metadata for newly added or removed HDFS blocks. If no such blocks exist, the application does nothing.
hadoop-sync also leaves metadata on the CitusDB master node in a consistent state. When you run the application, it either completes by propagating enough metadata that makes recent HDFS blocks queriable; or it errors out by reverting recent metadata updates and leaves CitusDB in the queriable state it was in before hadoop-sync was run.
hadoop-sync assumes that you are already running a Hadoop cluster and you have your CitusDB databases colocated on this cluster. For step by step instructions on how to get started, please see our documentation page at http://citusdata.com/docs/sql-on-hadoop
Usage
Our documentation page explains in detail running the hadoop-sync application. The following section talks about a few details that relate to hadoop-sync's internals. Please note that these details may change over time as hadoop-sync is still in Beta form.
Now, assuming that you already have a Hadoop cluster with data loaded into it, CitusDB deployed on the Hadoop cluster, and a distributed foreign table created on the CitusDB master node, you synchronize metadata like the following:
java -jar target/hadoop-sync-0.1.jar table_name [--fetch-min-max]
The distributed foreign table name is a required argument, and hadoop-sync will only synchronize block metadata for that particular table. The fetch-min-max is an optional argument that triggers the collection of min/max statistics for each new HDFS block. Since this requires going over all records in the HDFS block, passing this argument notably increases synchronization times. On the other hand, with these statistics, CitusDB can perform optimizations such as partition or join pruning that dramatically reduce query execution times.
It is worth mentioning that these min/max statistics are only relevant in the context of distributed queries. For local queries, the worker nodes themselves already use PostgreSQL's statistics collector to optimize queries, if the foreign data wrapper supports data sampling.
It is also worth noting that hadoop-sync relies on a properties file to determine node names and port numbers that Hadoop and CitusDB clusters use. This sync.properties file is bundled with the JAR file; and you can change the default settings by using emacs or vi on the JAR bundle, or by copying the file from target/classes/sync.properties to the current directory and editing it.