Step By Step
主要操作步驟
1、Java環境安裝
2、Apache Maven安裝
3、Flume-NG安裝
4、配置導入資料
一、JAVA環境安裝
1、更新軟體包清單
sudo apt-get update
2、安裝openjdk-8-jdk
sudo apt-get install openjdk-8-jdk
3、檢視java版本,看看是否安裝成功
java -version

二、Apache Maven安裝
1、安裝
apt install maven
2、檢視安裝版本
mvn -v
三、Flume-NG安裝
1、flume下載下傳,下載下傳
位址wget https://downloads.apache.org/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz
2、解壓
tar zxvf apache-flume-1.9.0-bin.tar.gz
3、下載下傳flume-datahub插件,下載下傳
https://aliyun-datahub.oss-cn-hangzhou.aliyuncs.com/tools/aliyun-flume-datahub-sink-2.0.4.tar.gz
4、解壓flume插件并放在${FLUME_HOME}/plugins.d目錄下(本示例${FLUME_HOME}值為:apache-flume-1.9.0-bin)
tar -zxvf aliyun-flume-datahub-sink-2.0.4.tar.gz
mkdir apache-flume-1.9.0-bin/plugins.d
mv aliyun-flume-datahub-sink apache-flume-1.9.0-bin/plugins.d
5、安裝效果檢視
apache-flume-1.9.0-bin/bin/flume-ng version
四、配置導入資料
1、資料檔案(demo.txt)
0,YxCOHXcst1NlL5ebJM9YmvQ1f8oy8neb3obdeoS0,true,1254275.1144629316,1573206062763,1254275.1144637289
0,YxCOHXcst1NlL5ebJM9YmvQ1f8oy8neb3obdeoS0,true,1254275.1144629316,1573206062763,1254275.1144637289
1,hHVNjKW5DsRmVXjguwyVDjzjn60wUcOKos9Qym0V,false,1254275.1144637289,1573206062763,1254275.1144637289
2,vnXOEuKF4Xdn5WnDCPbzPwTwDj3k1m3rlqc1vN2l,true,1254275.1144637289,1573206062763,1254275.1144637289
3,t0AGT8HShzroBVM3vkP37fIahg2yDqZ5xWfwDFJs,false,1254275.1144637289,1573206062763,1254275.1144637289
4,MKwZ1nczmCBp6whg1lQeFLZ6E628lXvFncUVcYWI,true,1254275.1144637289,1573206062763,1254275.1144637289
5,bDPQJ656xvPGw1PPjhhTUZyLJGILkNnpqNLaELWV,false,1254275.1144637289,1573206062763,1254275.1144637289
6,wWF7i4X8SXNhm4EfClQjQF4CUcYQgy3XnOSz0StX,true,1254275.1144637289,1573206062763,1254275.1144637289
7,whUxTNREujMP6ZrAJlSVhCEKH1KH9XYJmOFXKbh8,false,1254275.1144637289,1573206062763,1254275.1144637289
8,OYcS1WkGcbZFbPLKaqU5odlBf7rHDObkQJdBDrYZ,true,1254275.1144637289,1573206062763,1254275.1144637289
2、DataHub Topic Schema
字段名稱 | 字段類型 |
---|---|
id | BIGINT |
name | STRING |
gender | BOOLEAN |
salary | DOUBLE |
my_time | TIMESTAMP |
decimal | DECIMAL |
3、配置檔案
# A single-node Flume configuration for Datahub
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = cat /root/flume/demo.txt
# Describe the sink
a1.sinks.k1.type = com.aliyun.datahub.flume.sink.DatahubSink
a1.sinks.k1.datahub.accessId = LTAIOZZ******
a1.sinks.k1.datahub.accessKey = v7CjUJCMk7j9aKdu************
a1.sinks.k1.datahub.endPoint = https://dh-cn-shanghai.aliyuncs.com
a1.sinks.k1.datahub.project = flume_project
a1.sinks.k1.datahub.topic = flume
a1.sinks.k1.serializer = DELIMITED
a1.sinks.k1.serializer.delimiter = ,
a1.sinks.k1.serializer.fieldnames = id,name,gender,salary,my_time,decimal
a1.sinks.k1.serializer.charset = UTF-8
a1.sinks.k1.datahub.retryTimes = 5
a1.sinks.k1.datahub.retryInterval = 5
a1.sinks.k1.datahub.batchSize = 100
a1.sinks.k1.datahub.batchTimeout = 5
a1.sinks.k1.datahub.enablePb = true
a1.sinks.k1.datahub.compressType = DEFLATE
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
4、測試效果(本地測試按照自己實際檔案路徑配置即可)
apache-flume-1.9.0-bin/bin/flume-ng agent -n a1 -c conf -f datahub.conf -Dflume.root.logger=INFO,console