天天看点

Storm 多语言支持

using non jvm languages with storm

<a href="https://github.com/nathanmarz/storm/wiki/using-non-jvm-languages-with-storm">https://github.com/nathanmarz/storm/wiki/using-non-jvm-languages-with-storm</a>

multilang protocol

<a href="https://github.com/nathanmarz/storm/wiki/multilang-protocol">https://github.com/nathanmarz/storm/wiki/multilang-protocol</a>

对于jvm语言比较简单, 直接提高dsl封装java即可 

对于非jvm语言就稍微复杂一些, storm分为两部分, topology和component(blot和spout)

对于topology用其他语言实现比较easy, 因为nimbus是thrift server, 所以什么语言最终都是都是转化为thrift结构. 而且其实topology本身逻辑就比较简单, 直接用java写也行, 没有太多的必要一定要使用其他的语言

对于component, 采用的方案和hadoop的一样, 使用shell process来执行component, 并使用stdin, stdout作为component之间的通信 (json messages over stdin/stdout) 

当前storm, 实现python, ruby, 和fancy的版本, 如果需要支持其他的语言, 自己实现一下这个协议也应该很容易. 

其实component支持多语言比较必要, 因为很多分析或统计模块, 不一定是使用java, 如果porting比较麻烦, 不象topology那么简单.

two pieces: creating topologies and implementing spouts and bolts in other languages

creating topologies in another language is easy since topologies are just thrift structures (link to storm.thrift)

implementing spouts and bolts in another language is called a "multilang components" or "shelling"

the thrift structure lets you define multilang components explicitly as a program and a script (e.g., python and the file implementing your bolt)

in java, you override shellbolt or shellspout to create multilang components

note that output fields declarations happens in the thrift structure, so in java you create multilang components like the following:

declare fields in java, processing code in the other language by specifying it in constructor of shellbolt

multilang uses json messages over stdin/stdout to communicate with the subprocess

storm comes with ruby, python, and fancy adapters that implement the protocol. show an example of python

python supports emitting, anchoring, acking, and logging

"storm shell" command makes constructing jar and uploading to nimbus easy

makes jar and uploads it

calls your program with host/port of nimbus and the jarfile id

bolt可以使用任何语言来定义. 用其它语言定义的bolt会被当作子进程(subprocess)来执行, storm使用json消息通过stdin/stdout来和这些subprocess通信. 

这个通信协议是一个只有100行的库, storm团队给这些库开发了对应的ruby, python和fancy版本.

python版本的bolt的定义, 和java版不同的是继承shellbolt类

下面是splitsentence.py的定义: 

上面是使用python component的例子, 首先继承shellbolt, 表示输入输出是通过shell stdin/stdout来完成的 

然后, 下面直接将python splitsentence.py作为子进程来调用

在python中, 首先import storm, 其中封装了通信协议, 很简单的100行, 可以看看

<a href="https://github.com/nathanmarz/storm/wiki/dsls-and-multilang-adapters">https://github.com/nathanmarz/storm/wiki/dsls-and-multilang-adapters</a>

<a href="https://github.com/velvia/scalastorm">scala dsl</a>

<a href="https://github.com/colinsurprenant/redstorm">jruby dsl</a>

<a href="https://github.com/nathanmarz/storm/wiki/clojure-dsl">clojure dsl</a>

前面说了, 对于jvm的语言, 很简单只是封装一下java, 然后提供dsl即可, 上面列出所有官方提供的dsl 

可以简单以clojure为例子, 了解一下

storm comes with a clojure dsl for defining spouts, bolts, and topologies. the clojure dsl has access to everything the java api exposes, so if you're a clojure user you can code storm topologies without touching java at all.

<a href="https://github.com/nathanmarz/storm/wiki/clojure-dsl">https://github.com/nathanmarz/storm/wiki/clojure-dsl</a>

<a href="https://github.com/nathanmarz/storm/wiki/defining-a-non-jvm-language-dsl-for-storm">https://github.com/nathanmarz/storm/wiki/defining-a-non-jvm-language-dsl-for-storm</a>

对于non-jvm语言, 通过storm shell命令也可以实现类似dsl

there's a "storm shell" command that will help with submitting a topology. its usage is like this:

storm shell will then package resources/ into a jar, upload the jar to nimbus, and call your topology.py script like this:

then you can connect to nimbus using the thrift api and submit the topology, passing {uploaded-jar-location} into the submittopology method. for reference, here's the submittopology definition:

void submittopology(1: string name, 2: string uploadedjarlocation, 3: string jsonconf, 4: stormtopology topology) throws (1: alreadyaliveexception e, 2: invalidtopologyexception ite);

本文章摘自博客园,原文发布日期: 2013-05-10