天天看點

二維張量 乘以 三維張量_通量vs張量流誤解 Julia Performance與Flux有何關系? (How is Julia Performance Relevant for Flux?) 助焊劑看起來很小且不完整 (Flux Looks Small and Incomplete)

二維張量 乘以 三維張量

TensorFlow is the 800-pound Gorilla of Machine Learning that almost everybody in the field have heard about and have some familiarity with. But there is a tiny uppity little upstart called Flux which is kicking ass and taking names, causing it to grab some attention. Yet it is still a very misunderstood machine learning library.

TensorFlow是重達800磅的機器學習大猩猩,該領域的幾乎每個人都聽說過并熟悉。 但是,有一個小小的新貴,叫做Flux ,它正在踢屁股并取名,進而引起了人們的注意。 但是,它仍然是一個非常容易被誤解的機器學習庫。

Emmett Boudreau is fellow Julia fan here on medium has written a number of articles on the topic. One of his latest Is Flux Better Than TensorFlow made me realize reading the comments that people really don’t grasp what the big deal about Flux is.

埃米特·布德羅 ( Emmett Boudreau)是茱莉亞·範(Julia Fan)的同胞,他在媒體上寫了許多有關該主題的文章。 他的最新作品之一是《比TensorFlow更好的通量》,使我意識到閱讀這些評論,人們真的不了解Flux的重要意義。

If you are living and breathing Julia, there is a lot of things you take for granted. So here are some great questions which deserve answers.

如果您生活和呼吸着Julia,那麼很多事情都是您想當然的。 是以,這裡有一些值得回答的重要問題。

Julia Performance與Flux有何關系? (How is Julia Performance Relevant for Flux?)

William Heymann asks:

威廉·海曼(William Heymann)問:

Since tensorflow compiles to a graph and runs optimized code on a cpu, tpu or gpu. Does the language speed matter? From what I have seen tensorflow is the same performance in c++ and python

由于tensorflow會編譯為圖并在cpu,tpu或gpu上運作優化的代碼。 語言速度重要嗎? 從我看到的張量流是在c + +和python中相同的性能

This is true. Once you have a graph setup, TensorFlow will run fast. There isn’t much Julia and Flux can do to one up TensorFlow. But there is a number of assumptions baked into the framing of this question.

這是真的。 設定完圖形後,TensorFlow将快速運作。 Julia(Julia)和通量(Flux)對TensorFlow所做的工作不多。 但是,在這個問題的架構中有許多假設。

TensorFlow is based on having graph nodes (ops) pre-made in high performance C++. The Python interface is really just used to shuffle around and organize these pre-made nodes into a graph, which is then executed.

TensorFlow基于高性能C ++中預先制作的圖形節點(ops)。 實際上,Python接口隻是用于将這些預制節點随機組合在一起并組織成一個圖形,然後執行該圖形。

If you need a particular kind of functionality which does not exist as a node already, then you have a couple of choices:

如果您需要某種尚不存在于節點上的特定功能,那麼您有兩種選擇:

  • Try to mimic the needed functionality by composing existing nodes in Python.

    嘗試通過在Python中組成現有節點來模仿所需的功能。

  • Write your own custom node (op) in C++, and to all the busy work of registering it with the TensorFlow system, to make it available from Python.

    使用C ++編寫您自己的自定義節點(op) ,并完成在TensorFlow系統上進行注冊的所有繁瑣工作,以使其可從Python使用。

Contrast this with Flux, where you don’t assemble a graph made up of pre-made nodes. You basically just write regular Julia code. You have the whole Julia language is at your disposal. You can use if-statements, for-loops call any number of other functions. Use custom data types. Almost anything is possible.

與Flux相反,在Flux中,您不組裝由預制節點組成的圖形。 您基本上隻是編寫正常的Julia代碼。 您擁有完整的Julia語言可供選擇。 您可以使用if語句,for循環調用任意數量的其他函數。 使用自定義資料類型。 幾乎所有可能。

Then you specify what part of this regular Julia function represent parameters that you want to tweak. You hand this function to Flux and tell Flux to train on this function feeding it inputs and checking the output. Flux then uses your chosen training algorithm to update parameters until your function produce the desired output (super simplified).

然後,指定此正常Julia函數的哪個部分表示要調整的參數。 您将此功能交給Flux,并告訴Flux對該功能進行教育訓練,以提供其輸入并檢查輸出。 然後,Flux使用您選擇的訓練算法來更新參數,直到您的函數産生所需的輸出(超級簡化)為止。

Of course this Julia function could be made up of a bunch of classic sigmoid functions arranged in classic neural networks. You can do some convolutions etc. But that is just one option. The function you hand to Flux can be anything. It could be a whole ray-tracer if you wanted, where training tweaks some parameters of the ray-tracer algorithm to give desired output.

當然,這個Julia函數可以由布置在經典神經網絡中的一堆經典S型函數組成。 您可以進行一些卷積等。但這隻是一種選擇。 您交給Flux的功能可以是任何東西。 如果需要的話,它可以是整個光線跟蹤器 ,在訓練中可以調整光線跟蹤器算法的某些參數以提供所需的輸出。

The function could be some advance scientific model you already built. No problem, as long as you can specify the parameters to Flux.

該功能可能是您已經建立的一些進階科學模型。 沒問題,隻要您可以為Flux指定參數即可。

Basically the graph Flux deals with is the same abstract syntax tree graph of regular Julia code. You don’t have to build up a graph manually with all sorts of restrictions like in TensorFlow.

基本上,Flux處理的圖是與正常Julia代碼相同的抽象文法樹圖。 您無需像TensorFlow中那樣通過各種限制手動建構圖形。

If you have some scientific model in Python which you want to tweak through machine learning e.g. you cannot just feed it as-is to TensorFlow. No, you would have to go over this code and figure out a way to replace everything it does by building a complex graph in TensorFlow only using nodes supplied by TensorFlow. You don’t have the full power of Python available.

如果您有一些想要在Python中進行調整的科學模型,例如您不能僅将其按原樣提供給TensorFlow。 不,您将不得不周遊此代碼,并找到一種方法來替換它所做的所有事情,方法是僅使用TensorFlow提供的節點在TensorFlow中建構複雜的圖形。 您沒有Python的全部功能。

Let us expand on this question by answering Gershom Agim:

讓我們通過回答Gershom Agim來擴充這個問題:

You mention Julia’s a faster language but show no examples of where/how that comes into play considering TF, numpy and the whole ML ecosystem are just monolithic wrappers around c++, fortran and cuda libraries

您提到了Julia是一種更快的語言,但沒有考慮到TF,numpy和整個ML生态系統隻是圍繞c ++,fortran和cuda庫的整體包裝的示例,

Emmett Boudreau does indeed state that Julia is faster than Python and that may not seem to matter as you are just running C++ code in TensorFlow.

Emmett Boudreau确實聲明了Julia比Python快,這似乎無關緊要,因為您隻是在TensorFlow中運作C ++代碼。

But as I elaborated on above, by having this two-language split you incur a lot of inflexibility. Every problem must be expressed in terms of nodes assembled into a graph. If some function does not exist you have to find a way to create it with existing nodes or go through all the hassle of doing this in C++.

但是,正如我在上面詳細說明的那樣,通過拆分這兩種語言,您會産生很多不靈活性。 每個問題都必須按照組裝到圖中的節點來表達。 如果某些功能不存在,則必須找到一種使用現有節點建立該功能的方法,或者解決使用C ++進行此操作的所有麻煩。

Manually describing code for solving a machine learning problem by assembling a graph is not natural. Imagine writing regular code like that. Instead of writing say:

通過組裝圖形來手動描述用于解決機器學習問題的代碼是不自然的。 試想像這樣編寫正常代碼。 不要寫:

a = 3
b = 4
c = a + b - 10
           

You would have to write something like this:

您将必須編寫如下内容:

a = VariableNode("a")
b = VariableNode("b")
c = AddNode(a, SubNode(b, Constant(10)))
execute_graph(c, ["a" => 3, "b" => 4])
           

The latter is the a TensorFlow inspired way of approaching the problem. The former is a more Julia inspired way of writing code. You just write regular code. Julia is very much like LISP under the hood. Julia code can easily be manipulated as data. There is no need to invent a whole separate abstract syntax tree for machine learning purposes. You just use the Julia language itself as it is.

後者是TensorFlow啟發的解決問題的方法。 前者是一種受Julia啟發的編寫代碼的方式。 您隻需編寫正常代碼。 Julia非常像引擎蓋下的LISP。 Julia代碼可以輕松地作為資料進行操作。 出于機器學習的目的,無需發明整個單獨的抽象文法樹。 您隻需按原樣使用Julia語言即可。

Because Julia is a high performance language. You can run your machine learning algorithms straight on pure Julia code. You don’t need a bunch of C++ code snippets glued together in a graph arranged by Python code.

因為Julia是一種高性能的語言。 您可以直接在純Julia代碼上運作機器學習算法。 您不需要在Python代碼排列的圖形中将一堆C ++代碼片段粘合在一起。

助焊劑看起來很小且不完整 (Flux Looks Small and Incomplete)

On twitter Timothy Lau asked me this question.

在推特上,蒂莫西·劉問我這個問題。

I like flux but last I checked it seemed like it was stuck trying to catch up to the features and newer methods that keep getting added to pytorch and tensorflow.

我喜歡助焊劑,但最後我檢查了一下,似乎趕上了不斷添加到pytorch和tensorflow中的功能和更新的方法,似乎卡住了。

I had to do some follow up questions to clarify what he meant. One of his examples was a list of activation functions. You can see a list of activation functions for TensorFlow here. These are functions such as

sigmoid

,

relu

and

softmax

. In Flux at the time the list was very short, as with the list of many other types of functions in Julia.

我必須做一些後續問題來闡明他的意思。 他的例子之一是激活功能清單。 您可以在此處檢視TensorFlow的激活功能清單。 這些功能包括

sigmoid

relu

softmax

。 當時在Flux中,清單非常短,與Julia中許多其他類型的函數一樣。

This made Timothy Lau conclude that Flux was incomplete and not ready for prime time. It lacked so many functions TensorFlow has.

這使Timothy Lau得出結論,Flux尚不完整,還沒有準備好迎接黃金時段。 它缺乏TensorFlow擁有的衆多功能。

The problem is that this is actually not an apples to apples comparison. If you look at the current list of activation functions in Flux, you will notice that they are not actually from Flux at all but from another library called NNlib.

問題在于,這實際上不是蘋果與蘋果的比較。 如果檢視Flux中目前的激活函數清單,您會發現它們實際上根本不是來自Flux,而是來自另一個名為NNlib的庫。

And this is where things get interesting, NNlib is a generic library containing activation functions. It was not made specifically for Flux. Here is an example of how the

relu

function has been defined in NNlib:

這就是使事情變得有趣的地方,NNlib是包含激活函數的通用庫。 它不是專門為Flux制造的。 這是一個如何在

relu

定義

relu

函數的示例:

relu(x) = max(zero(x), x)
           

There is nothing Flux specific about this code. It is just plain Julia code. In fact this is so trivial you could have written it yourself. This is in significant contrast to TensorFlow activation functions, which must be part of the TensorFlow library. That is because these are nodes (or ops as TF calls them) written in C++ which must adhere to a specific interface that TensorFlow expects. Otherwise these activation function cannot be used in a TensorFlow graph.

此代碼沒有特定于Flux的内容。 這隻是普通的Julia代碼。 實際上,這是如此瑣碎,您可以自己編寫。 這與TensorFlow激活函數形成鮮明對比,後者必須是TensorFlow庫的一部分。 那是因為這些是用C ++編寫的節點(或TF稱為op的節點),必須遵守TensorFlow期望的特定接口。 否則,這些激活功能無法在TensorFlow圖中使用。

That means that e.g. PyTorch activation functions have to be reimplemented to fit the interfaces PyTorch expects. The net effect is that in the Python world one often ends up with massive libraries, because you need to reimplement the same things over and over again.

這意味着例如必須重新實作PyTorch激活功能以适合PyTorch期望的接口。 最終的結果是,在Python世界中,最終往往會擁有大量的庫,因為您需要一遍又一遍地重新實作相同的東西。

In the Julia world in contrast the same activation functions can be implemented once in one tiny library such as NNlib and reuse in any number of Julia machine learning libraries including Flux. The net effect of this is that Julia libraries tend to be very small. They don’t have to be big, because a lot of the functionality you need for any given workflow comes from other Julia libraries. Julia libraries are extremely composable. That means you can replicate what Python does with one massive library by simply combining a bunch of small well defined libraries.

相反,在Julia世界中,相同的激活函數可以在一個小庫(例如NNlib)中實作一次,并可以在包括Flux在内的任意數量的Julia機器學習庫中重用。 這樣做的最終結果是Julia庫往往很小。 它們不必很大,因為任何給定工作流程所需的許多功能都來自其他Julia庫。 Julia庫是非常可組合的。 這意味着您可以通過簡單地組合一堆定義良好的小型庫來複制Python對一個大型庫所做的工作。

For instance running on a GPU, using preallocated arrays, or running on a tensor processing unit is handled all with entirely separate libraries. It is not built into Flux. These other libraries aren’t even made specifically to give that functionality to Flux. They are generic libraries.

例如,在GPU上運作,使用預配置設定的數組或在張量處理單元上運作都可以通過完全獨立的庫來處理。 它沒有内置在Flux中。 這些其他庫甚至沒有專門為Flux提供該功能。 它們是通用庫。

Thus you cannot compare Flux alone to TensorFlow. You got to compare TensorFlow to basically the whole Julia eco-system of packages. This may give you some sense of why Julia is rapidly gaining such enthusiastic adherents. Highly composable micro-libraries is extremely powerful.

是以,您無法将Flux單獨與TensorFlow進行比較。 您必須将TensorFlow與基本上整個包裝的Julia生态系統進行比較。 這可能使您對Julia為何Swift獲得如此熱情的支援者有所了解。 高度可組合的微庫非常強大。

If you want a more concrete hands on example of how you use Flux, you can read one of my Flux introductions where we do hand writing recognition with Flux.

如果您想更具體地了解如何使用Flux的示例,可以閱讀我的Flux簡介之一,其中我們使用Flux進行手寫識别。

And if how Flux works is still a mystery to you, then I advice you to read this article, where I go more into detail of how Flux is implemented.

如果是如何工作的流量仍然是個謎你,那麼我建議你閱讀此文章,我走到哪裡更到的流量是如何實作的細節 。

翻譯自: https://medium.com/@Jernfrost/flux-vs-tensorflow-misconceptions-2737a8b464fb

二維張量 乘以 三維張量