天天看點

Distributed Phoenix Chat with PubSub PG2 adapter

In this article we’ll see how to cluster the Phoenix Chat nodes, using a really powerful functionality embedded in BEAM (the Elixir/Erlang VM), for easily communicate between Elixir nodes. We’ll then see how <code>pg2</code> works and inspect how Phoenix efficiently broadcasts the messages in a distributed chat app.

We previously saw, in Distributed Phoenix Chat using Redis PubSub, how to distribute multiple Phoenix Chat nodes and broadcast the messages using Redis. It worked well and it’s really easy to setup, especially in a Kubernetes cluster. Each single Chat node just needs to know the internet Redis server DNS and port to connect to.

This approach is easy but has some drawbacks:

The Redis server acts as a single point of failure: if Redis goes down, the whole service goes down. There is no way for the nodes to broadcast messages to clients in other nodes.

Single point of failure

We also need then to maintain a Redis server, or a new cluster of Redis servers. With Docker and Kubernetes it’s really easy to spawn new services in the cluster. But we need to keep in mind that maintaining a new server in production doesn’t come for free, especially under heavy loads.

Distributed Phoenix Chat with PubSub PG2 adapter

Distributed Phoenix

At first we need to fully connect each node to the other nodes, using the communication protocol embedded in the Erlang VM. I’ve briefly shown in Running Elixir in Docker Containers how to connect multiple Elixir nodes using Docker.

Let’s quickly see how to manually connect two Elixir nodes using <code>iex</code> in two separate terminals. We need to start the two <code>iex</code> sessions setting the node name and IP with the <code>--name</code>

Distributed Phoenix Chat with PubSub PG2 adapter

Connecting two Elixir nodes

Using the function <code>Node.connect/1</code> we’ve created a cluster made by two nodes: <code>[email protected]</code> and <code>[email protected]</code>. Once the nodes are connected we can start sending messages to remote processes, like when we are on a single node.

In <code>[email protected]</code> we start an <code>Agent</code> process, registering it under the <code>GlobalAgent</code>name in the global registry. The node <code>[email protected]</code> then sends a message to <code>GlobalAgent</code>, running on <code>[email protected]</code>, and get its state.

Distributed Phoenix Chat with PubSub PG2 adapter

Sending messages to a remote Agent process

We can easily configure Phoenix to leverage this powerful functionality to broadcast messages to remote nodes.

Before configuring our Phoenix Chat with the PG2 PubSub adapter, let’s dig a bit into understanding what PG2 is and how it works.

<code>pg2</code> is an Erlang module which implements process grouping. Process groups can be useful when we need to group processes distributed over multiple nodes, so we can easily monitor and message them.

This module implements process groups. Each message can be sent to one, some, or all group members. http://erlang.org/doc/man/pg2.html

Let’s see in practice how PG2 works, starting three different Elixir nodes: <code>[email protected]</code>, <code>[email protected]</code> and <code>[email protected]</code>.

In the <code>[email protected]</code> node, we form the cluster connecting <code>c</code> to the other two nodes. We then create the <code>:agents_group</code> process group with the <code>:pg2.create/1</code>function.

Distributed Phoenix Chat with PubSub PG2 adapter

Creation of a distributed process group

Each node runs a local <code>pg2</code> process, which monitors the processes in the group and holding their <code>PID</code> in the local <code>:pg2_table</code> <code>ETS</code> table. 

Without going deeper into the pg2 implementation itself, let’s start an agent for each node and add it to the <code>:agents_group</code> we have just created.

Agent processes join a pg2 group

Distributed Phoenix Chat with PubSub PG2 adapter

pg2 monitors the processes in the group

We start an <code>Agent</code> process in each node, each one holding its own state. We add them to the <code>:agents_group</code> with the function <code>:pg2.join(:agents_group, agent_pid)</code>.

Once the agents are added to the group, pg2 starts to monitor them. If a process exits it will be immediately removed from the group. We’ve seen that’s quite easy to make multiple processes part of a group, but how can we send a message to the so-called group’s members?

Distributed Phoenix Chat with PubSub PG2 adapter

Broadcasting a message

The <code>pg2</code> module doesn’t offer a <code>broadcast</code> or a <code>send</code> function to send a message to all the members. We need to enumerate the PIDs given by the <code>:pg2.get_members(:agents_group)</code> function and send them a message one by one. This actually gives us the freedom to selectively send a message to just a subset of the group’s members. We’ll see later how this freedom becomes handy.

<code>pg2</code> monitors the processes joined in the group. When we halt one agent, we see how the process is immediately removed from the group.

Now I’m going to use the code in the poeticoding/phoenix_chat_example GitHub repository, under <code>pubsub_pg2</code> branch.

When we create a new Phoenix app, it comes with a PubSub PG2 adapter configured by default. 

So, coming from the previous version in the <code>pubsub_redis</code> branch, we just need to change the pubsub configuration in the <code>config/config.exs</code> file. Let’s open two <code>iex</code> nodes running each one a chat server on port <code>4000</code> and on port <code>4001</code>.

If we try to connect a browser to <code>4000</code> and another browser to <code>4001</code> we see that the messages are not propagated. The two nodes are not connected, we need to cluster them.

Once the nodes are connected, we see that the messages are correctly broadcasted from one browser to the other one. It works and we don’t need any other configuration. I find interesting to hack a bit around though, inspecting the <code>Phoenix.PubSub.PG2</code> adapter to understand how it works under the hood.

Each Phoenix node starts its own local <code>PubSub.PG2Server</code> and registers it in a <code>pg2</code>group with name <code>{:phx, Chat.PubSub}</code>. 

The important thing to see here is that the members of the pg2 group are the PIDs of <code>PubSub.PG2Server</code> running in each node. If we spawn and connect another phoenix node in the cluster, we would see its <code>PubSub.PG2Server</code> PID as third member.

The members are not the users connection processes, this would be highly inefficient for how pg2 is built and because one single node would have to broadcast a single message to each user over multiple nodes.

Distributed Phoenix Chat with PubSub PG2 adapter

How Phoenix uses PubSub with pg2

Let’s see instead how Phoenix handles a broadcast over multiple nodes.

We connect a browser to the http server on <code>[email protected]</code> node, port <code>4001</code>

We send a message to the chat room. This message is sent to the node <code>b</code>, via the WebSocket connection. The <code>PubSub.PG2Server</code>, running locally in the node, broadcasts the message to all the browsers connected to the same node.

The <code>PubSub.PG2Server</code> in <code>b</code> then forwards the message to the remote <code>PubSub.PG2Server</code> running in <code>[email protected]</code>.

<code>PubSub.PG2Server</code> in the <code>a</code> node then broadcasts the message to all the browser connected to the node.

In this way the message is replicated over the cluster network just one time!

Let’s try to manually send the broadcast message from the node <code>b</code> to the <code>PubSub.PG2Server</code> running in node <code>a</code>. The message looks like this.

Distributed Phoenix Chat with PubSub PG2 adapter

Sending a message to PG2Server

We need at first to get the PIDs of the remote <code>PubSub.PG2Server</code>, which is part of the <code>{:phx, Chat.PubSub}</code> pg2 group.

With <code>:pg2.get_members</code> we get all the members part of the group, which are the <code>PubSub.PG2Server</code> running locally in <code>b</code>, and the remote one running in <code>a</code>.<code>:pg2.get_local_members</code> returns only the processes running locally, in this case in node <code>b</code>.

Let’s connect a browser to the node <code>a</code> http server (port <code>4000</code>) and see what happens when forwarding a message to the <code>PG2Server</code> running in <code>a</code>.

Distributed Phoenix Chat with PubSub PG2 adapter

We see how the message is correctly broadcasted, by the <code>PubSub.PG2Server</code>process, to the open connections.

We’ve seen how <code>pg2</code> works and how Phoenix conscientiously handles messages in a distributed PubSub. So far, we’ve always manually connected the nodes, which is an issue when we want to deploy our app into production and on a Kubernetes cluster. We’ll see in further articles how to use tools like <code>libcluster</code> to automatically cluster nodes and easily scale out using Kubernetes DNS for nodes auto-discovery.

繼續閱讀