using standard amqp, the only way to guarantee that a message isn't lost is by using transactions -- make the channel transactional, publish the message, commit. in this case, transactions are unnecessarily heavyweight and decrease throughput by a factor of 250. to remedy this, a confirmation mechanism was introduced.
如果采用标準的 amqp 協定,則唯一能夠保證消息不會丢失的方式是利用事務機制 -- 令 channel 處于 transactional 模式、向其 publish 消息、執行 commit 動作。在這種方式下,事務機制會帶來大量的多餘開銷,并會導緻吞吐量下降 250% 。為了補救事務帶來的問題,引入了 confirmation 機制(即 publisher confirm)。
to enable confirms, a client sends the confirm.select method. depending on whether no-wait was set or not, the broker may respond with a confirm.select-ok. once the confirm.select method is used on a channel, it is said to be in confirm mode. a transactional channel cannot be put into confirm mode and once a channel is in confirm mode, it cannot be made transactional.
為了使能 confirm 機制,client 首先要發送 confirm.select 方法幀。取決于是否設定了 no-wait 屬性,broker 會相應的判定是否以 confirm.select-ok 進行應答。一旦在 channel 上使用 confirm.select 方法,channel 就将處于 confirm 模式。處于 transactional 模式的 channel不能再被設定成 confirm 模式,反之亦然。
once a channel is in confirm mode, both the broker and the client count messages (counting starts at 1 on the first confirm.select). the broker then confirms messages as it handles them by sending a basic.ack on the same channel. the delivery-tag field contains the sequence number of the confirmed message. the broker may also set the multiple field in basic.ack to indicate that all messages up to and including the one with the sequence number have been handled.
一旦 channel 處于 confirm 模式,broker 和 client (譯者注:client 的計數自行實作)都将啟動消息計數(以 confirm.select 為基礎從 1 開始計數)。broker 會在處理完消息後(譯者注:這裡的說法會讓人産生錯誤了解,何為處理完消息?後續還有涉及),在目前 channel 上通過發送 basic.ack 的方式對其(消息)進行 confirm 。delivery-tag 域的值辨別了被 confirm 消息的序列号。broker 也可以通過設定 basic.ack 中的 multiple 域來表明到指定序列号為止的所有消息都已被 broker 正确的處理了。
in exceptional cases when the broker is unable to handle messages successfully, instead of a basic.ack, the broker will send a basic.nack. in this context, fields of the basic.nack have the same meaning as the corresponding ones in basic.ack and the requeue field should be ignored. by nack'ing one or more messages, the broker indicates that it was unable to process the messages and refuses responsibility for them; at that point, the client may choose to re-publish the messages.
在異常情況發生時,broker 将無法成功處理相應的消息,此時 broker 将發送 basic.nack 來代替 basic.ack 。在這個情形下,basic.nack 中各域值的含義與 basic.ack 中相應各域含義是相同的,同時 requeue 域的值應該被忽略。通過 nack 一條或多條消息,broker 表明自身無法對相應消息完成處理,并拒絕為這些消息的處理負責。在這種情況下,client 可以選擇将消息 re-publish 。
after a channel is put into confirm mode, all subsequently published messages will be confirmed or nack'd once. no guarantees are made as to how soon a message is confirmed. no message will be both confirmed and nack'd.
在 channel 被設定成 confirm 模式之後,所有被 publish 的後續消息都将被 confirm(即 ack) 或者被 nack 一次。但是沒有對消息被 confirm 的快慢做任何保證,并且同一條消息不會既被 confirm 又被 nack 。
basic.nack will only be delivered if an internal error occurs in the erlang process responsible for a queue.
basic.nack 隻會在負責 queue 功能的 erlang 程序發生内部錯誤時被發送。
the broker will confirm messages once:
broker 将在下面的情況中對消息進行 confirm :
it decides a message will not be routed to queues
(if the mandatory flag is set then the basic.return is sent first) or
broker 發現目前消息無法被路由到指定的 queues 中(如果設定了 mandatory 屬性,則 broker 會先發送 basic.return)
a transient message has reached all its queues (and mirrors) or
非持久屬性的消息到達了其所應該到達的所有 queue 中(和鏡像 queue 中)
a persistent message has reached all its queues (and mirrors) and been persisted to disk (and fsynced) or
持久消息到達了其所應該到達的所有 queue 中(和鏡像 queue 中),并被持久化到了磁盤(被 fsync)
a persistent message has been consumed (and if necessary acknowledged) from all its queues
持久消息從其所在的所有 queue 中被 consume 了(如果必要則會被 acknowledge)
the broker loses persistent messages if it crashes before said messages are written to disk. under certain conditions, this causes the broker to behave in surprising ways.
broker 會丢失持久化消息,如果 broker 在将上述消息寫入磁盤前異常。在一定條件下,這種情況會導緻 broker 以一種奇怪的方式運作。
for instance, consider this scenario:
例如,考慮下述情景:
a client publishes a persistent message to a durable queue
一個 client 将持久消息 publish 到持久 queue 中
a client consumes the message from the queue (noting that the message is persistent and the queue durable), but doesn't yet ack it,
另一個 client 從 queue 中 consume 消息(注意:該消息具有持久屬性,并且 queue 是持久化的),當尚未對其進行 ack
the broker dies and is restarted, and
broker 異常重新開機
the client reconnects and starts consuming messages.
client 重連并開始 consume 消息
at this point, the client could reasonably assume that the message will be delivered again. this is not the case: the restart has caused the broker to lose the message. in order to guarantee persistence, a client should use confirms. if the publisher's channel had been in confirm mode, the publisher would not have received an ack for the lost message (since the consumer message hadn't ack'd it and it hadn't been written to disk yet).
在上述情景下,client 有理由認為消息需要被(broker)重新 deliver 。但這并非事實:重新開機(有可能)會令 broker 丢失消息。為了確定持久性,client 應該使用 confirm 機制。如果 publisher 使用的 channel 被設定為 confirm 模式,publisher 将不會收到已丢失消息的 ack(這是因為 consumer 沒有對消息進行 ack ,同時該消息尚未被寫入磁盤)。