Asynchronous Event Processing

2025-08-10 21:51:42 +02:00 · 2023-05-14 21:03:41 +03:00
parent 33abe495e6
commit 81ad7caad1
6 changed files with 264 additions and 5 deletions
--- a/src/en/clean-copy/01-Introduction/05.md
+++ b/src/en/clean-copy/01-Introduction/05.md
@@ -1,4 +1,4 @@
-### [The API-first approach][intro-api-first-approach]
+### [The API-First Approach][intro-api-first-approach]

 Today, more and more IT companies are recognizing the importance of the “API-first” approach, which is the paradigm of developing software with a heavy focus on APIs.

--- a/src/en/clean-copy/03-[Work
+++ b/src/en/clean-copy/03-[Work
@@ -1 +1,134 @@
-### Organization of Notification Systems
+#### [Multiplexing Notifications. Asynchronous Event Processing][api-patterns-async-event-processing]
+
+One of the vexing restrictions of almost every technology mentioned in the previous chapter is the limited size of messages. With client push notifications the situation is the most problematic: Google Firebase Messaging at the moment this chapter is being written allowed no more than 4000 bytes of payload. In backend development, the restrictions are also notable; let's say, Amazon SQS limits the size of messages to 256 KiB. While developing *webhook*-based integrations, you risk hitting the maximum body size allowed by the partner's webserver (for example, in nginx the default value is 1MB). This leads us to the necessity of making two technical decisions regarding the notification formats:
+  * Whether a message contains all data needed to process it or just notifies some state change has happened
+  * If we choose the latter, whether a single notification contains data on a single change, or it might bear several such events.
+
+On the example of our coffee API:
+
+```
+// Option #1: the message
+// contains all the order data
+POST /partner/webhook
+Host: partners.host
+{
+  "event_id",
+  "occurred_at",
+  "order": {
+    "id",
+    "status",
+    "recipe_id",
+    "volume",
+    // Other data fields
+    …
+  }
+}
+```
+```
+// Option #2: the message body
+// contains only the notification
+// of the status change
+POST /partner/webhook
+Host: partners.host
+{
+  "event_id",
+  // Message type: a notification
+  // about a new order
+  "event_type": "new_order",
+  "occurred_at",
+  // Data sufficient to 
+  // retrieve the full state,
+  // in our case, the order identifier
+  "order_id"
+}
+// To process the event, the partner
+// must request some endpoint
+// on the API vendor's side,
+// possibly asynchronously
+GET /v1/orders/{id}
+→
+{ /* full data regarding
+     the order */ }
+```
+```
+// Option #3: the API vendor
+// notifies partners that
+// several orders await their
+// reaction
+POST /partner/webhook
+Host: partners.host
+{
+  // The system state revision
+  // and/or a cursor to retrieve
+  // the orders might be provided
+  "occurred_at",
+  "pending_order_count":
+    <the number of pending orders>
+}
+// In response to such a call,
+// partners should retrieve the list
+// of ongoing orders
+GET /v1/orders/pending
+→
+{
+  "orders",
+  "cursor"
+}
+```
+
+Which option to select depends on the subject area (and on the allowed message sizes in particular) and on the procedure of handling messages by partners. In our case, every order must be processed independently and the number of messages during the order life cycle is low, so our natural choice would be either option \#1 (if order data cannot contain unpredictably large fields) or \#2. Option \#3 is viable if: 
+  * The API generates a lot of notifications for a single logical entity
+  * Partners are interested in fresh state changes only
+  * Or events must be processed sequentially, and no parallelism is allowed.
+
+**NB**: the approach \#3 (and partly \#2) naturally leads us to the scheme that is typical for client-server integration: the push message itself contains almost no data and is only a trigger for ahead-of-time polling.
+
+The technique of sending only essential data in the notification has one important disadvantage, apart from more complicated data flows and increased request rate. With option \#1 implemented (i.e., the message contains all the data), we might assume that returning a success response by the subscriber is equivalent to successfully processing the state change by the partner (although it's not guaranteed if the partner uses asynchronous techniques). With options \#2 and \#3, this is certainly not the case: the partner must carry out additional actions (starting from retrieving the actual order state) to fully process the message. This implies that two separate statuses might be needed: “message received” and “message processed.” Ideally, the latter should follow the logic of the API work cycle, i.e., the partner should carry out some action upon processing the event, and this action might be treated as the “message processed” signal. In our coffee example, we can expect that the partner will either accept or reject an order after receiving the “new order” message. Then the full message processing flow will look like this:
+
+```
+// The API vendor
+// notifies the partner that
+// several orders await their
+// reaction
+POST /partner/webhook
+Host: partners.host
+{
+  "occurred_at",
+  "pending_order_count":
+    <the number of pending orders>
+}
+```
+```
+// In response, the partner
+// retrieves the list of
+// pending orders
+GET /v1/orders/pending
+→
+{
+  "orders",
+  "cursor"
+}
+```
+```
+// After the orders are processed,
+// the partners notify about this
+// by calling the specific API
+// endpoint
+POST /v1/orders/bulk-status-change
+{
+  "status_changes": [{
+    "order_id",
+    "new_status": "accepted",
+    // Other relevant information
+    // e.g. the preparation time
+    // estimates
+    …
+  }, {
+    "order_id",
+    "new_status": "rejected",
+    "reason"
+  }, …]
+}
+```
+
+If there is no genuine follow-up call expected during our API work cycle, we can introduce an endpoint to explicitly mark notifications as processed. This step is not mandatory as we can always stipulate that it is the partner's responsibility to process notifications and we do not expect any confirmations. However, we will lose an important monitoring tool if we do so, as we can no longer track what's happening on the partner's side, i.e., whether the partner is able to process notifications on time. This, in turn, will make it harder to develop the degradation and emergency shutdown mechanisms we talked about in the previous chapter.
--- a/src/ru/clean-copy/03-[В
+++ b/src/ru/clean-copy/03-[В
@@ -1 +1,127 @@
-### Варианты организации системы нотификаций
+### [Мультиплексирование сообщений. Асинхронная обработка событий][api-patterns-async-event-processing]
+
+Одно из неприятных ограничений почти всех перечисленных в предыдущей главе технологий — это относительно невысокий размер сообщения. Наиболее проблематичная ситуация с push-уведомлениями: Google Firebase Messaging на момент написания настоящей главы разрешал сообщения не более 4000 байт. Но и в серверной разработке ограничения заметны: например, Amazon SQS лимитирует размер сообщения 256 килобайтами. При разработке webhook-ов вы рискуете быстро упереться в размеры тел сообщений, выставленных на веб-серверах партнёров (например, в nginx по умолчанию разрешены тела запросов не более одного мегабайта). Это приводит нас к необходимости сделать два технических выбора:
+  * содержит ли тело сообщения все данные необходимые для его обработки, или только уведомляет о факте изменения состояния;
+  * если второе, то содержит ли один вызов извещение об одном изменении, или может уведомлять сразу о нескольких таких событиях.
+
+Рассмотрим на примере нашего кофейного API:
+
+```
+// Вариант 1: тело сообщения
+// содержит все данные о заказе
+POST /partner/webhook
+Host: partners.host
+{
+  "event_id",
+  "occurred_at",
+  "order": {
+    "id",
+    "status",
+    "recipe_id",
+    "volume",
+    // Все прочие детали заказа
+    …
+  }
+}
+```
+```
+// Вариант 2: тело сообщения
+// содержит только информацию
+// о самом событии
+POST /partner/webhook
+Host: partners.host
+{
+  "event_id",
+  // Тип сообщения: нотификация
+  // о появлении нового заказа
+  "event_type": "new_order",
+  "occurred_at",
+  // Все поля данных, необходимые
+  // для обращения за полным
+  // состоянием. В нашем случае —
+  // идентификатор заказа
+  "order_id"
+}
+// При обработке сообщения,
+// возможно, отложенной,
+// партнёр должен обратиться
+// к нашему API
+GET /v1/orders/{id}
+→
+{ /* все детали заказа */ }
+```
+```
+// Вариант 3: мы уведомляем
+// партнёра, что его реакции
+// ожидают три новых заказа
+POST /partner/webhook
+Host: partners.host
+{
+  // Здесь может быть версия
+  // состояния системы или курсор
+  "occurred_at",
+  "pending_order_count":
+    <число новых заказов>
+}
+// В ответ партнёр должен вызвать
+// эндпойнт получения списка заказов
+GET /v1/orders/pending
+→
+{
+  "orders",
+  "cursor"
+}
+```
+
+Выбор подходящей модели зависит от предметной области (в частности, допустимых размерах тел сообщений) и того, каким образом партнёр будет обрабатывать сообщение. В нашем конкретном случае, когда партнёр должен каждый новый заказ обработать отдельно, при этом на один заказ не может приходить больше одного-двух уведомлений, естественным выбором является вариант 1 (если тело запроса не содержит никаких непредсказуемо больших данных) или 2. Третий подход будет естественным выбором, если:
+  * API генерирует большое число сообщений об изменениях состояния на одну логическую сущность;
+  * партнёров интересуют только наиболее свежие изменения;
+  * или обработка событий требует последовательного исполнения и не подразумевает параллельности.
+
+**NB**: третий (и отчасти второй) варианты естественным образом приводят нас к схеме, характерной для клиентских устройств: push-уведомление само по себе не почти содержит полезной информации и только является сигналом для внеочередного поллинга.
+
+Применение техник с отправкой только ограниченного набора данных помимо усложнения схемы взаимодействия и увеличения количества запросов имеет ещё один важный недостаток. Если в варианте 1 (сообщение содержит в себе все релевантные данные) мы можем рассчитывать на то, что возврат кода успеха подписчиком эквивалентен успешной обработке сообщения партнёром (что, вообще говоря, тоже не гарантировано, т.к. партнёр может использовать асинхронные схемы), то для вариантов 2 и 3 это заведомо не так: для обработки сообщений партнёр должен выполнить дополнительные действия, начиная с получения нужных данных о заказе. В этом случае нам необходимо иметь раздельные статусы — сообщение доставлено и сообщение обработано; в идеале, второе должно вытекать из логики работы API, т.е. сигналом о том, что сообщение обработано, является какое-то действие, совершаемое партнёром. В нашем кофейном примере это может быть перевод заказа партнёром из статуса `"new"` (заказ создан пользователем) в статус `"accepted"` или `"rejected"` (кофейня партнёра приняла или отклонила заказ). Тогда полный цикл обработки уведосления будет выглядеть так:
+
+```
+// Уведомляем партнёра о том,
+// что его реакции
+// ожидают три новых заказа
+POST /partner/webhook
+Host: partners.host
+{
+  "occurred_at",
+  "pending_order_count":
+    <число новых заказов>
+}
+```
+```
+// В ответ партнёр вызывает
+// эндпойнт получения списка заказов
+GET /v1/orders/pending
+→
+{
+  "orders",
+  "cursor"
+}
+```
+```
+// После того, как заказы обработаны,
+// партнёр уведомляет нас об
+// изменениях статуса
+POST /v1/orders/bulk-status-change
+{
+  "status_changes": [{
+    "order_id",
+    "new_status": "accepted",
+    // Иная релевантная информация,
+    // например, время готовности
+    …
+  }, {
+    "order_id",
+    "new_status": "rejected",
+    "reason"
+  }, …]
+}
+```
+
+Если такого нативного способа оповестить об успешной обработке события схема работы нашего API не предполагает, мы можем ввести эндпойнт который явно помечает сообщения прочитанными. Этот шаг, вообще говоря, необязательный (мы можем просто договориться о том, что это ответственность партнёра обрабатывать события и мы не ждём от него никаких подтверждений), но это лишает нас полезного инструмента мониторинга — что происходит на стороне партнёра, успевает ли он обрабатывать события — что в свою очередь затрудняет разработку упомянутых в предыдущей главе механизмов деградации и аварийного отключения интеграции.
--- a/src/ru/drafts/03-Раздел
+++ b/src/ru/drafts/03-Раздел