additional considerations

2025-05-31 22:09:37 +02:00 · 2023-06-26 22:40:18 +03:00 · 2023-06-26 22:40:18 +03:00 · 7fe889c18e
commit 7fe889c18e
parent 4c26248557
2 changed files with 48 additions and 20 deletions
--- a/src/en/clean-copy/05-[Work
+++ b/src/en/clean-copy/05-[Work
@ -2,7 +2,7 @@

 As we noted on several occasions in the previous chapters, neither the HTTP and URL standards nor REST architectural principles prescribe concrete semantics for the meaningful parts of a URL (notably, path fragments and key-value pairs in the query). **The rules for organizing URLs in an HTTP API exist *only* to improve the API's readability and consistency from the developers' perspective**. However, this doesn't mean they are unimportant. Quite the opposite: URLs in HTTP APIs are a means of describing abstraction levels and entities' responsibility areas. A well-designed API hierarchy should be reflected in a well-designed URL nomenclature.

-**NB**: the lack of specific guidance from the specification editors naturally led to developers inventing it themselves. Many of these spontaneous practices can be found on the Internet, such as the requirement to use only nouns in URLs. They are often claimed to be a part of the standards or REST architectural principles (which they are not). Nevertheless, deliberately ignoring such self-proclaimed “best practices” is not the best decision for an API vendor as it increases the chances of being misunderstood.
+**NB**: the lack of specific guidance from the specification editors naturally led to developers inventing it themselves. Many of these spontaneous practices can be found on the Internet, such as the requirement to use only nouns in URLs. They are often claimed to be a part of the standards or REST architectural principles (which they are not). Nevertheless, deliberately ignoring such self-proclaimed “best practices” is a rather risky decision for an API vendor as it increases the chances of being misunderstood.

 Traditionally, the following semantics are considered to be the default:
  * Path components (i.e., fragments between `/` symbols) are used to organize nested resources, such as `/partner/{id}/coffee-machines/{id}`. A path can be further extended by adding new suffixes to indicate subordinate sub-resources.
@ -10,7 +10,20 @@ Traditionally, the following semantics are considered to be the default:

 This convention allows for reflecting almost any API's entity nomenclature decently and it is more than reasonable to follow it (and it's unreasonable to defiantly neglect it). However, this indistinctly defined logic inevitably leads to numerous variants of interpreting it:
  
-  1. How exactly should the endpoints connecting two entities lacking a clear relation between them be organized? For example, how should a URL for preparing a lungo on a specific coffee machine look?
+  1. Where do metadata of operations end and “regular” data begin, and how acceptable is it to duplicate fields in both places? For example, one common practice in HTTP APIs is to indicate the type of returned data by adding an “extension” to the URL, similar to file names in file systems (i.e., when accessing the resource `/v1/orders/{id}.xml`, the response will be in XML format, and when accessing `/v1/orders/{id}.json`, it will be in JSON format). On the one hand, `Accept*` headers are intended to determine the data format. On the other hand, code readability is clearly improved by introducing a “format” parameter directly in the URL.
+
+  2. As a consequence of the previous point, how should the version of the API itself be correctly indicated in an HTTP API? At least three options can be suggested, each of which fully complies with the letter of the standard:
+      * A path parameter: `/v1/orders/{id}`
+      * A query parameter: `/orders/{id}?version=1`
+      * A header:
+          ```
+          GET /orders/{id} HTTP/1.1
+          X-OurCoffeeApi-Version: 1
+          ```
+          
+      Even more exotic options can be added here, such as specifying the schema in a customized media type or request protocol.
+
+  3. How exactly should the endpoints connecting two entities lacking a clear relation between them be organized? For example, how should a URL for preparing a lungo on a specific coffee machine look?
      * `/coffee-machines/{id}/recipes/lungo/prepare`
      * `/recipes/lungo/coffee-machines/{id}/prepare`
      * `/coffee-machines/{id}/prepare?recipe=lungo`
@ -20,9 +33,9 @@ This convention allows for reflecting almost any API's entity nomenclature decen

      All these options are semantically viable and generally speaking equitable.

-  2. How strictly should the literal interpretation of the `VERB /resource` construct be enforced? If we agree to follow the “only nouns in the URLs” rule (logically, a verb cannot be applied to a verb, right?) then we should use `preparer` or `preparator` in the examples above (and the `/action=prepare&coffee_machine_id=<id>&recipe=lungo` is unacceptable at all as there is no object to act upon). However, this adds visual noise in the form of “ator” suffixes but definitely doesn't make the code more concise or readable.
+  4. How strictly should the literal interpretation of the `VERB /resource` construct be enforced? If we agree to follow the “only nouns in the URLs” rule (logically, a verb cannot be applied to a verb, right?) then we should use `preparer` or `preparator` in the examples above (and the `/action=prepare&coffee_machine_id=<id>&recipe=lungo` is unacceptable at all as there is no object to act upon). However, this adds visual noise in the form of “ator” suffixes but definitely doesn't make the code more concise or readable.

-  3. If the call signature implies that the operation is by default unsafe or non-idempotent, does it mean that the operation *must* be unsafe or non-idempotent? As HTTP verbs bear double semantics (the meaning of the operation vs. possible side effects), it implies ambiguity in organizing APIs. Let's consider the `/v1/search` resource from our study API. Which verb should be used to request it?
+  5. If the call signature implies that the operation is by default unsafe or non-idempotent, does it mean that the operation *must* be unsafe or non-idempotent? As HTTP verbs bear double semantics (the meaning of the operation vs. possible side effects), it implies ambiguity in organizing APIs. Let's consider the `/v1/search` resource from our study API. Which verb should be used to request it?
      * On one hand, `GET /v1/search?query=<search query>` explicitly declares that there are no side effects (no state is overwritten) and the results can be cached (given that all significant parameters are passed as parts of the URL).
      * On the other hand, a response to a `GET /v1/search` request must contain *a representation of the `/search` resource*. Are search results a representation of a search engine? The meaning of a “search” operation is much better described as “processing the representation enclosed in the request according to the resource's own specific semantics,” which is exactly the definition of the `POST` method. Additionally, how could we cache search requests? The results page is dynamically formed from a plethora of various sources, and a subsequent request with the same query might yield a different result.

@ -30,12 +43,14 @@ This convention allows for reflecting almost any API's entity nomenclature decen

      **NB**: the authors of the standard are also concerned about this dichotomy and have finally [proposed the `QUERY` HTTP method](https://www.ietf.org/archive/id/draft-ietf-httpbis-safe-method-w-body-02.html), which is basically a safe (i.e., non-modifying) version of `POST`. However, we do not expect it to gain widespread adoption just as [the existing `SEARCH` verb](https://www.rfc-editor.org/rfc/rfc5323) did not.

-Unfortunately, we don't have simple answers to these questions. In this book, we stick to the following approach:
-  * Signatures must be readable and concise first and foremost. Making the code more complicated to align with some abstract concept is undesirable.
-  * Hierarchies are indicated if they are unequivocal. If a low-level entity is a full subordinate of a higher-level entity, the relation will be expressed with nested path fragments.
+Unfortunately, we don't have simple answers to these questions. Within this book, we adhere to the following approach: the call signature should, first and foremost, be concise and readable. Complicating signatures for the sake of abstract concepts is undesirable. In relation to the mentioned issues, this means that:
+  1. Operation metadata should not change the meaning of the operation. If a request reaches the final microservice without any headers at all, it should still be executable, although some auxiliary functionality may degrade or be absent.
+
+  2. We use versioning in the path for one simple reason: all other methods make sense only if, when changing the major version of the protocol, the URL nomenclature remains the same. However, if the resource nomenclature can be preserved, there is no need to break backward compatibility.
+  3. Hierarchies are indicated if they are unequivocal. If a low-level entity is a full subordinate of a higher-level entity, the relation will be expressed with nested path fragments.
      * If there are doubts about the hierarchy persisting during further development of the API, it is more convenient to create a new root path prefix rather than employ nested paths.
-  * For “cross-domain” operations (i.e., when it is necessary to refer to entities of different abstraction levels within one request) it is better to have a dedicated resource specifically for this operation (e.g., in the example above, we would prefer the `/prepare?coffee_machine_id=<id>&recipe=lungo` signature).
-  * The semantics of the HTTP verbs take priority over false non-safety / non-idempotency warnings. Furthermore, the author of this book prefers using `POST` to indicate any unexpected side effects of an operation, such as high computational complexity, even if it is fully safe.
+  4. For “cross-domain” operations (i.e., when it is necessary to refer to entities of different abstraction levels within one request) it is better to have a dedicated resource specifically for this operation (e.g., in the example above, we would prefer the `/prepare?coffee_machine_id=<id>&recipe=lungo` signature).
+  5. The semantics of the HTTP verbs take priority over false non-safety / non-idempotency warnings. Furthermore, the author of this book prefers using `POST` to indicate any unexpected side effects of an operation, such as high computational complexity, even if it is fully safe.

 **NB**: passing variables as either query parameters or path fragments affects not only readability. Let's consider the example from the previous chapter and imagine that gateway D is implemented as a stateless proxy with a declarative configuration. Then receiving a request like this:
  * `GET /v1/state?user_id=<user_id>`
--- a/src/ru/clean-copy/05-[В
+++ b/src/ru/clean-copy/05-[В
@ -10,7 +10,20 @@

 Подобная конвенция достаточно хорошо подходит для того, чтобы отразить номенклатуру сущностей почти любого API, поэтому следовать ей вполне разумно (и, наоборот, демонстративное нарушение этого устоявшегося соглашения чревато тем, что разработчики вас просто неправильно поймут). Однако подобная некодифицированная и размытая концепция неизбежно вызывает множество разночтений в конкретных моментах:

-  1. Каким образом организовывать эндпойнты, связывающие две сущности, между которыми нет явных отношений подчинения? Скажем, каким должен быть URL запуска приготовления лунго на конкретной кофе-машине?
+  1. Где заканчиваются метаданные операции и начинаются «просто» данные и насколько допустимо дублировать поля и там, и там? Например, одна из распространённых практик в HTTP API — индицировать тип возвращаемых данных путём добавки «расширения» к URL, подобно именам файлов в файловых системах (т.е. при обращении к ресурсу `/v1/orders/{id}.xml` ответ будет получен в формате XML, а при обращении к `/v1/orders/{id}.json` — в формате JSON). С одной стороны, для определения формата данных предназначены `Accept*`-заголовки. С другой стороны, читабельность кода явно повышается с введением параметра «формат» прямо в URL.
+
+  2. Как следствие предыдущего пункта — каким образом в HTTP API правильно указывать версию самого API? Навскидку можно предложить как минимум три варианта, каждый из которых вполне соответствует букве стандарта:
+      * как path-параметр: `/v1/orders/{id}`;
+      * как query-параметр: `/orders/{id}?version=1`;
+      * как заголовок:
+          ```
+          GET /orders/{id} HTTP/1.1
+          X-OurCoffeeApi-Version: 1
+          ```
+      
+      Сюда можно приплюсовать и более экзотические варианты, такие как указание схемы в кастомизированном медиатипе или протоколе запроса.
+
+  3. Каким образом организовывать эндпойнты, связывающие две сущности, между которыми нет явных отношений подчинения? Скажем, каким должен быть URL запуска приготовления лунго на конкретной кофе-машине?
      * `/coffee-machines/{id}/recipes/lungo/prepare`
      * `/recipes/lungo/coffee-machines/{id}/prepare`
      * `/coffee-machines/{id}/prepare?recipe=lungo`
@ -20,9 +33,9 @@
    
      Все эти варианты семантически вполне допустимы и в общем-то равноправны.

-  2. Насколько строго должна выдерживаться буквальная интерпретация конструкции `ГЛАГОЛ /ресурс`? Если мы принимаем правило «части URL обязаны быть существительными» (и ведь странно применять глагол к глаголу!), то в примерах выше должно быть не `prepare`, а `preparator` или `preparer` (а вариант `/action=prepare&coffee_machine_id=<id>&recipe=lungo` вовсе недопустим, так как нет объекта действия), что, честно говоря, лишь добавляет визуального шума в виде суффиксов «ator», но никак не способствует большей лаконичности и однозначности понимания.
+  4. Насколько строго должна выдерживаться буквальная интерпретация конструкции `ГЛАГОЛ /ресурс`? Если мы принимаем правило «части URL обязаны быть существительными» (и ведь странно применять глагол к глаголу!), то в примерах выше должно быть не `prepare`, а `preparator` или `preparer` (а вариант `/action=prepare&coffee_machine_id=<id>&recipe=lungo` вовсе недопустим, так как нет объекта действия), что, честно говоря, лишь добавляет визуального шума в виде суффиксов «ator», но никак не способствует большей лаконичности и однозначности понимания.

-  3. Если сигнатура вызова по умолчанию модифицирующая или неидемпотентная, означает ли это, что операция *обязана* быть модифицирующей / идемпотентной? Двойственность смысловой нагрузки глаголов (семантика vs побочные действия) порождает неопределённость в вопросах организации API. Рассмотрим, например, ресурс `/v1/search`, осуществляющий поиск предложений кофе в нашем учебном API. С каким глаголом мы должны к нему обращаться?
+  5. Если сигнатура вызова по умолчанию модифицирующая или неидемпотентная, означает ли это, что операция *обязана* быть модифицирующей / идемпотентной? Двойственность смысловой нагрузки глаголов (семантика vs побочные действия) порождает неопределённость в вопросах организации API. Рассмотрим, например, ресурс `/v1/search`, осуществляющий поиск предложений кофе в нашем учебном API. С каким глаголом мы должны к нему обращаться?
      * С одной стороны, `GET /v1/search?query=<поисковый запрос>` позволяет явно продекларировать, что никаких посторонних эффектов у этого запроса нет (никакие данные не перезаписываются) и результаты его можно кэшировать (при условии, что все значимые параметры передаются в URL).
      * С другой стороны, согласно семантике операции, `GET /v1/search` должен возвращать *представление ресурса `search`*. Но разве результаты поиска являются представлением ресурса-поисковика? Смысл операции «поиск» гораздо точнее описывается фразой «обработка запроса в соответствии с внутренней семантикой ресурса», т.е. соответствует методу `POST`. Кроме того, можем ли мы вообще говорить о кэшировании поисковых запросов? Страница результатов поиска формируется динамически из множества источников, и повторный запрос с той же поисковой фразой почти наверняка выдаст другой список результатов.

@ -30,13 +43,13 @@

      **NB**: эта дихотомия волнует не только нас, но и авторов стандарта, которые в конечном итоге [предложили новый глагол `QUERY`](https://www.ietf.org/archive/id/draft-ietf-httpbis-safe-method-w-body-02.html), который по сути является немодифицирующим `POST`. Мы, однако, сомневаемся, что он получит широкое распространение — поскольку [уже существующий `SEARCH`](https://www.rfc-editor.org/rfc/rfc5323) оказался в этом качестве никому не нужен.

-Простых ответов на вопросы выше у нас, к сожалению, нет. В рамках настоящей книги мы придерживаемся следующего подхода:
-
-  * сигнатура вызова в первую очередь должна быть лаконична и читабельна; усложнение сигнатур в угоду абстрактным концепциям нежелательно;
-  * иерархия ресурсов выдерживается там, где она однозначна (т.е., если сущность низшего уровня абстракции однозначно подчинена сущности высшего уровня абстракции, то отношения между ними будут выражены в виде вложенных путей);
-      * если есть сомнения в том, что иерархия в ходе дальнейшего развития API останется неизменной, лучше завести новый верхнеуровневый префикс, а не вкладывать новые сущности в уже существующие;
-  * для выполнения «кросс-доменных» операций (т.е. при необходимости сослаться на объекты разных уровней абстракции в одном вызове) предпочтительнее завести специальный ресурс, выполняющий операцию (т.е. в примере с кофе-машинами и рецептами автор этой книги выбрал бы вариант `/prepare?coffee_machine_id=<id>&recipe=lungo`);
-  * семантика HTTP-глагола приоритетнее ложного предупреждения о небезопасности/неидемпотентности (в частности, если операция является безопасной, но ресурсозатратной, с нашей точки зрения вполне разумно использовать метод `POST` для индикации этого факта).
+Простых ответов на вопросы выше у нас, к сожалению, нет. В рамках настоящей книги мы придерживаемся следующего подхода: сигнатура вызова в первую очередь должна быть лаконична и читабельна. Усложнение сигнатур в угоду абстрактным концепциям нежелательно. Применительно к указанным проблемам это означает, что:
+  1. Метаданные операции не должны менять смысл операции; если запрос доходит до конечного микросервиса вообще без заголовков, он всё ещё должен быть выполним, хотя какая-то вспомогательная функциональность может деградировать или отсутствовать.
+  2. Мы используем указание версии в path по одной простой причине: все остальные способы сделать это имеют смысл если и только если при изменении мажорной версии протокола номенклатура URL останется прежней. Но, если номенклатура ресурсов может быть сохранена, то нет никакой нужды нарушать обратную совместимость нет.
+  3. Иерархия ресурсов выдерживается там, где она однозначна (т.е., если сущность низшего уровня абстракции однозначно подчинена сущности высшего уровня абстракции, то отношения между ними будут выражены в виде вложенных путей).
+      * Если есть сомнения в том, что иерархия в ходе дальнейшего развития API останется неизменной, лучше завести новый верхнеуровневый префикс, а не вкладывать новые сущности в уже существующие.
+  4. Для выполнения «кросс-доменных» операций (т.е. при необходимости сослаться на объекты разных уровней абстракции в одном вызове) предпочтительнее завести специальный ресурс, выполняющий операцию (т.е. в примере с кофе-машинами и рецептами автор этой книги выбрал бы вариант `/prepare?coffee_machine_id=<id>&recipe=lungo`).
+  5. Семантика HTTP-вызова приоритетнее ложного предупреждения о небезопасности/неидемпотентности (в частности, если операция является безопасной, но ресурсозатратной, с нашей точки зрения вполне разумно использовать метод `POST` для индикации этого факта).

 **NB**: отметим, что передача параметров в виде пути или query-параметра в URL влияет не только на читабельность. Вернёмся к примеру из предыдущей главы и представим, что гейтвей D реализован в виде stateless прокси с декларативной конфигурацией. Тогда получать от клиента запрос в виде:
  * `GET /v1/state?user_id=<user_id>`
@ -67,7 +80,7 @@

 ##### Создание

-Начнём с операции создания ресурса. Как мы помним из главы «[Стратегии синхронизации](#api-patterns-sync-strategies)”, операция создания в любой сколько-нибудь ответственной предметной области обязана быть идемпотентной и, очень желательно, ещё и позволять управлять параллелизмом. В рамках парадигмы HTTP API идемпотентное создание можно организовать одним из трёх способов:
+Начнём с операции создания ресурса. Как мы помним из главы «[Стратегии синхронизации](#api-patterns-sync-strategies)», операция создания в любой сколько-нибудь ответственной предметной области обязана быть идемпотентной и, очень желательно, ещё и позволять управлять параллелизмом. В рамках парадигмы HTTP API идемпотентное создание можно организовать одним из трёх способов:

  1. Через метод `POST` с передачей токена идемпотентности (им может выступать, в частности, `ETag` ресурса):
      ```