1
0
mirror of https://github.com/twirl/The-API-Book.git synced 2025-05-25 22:08:06 +02:00

refactoring

This commit is contained in:
Sergey Konstantinov 2023-07-02 20:39:56 +03:00
parent 851d6e38b9
commit dc20620c8a
6 changed files with 253 additions and 243 deletions

View File

@ -1,6 +1,6 @@
### [On the HTTP API Concept and Terminology][http-api-concepts]
The problem of designing HTTP APIs is unfortunately one of the most “holywar”-inspiring issues. On one hand, it is one of the most popular technologies but, on the other hand, it is quite complex and difficult to comprehend due to the large and fragmented standard split into many RFCs. As a result, the HTTP specification is doomed to be poorly understood and imperfectly interpreted by millions of software engineers and thousands of textbook writers. Therefore, before proceeding to the useful part of this Section, we must clarify exactly what we are going to discuss.
The problem of designing HTTP APIs is, unfortunately, one of the most “holywar”-inspiring issues. On one hand, it is one of the most popular technologies but, on the other hand, it is quite complex and difficult to comprehend due to the large and fragmented standard split into many RFCs. As a result, the HTTP specification is doomed to be poorly understood and imperfectly interpreted by millions of software engineers and thousands of textbook writers. Therefore, before proceeding to the useful part of this Section, we must clarify exactly what we are going to discuss.
It has somehow happened that the entire modern network stack used for developing client-server APIs has been unified in two important points. One of them is the Internet Protocol Suite, which comprises the IP protocol as a base and an additional layer on top of it in the form of either the TCP or UDP protocol. Today, alternatives to the TCP/IP stack are used for a very limited subset of engineering tasks.

View File

@ -1,43 +1,95 @@
### [The REST Myth][http-api-rest-myth]
### [Advantages and Disadvantages of HTTP APIs Compared to Alternative Technologies][http-api-pros-and-cons]
Before we proceed to discuss HTTP API design patterns, we feel obliged to clarify one more important terminological issue. Often, an API matching the description we gave in the previous chapter is called a “REST API” or a “RESTful API.” In this Section, we don't use any of these terms as it makes no practical sense.
After reviewing the previous chapter, the reader may wonder why this dichotomy exists in the first place, i.e., why do some HTTP APIs rely on HTTP semantics, while others reject it in favor of custom arrangements, and still others are stuck somewhere in between? For example, if we consider the [JSON-RPC response format](https://www.jsonrpc.org/specification#response_object), we quickly notice that it could be replaced with standard HTTP protocol functionality. Instead of this:
What is “REST”? In 2000, Roy Fielding, one of the authors of the HTTP and URI specifications, published his doctoral dissertation titled “Architectural Styles and the Design of Network-based Software Architectures,” the fifth chapter of which was named “[Representational State Transfer (REST)](https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm).”
```
HTTP/1.1 200 OK
As anyone can attest by reading this chapter, it features a very much abstract overview of a distributed client-server architecture that is not bound to either HTTP or URL. Furthermore, it does not discuss any API design recommendations. In this chapter, Fielding methodically *enumerates restrictions* that any software engineer encounters when developing distributed client-server software. Here they are:
* The client and the server do not know how each of them is implemented
* Sessions are stored on the client (the “stateless” constraint)
* Data must be marked as cacheable or non-cacheable
* Interaction interfaces between system components must be uniform
* Network-based systems are layered, meaning every server may just be a proxy to another server
* The functionality of the client might be enhanced by the server providing code on demand.
{
"jsonrpc": "2.0",
"id",
"error": {
"code": -32600,
"message": "Invalid request"
}
}
```
That's it. With this, the REST definition is over. Fielding further concretizes some implementation aspects of systems under the stated restrictions. However, all these clarifications are no less abstract. Literally, the key abstraction for the REST architectural style is “resource”; any data that can have a name may be a resource.
the server could have simply responded with a `400 Bad Request`, passing the request identifier as a custom header like `X-OurCoffeeAPI-RequestId`. Nevertheless, protocol designers decided to introduce their own custom format.
The key conclusion that we might draw from the Fielding-2000 definition of REST is, generally speaking, that *any networking software in the world complies with the REST constraints*. The exceptions are very rare.
This situation (not only with JSON-RPC but with essentially every high-level protocol built on top of HTTP) has developed due to various reasons. Some of them are historical (such as the inability to use many HTTP protocol features in early implementations of the `XMLHttpRequest` functionality in web browsers). However, new RPC protocols relying on the bare minimum of HTTP capabilities continue to emerge today. We can enumerate at least four groups of reasons leading to this situation.
Consider the following:
* It is very hard to imagine any system that does not feature *any* level of uniformity of inter-component communication as it would be impossible to develop such a system. Ultimately, as we mentioned in the previous chapter, almost all network interactions are based on the IP protocol, which *is* a uniform interface.
* If there is a uniform communication interface, it can be mimicked if needed, so the requirement of client and server implementation independence can always be met.
* If we can create an alternative server, it means we can always have a layered architecture by placing an additional proxy between the client and the server.
* As clients are computational machines, they *always* store some state and cache some data.
* Finally, the code-on-demand requirement is a sly one as in a [von Neumann architecture](https://en.wikipedia.org/wiki/Von_Neumann_architecture), we can always say that the data the client receives actually comprises instructions in some formal language.
##### Metadata readability
Yes, of course, the reasoning above is a sophism, a reduction to absurdity. Ironically, we might take the opposite path to absurdity by proclaiming that REST constraints are never met. For instance, the code-on-demand requirement obviously contradicts the requirement of having an independently-implemented client and server as the client must be able to interpret the instructions the server sends written in a specific language. As for the “S” rule (i.e., the “stateless” constraint), it is very hard to find a system that does not store *any* client context as it's close to impossible to make anything *useful* for the client in this case. (And, by the way, Fielding explicitly requires that: “communication … cannot take advantage of any stored context on the server.”)
Let us emphasize a very important distinction between application-level protocols (such as JSON-RPC in our case) and pure HTTP. In the example above, a `400 BadRequest` error is a transparent status for every intermediary network agent but a JSON-RPC custom error is not. Firstly, only a JSON-RPC-enabled client can read it. Secondly, and more importantly, in JSON-RPC, the request status *is not metadata*. In pure HTTP, the details of the operation, such as the method, requested URL, execution status, and request / response headers are readable *without the necessity to parse the entire body*. In most higher-level protocols, including JSON-RPC, this is not the case: even a protocol-enabled client must read a body to retrieve that information.
Finally, in 2008, Fielding himself increased the entropy in the understanding of the concept by issuing a [clarification](https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven) explaining what he actually meant. In this article, among other things, he stated that:
* REST API development must focus on describing media types representing resources
* The client must be agnostic of these media types
* There must not be fixed resource names and operations with resources. Clients must extract this information from the server's responses.
How does an API developer benefit from the capability of reading request and response metadata? The modern client-server communication stack is multi-layered. We can enumerate a number of intermediary agents that process network requests and responses:
* Frameworks that developers use to write code
* Programming language APIs that frameworks are built on, and operating system APIs that compilers / interpreters of these languages rely on
* Intermediary proxy servers between a client and a server
* Various abstractions used in server programming, including server frameworks, programming languages, and operating systems
* Web server software that is typically placed in front of backend handlers
* Additional modern microservice-oriented tools such as API gateways and proxies
The concept of “Fielding-2008 REST” implies that clients, after somehow obtaining an entry point to the API, must be able to communicate with the server having no prior knowledge of the API and definitely must not contain any specific code to work with the API. This requirement is much stricter than the ones described in the dissertation of 2000. Particularly, REST-2008 implies that there are no fixed URL templates; actual URLs to perform operations with the resource are included in the resource representation (this concept is known as [HATEOAS](https://en.wikipedia.org/wiki/HATEOAS)). The dissertation of 2000 does not contain any definitions of “hypermedia” that contradict the idea of constructing such links based on the prior knowledge of the API (such as a specification).
The main advantage that following the letter of the HTTP standard offers is the possibility to rely on intermediary agents, from client frameworks to API gateways, to read the request metadata and perform actions based on it. This includes regulating timeouts and retry policies, logging, proxying, and sharding requests, among other things, without the necessity to write additional code to achieve these functionalities. If we try to formulate the main principle of designing HTTP APIs, it will be: **you would rather design an API in a way that intermediary agents can read and interpret request and response metadata**.
**NB**: leaving out the fact that Fielding rather loosely interpreted his own dissertation, let us point out that no system in the world complies with the Fielding-2008 definition of REST.
The main disadvantage of HTTP APIs is that you have to rely on intermediary agents, from client frameworks to API gateways, to read the request metadata and perform actions based on it *without your consent*. This includes regulating timeouts and retry policies, logging, proxying, and sharding requests, among other things. Since HTTP-related specifications are complex and the concepts of REST can be challenging to comprehend, and software engineers do not always write perfect code, these intermediary agents (including partners' developers!) will sometimes interpret HTTP metadata *incorrectly*, especially when dealing with exotic and hard-to-implement standards. Usually, one of the stated reasons for developing new RPC frameworks is the desire to make working with the protocol simple and consistent, thereby reducing the likelihood of errors when writing integration code.
We have no idea why, out of all the overviews of abstract network-based software architecture, Fielding's concept gained such popularity. It is obvious that Fielding's theory, reflected in the minds of millions of software developers, became a genuine engineering subculture. By reducing the REST idea to the HTTP protocol and the URL standard, the chimera of a “RESTful API” was born, of which [nobody knows the definition](https://restfulapi.net/).
##### Quality of Solutions
Do we want to say that REST is a meaningful concept? Definitely not. We only aimed to explain that it allows for quite a broad range of interpretations, which is simultaneously its main power and its main weakness.
The ability to read and interpret the metadata of requests and responses leads to the fragmentation of available software for working with HTTP APIs. There are plenty of tools on the market, being developed by many different companies and collaborations, and many of them are free to use:
* Proxies and gateways (nginx, Envoy, etc.)
* Different IDLs (first of all, OpenAPI) and related tools for working with specifications (Redoc, Swagger UI, etc.) and auto-generating code
* Programmer-oriented software that allows for convenient development and debugging of API clients (Postman, Insomnia), etc.
On one hand, thanks to the multitude of interpretations, the API developers have built a perhaps vague but useful view of “proper” HTTP API architecture. On the other hand, the lack of concrete definitions has made REST API one of the most “holywar”-inspiring topics, and these holywars are usually quite meaningless as the popular REST concept has nothing to do with the REST described in Fielding's dissertation (and even more so, with the REST described in Fielding's manifesto of 2008).
Of course, most of these instruments will work with APIs that utilize other paradigms. However, the ability to read HTTP metadata and interpret it *uniformly* makes it possible to easily design complex pipelines such as exporting nginx access logs to Prometheus and generating response status code monitoring dashboards in Grafana that work out of the box.
The terms “REST architectural style” and its derivative “REST API” will not be used in the following chapters since it makes no practical sense as we explained above. We referred to the constraints described by Fielding many times in the previous chapters because, let us emphasize it once more, it is impossible to develop distributed client-server APIs without taking them into account. However, HTTP APIs (meaning JSON-over-HTTP endpoints utilizing the semantics described in the HTTP and URL standards) as we will describe them in the following chapter align well with the “average” understanding of “REST / RESTful API” as per numerous tutorials on the Web.
The downside of this versatility is the quality of these solutions and the amount of time one needs to integrate them, especially if one's technological stack is not common. On the other hand, the development of alternative technologies is usually driven by a single large IT company (such as Facebook, Google, or Apache Software Foundation). Such a framework might be less functional, but it will certainly be more homogeneous and qualitative in terms of convenience for developers, supporting users, and the number of known issues.
This observation applies not only to software but also to its creators. Developers' knowledge of HTTP APIs is fragmented as well. Almost every programmer is capable of working with HTTP APIs to some extent, but a significant number of them lack a thorough understanding of the standards and do not consult them while writing code. As a result, implementing business logic that effectively and consistently works with HTTP APIs can be more challenging than integrating alternative technologies. This statement holds true for both partner integrators and API providers themselves.
Additionally, let's emphasize that the HTTP API paradigm is currently the default choice for *public* APIs. Because of the aforementioned reasons, partners can integrate an HTTP API without significant obstacles, regardless of their technological stack. Moreover, the prevalence of the technology lowers the entry barrier and the requirements for the qualification of partners' engineers.
##### The Design Paradigm
Modern HTTP APIs inherited the design paradigm from the times when the HTTP protocol was mainly used to transfer hypertext. It implies that an HTTP request constitutes an operation performed on some object (*a resource*) identified by a URL. Many alternative solutions stick to other concepts; notably, in these technologies, URLs identify *a function* to call with the given parameters. This semantics doesn't exactly contradict the HTTP architectural principles, as making remote procedure calls is covered by the protocol pretty well, but it makes using some HTTP capabilities (such as, let's say, the `Range-*` headers) meaningless, and some even dangerous as ambivalences of interpretations of some fields (such as, let's say, `ETag`) arise.
From the client developers' perspective, following the HTTP paradigms implies implementing an additional layer of logic that transforms calling methods on objects to HTTP operations on corresponding resources. RPC technologies are more convenient to integrate in this sense. (Although, any complex RPC API will require such an adapter level, and GraphQL requires it from the very beginning.)
##### The Question of Performance
When discussing the advantages of alternative technologies such as GraphQL, gRPC, Apache Thrift, etc., the argument of lower performance of JSON-over-HTTP APIs is often presented. Specifically, the following issues with the technology are commonly mentioned:
1. The verbosity of the JSON format:
* Mandatory field names in every object, even for an array of similar entities
* The large proportion of technical symbols (quotes, braces, commas, etc.) and the necessity to escape them in string values
2. The common approach of returning a full resource representation on resource retrieval requests, even if the client only needs a subset of the fields
3. The lower performance of data serializing and deserializing operations
4. The need to introduce additional encoding, such as Base64, to handle binary data
5. Performance quirks of the HTTP protocol itself, particularly the inability to serve multiple simultaneous requests through one connection.
Let's be honest: HTTP APIs do suffer from the listed problems. However, we can confidently say that the impact of these factors is often overestimated. The reason API vendors care little about HTTP API performance is that the actual overhead is not as significant as perceived. Specifically:
1. Regarding the verbosity of the format, it is important to note that these issues are mainly relevant when compresiion algorithms are not utilized. [Comparisons](https://nilsmagnus.github.io/post/proto-json-sizes/) have shown that enabling compression algorithms such as gzip largely reduces the difference in sizes between JSON documents and alternative binary formats (and there are compression algorithms specifically designed for processing text data, such as [brotli](https://datatracker.ietf.org/doc/html/rfc7932)).
2. If necessary, API designers can customize the list of returned fields in HTTP APIs. It aligns well with both the letter and the spirit of the standard. However, as we already explained to the reader in the “[Partial Updates](#api-patterns-partial-updates)” chapter, trying to minimize traffic by returning only subsets of data is rarely justified in well-designed APIs.
3. If standard JSON deserializers are used, the overhead compared to binary standards might indeed be significant. However, if this overhead is a real problem, it makes sense to consider alternative JSON serializers such as [simdjson](https://github.com/simdjson/simdjson). Due to their low-level and highly optimized code, simdjson demonstrates impressive throughput which would be suitable for all APIs except some corner cases.
4. Generally speaking, the HTTP API paradigm implies that binary data (such as images or video files) is served through separate endpoints. Returning binary data in JSON is only necessary when a separate request for the data is a problem from the performance perspective. These situations are virtually non-existent in server-to-server interactions and/or if HTTP/2 or a higher protocol version is used.
5. The HTTP/1.1 protocol version is indeed a suboptimal solution if request multiplexing is needed. However, alternate approaches to tackling the problem usually rely on… HTTP/2. Of course, an HTTP API can also be served over HTTP/2.
Let us reiterate once more: JSON-over-HTTP APIs are *indeed* less performative than binary protocols. Nevertheless, we take the liberty to say that for a well-designed API in common subject areas switching to alternative protocols will generate quite a modest profit.
#### Advantages and Disadvantages of the JSON Format
It's not hard to notice that most of the claims regarding HTTP API performance are actually not about the HTTP protocol but the JSON format. There is no problem in developing an HTTP API that will utilize any binary format (including, for instance, [Protocol Buffers](https://protobuf.dev/)). Then the difference between a Protobuf-over-HTTP API and a gRPC API would be just about using granular URLs, status codes, request / response headers, and the ensuing (in)ability to use integrated software tools out of the box.
However, on many occasions (including this book) developers prefer the textual JSON over binary Protobuf (Flatbuffers, Thrift, Avro, etc.) for a very simple reason: JSON is easy to read. First, it's a text format and doesn't require additional decoding. Second, it's self-descriptive, meaning that property names are included. Unlike Protobuf-encoded messages which are basically impossible to read without a `.proto` file, one can make a very good guess as to what a JSON document is about at a glance. Provided that request metadata in HTTP APIs is readable as well, we ultimately get a communication format that is easy to parse and understand with just our eyes.
Apart from being human-readable, JSON features another important advantage: it is strictly formal meaning it does not contain any constructs that can be interpreted differently in different architectures (with a possible exception of the sizes of numbers and strings), and the deserialization result aligns very well with native data structures (i.e., indexed and associative arrays) of almost every programming language. From this point of view, we actually had no other choice when selecting a format for code samples in this book.
If you happen to design a less general API for a specific subject area, we still recommend the same approach for choosing a format:
* Estimate the overhead of preparing and introducing tools to decipher binary protocols versus the overhead of using not the most optimal data transfer protocols.
* Make an assessment of what is more important to you: having a quality but restricted in its capabilities set of bundled software or having the possibility of using a broad range of tools that work with HTTP APIs, even though their quality is not that high.
* Evaluate the cost of finding developers proficient with the format.

View File

@ -1,172 +1,43 @@
### [Components of an HTTP Request and Their Semantics][http-api-requests-semantics]
### [The REST Myth][http-api-rest-myth]
The third important exercise we must conduct is to describe the format of an HTTP request and response and explain the basic concepts. Many of these may seem obvious to the reader. However, the situation is that even the basic knowledge we require to move further is scattered across vast and fragmented documentation, causing even experienced developers to struggle with some nuances. Below, we will try to compile a structured overview that is sufficient to design HTTP APIs.
Before we proceed to discuss HTTP API design patterns, we feel obliged to clarify one more important terminological issue. Often, an API matching the description we gave in the “[On the HTTP API Concept and Terminology](#http-api-concepts)” chapter is called a “REST API” or a “RESTful API.” In this Section, we don't use any of these terms as it makes no practical sense.
To describe the semantics and formats, we will refer to the brand-new [RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html), which replaces no fewer than nine previous specifications dealing with different aspects of the technology. However, a significant volume of additional functionality is still covered by separate standards. In particular, the HTTP caching principles are described in the standalone [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111.html), while the popular `PATCH` method is omitted in the main RFC and is regulated by [RFC 5789](https://www.rfc-editor.org/rfc/rfc5789.html).
What is “REST”? In 2000, Roy Fielding, one of the authors of the HTTP and URI specifications, published his doctoral dissertation titled “Architectural Styles and the Design of Network-based Software Architectures,” the fifth chapter of which was named “[Representational State Transfer (REST)](https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm).”
An HTTP request consists of (1) applying a specific verb to a URL, stating (2) the protocol version, (3) additional meta-information in headers, and (4) optionally, some content (request body):
As anyone can attest by reading this chapter, it features a very much abstract overview of a distributed client-server architecture that is not bound to either HTTP or URL. Furthermore, it does not discuss any API design recommendations. In this chapter, Fielding methodically *enumerates restrictions* that any software engineer encounters when developing distributed client-server software. Here they are:
* The client and the server do not know how each of them is implemented
* Sessions are stored on the client (the “stateless” constraint)
* Data must be marked as cacheable or non-cacheable
* Interaction interfaces between system components must be uniform
* Network-based systems are layered, meaning every server may just be a proxy to another server
* The functionality of the client might be enhanced by the server providing code on demand.
```
POST /v1/orders HTTP/1.1
Host: our-api-host.tld
Content-Type: application/json
That's it. With this, the REST definition is over. Fielding further concretizes some implementation aspects of systems under the stated restrictions. However, all these clarifications are no less abstract. Literally, the key abstraction for the REST architectural style is “resource”; any data that can have a name may be a resource.
{
"coffee_machine_id": 123,
"currency_code": "MNT",
"price": "10.23",
"recipe": "lungo",
"offer_id": 321,
"volume": "800ml"
}
```
The key conclusion that we might draw from the Fielding-2000 definition of REST is, generally speaking, that *any networking software in the world complies with the REST constraints*. The exceptions are very rare.
An HTTP response to such a request includes (1) the protocol version, (2) a status code with a corresponding message, (3) response headers, and (4) optionally, response content (body):
Consider the following:
* It is very hard to imagine any system that does not feature *any* level of uniformity of inter-component communication as it would be impossible to develop such a system. Ultimately, as we mentioned in the previous chapter, almost all network interactions are based on the IP protocol, which *is* a uniform interface.
* If there is a uniform communication interface, it can be mimicked if needed, so the requirement of client and server implementation independence can always be met.
* If we can create an alternative server, it means we can always have a layered architecture by placing an additional proxy between the client and the server.
* As clients are computational machines, they *always* store some state and cache some data.
* Finally, the code-on-demand requirement is a sly one as in a [von Neumann architecture](https://en.wikipedia.org/wiki/Von_Neumann_architecture), we can always say that the data the client receives actually comprises instructions in some formal language.
```
HTTP/1.1 201 Created
Location: /v1/orders/123
Content-Type: application/json
Yes, of course, the reasoning above is a sophism, a reduction to absurdity. Ironically, we might take the opposite path to absurdity by proclaiming that REST constraints are never met. For instance, the code-on-demand requirement obviously contradicts the requirement of having an independently-implemented client and server as the client must be able to interpret the instructions the server sends written in a specific language. As for the “S” rule (i.e., the “stateless” constraint), it is very hard to find a system that does not store *any* client context as it's close to impossible to make anything *useful* for the client in this case. (And, by the way, Fielding explicitly requires that: “communication … cannot take advantage of any stored context on the server.”)
{
"id": 123
}
```
Finally, in 2008, Fielding himself increased the entropy in the understanding of the concept by issuing a [clarification](https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven) explaining what he actually meant. In this article, among other things, he stated that:
* REST API development must focus on describing media types representing resources
* The client must be agnostic of these media types
* There must not be fixed resource names and operations with resources. Clients must extract this information from the server's responses.
**NB**: in HTTP/2 (and future HTTP/3), separate binary frames are used for headers and data instead of the holistic text format. However, this doesn't affect the architectural concepts we will describe below. To avoid ambiguity, we will provide examples in the HTTP/1.1 format. You can find detailed information about the HTTP/2 format [here](https://hpbn.co/http2/).
The concept of “Fielding-2008 REST” implies that clients, after somehow obtaining an entry point to the API, must be able to communicate with the server having no prior knowledge of the API and definitely must not contain any specific code to work with the API. This requirement is much stricter than the ones described in the dissertation of 2000. Particularly, REST-2008 implies that there are no fixed URL templates; actual URLs to perform operations with the resource are included in the resource representation (this concept is known as [HATEOAS](https://en.wikipedia.org/wiki/HATEOAS)). The dissertation of 2000 does not contain any definitions of “hypermedia” that contradict the idea of constructing such links based on the prior knowledge of the API (such as a specification).
##### A URL
**NB**: leaving out the fact that Fielding rather loosely interpreted his own dissertation, let us point out that no system in the world complies with the Fielding-2008 definition of REST.
A Uniform Resource Locator (URL) is an addressing unit in an HTTP API. Some evangelists of the technology even use the term “URL space” as a synonym for “The World Wide Web.” It is expected that a proper HTTP API should employ an addressing system that is as granular as the subject area itself; in other words, each entity that the API can manipulate should have its own URL.
We have no idea why, out of all the overviews of abstract network-based software architecture, Fielding's concept gained such popularity. It is obvious that Fielding's theory, reflected in the minds of millions of software developers, became a genuine engineering subculture. By reducing the REST idea to the HTTP protocol and the URL standard, the chimera of a “RESTful API” was born, of which [nobody knows the definition](https://restfulapi.net/).
The URL format is governed by a [separate standard](https://url.spec.whatwg.org/) developed by an independent body known as the Web Hypertext Application Technology Working Group (WHATWG). The concepts of URLs and Uniform Resource Names (URNs) together constitute a more general entity called Uniform Resource Identifiers (URIs). (The difference between the two is that a URL allows for *locating* a resource within the framework of some protocol whereas a URN is an “internal” entity name that does not provide information on how to find the resource.)
Do we want to say that REST is a meaningful concept? Definitely not. We only aimed to explain that it allows for quite a broad range of interpretations, which is simultaneously its main power and its main weakness.
URLs can be decomposed into sub-components, each of which is optional. While the standard enumerates a number of legacy practices, such as passing logins and passwords in URLs or using non-UTF encodings, we will skip discussing those. Instead, we will focus on the following components that are relevant to the topic of HTTP API design:
* A scheme: a protocol to access the resource (in our case it is always `https:`)
* A host: a top-level address unit in the form of either a domain name or an IP address. A host might contain subdomains.
* A port.
* A path: a URL part between the host (including port) and the `?` or `#` symbols or the end of the line.
* The path itself is usually decomposed into parts using the `/` symbol as a delimiter. However, the standard does not define any semantics for it.
* Two paths, one ending with `/` and one without it (for example, `/root/leaf` and `/root/leaf/`), are considered different paths according to the standard. Conversely, two URLs that differ only in trailing slashes in their paths are considered different as well. However, we are unaware of a single argument to differentiate such URLs in practice.
* Paths may contain `.` and `..` parts, which are supposed to be interpreted similarly to analogous symbols in file paths (meaning that `/root/leaf`, `/root/./leaf`, and `/root/branch/../leaf` are equivalent).
* A query: a URL part between the `?` symbol and either `#` or the end of the line.
* A query is usually decomposed into `key=value` pairs split by the `&` character. Again, the standard does not require this or define the semantics.
* Nor does the standard imply any normalization of the ordering. URLs that differ only in the order of keys in the queries are considered different.
* A fragment (also known as an anchor): a part of a URL that follows the `#` sign.
* Fragments are usually treated as addresses within the requested document and because of that are often omitted by user agents while executing the request.
* Two URLs that only differ in fragment parts may be considered equal or not, depending on the context.
On one hand, thanks to the multitude of interpretations, the API developers have built a perhaps vague but useful view of “proper” HTTP API architecture. On the other hand, the lack of concrete definitions has made REST API one of the most “holywar”-inspiring topics, and these holywars are usually quite meaningless as the popular REST concept has nothing to do with the REST described in Fielding's dissertation (and even more so, with the REST described in Fielding's manifesto of 2008).
In HTTP requests, the scheme, host, and port are usually (but not always) omitted and presumed to be equal to the connection parameters. (Fielding actually names this arrangement one of the biggest flaws in the protocol design.)
Traditionally, it is implied that paths describe a strict hierarchy of resource subordination (for example, the URL of a specific coffee machine in our API could look like `places/{id}/coffee-machines/{id}`, since a coffee machine strictly belongs to one coffee shop), while query parameters express non-strict hierarchies and operation parameters (for example, the URL for searching listings could look like `search?location=<map point>`).
Additionally, the standard contains rules for serializing, normalizing, and comparing URLs, knowing which can be useful for an HTTP API developer.
##### Headers
Headers contain *metadata* associated with a request or a response. They might describe properties of entities being passed (e.g., `Content-Length`), provide additional information regarding a client or a server (e.g., `User-Agent`, `Date`, etc.) or simply contain additional fields that are not directly related to the request/response semantics (such as `Authorization`).
The important feature of headers is the possibility to read them before the message body is fully transmitted. This allows for altering request or response handling depending on the headers, and it is perfectly fine to manipulate headers while proxying requests. Many network agents actually do this, i.e., add, remove, or modify headers while proxying requests. In particular, modern web browsers automatically add a number of technical headers, such as `User-Agent`, `Origin`, `Accept-Language`, `Connection`, `Referer`, `Sec-Fetch-*`, etc., and modern server software automatically adds or modifies such headers as `X-Powered-By`, `Date`, `Content-Length`, `Content-Encoding`, `X-Forwarded-For`, etc.
This freedom in manipulating headers can result in unexpected problems if an API uses them to transmit data as the field names developed by an API vendor can accidentally overlap with existing conventional headers, or worse, such a collision might occur in the future at any moment. To avoid this issue, the practice of adding the prefix `X-` to custom header names was frequently used in the past. More than ten years ago this practice was officially discouraged (see the detailed overview in [RFC 6648](https://www.rfc-editor.org/rfc/rfc6648)). Nevertheless, the prefix has not been fully abolished, and many semi-standard headers still contain it (notably, `X-Forwarded-For`). Therefore, using the `X-` prefix reduces the probability of collision but does not eliminate it. The same RFC reasonably suggests using the API vendor name as a prefix instead of `X-`. (We would rather recommend using both, i.e., sticking to the `X-ApiName-FieldName` format. Here `X-` is included for readability [to distinguish standard fields from custom ones], and the company or API name part helps avoid collisions with other non-standard header names).
Additionally, headers are used as control flow instructions for so-called “content negotiation,” which allows the client and server to agree on a response format (through `Accept*` headers) and to perform conditional requests that aim to reduce traffic by skipping response bodies, either fully or partially (through `If-*` headers, such as `If-Range`, `If-Modified-Since`, etc.)
##### HTTP Verbs
One important component of an HTTP request is a method (verb) that describes the operation being applied to a resource. RFC 9110 standardizes eight verbs — namely, `GET`, `POST`, `PUT`, `DELETE`, `HEAD`, `CONNECT`, `OPTIONS`, and `TRACE` — of which we as API developers are interested in the former four. The `CONNECT`, `OPTIONS`, and `TRACE` methods are technical and rarely used in HTTP APIs (except for `OPTIONS`, which needs to be implemented to ensure access to the API from a web browser). Theoretically, the `HEAD` verb, which allows for requesting *resource metadata only*, might be quite useful in API design. However, for reasons unknown to us, it did not take root in this capacity.
Apart from RFC 9110, many other specifications propose additional HTTP verbs, such as `COPY`, `LOCK`, `SEARCH`, etc. — the full list can be found in [the registry](http://www.iana.org/assignments/http-methods/http-methods.xhtml). However, only one of them gained widespread popularity — the `PATCH` method. The reasons for this state of affairs are quite trivial: the five methods (`GET`, `POST`, `PUT`, `DELETE`, and `PATCH`) are enough for almost any API.
HTTP verbs define two important characteristics of an HTTP call:
* Semantics: what the operation *means*
* Side effects:
* Whether the request modifies any resource state or if it is safe (and therefore, could it be cached)
* Whether the request is idempotent or not.
| Verb | Semantics | Is safe (non-modifying) | Is idempotent | Can have a body |
|------|-----------|-------------------------|---------------|------------|
| GET | Returns a representation of a resource | Yes | Yes | Should not |
| PUT | Replaces (fully overwrites) a resource with a provided entity | No | Yes | Yes |
| DELETE | Deletes a resource | No | Yes | Should not |
| POST | Processes a provided entity according to its internal semantics | No | No | Yes |
| PATCH | Modifies (partially overwrites) a resource with a provided entity | No | No | Yes |
**NB**: contrary to a popular misconception, the `POST` method is not limited to creating new resources.
The most important property of modifying idempotent verbs is that **the URL serves as an idempotency key for the request**. The `PUT /url` operation fully overwrites a resource, so repeating the request won't change the resource. Conversely, retrying a `DELETE /url` request must leave the system in the same state where the `/url` resource is deleted. Regarding the `GET /url` method, it must semantically return the representation of the same target resource `/url`. If it exists, its implementation must be consistent with prior `PUT` / `DELETE` operations. If the resource was overwritten via `PUT /url`, a subsequent `GET /url` call must return a representation that matches the entity enclosed in the `PUT /url` request. In the case of JSON-over-HTTP APIs, this simply means that `GET /url` returns the same data as what was passed in the preceding `PUT /url`, possibly normalized and equipped with default values. On the other hand, a `DELETE /url` call must remove the resource, resulting in subsequent `GET /url` requests returning a `404` or `410` error.
The idempotency and symmetry of the `GET` / `PUT` / `DELETE` methods imply that neither `GET` nor `DELETE` can have a body as no reasonable meaning could be associated with it. However, most web server software allows these methods to have bodies and transmits them further to the endpoint handler, likely because many software engineers are unaware of the semantics of the verbs (although we strongly discourage relying on this behavior).
For obvious reasons, responses to modifying endpoints are not cached (though there are some conditions to use a response to a `POST` request as cached data for subsequent `GET` requests). This ensures that repeating `POST` / `PUT` / `DELETE` / `PATCH` requests will hit the server as no intermediary agent can respond with a cached result. In the case of a `GET` request, it is generally not true. Only the presence of `no-store` or `no-cache` directives in the response guarantees that the subsequent `GET` request will reach the server.
One of the most widespread HTTP API design antipatterns is violating the semantics of HTTP verbs:
* Placing modifying operations in a `GET` handler. This can lead to the following problems:
* Interim agents might respond to such a request using a cached value if a required caching directive is missing, or vice versa, automatically repeat a request upon receiving a network timeout.
* Some agents consider themselves eligible to traverse hyper-references (i.e., making HTTP `GET` requests) without the explicit user's consent. For example, social networks and messengers perform such calls to generate a preview for a link when a user tries to share it.
* Placing non-idempotent operations in `PUT` / `DELETE` handlers. Although interim agents do not typically repeat modifying requests regardless of their alleged idempotency, a client or server framework can easily do so. This mistake is often coupled with requiring passing a body alongside a `DELETE` request to discern the specific object that needs to be deleted, which per se is a problem as any interim agent might discard such a body.
* Ignoring the `GET` / `PUT` / `DELETE` symmetry requirement. This can manifest in different ways, such as:
* Making a `GET /url` operation return data even after a successful `DELETE /url` call
* Making a `PUT /url` operation take the identifiers of the entities to modify from the request body instead of the URL, resulting in the `GET /url` operation's inability to return a representation of the entity passed to the `PUT /url` handler.
##### Status Codes
A status code is a machine-readable three-digit number that describes the outcome of an HTTP request. There are five groups of status codes:
* `1xx` codes are informational. Among these, the `100 Continue` code is probably the only one that is commonly used.
* `2xx` codes indicate that the operation was successful.
* `3xx` codes are redirection codes, implying that additional actions must be taken to consider the operation fully successful.
* `4xx` codes represent client errors
* `5xx` codes represent server errors.
**NB**: the separation of codes into groups by the first digit is of practical importance. If the client is unaware of the meaning of an `xyz` code returned by the server, it must conduct actions as if an `x00` code was received.
The idea behind status codes is obviously to make errors machine-readable so that all interim agents can detect what has happened with a request. The HTTP status code nomenclature effectively describes nearly every problem applicable to an HTTP request, such as invalid `Accept-*` header values, missing `Content-Length`, unsupported HTTP verbs, excessively long URIs, etc.
Unfortunately, the HTTP status code nomenclature is not well-suited for describing errors in *business logic*. To return machine-readable errors related to the semantics of the operation, it is necessary either to use status codes unconventionally (i.e., in violation of the standard) or to enrich responses with additional fields. Designing custom errors in HTTP APIs will be discussed in the corresponding chapter.
**NB**: note the problem with the specification design. By default, all `4xx` codes are non-cacheable, but there are several exceptions, namely the `404`, `405`, `410`, and `414` codes. While we believe this was done with good intentions, the number of developers aware of this nuance is likely to be similar to the number of HTTP specification editors.
#### One Important Remark Regarding Caching
Caching is a crucial aspect of modern microservice architecture design. It can be tempting to control caching at the protocol level, and the HTTP standard provides various tools to facilitate this. However, the author of this book must warn you: if you decide to utilize these tools, it is essential to thoroughly understand the standard. Flaws in the implementation of certain techniques can result in disruptive behavior. The author personally experienced a major outage caused by the aforementioned lack of knowledge regarding the default cacheability of `404` responses. In this incident, some settings for an important geographical area were mistakenly deleted. Although the problem was quickly localized and the settings were restored, the service remained inoperable in the area for several hours because clients had cached the `404` response and did not request it anew until the cache had expired.
#### One Important Remark Regarding Consistency
One parameter might be placed in different components of an HTTP request. For example, an identifier of a partner making a request might be passed as part of:
* A domain name, e.g., `{partner_id}.domain.tld`
* A path, e.g., `/v1/{partner_id}/orders`
* A query parameter, e.g. `/v1/orders?partner_id=<partner_id>`
* A header value, e.g.
```
GET /v1/orders HTTP/1.1
X-ApiName-Partner-Id: <partner_id>
```
* A field within the request body, e.g.
```
POST /v1/orders/retrieve HTTP/1.1
{
"partner_id": <partner_id>
}
```
There are also more exotic options, such as placing a parameter in the scheme of a request or in the `Content-Type` header.
However, when we move a parameter around different components, we face three annoying issues:
* Some tokens are case-sensitive (path, query parameters, JSON field names), while others are not (domain and header names)
* With header *values*, there is even more chaos: some of them are required to be case-insensitive (e.g., `Content-Type`), while others are prescribed to be case-sensitive (e.g., `ETag`)
* Allowed symbols and escaping rules differ as well:
* Notably, there is no widespread practice for escaping the `/`, `?`, and `#` symbols in a path
* Unicode symbols in domain names are allowed (though not universally supported) through a peculiar encoding technique called “[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)”
* Traditionally, different casings are used in different parts of an HTTP request:
* `kebab-case`in domains, headers, and paths
* `snake_case` in query parameters
* `snake_case` or `camelCase` in request bodies.
Furthermore, using both `snake_case` and `camelCase` in domain names is impossible as the underscore sign is not allowed and capital letters will be lowercased during URL normalization.
Theoretically, it is possible to use `kebab-case` everywhere. However, most programming languages do not allow variable names and object fields in `kebab-case`, so working with such an API would be quite inconvenient.
To wrap this up, the situation with casing is so spoiled and convoluted that there is no consistent solution to employ. In this book, we follow this rule: tokens are cased according to the common practice for the corresponding request component. If a token's position changes, the casing is changed as well. (However, we're far from recommending following this approach unconditionally. Our recommendation is rather to try to avoid increasing the entropy by choosing a solution that minimizes the probability of misunderstanding.)
**NB**: strictly speaking, JSON stands for “JavaScript Object Notation,” and in JavaScript, the default casing is `camelCase`. However, we dare to say that JSON ceased to be a format bound to JavaScript long ago and is now a universal format for organizing communication between agents written in different programming languages. Employing `camel_case` allows for easily moving a parameter from a query to a body, which is the most frequent case. Although the inverse solution (i.e., using `camelCase` in query parameter names) is also possible.
The terms “REST architectural style” and its derivative “REST API” will not be used in the following chapters since it makes no practical sense as we explained above. We referred to the constraints described by Fielding many times in the previous chapters because, let us emphasize it once more, it is impossible to develop distributed client-server APIs without taking them into account. However, HTTP APIs (meaning JSON-over-HTTP endpoints utilizing the semantics described in the HTTP and URL standards) as we will describe them in the following chapter align well with the “average” understanding of “REST / RESTful API” as per numerous tutorials on the Web.

View File

@ -1,85 +1,172 @@
### [Advantages and Disadvantages of HTTP APIs][http-api-pros-and-cons]
### [Components of an HTTP Request and Their Semantics][http-api-requests-semantics]
After the three introductory chapters dedicated to clarifying terms and concepts (that's the price of the popularity of the technology) the reader might wonder why this dichotomy exists in the first place, i.e., why do some HTTP APIs rely on HTTP semantics while others reject it in favor of custom arrangements? For example, if we consider the [JSON-RPC response format](https://www.jsonrpc.org/specification#response_object), we will quickly notice that it might be replaced with standard HTTP protocol functionality. Instead of this:
The important exercise we must conduct is to describe the format of an HTTP request and response and explain the basic concepts. Many of these may seem obvious to the reader. However, the situation is that even the basic knowledge we require to move further is scattered across vast and fragmented documentation, causing even experienced developers to struggle with some nuances. Below, we will try to compile a structured overview that is sufficient to design HTTP APIs.
To describe the semantics and formats, we will refer to the brand-new [RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html), which replaces no fewer than nine previous specifications dealing with different aspects of the technology. However, a significant volume of additional functionality is still covered by separate standards. In particular, the HTTP caching principles are described in the standalone [RFC 9111](https://www.rfc-editor.org/rfc/rfc9111.html), while the popular `PATCH` method is omitted in the main RFC and is regulated by [RFC 5789](https://www.rfc-editor.org/rfc/rfc5789.html).
An HTTP request consists of (1) applying a specific verb to a URL, stating (2) the protocol version, (3) additional meta-information in headers, and (4) optionally, some content (request body):
```
HTTP/1.1 200 OK
POST /v1/orders HTTP/1.1
Host: our-api-host.tld
Content-Type: application/json
{
"jsonrpc": "2.0",
"id",
"error": {
"code": -32600,
"message": "Invalid request"
}
"coffee_machine_id": 123,
"currency_code": "MNT",
"price": "10.23",
"recipe": "lungo",
"offer_id": 321,
"volume": "800ml"
}
```
the server could have simply responded with `400 Bad Request`, passing the request identifier as a custom header like `X-OurCoffeeAPI-RequestId`. Nevertheless, protocol designers decided to introduce their own custom format.
An HTTP response to such a request includes (1) the protocol version, (2) a status code with a corresponding message, (3) response headers, and (4) optionally, response content (body):
This situation (not only with JSON-RPC but with essentially every high-level protocol built on top of HTTP) has developed due to various reasons. Some of them are historical (such as the inability to use many HTTP protocol features in early implementations of the `XMLHttpRequest` functionality in web browsers). However, new RPC protocols relying on the bare minimum of HTTP capabilities continue to emerge today.
```
HTTP/1.1 201 Created
Location: /v1/orders/123
Content-Type: application/json
To answer this question, we must emphasize a very important distinction between application-level protocols (such as JSON-RPC in our case) and pure HTTP. A `400 BadRequest` is a transparent status for every intermediary network agent but a JSON-RPC custom error is not. Firstly, only a JSON-RPC-enabled client can read it. Secondly, and more importantly, in JSON-RPC, the request status *is not metadata*. In pure HTTP, the details of the operation, such as the method, requested URL, execution status, and request / response headers are readable *without the necessity to parse the entire body*. In most higher-level protocols, including JSON-RPC, this is not the case: even a protocol-enabled client must read a body to retrieve that information.
{
"id": 123
}
```
How does an API developer benefit from the capability of reading request and response metadata? The modern client-server communication stack, as envisioned by Fielding, is multi-layered. We can enumerate a number of intermediary agents that process network requests and responses:
* Frameworks that developers use to write code
* Programming language APIs that frameworks are built on, and operating system APIs that compilers / interpreters of these languages rely on
* Intermediary proxy servers between a client and a server
* Various abstractions used in server programming, including server frameworks, programming languages, and operating systems
* Web server software that is typically placed in front of backend handlers
* Additional modern microservice-oriented tools such as API gateways and proxies
**NB**: in HTTP/2 (and future HTTP/3), separate binary frames are used for headers and data instead of the holistic text format. However, this doesn't affect the architectural concepts we will describe below. To avoid ambiguity, we will provide examples in the HTTP/1.1 format. You can find detailed information about the HTTP/2 format [here](https://hpbn.co/http2/).
The main advantage that following the letter of the HTTP standard offers is the possibility to rely on intermediary agents, from client frameworks to API gateways, to read the request metadata and perform actions based on it. This includes regulating timeouts and retry policies, logging, proxying, and sharding requests, among other things, without the necessity to write additional code to achieve these functionalities.
##### A URL
If we try to formulate the main principle of designing HTTP APIs, it will be: **you would rather design an API in a way that intermediary agents can read and interpret request and response metadata**.
A Uniform Resource Locator (URL) is an addressing unit in an HTTP API. Some evangelists of the technology even use the term “URL space” as a synonym for “The World Wide Web.” It is expected that a proper HTTP API should employ an addressing system that is as granular as the subject area itself; in other words, each entity that the API can manipulate should have its own URL.
Unlike many alternative technologies, which are usually driven by a single large IT company (such as Facebook, Google, or Apache Software Foundation), tools for working with HTTP APIs are developed by many different companies and collaborations. As a result, instead of a single homogeneous but limited toolkit provided by a single vendor, we have a plethora of instruments that work with HTTP APIs. The most notable examples include:
* Proxies and gateways (nginx, Envoy, etc.)
* Different IDLs (first of all, OpenAPI) and related tools for working with specifications (Redoc, Swagger UI, etc.) and auto-generating code
* Programmer-oriented software that allows for convenient development and debugging of API clients (Postman, Insomnia), etc.
The URL format is governed by a [separate standard](https://url.spec.whatwg.org/) developed by an independent body known as the Web Hypertext Application Technology Working Group (WHATWG). The concepts of URLs and Uniform Resource Names (URNs) together constitute a more general entity called Uniform Resource Identifiers (URIs). (The difference between the two is that a URL allows for *locating* a resource within the framework of some protocol whereas a URN is an “internal” entity name that does not provide information on how to find the resource.)
Of course, most of these instruments will work with APIs that utilize other paradigms. However, the ability to read HTTP metadata makes it possible to easily design complex pipelines such as exporting nginx access logs to Prometheus and generating response status code monitoring dashboards in Grafana that work out of the box.
URLs can be decomposed into sub-components, each of which is optional. While the standard enumerates a number of legacy practices, such as passing logins and passwords in URLs or using non-UTF encodings, we will skip discussing those. Instead, we will focus on the following components that are relevant to the topic of HTTP API design:
* A scheme: a protocol to access the resource (in our case it is always `https:`)
* A host: a top-level address unit in the form of either a domain name or an IP address. A host might contain subdomains.
* A port.
* A path: a URL part between the host (including port) and the `?` or `#` symbols or the end of the line.
* The path itself is usually decomposed into parts using the `/` symbol as a delimiter. However, the standard does not define any semantics for it.
* Two paths, one ending with `/` and one without it (for example, `/root/leaf` and `/root/leaf/`), are considered different paths according to the standard. Conversely, two URLs that differ only in trailing slashes in their paths are considered different as well. However, we are unaware of a single argument to differentiate such URLs in practice.
* Paths may contain `.` and `..` parts, which are supposed to be interpreted similarly to analogous symbols in file paths (meaning that `/root/leaf`, `/root/./leaf`, and `/root/branch/../leaf` are equivalent).
* A query: a URL part between the `?` symbol and either `#` or the end of the line.
* A query is usually decomposed into `key=value` pairs split by the `&` character. Again, the standard does not require this or define the semantics.
* Nor does the standard imply any normalization of the ordering. URLs that differ only in the order of keys in the queries are considered different.
* A fragment (also known as an anchor): a part of a URL that follows the `#` sign.
* Fragments are usually treated as addresses within the requested document and because of that are often omitted by user agents while executing the request.
* Two URLs that only differ in fragment parts may be considered equal or not, depending on the context.
Additionally, let's emphasize that the HTTP API paradigm is currently the default choice for *public* APIs. Because of the aforementioned reasons, partners can integrate an HTTP API without significant obstacles, regardless of their technological stack. Moreover, the prevalence of the technology lowers the entry barrier and the requirements for the qualification of partners' engineers.
In HTTP requests, the scheme, host, and port are usually (but not always) omitted and presumed to be equal to the connection parameters. (Fielding actually names this arrangement one of the biggest flaws in the protocol design.)
The main disadvantage of HTTP APIs is that you have to rely on intermediary agents, from client frameworks to API gateways, to read the request metadata and perform actions based on it *without your consent*. This includes regulating timeouts and retry policies, logging, proxying, and sharding requests, among other things. Since HTTP-related specifications are complex and the concepts of REST can be challenging to comprehend, and software engineers do not always write perfect code, these intermediary agents (including partners' developers!) will sometimes interpret HTTP metadata *incorrectly*, especially when dealing with exotic and hard-to-implement standards. Usually, one of the stated reasons for developing new RPC frameworks is the desire to make working with the protocol simple and consistent, thereby reducing the likelihood of errors when writing integration code.
Traditionally, it is implied that paths describe a strict hierarchy of resource subordination (for example, the URL of a specific coffee machine in our API could look like `places/{id}/coffee-machines/{id}`, since a coffee machine strictly belongs to one coffee shop), while query parameters express non-strict hierarchies and operation parameters (for example, the URL for searching listings could look like `search?location=<map point>`).
This conclusion applies not only to software but also to its creators. Developers' knowledge of HTTP APIs is fragmented as well. Almost every programmer is capable of working with HTTP APIs to some extent, but a significant number of them lack a thorough understanding of the standards and do not consult them while writing code. As a result, implementing business logic that effectively and consistently works with HTTP APIs can be more challenging than integrating alternative technologies. This observation holds true for both partner integrators and API providers themselves.
Additionally, the standard contains rules for serializing, normalizing, and comparing URLs, knowing which can be useful for an HTTP API developer.
#### The Question of Performance
##### Headers
When discussing the advantages of alternative technologies such as GraphQL, gRPC, Apache Thrift, etc., the argument of lower performance of JSON-over-HTTP APIs is often presented. Specifically, the following issues with the technology are commonly mentioned:
1. The verbosity of the JSON format:
* Mandatory field names in every object, even for an array of similar entities
* The large proportion of technical symbols (quotes, braces, commas, etc.) and the necessity to escape them in string values
2. The common approach of returning a full resource representation on resource retrieval requests, even if the client only needs a subset of the fields
3. The lower performance of data serializing and deserializing operations
4. The need to introduce additional encoding, such as Base64, to handle binary data
5. Performance quirks of the HTTP protocol itself, particularly the inability to serve multiple simultaneous requests through one connection.
Headers contain *metadata* associated with a request or a response. They might describe properties of entities being passed (e.g., `Content-Length`), provide additional information regarding a client or a server (e.g., `User-Agent`, `Date`, etc.) or simply contain additional fields that are not directly related to the request/response semantics (such as `Authorization`).
Let's be honest: HTTP APIs do suffer from the listed problems. However, we can confidently say that the impact of these factors is often overestimated. The reason API vendors care little about HTTP API performance is that the actual overhead is not as significant as perceived. Specifically:
The important feature of headers is the possibility to read them before the message body is fully transmitted. This allows for altering request or response handling depending on the headers, and it is perfectly fine to manipulate headers while proxying requests. Many network agents actually do this, i.e., add, remove, or modify headers while proxying requests. In particular, modern web browsers automatically add a number of technical headers, such as `User-Agent`, `Origin`, `Accept-Language`, `Connection`, `Referer`, `Sec-Fetch-*`, etc., and modern server software automatically adds or modifies such headers as `X-Powered-By`, `Date`, `Content-Length`, `Content-Encoding`, `X-Forwarded-For`, etc.
1. Regarding the verbosity of the format, it is important to note that these issues are mainly relevant when compressing algorithms are not utilized. [Comparisons](https://nilsmagnus.github.io/post/proto-json-sizes/) have shown that enabling compression algorithms such as gzip largely reduces the difference in sizes between JSON documents and alternative binary formats (and there are compression algorithms specifically designed for processing text data, such as [brotli](https://datatracker.ietf.org/doc/html/rfc7932)).
This freedom in manipulating headers can result in unexpected problems if an API uses them to transmit data as the field names developed by an API vendor can accidentally overlap with existing conventional headers, or worse, such a collision might occur in the future at any moment. To avoid this issue, the practice of adding the prefix `X-` to custom header names was frequently used in the past. More than ten years ago this practice was officially discouraged (see the detailed overview in [RFC 6648](https://www.rfc-editor.org/rfc/rfc6648)). Nevertheless, the prefix has not been fully abolished, and many semi-standard headers still contain it (notably, `X-Forwarded-For`). Therefore, using the `X-` prefix reduces the probability of collision but does not eliminate it. The same RFC reasonably suggests using the API vendor name as a prefix instead of `X-`. (We would rather recommend using both, i.e., sticking to the `X-ApiName-FieldName` format. Here `X-` is included for readability [to distinguish standard fields from custom ones], and the company or API name part helps avoid collisions with other non-standard header names).
2. If necessary, API designers can customize the list of returned fields in HTTP APIs. It aligns well with both the letter and the spirit of the standard. However, as we already explained to the reader in the “[Partial Updates](#api-patterns-partial-updates)” chapter, trying to minimize traffic by returning only subsets of data is rarely justified in well-designed APIs.
Additionally, headers are used as control flow instructions for so-called “content negotiation,” which allows the client and server to agree on a response format (through `Accept*` headers) and to perform conditional requests that aim to reduce traffic by skipping response bodies, either fully or partially (through `If-*` headers, such as `If-Range`, `If-Modified-Since`, etc.)
3. If standard JSON deserializers are used, the overhead compared to binary standards might indeed be significant. However, if this overhead is a real problem, it makes sense to consider alternative JSON serializers such as [simdjson](https://github.com/simdjson/simdjson). Due to their low-level and highly optimized code, simdjson demonstrates impressive throughput which would be suitable for all APIs except some corner cases.
##### HTTP Verbs
4. Generally speaking, the HTTP API paradigm implies that binary data (such as images or video files) is served through separate endpoints. Returning binary data in JSON is only necessary when a separate request for the data is a problem from the performance perspective. These situations are virtually non-existent in server-to-server interactions and/or if HTTP/2 or a higher protocol version is used.
One important component of an HTTP request is a method (verb) that describes the operation being applied to a resource. RFC 9110 standardizes eight verbs — namely, `GET`, `POST`, `PUT`, `DELETE`, `HEAD`, `CONNECT`, `OPTIONS`, and `TRACE` — of which we as API developers are interested in the former four. The `CONNECT`, `OPTIONS`, and `TRACE` methods are technical and rarely used in HTTP APIs (except for `OPTIONS`, which needs to be implemented to ensure access to the API from a web browser). Theoretically, the `HEAD` verb, which allows for requesting *resource metadata only*, might be quite useful in API design. However, for reasons unknown to us, it did not take root in this capacity.
5. The HTTP/1.1 protocol version is indeed a suboptimal solution if request multiplexing is needed. However, alternate approaches to tackling the problem usually rely on… HTTP/2. Of course, an HTTP API can also be served over HTTP/2.
Apart from RFC 9110, many other specifications propose additional HTTP verbs, such as `COPY`, `LOCK`, `SEARCH`, etc. — the full list can be found in [the registry](http://www.iana.org/assignments/http-methods/http-methods.xhtml). However, only one of them gained widespread popularity — the `PATCH` method. The reasons for this state of affairs are quite trivial: the five methods (`GET`, `POST`, `PUT`, `DELETE`, and `PATCH`) are enough for almost any API.
Let us reiterate once more: JSON-over-HTTP APIs are *indeed* less performative than binary protocols. Nevertheless, we take the liberty to say that for a well-designed API in common subject areas switching to alternative protocols will generate quite a modest profit.
HTTP verbs define two important characteristics of an HTTP call:
* Semantics: what the operation *means*
* Side effects:
* Whether the request modifies any resource state or if it is safe (and therefore, could it be cached)
* Whether the request is idempotent or not.
#### Advantages and Disadvantages of the JSON Format
| Verb | Semantics | Is safe (non-modifying) | Is idempotent | Can have a body |
|------|-----------|-------------------------|---------------|------------|
| GET | Returns a representation of a resource | Yes | Yes | Should not |
| PUT | Replaces (fully overwrites) a resource with a provided entity | No | Yes | Yes |
| DELETE | Deletes a resource | No | Yes | Should not |
| POST | Processes a provided entity according to its internal semantics | No | No | Yes |
| PATCH | Modifies (partially overwrites) a resource with a provided entity | No | No | Yes |
It's not hard to notice that most of the claims regarding HTTP API performance are actually not about the HTTP protocol but the JSON format. There is no problem in developing an HTTP API that will utilize any binary format (including, for instance, [Protocol Buffers](https://protobuf.dev/)). Then the difference between a Protobuf-over-HTTP API and a gRPC API would be just about using granular URLs, status codes, request/response headers, and the ensuing (in)ability to use integrated software tools out of the box.
**NB**: contrary to a popular misconception, the `POST` method is not limited to creating new resources.
However, on many occasions (including this book) developers prefer the textual JSON over binary Protobuf (Flatbuffers, Thrift, Avro, etc.) for a very simple reason: JSON is easy to read. First, it's a text format and doesn't require additional decoding. Second, it's self-descriptive, meaning that property names are included. Unlike Protobuf-encoded messages which are basically impossible to read without a `.proto` file, one can make a very good guess what a JSON document is about at a glance. Provided that request metadata in HTTP APIs is readable as well, we ultimately get a communication format that is easy to parse and understand with just our eyes.
The most important property of modifying idempotent verbs is that **the URL serves as an idempotency key for the request**. The `PUT /url` operation fully overwrites a resource, so repeating the request won't change the resource. Conversely, retrying a `DELETE /url` request must leave the system in the same state where the `/url` resource is deleted. Regarding the `GET /url` method, it must semantically return the representation of the same target resource `/url`. If it exists, its implementation must be consistent with prior `PUT` / `DELETE` operations. If the resource was overwritten via `PUT /url`, a subsequent `GET /url` call must return a representation that matches the entity enclosed in the `PUT /url` request. In the case of JSON-over-HTTP APIs, this simply means that `GET /url` returns the same data as what was passed in the preceding `PUT /url`, possibly normalized and equipped with default values. On the other hand, a `DELETE /url` call must remove the resource, resulting in subsequent `GET /url` requests returning a `404` or `410` error.
Apart from being human-readable, JSON features another important advantage: it is strictly formal meaning it does not contain any constructs that can be interpreted differently in different architectures (with a possible exception of the sizes of numbers and strings), and the deserialization result is aligned very well with native data structures (i.e., indexed and associative arrays) of almost every programming language. From this point of view, we actually had no other choice when selecting a format for code samples in this book.
The idempotency and symmetry of the `GET` / `PUT` / `DELETE` methods imply that neither `GET` nor `DELETE` can have a body as no reasonable meaning could be associated with it. However, most web server software allows these methods to have bodies and transmits them further to the endpoint handler, likely because many software engineers are unaware of the semantics of the verbs (although we strongly discourage relying on this behavior).
If you happen to design a less general API for a specific subject area, we still recommend the same approach for choosing a format:
* Estimate the overhead of preparing and introducing tools to decipher binary protocols versus the overhead of using not the most optimal data transfer protocols.
* Make an assessment of what is more important to you: having a quality but restricted in its capabilities set of bundled software or having the possibility of using a broad range of tools that work with HTTP APIs, even though their quality is not that high.
* Evaluate the cost of finding developers proficient with the format.
For obvious reasons, responses to modifying endpoints are not cached (though there are some conditions to use a response to a `POST` request as cached data for subsequent `GET` requests). This ensures that repeating `POST` / `PUT` / `DELETE` / `PATCH` requests will hit the server as no intermediary agent can respond with a cached result. In the case of a `GET` request, it is generally not true. Only the presence of `no-store` or `no-cache` directives in the response guarantees that the subsequent `GET` request will reach the server.
One of the most widespread HTTP API design antipatterns is violating the semantics of HTTP verbs:
* Placing modifying operations in a `GET` handler. This can lead to the following problems:
* Interim agents might respond to such a request using a cached value if a required caching directive is missing, or vice versa, automatically repeat a request upon receiving a network timeout.
* Some agents consider themselves eligible to traverse hyper-references (i.e., making HTTP `GET` requests) without the explicit user's consent. For example, social networks and messengers perform such calls to generate a preview for a link when a user tries to share it.
* Placing non-idempotent operations in `PUT` / `DELETE` handlers. Although interim agents do not typically repeat modifying requests regardless of their alleged idempotency, a client or server framework can easily do so. This mistake is often coupled with requiring passing a body alongside a `DELETE` request to discern the specific object that needs to be deleted, which per se is a problem as any interim agent might discard such a body.
* Ignoring the `GET` / `PUT` / `DELETE` symmetry requirement. This can manifest in different ways, such as:
* Making a `GET /url` operation return data even after a successful `DELETE /url` call
* Making a `PUT /url` operation take the identifiers of the entities to modify from the request body instead of the URL, resulting in the `GET /url` operation's inability to return a representation of the entity passed to the `PUT /url` handler.
##### Status Codes
A status code is a machine-readable three-digit number that describes the outcome of an HTTP request. There are five groups of status codes:
* `1xx` codes are informational. Among these, the `100 Continue` code is probably the only one that is commonly used.
* `2xx` codes indicate that the operation was successful.
* `3xx` codes are redirection codes, implying that additional actions must be taken to consider the operation fully successful.
* `4xx` codes represent client errors
* `5xx` codes represent server errors.
**NB**: the separation of codes into groups by the first digit is of practical importance. If the client is unaware of the meaning of an `xyz` code returned by the server, it must conduct actions as if an `x00` code was received.
The idea behind status codes is obviously to make errors machine-readable so that all interim agents can detect what has happened with a request. The HTTP status code nomenclature effectively describes nearly every problem applicable to an HTTP request, such as invalid `Accept-*` header values, missing `Content-Length`, unsupported HTTP verbs, excessively long URIs, etc.
Unfortunately, the HTTP status code nomenclature is not well-suited for describing errors in *business logic*. To return machine-readable errors related to the semantics of the operation, it is necessary either to use status codes unconventionally (i.e., in violation of the standard) or to enrich responses with additional fields. Designing custom errors in HTTP APIs will be discussed in the corresponding chapter.
**NB**: note the problem with the specification design. By default, all `4xx` codes are non-cacheable, but there are several exceptions, namely the `404`, `405`, `410`, and `414` codes. While we believe this was done with good intentions, the number of developers aware of this nuance is likely to be similar to the number of HTTP specification editors.
#### One Important Remark Regarding Caching
Caching is a crucial aspect of modern microservice architecture design. It can be tempting to control caching at the protocol level, and the HTTP standard provides various tools to facilitate this. However, the author of this book must warn you: if you decide to utilize these tools, it is essential to thoroughly understand the standard. Flaws in the implementation of certain techniques can result in disruptive behavior. The author personally experienced a major outage caused by the aforementioned lack of knowledge regarding the default cacheability of `404` responses. In this incident, some settings for an important geographical area were mistakenly deleted. Although the problem was quickly localized and the settings were restored, the service remained inoperable in the area for several hours because clients had cached the `404` response and did not request it anew until the cache had expired.
#### One Important Remark Regarding Consistency
One parameter might be placed in different components of an HTTP request. For example, an identifier of a partner making a request might be passed as part of:
* A domain name, e.g., `{partner_id}.domain.tld`
* A path, e.g., `/v1/{partner_id}/orders`
* A query parameter, e.g. `/v1/orders?partner_id=<partner_id>`
* A header value, e.g.
```
GET /v1/orders HTTP/1.1
X-ApiName-Partner-Id: <partner_id>
```
* A field within the request body, e.g.
```
POST /v1/orders/retrieve HTTP/1.1
{
"partner_id": <partner_id>
}
```
There are also more exotic options, such as placing a parameter in the scheme of a request or in the `Content-Type` header.
However, when we move a parameter around different components, we face three annoying issues:
* Some tokens are case-sensitive (path, query parameters, JSON field names), while others are not (domain and header names)
* With header *values*, there is even more chaos: some of them are required to be case-insensitive (e.g., `Content-Type`), while others are prescribed to be case-sensitive (e.g., `ETag`)
* Allowed symbols and escaping rules differ as well:
* Notably, there is no widespread practice for escaping the `/`, `?`, and `#` symbols in a path
* Unicode symbols in domain names are allowed (though not universally supported) through a peculiar encoding technique called “[Punycode](https://www.rfc-editor.org/rfc/rfc3492.txt)”
* Traditionally, different casings are used in different parts of an HTTP request:
* `kebab-case`in domains, headers, and paths
* `snake_case` in query parameters
* `snake_case` or `camelCase` in request bodies.
Furthermore, using both `snake_case` and `camelCase` in domain names is impossible as the underscore sign is not allowed and capital letters will be lowercased during URL normalization.
Theoretically, it is possible to use `kebab-case` everywhere. However, most programming languages do not allow variable names and object fields in `kebab-case`, so working with such an API would be quite inconvenient.
To wrap this up, the situation with casing is so spoiled and convoluted that there is no consistent solution to employ. In this book, we follow this rule: tokens are cased according to the common practice for the corresponding request component. If a token's position changes, the casing is changed as well. (However, we're far from recommending following this approach unconditionally. Our recommendation is rather to try to avoid increasing the entropy by choosing a solution that minimizes the probability of misunderstanding.)
**NB**: strictly speaking, JSON stands for “JavaScript Object Notation,” and in JavaScript, the default casing is `camelCase`. However, we dare to say that JSON ceased to be a format bound to JavaScript long ago and is now a universal format for organizing communication between agents written in different programming languages. Employing `camel_case` allows for easily moving a parameter from a query to a body, which is the most frequent case. Although the inverse solution (i.e., using `camelCase` in query parameter names) is also possible.

View File

@ -27,7 +27,7 @@ Additionally, we'd like to provide some code style advice:
* Do not place unsafe operations behind the `GET` verb, and do not place non-idempotent operations behind the `PUT` / `DELETE` methods.
* Maintain the `GET` / `PUT` / `DELETE` operations symmetry.
* Do not allow `GET` / `HEAD` / `DELETE` requests to have a body and do not provide bodies in response to `HEAD` requests or alongside the `204` status code.
* Do not invent your own standards for passing arrays and nested objects as query parameters. It is better to use an HTTP verb that allows having a body, or as a last resort pass the parameter as a base64-encoded JSON-stringified value.
* Do not invent your own standards for passing arrays and nested objects as query parameters. It is better to use an HTTP verb that allows having a body, or as a last resort pass the parameter as a Base64-encoded JSON-stringified value.
* Do not put parameters that require escaping (i.e., non-alphanumeric ones) in a path or a domain of a URL. Use query or body parameters for this purpose.
8. Familiarize yourself with at least the basics of typical vulnerabilities in HTTP APIs used by attackers, such as:

View File

@ -1,4 +1,4 @@
### [Преимущества и недостатки HTTP API. Сравнение с альтернативными технологиями][http-api-pros-and-cons]
### [Преимущества и недостатки HTTP API в сравнении с альтернативными технологиями][http-api-pros-and-cons]
По прочтению предыдущей главы у читателя может возникнуть резонный вопрос — а почему вообще существует такая дихотомия: одни API полагаются на стандартную семантику HTTP, другие полностью от неё отказываются в пользу новоизобретённых стандартов, а третьи существуют где-то посередине. Например, если мы посмотрим на [формат ответа в JSON-RPC](https://www.jsonrpc.org/specification#response_object), то мы обнаружим, что он легко мог бы быть заменён на стандартные средства протокола HTTP. Вместо