1
0
mirror of https://github.com/twirl/The-API-Book.git synced 2025-01-23 17:53:04 +02:00
This commit is contained in:
Sergey Konstantinov 2022-05-10 00:54:23 +03:00
parent 0ad3749f52
commit 751703da4f
11 changed files with 130 additions and 130 deletions

View File

@ -1,6 +1,6 @@
### On the Structure of This Book
The book you're holding in your hands comprises this Introduction and two sections: ‘The API Design’ and ‘Backwards Compatibility’.
The book you're holding in your hands comprises this Introduction and two sections: ‘The API Design’ and ‘The Backwards Compatibility’.
In Section I, we will discuss designing APIs as a concept: how to build the architecture properly, from high-level planning down to final interfaces.

View File

@ -6,9 +6,9 @@ What API *means* apart from the formal definition?
You're possibly reading this book using a Web browser. To make the browser display this page correctly, a bunch of stuff must work correctly: parsing the URL according to the specification; DNS service; TLS handshake protocol; transmitting the data over HTTP protocol; HTML document parsing; CSS document parsing; correct HTML+CSS rendering.
But those are just the tip of the iceberg. To make HTTP protocol work you need the entire network stack (comprising 4-5 or even more different level protocols) work correctly. HTML document parsing is being performed according to hundreds of different specifications. Document rendering calls the underlying operating system API, or even directly graphical processor API. And so on: down to modern CISC processor commands being implemented on top of the microcommands API.
But those are just the tip of the iceberg. To make the HTTP protocol work you need the entire network stack (comprising 4-5 or even more different level protocols) works correctly. HTML document parsing is being performed according to hundreds of different specifications. Document rendering calls the underlying operating system API, or even directly graphical processor API. And so on: down to modern CISC processor commands being implemented on top of the microcommands API.
In other words, hundreds or even thousands of different APIs must work correctly to make basic actions possible, like viewing a webpage. Modern internet technologies simply couldn't exist without these tons of API working fine.
In other words, hundreds or even thousands of different APIs must work correctly to make basic actions possible, like viewing a webpage. Modern internet technologies simply couldn't exist without these tons of APIs working fine.
**An API is an obligation**. A formal obligation to connect different programmable contexts.
@ -19,8 +19,8 @@ When I'm asked for an example of a well-designed API, I usually show a picture o
* it interconnects two areas;
* backwards compatibility being broken not a single time in two thousand years.
What differs between a Roman aqueduct and a good API is that APIs presume a contract being *programmable*. To connect two areas some *coding* is needed. The goal of this book is to help you in designing APIs which serve their purposes as solidly as a Roman aqueduct does.
What differs between a Roman aqueduct and a good API is that APIs presume a contract to be *programmable*. To connect two areas some *coding* is needed. The goal of this book is to help you in designing APIs which serve their purposes as solidly as a Roman aqueduct does.
An aqueduct also illustrates another problem of the API design: your customers are engineers themselves. You are not supplying water to end-users: suppliers are plugging their pipes to your engineering structure, building their own structures upon it. From one side, you may provide access to the water to many more people through them, not spending your time on plugging each individual house to your network. From the other side, you can't control the quality of suppliers' solutions, and you are to be blamed every time there is a water problem caused by their incompetence.
An aqueduct also illustrates another problem of the API design: your customers are engineers themselves. You are not supplying water to end-users: suppliers are plugging their pipes into your engineering structure, building their own structures upon it. From one side, you may provide access to the water to many more people through them, not spending your time plugging each individual house into your network. From the other side, you can't control the quality of suppliers' solutions, and you are to be blamed every time there is a water problem caused by their incompetence.
That's why designing the API implies a larger area of responsibility. **API is a multiplier to both your opportunities and mistakes**.

View File

@ -9,8 +9,8 @@ So, how the API design might help the developers? Quite simple: a well-designed
* the API must be readable; ideally, developers write correct code after just looking at the method nomenclature, never bothering about details (especially API implementation details!); it is also very important to mention, that not only problem solution (the ‘happy path’) should be obvious, but also possible errors and exceptions (the ‘unhappy path’);
* the API must be consistent; while developing new functionality (i.e. while using unknown API entities) developers may write new code similar to the code they already wrote using known API concepts, and this new code will work.
However static convenience and clarity of APIs is a simple part. After all, nobody seeks for making an API deliberately irrational and unreadable. When we are developing an API, we always start with clear basic concepts. Providing you've got some experience in APIs, it's quite hard to make an API core that fails to meet obviousness, readability, and consistency criteria.
However static convenience and clarity of APIs are simple parts. After all, nobody seeks for making an API deliberately irrational and unreadable. When we are developing an API, we always start with clear basic concepts. Providing you've got some experience in APIs, it's quite hard to make an API core that fails to meet obviousness, readability, and consistency criteria.
Problems begin when we start to expand our API. Adding new functionality sooner or later result in transforming once plain and simple API into a mess of conflicting concepts, and our efforts to maintain backwards compatibility lead to illogical, unobvious, and simply bad design solutions. It is partly related to an inability to predict the future in detail: your understanding of ‘fine’ APIs will change over time, both in objective terms (what problems the API is to solve, and what are the best practices) and in subjective ones too (what obviousness, readability, and consistency *really means* regarding your API).
Problems begin when we start to expand our API. Adding new functionality sooner or later results in transforming once plain and simple API into a mess of conflicting concepts, and our efforts to maintain backwards compatibility lead to illogical, unobvious, and simply bad design solutions. It is partly related to an inability to predict the future in detail: your understanding of ‘fine’ APIs will change over time, both in objective terms (what problems the API is to solve, and what are the best practices) and in subjective ones too (what obviousness, readability, and consistency *really mean* to your API design).
The principles we are explaining below are specifically oriented to making APIs evolve smoothly over time, not being turned into a pile of mixed inconsistent interfaces. It is crucial to understand that this approach isn't free: a necessity to bear in mind all possible extension variants and to preserve essential growth points means interface redundancy and possibly excessing abstractions being embedded in the API design. Besides both make developers' work harder. **Providing excess design complexities being reserved for future use makes sense only when this future actually exists for your API. Otherwise, it's simply an overengineering.**
The principles we are explaining below are specifically oriented to making APIs evolve smoothly over time, not being turned into a pile of mixed inconsistent interfaces. It is crucial to understand that this approach isn't free: a necessity to bear in mind all possible extension variants and to preserve essential growth points means interface redundancy and possibly excessing abstractions being embedded in the API design. Besides both make the developers' jobs harder. **Providing excess design complexities being reserved for future use makes sense only if this future actually exists for your API. Otherwise, it's simply an overengineering.**

View File

@ -2,7 +2,7 @@
Backwards compatibility is a *temporal* characteristic of your API. An obligation to maintain backwards compatibility is the crucial point where API development differs from software development in general.
Of course, backwards compatibility isn't an absolute. In some subject areas shipping new backwards-incompatible API versions is a routine. Nevertheless, every time you deploy new backwards-incompatible API version, the developers need to make some non-zero effort to adapt their code to the new API version. In this sense, releasing new API versions puts a sort of a ‘tax’ on customers. They must spend quite real money just to make sure their product continues working.
Of course, backwards compatibility isn't an absolute. In some subject areas shipping new backwards-incompatible API versions is a routine. Nevertheless, every time you deploy a new backwards-incompatible API version, the developers need to make some non-zero effort to adapt their code to the new API version. In this sense, releasing new API versions puts a sort of a ‘tax’ on customers. They must spend quite real money just to make sure their product continues working.
Large companies, which occupy firm market positions, could afford to imply such taxation. Furthermore, they may introduce penalties for those who refuse to adapt their code to new API versions, up to disabling their applications.

View File

@ -2,10 +2,10 @@
Here and throughout we firmly stick to [semver](https://semver.org/) principles of versioning:
1. API versions are denoted with three numbers, i.e. `1.2.3`.
1. First number (major version) increases when backwards incompatible changes in the API are shipped.
2. Second Number (minor version) increases when new functionality is added to the API, keeping backwards compatibility intact.
1. First number (major version) increases when backwards-incompatible changes in the API are shipped.
2. Second number (minor version) increases when new functionality is added to the API, keeping backwards compatibility intact.
3. Third number (patch) increases when a new API version contains bug fixes only.
Sentences ‘major API version’ and ‘new API version, containing backwards-incompatible changes’ are therefore to be considered as equivalent ones.
Sentences ‘major API version’ and ‘new API version, containing backwards-incompatible changes’ are therefore to be considered equivalent ones.
In Section II we will discuss versioning policies in more detail. In Section I, we will just use semver versions designation, specifically `v1`, `v2`, etc.

View File

@ -1,10 +1,10 @@
### Terms and Notation Keys
Software development is being characterized, among other things, by the existence of many different engineering paradigms, whose adepts sometimes are quite aggressive towards other paradigms' adepts. While writing this book we are deliberately avoiding using terms like ‘method’, ‘object’, ‘function’, and so on, using a neutral term ‘entity’ instead. ‘Entity’ means some atomic functionality unit, like class, method, object, monad, prototype (underline what you think is right).
Software development is being characterized, among other things, by the existence of many different engineering paradigms, whose adepts sometimes are quite aggressive towards other paradigms' adepts. While writing this book we are deliberately avoiding using terms like ‘method’, ‘object’, ‘function’, and so on, using the neutral term ‘entity’ instead. ‘Entity’ means some atomic functionality unit, like class, method, object, monad, prototype (underline what you think is right).
As for an entity's components, we regretfully failed to find a proper term, so we will use the words ‘fields’ and ‘methods’.
Most of the examples of APIs will be provided in a form of JSON-over-HTTP endpoints. This is some sort of notation which, as we see it, helps to describe concepts in the most comprehensible manner. `GET /v1/orders` endpoint call could easily be replaced with `orders.get()` method call, local or remote; JSON could easily be replaced with any other data format. The meaning of assertions shouldn't change.
Most of the examples of APIs will be provided in a form of JSON-over-HTTP endpoints. This is some sort of notation that, as we see it, helps to describe concepts in the most comprehensible manner. A `GET /v1/orders` endpoint call could easily be replaced with an `orders.get()` method call, local or remote; JSON could easily be replaced with any other data format. The semantics of statements shouldn't change.
Let's take a look at the following example:
@ -32,11 +32,11 @@ It should be read like this:
* a specific `X-Idempotency-Token` header is added to the request alongside standard headers (which we omit);
* terms in angle brackets (`<idempotency token>`) describe the semantics of an entity value (field, header, parameter);
* a specific JSON, containing a `some_parameter` field and some other unspecified fields (indicated by ellipsis) is being sent as a request body payload;
* in response (marked with arrow symbol `→`) server returns a `404 Not Founds` status code; the status might be omitted (treat it like `200 OK` if no status is provided);
* in response (marked with an arrow symbol `→`) server returns a `404 Not Founds` status code; the status might be omitted (treat it like a `200 OK` if no status is provided);
* the response could possibly contain additional notable headers;
* the response body is a JSON comprising a single `error_message` field; field value absence means that field contains exactly what you expect it should contain — some error message in this case.
Term ‘client’ here stands for an application being executed on a user's device, either native or web one. Terms ‘agent’ and ‘user agent’ are synonymous to ‘client’.
The term ‘client’ here stands for an application being executed on a user's device, either a native or a web one. The terms ‘agent’ and ‘user agent’ are synonymous to ‘client’.
Some request and response parts might be omitted if they are irrelevant to the topic being discussed.

View File

@ -1,24 +1,24 @@
### Defining an Application Field
The key question you should ask yourself looks like that: what problem we solve? It should be asked four times, each time putting an emphasis on another word.
The key question you should ask yourself looks like that: what problem do we solve? It should be asked four times, each time putting an emphasis on another word.
1. *What* problem we solve? Could we clearly outline the situation in which our hypothetical API is needed by developers?
1. *What* problem do we solve? Could we clearly outline the situation in which our hypothetical API is needed by developers?
2. What *problem* we solve? Are we sure that the abovementioned situation poses a problem? Does someone really want to pay (literally or figuratively) to automate a solution for this problem?
2. What *problem* do we solve? Are we sure that the abovementioned situation poses a problem? Does someone really want to pay (literally or figuratively) to automate a solution for this problem?
3. What problem *we* solve? Do we actually possess the expertise to solve the problem?
3. What problem do *we* solve? Do we actually possess the expertise to solve the problem?
4. What problem we *solve*? Is it true that the solution we propose solves the problem indeed? Aren't we creating another problem instead?
4. What problem do we *solve*? Is it true that the solution we propose solves the problem indeed? Aren't we creating another problem instead?
So, let's imagine that we are going to develop an API for automated coffee ordering in city cafes, and let's apply the key question to it.
1. Why would someone need an API to make a coffee? Why ordering a coffee via ‘human-to-human’ or ‘human-to-machine’ interface is inconvenient, why have ‘machine-to-machine’ interface?
1. Why would someone need an API to make a coffee? Why ordering a coffee via ‘human-to-human’ or ‘human-to-machine’ interfaces is inconvenient, why have a ‘machine-to-machine’ interface?
* Possibly, we're solving knowledge and selection problems? To provide humans with full knowledge of what options they have right now and right here.
* Possibly, we're solving awareness and selection problems? To provide humans with full knowledge of what options they have right now and right here.
* Possibly, we're optimizing waiting times? To save the time people waste while waiting for their beverages.
* Possibly, we're reducing the number of errors? To help people get exactly what they wanted to order, stop losing information in imprecise conversational communication, or in dealing with unfamiliar coffee machine interfaces?
‘Why’ question is the most important of all questions you must ask yourself. And not only about global project goals, but also locally about every single piece of functionality. **If you can't briefly and clearly answer the question ‘what this entity is needed for’, then it's not needed**.
The ‘why’ question is the most important of all questions you must ask yourself. And not only about global project goals, but also locally about every single piece of functionality. **If you can't briefly and clearly answer the question ‘what this entity is needed for’ then it's not needed**.
Here and throughout we assume, to make our example more complex and bizarre, that we are optimizing all three factors.
@ -34,7 +34,7 @@ In general, there are no simple answers to those questions. Ideally, you should
Since our book is dedicated not to software development per se, but to developing APIs, we should look at all those questions from a different angle: why solving those problems specifically requires an API, not simply a specialized software application? In terms of our fictional example, we should ask ourselves: why provide a service to developers, allowing for brewing coffee to end users, instead of just making an app?
In other words, there must be a solid reason to split two software development domains: there are the operators which provide APIs; and there are the operators which develop services for end users. Their interests are somehow different to such an extent, that coupling these two roles in one entity is undesirable. We will talk about the motivation to specifically provide APIs in more detail in Section III.
In other words, there must be a solid reason to split two software development domains: there are the operators which provide APIs, and there are the operators which develop services for end users. Their interests are somehow different to such an extent, that coupling these two roles in one entity is undesirable. We will talk about the motivation to specifically provide APIs in more detail in Section III.
We should also note, that you should try making an API when and only when you wrote ‘because that's our area of expertise’ in question 2. Developing APIs is a sort of meta-engineering: you're writing some software to allow other companies to develop software to solve users' problems. You must possess expertise in both domains (APIs and user products) to design your API well.
@ -42,7 +42,7 @@ As for our speculative example, let us imagine that in the near future some tect
#### What and How
After finishing all these theoretical exercises, we should proceed right to designing and developing the API, having a decent understanding regarding two things:
After finishing all these theoretical exercises, we should proceed right to designing and developing the API, having a decent understanding of two things:
* *what* we're doing, exactly;
* *how* we're doing it, exactly.

View File

@ -12,7 +12,7 @@ Back to our coffee example. What entity abstraction levels do we see?
2. Each cup of coffee is being prepared according to some `recipe`, which implies the presence of different ingredients and sequences of preparation steps.
3. Each beverage is being prepared on some physical `coffee machine`, occupying some position in space.
Every level presents a developer-facing ‘facet’ in our API. While elaborating abstractions hierarchy, we are first of all trying to reduce the interconnectivity of different entities. That would help us to reach several goals.
Every level presents a developer-facing ‘facet’ in our API. While elaborating on the hierarchy of abstractions, we are first of all trying to reduce the interconnectivity of different entities. That would help us to reach several goals.
1. Simplifying developers' work and the learning curve. At each moment of time, a developer is operating only those entities which are necessary for the task they're solving right now. And conversely, badly designed isolation leads to the situation when developers have to keep in mind lots of concepts mostly unrelated to the task being solved.
@ -55,7 +55,7 @@ This solution intuitively looks bad, and it really is: it violates all the above
Variant I: we have a list of possible volumes fixed and introduce bogus recipes like `/recipes/small-lungo` or `recipes/large-lungo`. Why ‘bogus’? Because it's still the same lungo recipe, same ingredients, same preparation steps, only volumes differ. We will have to start the mass production of recipes, only different in volume, or introduce some recipe ‘inheritance’ to be able to specify the ‘base’ recipe and just redefine the volume.
Variant II: we modify an interface, pronouncing volumes stated in recipes being just the default values. We allow to request different cup volume when placing an order:
Variant II: we modify an interface, pronouncing volumes stated in recipes being just the default values. We allow to request different cup volumes while placing an order:
```
POST /v1/orders
@ -66,15 +66,15 @@ POST /v1/orders
}
```
For those orders with an arbitrary volume requested, a developer will need to obtain the requested volume not from `GET /v1/recipes`, but `GET /v1/orders`. Doing so we're getting a whole bunch of related problems:
* there is a significant chance that developers will make mistakes in this functionality implementation, if they add arbitrary volume support in the code working with the `POST /v1/orders` handler, but forget to make corresponding changes in the order readiness check code;
* the same field (coffee volume) now means different things in different interfaces. In `GET /v1/recipes` context `volume` field means ‘a volume to be prepared if no arbitrary volume is specified in `POST /v1/orders` request’; and it cannot be renamed to ‘default volume’ easily, we now have to live with that.
For those orders with an arbitrary volume requested, a developer will need to obtain the requested volume not from the `GET /v1/recipes` endpoint, but the `GET /v1/orders` one. Doing so we're getting a whole bunch of related problems:
* there is a significant chance that developers will make mistakes in this functionality implementation if they add arbitrary volume support in the code working with the `POST /v1/orders` handler, but forget to make corresponding changes in the order readiness check code;
* the same field (coffee volume) now means different things in different interfaces. In the `GET /v1/recipes` context the `volume` field means ‘a volume to be prepared if no arbitrary volume is specified in the `POST /v1/orders` request’; and it cannot be renamed to ‘default volume’ easily, we now have to live with that.
**In third**, the entire scheme becomes totally inoperable if different types of coffee machines produce different volumes of lungo. To introduce ‘lungo volume depends on machine type’ constraint we have to do quite a nasty thing: make recipes depend on coffee machine id. By doing so we start actively ‘stir’ abstraction levels: one part of our API (recipe endpoints) becomes unusable without explicit knowledge of another part (coffee machines listing). And what is even worse, developers will have to change the logic of their apps: previously it was possible to choose volume first, then a coffee machine; but now this step must be rebuilt from scratch.
**Third**, the entire scheme becomes totally inoperable if different types of coffee machines produce different volumes of lungo. To introduce the ‘lungo volume depends on machine type’ constraint we have to do quite a nasty thing: make recipes depend on coffee machine ids. By doing so we start actively ‘stir’ abstraction levels: one part of our API (recipe endpoints) becomes unusable without explicit knowledge of another part (coffee machines listing). And what is even worse, developers will have to change the logic of their apps: previously it was possible to choose volume first, then a coffee machine; but now this step must be rebuilt from scratch.
Okay, we understood how to make things bad. But how to make them *nice*?
Okay, we understood how to make things naughty. But how to make them *nice*?
Abstraction levels separation should go alongside three directions:
Abstraction levels separation should go in three directions:
1. From user scenarios to their internal representation: high-level entities and their method nomenclature must directly reflect API usage scenarios; low-level entities reflect the decomposition of scenarios into smaller parts.
@ -86,7 +86,7 @@ The more is the distance between programmable contexts our API connects, the dee
In our example with coffee readiness detection we clearly face the situation when we need an interim abstraction level:
* from one side, an ‘order’ should not store the data regarding coffee machine sensors;
* from the other side, a coffee machine should not store the data regarding order properties (and its API probably doesn't provide such functionality).
* on the other side, a coffee machine should not store the data regarding order properties (and its API probably doesn't provide such functionality).
A naïve approach to this situation is to design an interim abstraction level as a ‘connecting link’, which reformulates tasks from one abstraction level to another. For example, introduce a `task` entity like that:
@ -112,7 +112,7 @@ We call this approach ‘naïve’ not because it's wrong; on the contrary, that
An experienced developer in this case must ask: what options do exist? How we really should determine beverage readiness? If it turns out that comparing volumes *is* the only working method to tell whether the beverage is ready, then all the speculations above are wrong. You may safely include readiness-by-volume detection into your interfaces since no other methods exist. Before abstracting something we need to learn what exactly we're abstracting.
In our example let's assume that we have studied coffee machines API specs, and learned that two device types exist:
In our example let's assume that we have studied coffee machines' API specs, and learned that two device types exist:
* coffee machines capable of executing programs coded in the firmware; the only customizable options are some beverage parameters, like desired volume, a syrup flavor, and a kind of milk;
* coffee machines with built-in functions, like ‘grind specified coffee volume’, ‘shed the specified amount of water’, etc.; such coffee machines lack ‘preparation programs’, but provide access to commands and sensors.
@ -158,7 +158,7 @@ To be more specific, let's assume those two kinds of coffee machines provide the
GET /execution/status
```
**NB**. Just in case: this API violates a number of design principles, starting with a lack of versioning; it's described in such a manner because of two reasons: (1) to demonstrate how to design a more convenient API, (b) in the real life, you really got something like that from vendors, and this API is quite sane, actually.
**NB**. Just in case: this API violates a number of design principles, starting with a lack of versioning; it's described in such a manner because of two reasons: (1) to demonstrate how to design a more convenient API, (2) in the real life, you would really get something like that from vendors, and this API is quite a sane one, actually.
* Coffee machines with built-in functions:
```
@ -217,16 +217,16 @@ Now the picture becomes more apparent: we need to abstract coffee machine API ca
The next step in abstraction level separating is determining what functionality we're abstracting. To do so we need to understand the tasks developers solve at the ‘order’ level, and to learn what problems they get if our interim level is missing.
1. Obviously the developers desire to create an order uniformly: list high-level order properties (beverage kind, volume, and special options like syrup or milk type), and don't think about how the specific coffee machine executes it.
1. Obviously, the developers desire to create an order uniformly: list high-level order properties (beverage kind, volume, and special options like syrup or milk type), and don't think about how the specific coffee machine executes it.
2. Developers must be able to learn the execution state: is the order ready? if not — when to expect it's ready (and is there any sense to wait in case of execution errors).
3. Developers need to address the order's location in space and time — to explain to users where and when they should pick the order up.
4. Finally, developers need to run atomic operations, like canceling orders.
Note, that the first-kind API is much closer to developers' needs than the second-kind API. Indivisible ‘program’ is a way more convenient concept than working with raw commands and sensor data. There are only two problems we see in the first-kind API:
Note, that the first-kind API is much closer to developers' needs than the second-kind API. An indivisible ‘program’ is a way more convenient concept than working with raw commands and sensor data. There are only two problems we see in the first-kind API:
* absence of explicit ‘programs’ to ‘recipes’ relation; program identifier is of no use to developers since there is a ‘recipe’ concept;
* absence of explicit ‘ready’ status.
But with the second-kind API it's much worse. The main problem we foresee is an absence of ‘memory’ for actions being executed. Functions and sensors API is totally stateless, which means we don't even understand who called a function being currently executed, when, and which order it is related to.
But with the second-kind API, it's much worse. The main problem we foresee is an absence of ‘memory’ for actions being executed. Functions and sensors API is totally stateless, which means we don't even understand who called a function being currently executed, when, and which order it relates.
So we need to introduce two abstraction levels.
@ -235,7 +235,7 @@ So we need to introduce two abstraction levels.
* statuses and other high-level execution parameters nomenclature (for example, estimated preparation time or possible execution error) being the same;
* methods nomenclature (for example, order cancellation method) and their behavior being the same.
2. Program runtime level. For the first-kind API it will provide just a wrapper for existing programs API; for the second-kind API the entire ‘runtime’ concept is to be developed from scratch by us.
2. Program runtime level. For the first-kind API, it will provide just a wrapper for existing programs API; for the second-kind API, the entire ‘runtime’ concept is to be developed from scratch by us.
What does this mean in a practical sense? Developers will still be creating orders, dealing with high-level entities only:
@ -278,7 +278,7 @@ Please note that knowing the coffee machine API kind isn't required at all; that
* `POST /v1/program-matcher/{api_type}`
* `POST /v1/programs/{api_type}/{program_id}/run`
This approach has some benefits, like the possibility to provide different sets of parameters, specific to the API kind. But we see no need in such fragmentation. `run` method handler is capable of extracting all the program metadata and performing one of two actions:
This approach has some benefits, like the possibility to provide different sets of parameters, specific to the API kind. But we see no need for such fragmentation. `run` method handler is capable of extracting all the program metadata and performing one of two actions:
* call `POST /execute` physical API method, passing internal program identifier — for the first API kind;
* initiate runtime creation to proceed with the second API kind.
@ -330,9 +330,9 @@ And the `state` like that:
}
```
**NB**: while implementing `orders``match``run``runtimes` call sequence we have two options:
**NB**: while implementing the `orders``match``run``runtimes` call sequence we have two options:
* either `POST /orders` handler requests the data regarding the recipe, the coffee machine model, and the program on its own behalf, and forms a stateless request which contains all the necessary data (the API kind, command sequence, etc.);
* or the request contains only data identifiers, and next in chain handler will request pieces of data it needs via some internal APIs.
* or the request contains only data identifiers, and the next handler in the chain will request pieces of data it needs via some internal APIs.
Both variants are plausible, selecting one of them depends on implementation details.
@ -341,21 +341,21 @@ Both variants are plausible, selecting one of them depends on implementation det
A crucial quality of properly separated abstraction levels (and therefore a requirement to their design) is a level isolation restriction: **only adjacent levels may interact**. If ‘jumping over’ is needed in the API design, then clearly mistakes were made.
Get back to our example. How retrieving order status would work? To obtain a status the following call chain is to be performed:
* user initiates a call to the `GET /v1/orders` method;
* a user initiates a call to the `GET /v1/orders` method;
* the `orders` handler completes operations on its level of responsibility (for example, checks user authorization), finds `program_run_id` identifier and performs a call to the `runs/{program_run_id}` endpoint;
* the `runs` endpoint in its turn completes operations corresponding to its level (for example, checks the coffee machine API kind) and, depending on the API kind, proceeds with one of two possible execution branches:
* either calls the `GET /execution/status` method of a physical coffee machine API, gets the coffee volume, and compares it to the reference value;
* or invokes the `GET /v1/runtimes/{runtime_id}` method to obtain the `state.status` and converts it to the order status;
* in a case of the second-kind API, the call chain continues: the `GET /runtimes` handler invokes the `GET /sensors` method of a physical coffee machine API and performs some manipulations with the data, like comparing the cup / ground coffee / shed water volumes with the reference ones, and changing the state and the status if needed.
* in the case of the second-kind API, the call chain continues: the `GET /runtimes` handler invokes the `GET /sensors` method of a physical coffee machine API and performs some manipulations with the data, like comparing the cup / ground coffee / shed water volumes with the reference ones, and changing the state and the status if needed.
**NB**: The ‘call chain’ wording shouldn't be treated literally. Each abstraction level might be organized differently in a technical sense:
* there might be explicit proxying of calls down the hierarchy;
* there might be a cache at each level, being updated upon receiving a callback call or an event. In particular, a low-level runtime execution cycle obviously must be independent of upper levels, renew its state in the background, and not wait for an explicit call.
Note what happens here: each abstraction level wields its own status (e.g. order, runtime, sensors status), being formulated in corresponding to this level subject area terms. Forbidding the ‘jumping over’ results in the necessity to spawn statuses at each level independently.
Note what happens here: each abstraction level wields its own status (e.g. order, runtime, sensors status), being formulated in subject area terms corresponding to this level. Forbidding the ‘jumping over’ results in the necessity to spawn statuses at each level independently.
Let's now look at how the order cancel operation flows through our abstraction levels. In this case, the call chain will look like that:
* user initiates a call to the `POST /v1/orders/{id}/cancel` method;
* a user initiates a call to the `POST /v1/orders/{id}/cancel` method;
* the method handler completes operations on its level of responsibility:
* checks the authorization;
* solves money issues, i.e. whether a refund is needed;
@ -363,20 +363,20 @@ Let's now look at how the order cancel operation flows through our abstraction l
* the `rides/cancel` handler completes operations on its level of responsibility and, depending on the coffee machine API kind, proceeds with one of two possible execution branches:
* either calls the `POST /execution/cancel` method of a physical coffee machine API;
* or invokes the `POST /v1/runtimes/{id}/terminate` method;
* in a second case the call chain continues, the `terminate` handler operates its internal state:
* in a second case the call chain continues as the `terminate` handler operates its internal state:
* changes the `resolution` to `"terminated"`;
* runs the `"discard_cup"` command.
Handling state-modifying operations like `cancel` requires more advanced abstraction levels juggling skills compared to non-modifying calls like `GET /status`. There are two important moments:
Handling state-modifying operations like the `cancel` one requires more advanced abstraction levels juggling skills compared to non-modifying calls like the `GET /status` one. There are two important moments:
1. At each abstraction level the idea of ‘order canceling’ is reformulated:
* at the `orders` level this action in fact splits into several ‘cancels’ of other levels: you need to cancel money holding and to cancel an order execution;
* at the second API kind physical level a ‘cancel’ operation itself doesn't exist: ‘cancel’ means ‘executing the `discard_cup` command’, which is quite the same as any other command.
* at the second API kind physical level the ‘cancel’ operation itself doesn't exist: ‘cancel’ means ‘executing the `discard_cup` command’, which is quite the same as any other command.
The interim API level is needed to make this transition between different level ‘cancels’ smooth and rational without jumping over canyons.
2. From a high-level point of view, canceling an order is a terminal action, since no further operations are possible. From a low-level point of view, the processing continues until the cup is discarded, and then the machine is to be unlocked (e.g. new runtimes creation allowed). It's a task to the execution control level to couple those two states, outer (the order is canceled) and inner (the execution continues).
It might look that forcing the abstraction levels isolation is redundant and makes interfaces more complicated. In fact, it is: it's very important to understand that flexibility, consistency, readability, and extensibility come with a price. One may construct an API with zero overhead, essentially just provide access to the coffee machine's microcontrollers. However using such an API would be a disaster to a developer, not to mention the inability to extend it.
It might look that forcing the abstraction levels isolation is redundant and makes interfaces more complicated. In fact, it is: it's very important to understand that flexibility, consistency, readability, and extensibility come with a price. One may construct an API with zero overhead, essentially just provide access to the coffee machine's microcontrollers. However using such an API would be a disaster for a developer, not to mention the inability to extend it.
Separating abstraction levels is first of all a logical procedure: how we explain to ourselves and developers what our API consists of. **The abstraction gap between entities exists objectively**, no matter what interfaces we design. Our task is just separate this gap into levels *explicitly*. The more implicitly abstraction levels are separated (or worse — blended into each other), the more complicated is your API's learning curve, and the worse is the code that uses it.
@ -399,11 +399,11 @@ Each API abstraction level, therefore corresponds to some data flow generalizati
We may also traverse the tree backward.
1. At the order level we set its logical parameters: recipe, volume, execution place and possible statuses set.
1. At the order level, we set its logical parameters: recipe, volume, execution place and possible statuses set.
2. At the execution level we read the order level data and create a lower level execution context: the program as a sequence of steps, their parameters, transition rules, and initial state.
2. At the execution level, we read the order level data and create a lower level execution context: the program as a sequence of steps, their parameters, transition rules, and initial state.
3. At the runtime level we read the target parameters (which operation to execute, what the target volume is) and translate them into coffee machine API microcommands and statuses for each command.
3. At the runtime level, we read the target parameters (which operation to execute, what the target volume is) and translate them into coffee machine API microcommands and statuses for each command.
Also, if we take a deeper look into the ‘bad’ decision (forcing developers to determine actual order status on their own), being discussed at the beginning of this chapter, we could notice a data flow collision there:
* from one side, in the order context ‘leaked’ physical data (beverage volume prepared) is injected, therefore stirring abstraction levels irreversibly;

View File

@ -1,27 +1,27 @@
### Isolating Responsibility Areas
Based on the previous chapter, we understand that the abstraction hierarchy in our hypothetical project would look like that:
* the user level (those entities users directly interact with and which are formulated in terms, understandable by user: orders, coffee recipes);
* the user level (those entities users directly interact with and which are formulated in terms, understandable by users: orders, coffee recipes);
* the program execution control level (the entities responsible for transforming orders into machine commands);
* the runtime level for the second API kind (the entities describing the command execution state machine).
We are now to define each entity's responsibility area: what's the reasoning in keeping this entity within our API boundaries; what operations are applicable to the entity directly (and which are delegated to other objects). In fact, we are to apply the ‘why’-principle to every single API entity.
To do so we must iterate all over the API and formulate in subject area terms what every object is. Let us remind that the abstraction levels concept implies that each level is a some interim subject area per se; a step we take in the journey from describing a task in the first connected context terms (‘a lungo ordered by a user’) to the second connect context terms (‘a command performed by a coffee machine’).
To do so we must iterate all over the API and formulate in subject area terms what every object is. Let us remind that the abstraction levels concept implies that each level is some interim subject area per se; a step we take in the journey from describing a task in the first connected context terms (‘a lungo ordered by a user’) to the second connect context terms (‘a command performed by a coffee machine’).
As for our fictional example, it would look like that:
As for our fictional example, it would look as follows.
1. User-level entities.
* An `order` describes some logical unit in app-user interaction. An `order` might be:
* created;
* checked for its status;
* retrieved;
* canceled;
* A `recipe` describes an ‘ideal model’ of some coffee beverage type, its customer properties. A `recipe` is an immutable entity for us, which means we could only read it.
* A `coffee-machine` is a model of a real-world device. We must be able to retrieve the coffee machine's geographical location and the options it supports from this model (will be discussed below).
* A `recipe` describes an ‘ideal model’ of some coffee beverage type, e.g. its customer properties. A `recipe` is an immutable entity for us, which means we could only read it.
* A `coffee-machine` is a model of a real-world device. We must be able to retrieve the coffee machine's geographical location and the options it supports from this model (which will be discussed below).
2. Program execution control level entities.
* A `program` describes some general execution plan for a coffee machine. Programs could only be read.
* The program matcher `programs/matcher` is capable of coupling a `recipe` and a `program`, which in fact means ‘to retrieve a dataset needed to prepare a specific recipe on a specific coffee machine’.
* A program execution `programs/run` describes a single fact of running a program on a coffee machine. `run` might be:
* The `programs/matcher` entity is capable of coupling a `recipe` and a `program`, which in fact means ‘to retrieve a dataset needed to prepare a specific recipe on a specific coffee machine’.
* A `programs/run` entity describes a single fact of running a program on a coffee machine. A `run` might be:
* initialized (created);
* checked for its status;
* canceled.
@ -37,7 +37,7 @@ If we look closely at the entities, we may notice that each entity turns out to
At this point, when our API is in general clearly outlined and drafted, we must put ourselves into the developer's shoes and try writing code. Our task is to look at the entity nomenclature and make some estimates regarding their future usage.
So, let us imagine we've got a task to write an app for ordering a coffee, based upon our API. What code would we write?
So, let us imagine we've got a task to write an app for ordering a coffee, based on our API. What code would we write?
Obviously, the first step is offering a choice to a user, to make them point out what they want. And this very first step reveals that our API is quite inconvenient. There are no methods allowing for choosing something. A developer has to implement these steps:
* retrieve all possible recipes from the `GET /v1/recipes` endpoint;
@ -64,7 +64,7 @@ app.display(coffeeMachines);
As you see, developers are to write a lot of redundant code (to say nothing about the difficulties of implementing spatial indexes). Besides, if we take into consideration our Napoleonic plans to cover all coffee machines in the world with our API, then we need to admit that this algorithm is just a waste of resources on retrieving lists and indexing them.
The necessity of adding a new endpoint for searching becomes obvious. To design such an interface we must imagine ourselves being UX designers, and think about how an app could try to arouse users' interest. Two scenarios are evident:
* display all cafes in the vicinity and types of coffee they offer (a ‘service discovery’ scenario) — for new users or just users with no specific tastes;
* display all cafes in the vicinity and the types of coffee they offer (a ‘service discovery’ scenario) — for new users or just users with no specific tastes;
* display nearby cafes where a user could order a particular type of coffee — for users seeking a certain beverage type.
Then our new interface would look like this:
@ -92,7 +92,7 @@ Here:
* an `offer` — is a marketing bid: on what conditions a user could have the requested coffee beverage (if specified in the request), or some kind of a marketing offer — prices for the most popular or interesting products (if no specific preference was set);
* a `place` — is a spot (café, restaurant, street vending machine) where the coffee machine is located; we never introduced this entity before, but it's quite obvious that users need more convenient guidance to find a proper coffee machine than just geographical coordinates.
**NB**. We could have been enriched the existing `/coffee-machines` endpoint instead of adding a new one. This decision, however, looks less semantically viable: coupling in one interface different modes of listing entities, by relevance and by order, is usually a bad idea because these two types of rankings imply different usage features and scenarios. Furthermore, enriching the search with ‘offers’ pulls this functionality out of `coffee-machines` namespace: the fact of getting offers to prepare specific beverages in specific conditions is a key feature to users, with specifying the coffee-machine being just a part of an offer.
**NB**. We could have enriched the existing `/coffee-machines` endpoint instead of adding a new one. This decision, however, looks less semantically viable: coupling in one interface different modes of listing entities, by relevance and by order, is usually a bad idea because these two types of rankings imply different usage features and scenarios. Furthermore, enriching the search with ‘offers’ pulls this functionality out of the `coffee-machines` namespace: the fact of getting offers to prepare specific beverages in specific conditions is a key feature to users, with specifying the coffee machine being just a part of an offer.
Coming back to the code developers are writing, it would now look like that:
```
@ -105,14 +105,14 @@ app.display(offers);
#### Helpers
Methods similar to newly invented `offers/search` are called *helpers*. The purpose they exist is to generalize known API usage scenarios and facilitate implementing them. By ‘facilitating’ we mean not only reducing wordiness (getting rid of ‘boilerplates’) but also helping developers to avoid common problems and mistakes.
Methods similar to the newly invented `offers/search` one are called *helpers*. The purpose they exist is to generalize known API usage scenarios and facilitate implementing them. By ‘facilitating’ we mean not only reducing wordiness (getting rid of ‘boilerplates’) but also helping developers to avoid common problems and mistakes.
For instance, let's consider the order price question. Our search function returns some ‘offers’ with prices. But ‘price’ is volatile; coffee could cost less during ‘happy hours’, for example. Developers could make a mistake thrice while implementing this functionality:
* cache search results on a client device for too long (as a result, the price will always be nonactual);
* contrary to previous, call search method excessively just to actualize prices, thus overloading the network and the API servers;
* create an order with an invalid price (therefore deceiving a user, displaying one sum, and debiting another).
To solve the third problem we could demand including the displayed price in the order creation request, and return an error if it differs from the actual one. (In fact, any API working with money *shall* do so.) But it isn't helping with the first two problems and makes the user experience degrade. Displaying actual price is always a much more convenient behavior than displaying errors upon pressing the ‘place an order’ button.
To solve the third problem we could demand including the displayed price in the order creation request, and return an error if it differs from the actual one. (In fact, any API working with money *shall* do so.) But it isn't helping with the first two problems and makes the user experience degrade. Displaying the actual price is always a much more convenient behavior than displaying errors upon pressing the ‘place an order’ button.
One solution is to provide a special identifier to an offer. This identifier must be specified in an order creation request.
```
@ -132,7 +132,7 @@ One solution is to provide a special identifier to an offer. This identifier mus
"cursor"
}
```
By doing so we're not only helping developers to grasp a concept of getting the relevant price, but also solving a UX task of telling users about ‘happy hours’.
By doing so we're not only helping developers to grasp the concept of getting the relevant price, but also solving a UX task of telling users about ‘happy hours’.
As an alternative, we could split endpoints: one for searching, another one for obtaining offers. This second endpoint would only be needed to actualize prices in the specified places.
@ -156,11 +156,11 @@ The main rule of error interfaces in the APIs is: an error response must help a
An error response content must address the following questions:
1. Which party is the problem's source: client or server?
HTTP APIs traditionally employ `4xx` status codes to indicate client problems, `5xx` to indicate server problems (with the exception of a `404`, which is an uncertainty status).
HTTP APIs traditionally employ the `4xx` status codes to indicate client problems, `5xx` to indicate server problems (with the exception of the `404` code, which is an uncertainty status).
2. If the error is caused by a server, is there any sense to repeat the request? If yes, then when?
3. If the error is caused by a client, is it resolvable, or not?
The invalid price error is resolvable: a client could obtain a new price offer and create a new order with it. But if the error occurred because of a mistake in the client code, then eliminating the cause is impossible, and there is no need to make the user push the ‘place an order’ button again: this request will never succeed.
**NB**: here and throughout we indicate resolvable problems with `409 Conflict` code, and unresolvable ones with `400 Bad Request`.
**NB**: here and throughout we indicate resolvable problems with the `409 Conflict` code, and unresolvable ones with the `400 Bad Request` code.
4. If the error is resolvable, then what's the kind of problem? Obviously, a client couldn't resolve a problem it's unaware of. For every resolvable problem, some *code* must be written (reobtaining the offer in our case), so a list of error descriptions must exist.
5. If the same kind of errors arise because of different parameters being invalid, then which parameter value is wrong exactly?
6. Finally, if some parameter value is unacceptable, then what values are acceptable?
@ -183,15 +183,15 @@ In our case, the price mismatch error should look like this:
}
```
After getting this error, a client is to check the error's kind (‘some problem with offer’), check the specific error reason (‘order lifetime expired’), and send an offer retrieving request again. If `checks_failed` field indicated another error reason (for example, the offer isn't bound to the specified user), client actions would be different (re-authorize the user, then get a new offer). If there were no error handler for this specific reason, a client would show `localized_message` to the user, and invoke standard error recovery procedure.
After getting this error, a client is to check the error's kind (‘some problem with offer’), check the specific error reason (‘order lifetime expired’), and send an offer retrieving request again. If the `checks_failed` field indicated another error reason (for example, the offer isn't bound to the specified user), client actions would be different (re-authorize the user, then get a new offer). If there were no error handlers for this specific reason, a client would show the `localized_message` to the user, and invoke the standard error recovery procedure.
It is also worth mentioning that unresolvable errors are useless to a user at the time (since the client couldn't react usefully to unknown errors), but it doesn't mean that providing extended error data is excessive. A developer will read it when fixing the error in the code. Also, check paragraphs 12&13 in the next chapter.
It is also worth mentioning that unresolvable errors are useless to a user at the time (since the client couldn't react usefully to unknown errors), but it doesn't mean that providing extended error data is excessive. A developer will read it when fixing the error in the code. Also, check paragraphs 12 and 13 in the next chapter.
#### Decomposing Interfaces. The ‘7±2’ Rule
Out of our own API development experience, we can tell without any doubt that the greatest final interfaces design mistake (and the greatest developers' pain accordingly) is excessive overloading of entities' interfaces with fields, methods, events, parameters, and other attributes.
Out of our own API development experience, we can tell without any doubt that the greatest final interface design mistake (and the greatest developers' pain accordingly) is excessive overloading of entities' interfaces with fields, methods, events, parameters, and other attributes.
Meanwhile, there is the ‘Golden Rule’ of interface design (applicable not only to APIs but almost to anything): humans could comfortably keep 7±2 entities in short-term memory. Manipulating a larger number of chunks complicates things for most humans. The rule is also known as [‘Miller's law’](https://en.wikipedia.org/wiki/Working_memory#Capacity).
Meanwhile, there is the ‘Golden Rule’ of interface design (applicable not only to APIs but almost to anything): humans could comfortably keep 7±2 entities in short-term memory. Manipulating a larger number of chunks complicates things for most humans. The rule is also known as the [‘Miller's law’](https://en.wikipedia.org/wiki/Working_memory#Capacity).
The only possible method of overcoming this law is decomposition. Entities should be grouped under a single designation at every concept level of the API, so developers are never to operate more than 10 entities at a time.
@ -272,7 +272,7 @@ Let's try to group it together:
}
```
Such decomposed API is much easier to read than a long sheet of different attributes. Furthermore, it's probably better to group even more entities in advance. For example, `place` and `route` could be joined in a single `location` structure, or `offer` and `pricing` might be combined into some generalized object.
Such decomposed API is much easier to read than a long sheet of different attributes. Furthermore, it's probably better to group even more entities in advance. For example, a `place` and a `route` could be joined in a single `location` structure, or an `offer` and a `pricing` might be combined into some generalized object.
It is important to say that readability is achieved not only by mere grouping the entities. Decomposing must be performed in such a manner that a developer, while reading the interface, instantly understands: ‘here is the place description of no interest to me right now, no need to traverse deeper’. If the data fields needed to complete some action are scattered all over different composites, the readability doesn't improve but degrades.

View File

@ -2,13 +2,13 @@
When all entities, their responsibilities, and relations to each other are defined, we proceed to the development of the API itself. We are to describe the objects, fields, methods, and functions nomenclature in detail. In this chapter, we're giving purely practical advice on making APIs usable and understandable.
Important assertion at number 0:
An important assertion at number 0:
##### 0. Rules are just generalizations
Rules are not to be applied unconditionally. They are not making thinking redundant. Every rule has a rational reason to exist. If your situation doesn't justify following the rule — then you shouldn't do it.
For example, demanding a specification being consistent exists to help developers spare time on reading docs. If you *need* developers to read some entity's doc, it is totally rational to make its signature deliberately inconsistent.
For example, demanding a specification be consistent exists to help developers spare time on reading docs. If you *need* developers to read some entity's doc, it is totally rational to make its signature deliberately inconsistent.
This idea applies to every concept listed below. If you get an unusable, bulky, unobvious API because you follow the rules, it's a motive to revise the rules (or the API).
@ -51,7 +51,7 @@ POST /v1/orders/statistics/aggregate
Two important implications:
**1.1.** If the operation is modifying, it must be obvious from the signature. In particular, there might be no modifying operations using `GET` verb.
**1.1.** If the operation is modifying, it must be obvious from the signature. In particular, there might be no modifying operations using the `GET` verb.
**1.2.** If your API's nomenclature contains both synchronous and asynchronous operations, then (a)synchronicity must be apparent from signatures, **or** a naming convention must exist.
@ -74,9 +74,9 @@ So *always* specify exactly which standard is applied. Exceptions are possible i
or
`"duration": {"unit": "ms", "value": 5000}`.
One particular implication from this rule is that money sums must *always* be accompanied by a currency code.
One particular implication of this rule is that money sums must *always* be accompanied by a currency code.
It is also worth saying that in some areas the situation with standards is so spoiled that, whatever you do, someone got upset. A ‘classical’ example is geographical coordinates order (latitude-longitude vs longitude-latitude). Alas, the only working method of fighting with frustration there is the ‘Serenity Notepad’ to be discussed in Section II.
It is also worth saying that in some areas the situation with standards is so spoiled that, whatever you do, someone got upset. A ‘classical’ example is geographical coordinates order (latitude-longitude vs longitude-latitude). Alas, the only working method of fighting frustration there is the ‘Serenity Notepad’ to be discussed in Section II.
##### Keep fractional numbers precision intact
@ -108,14 +108,14 @@ In the 21st century, there's no need to shorten entities' names.
strpbrk (str1, str2)
```
Possibly, an author of this API thought that `pbrk` abbreviature would mean something to readers; clearly mistaken. Also, it's hard to tell from the signature which string (`str1` or `str2`) stands for a character set.
Possibly, an author of this API thought that the `pbrk` abbreviature would mean something to readers; clearly mistaken. Also, it's hard to tell from the signature which string (`str1` or `str2`) stands for a character set.
**Better**: `str_search_for_characters (lookup_character_set, str)`
— though it's highly disputable whether this function should exist at all; a feature-rich search function would be much more convenient. Also, shortening `string` to `str` bears no practical sense, regretfully being a routine in many subject areas.
— though it's highly disputable whether this function should exist at all; a feature-rich search function would be much more convenient. Also, shortening a `string` to an `str` bears no practical sense, regretfully being a routine in many subject areas.
##### Naming implies typing
Field named `recipe` must be of `Recipe` type. Field named `recipe_id` must contain a recipe identifier which we could find within the `Recipe` entity.
Field named `recipe` must be of a `Recipe` type. Field named `recipe_id` must contain a recipe identifier that we could find within the `Recipe` entity.
Same for primitive types. Arrays must be named in a plural form or as collective nouns, i.e. `objects`, `children`. If that's impossible, better add a prefix or a postfix to avoid doubt.
@ -130,9 +130,9 @@ Similarly, if a Boolean value is expected, entity naming must describe some qual
**Better**: `"task.is_finished": true`.
Specific platforms imply specific additions to this rule with regard to the first-class citizen types they provide. For example, entities of `Date` type (if such type is present) would benefit from being indicated with `_at` or `_date` postfix, i.e. `created_at`, `occurred_at`.
Specific platforms imply specific additions to this rule with regard to the first-class citizen types they provide. For example, entities of the `Date` type (if such type is present) would benefit from being indicated with `_at` or `_date` postfixes, i.e. `created_at`, `occurred_at`.
If entity name is a polysemantic term itself, which could confuse developers, better add an extra prefix or postfix to avoid misunderstanding.
If an entity name is a polysemantic term itself, which could confuse developers, better add an extra prefix or postfix to avoid misunderstanding.
**Bad**:
```
@ -146,7 +146,7 @@ Word ‘function’ is many-valued. It could mean built-in functions, but also
##### Matching entities must have matching names and behave alike
**Bad**: `begin_transition` / `stop_transition`
`begin` and `stop` doesn't match; developers will have to dig into the docs.
`begin` and `stop` terms don't match; developers will have to dig into the docs.
**Better**: either `begin_transition` / `end_transition` or `start_transition` / `stop_transition`.
@ -163,22 +163,22 @@ str_replace(needle, replace, haystack)
```
Several rules are violated:
* inconsistent underscore using;
* functionally close methods have different `needle`/`haystack` argument order;
* first function finds the first occurrence while the second one finds them all, and there is no way to deduce that fact out of the function signatures.
* functionally close methods have different `needle`/`haystack` argument ordering;
* the first function finds the first occurrence while the second one finds them all, and there is no way to deduce that fact out of the function signatures.
We're leaving the exercise of making these signatures better to the reader.
##### Use globally unique identifiers
It's considered good form to use globally unique strings as entity identifiers, either semantic (i.e. "lungo" for beverage types) or random ones (i.e. [UUID-4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random))). It might turn out to be extremely useful if you need to merge data from several sources under a single identifier.
It's considered a good form to use globally unique strings as entity identifiers, either semantic (i.e. "lungo" for beverage types) or random ones (i.e. [UUID-4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random))). It might turn out to be extremely useful if you need to merge data from several sources under a single identifier.
In general, we tend to advice using urn-like identifiers, e.g. `urn:order:<uuid>` (or just `order:<uuid>`). That helps a lot in dealing with legacy systems with different identifiers attached to the same entity. Namespaces in urns help to understand quickly which identifier is used and is there a usage mistake.
In general, we tend to advise using urn-like identifiers, e.g. `urn:order:<uuid>` (or just `order:<uuid>`). That helps a lot in dealing with legacy systems with different identifiers attached to the same entity. Namespaces in urns help to understand quickly which identifier is used and is there a usage mistake.
One important implication: **never use increasing numbers as external identifiers**. Apart from the abovementioned reasons, it allows counting how many entities of each type there are in the system. Your competitors will be able to calculate a precise number of orders you have each day, for example.
**NB**: in this book, we often use short identifiers like "123" in code examples; that's for reading the book on small screens convenience. Do not replicate this practice in a real-world API.
##### System state must be observable by clients
##### The system state must be observable by clients
This rule could be reformulated as ‘don't make clients guess’.
@ -197,7 +197,7 @@ GET /v1/orders/{id}
// and awaits checking
→ 404 Not Found
```
— though the operation looks to be executed successfully, the client must store order id and recurrently check `GET /v1/orders/{id}` state. This pattern is bad per se, but gets even worse when we consider two cases:
— though the operation looks to be executed successfully, the client must store the order id and recurrently check the `GET /v1/orders/{id}` state. This pattern is bad per se, but gets even worse when we consider two cases:
* clients might lose the id, if system failure happened in between sending the request and getting the response, or if app data storage was damaged or cleansed;
* customers can't use another device; in fact, the knowledge of orders being created is bound to a specific user agent.
@ -236,9 +236,9 @@ GET /v1/users/{id}/orders
— humans are bad at perceiving double negation; make mistakes.
**Better**: `"prohibit_calling": true` or `"avoid_calling": true`
— it's easier to read, though you shouldn't deceive yourself. Avoid semantical double negations, even if you've found a ‘negative’ word without ‘negative’ prefix.
— it's easier to read, though you shouldn't deceive yourself. Avoid semantical double negations, even if you've found a ‘negative’ word without a ‘negative’ prefix.
Also worth mentioning, that making mistakes in [de Morgan's laws](https://en.wikipedia.org/wiki/De_Morgan's_laws) usage is even simpler. For example, if you have two flags:
Also worth mentioning that making mistakes in the [de Morgan's laws](https://en.wikipedia.org/wiki/De_Morgan's_laws) usage is even simpler. For example, if you have two flags:
```
GET /coffee-machines/{id}/stocks
@ -257,7 +257,7 @@ GET /coffee-machines/{id}/stocks
"cup_absence": false
}
```
— then developers will have to evaluate one of `!beans_absence && !cup_absence``!(beans_absence || cup_absence)` conditions, and in this transition people tend to make mistakes. Avoiding double negations helps little, and regretfully only general advice could be given: avoid the situations when developers have to evaluate such flags.
— then developers will have to evaluate one of the `!beans_absence && !cup_absence``!(beans_absence || cup_absence)` conditions, and in this transition, people tend to make mistakes. Avoiding double negations helps little, and regretfully only general advice could be given: avoid the situations when developers have to evaluate such flags.
##### Avoid implicit type conversion
@ -281,7 +281,7 @@ if (Type(order.contactless_delivery) == 'Boolean' &&
This practice makes the code more complicated, and it's quite easy to make mistakes, which will effectively treat the field in a quite opposite manner. The same could happen if some special values (i.e. `null` or `-1`) to denote value absence are used.
The universal rule to deal with such situations is to make all new Boolean flags being false by default.
The universal rule to deal with such situations is to make all new Boolean flags false by default.
**Better**
@ -356,9 +356,9 @@ PATCH /v1/orders/123
{ "delivery_address" }
```
— this approach is usually chosen to lessen request and response body sizes, plus it allows to implement collaborative editing cheaply. Both these advantages are imaginary.
— this approach is usually chosen to lessen request and response body sizes, plus it allows for the implementation of collaborative editing cheaply. Both these advantages are imaginary.
**First**, sparing bytes on semantic data is seldom needed in modern apps. Network packets sizes (MTU, Maximum Transmission Unit) are more than a kilobyte right now; shortening responses is useless while they're less than a kilobyte.
**First**, sparing bytes on semantic data is seldom needed in modern apps. Network packet sizes (MTU, Maximum Transmission Unit) are more than a kilobyte right now; shortening responses is useless while they're less than a kilobyte.
Excessive network traffic usually occurs if:
@ -374,9 +374,9 @@ Transferring only a subset of fields solves none of these problems, in the best
**Second**, shortening response sizes will backfire exactly with spoiling collaborative editing: one client won't see the changes the other client has made. Generally speaking, in 9 cases out of 10, it is better to return a full entity state from any modifying operation, sharing the format with the read-access endpoint. Actually, you should always do this unless response size affects performance.
**In third**, this approach might work if you need to rewrite a field's value. But how to unset the field, return its value to the default state? For example, how to *remove* `client_phone_number_ext`?
**Third**, this approach might work if you need to rewrite a field's value. But how to unset the field, e.g. to return its value to the default state? For example, how to *remove* the `client_phone_number_ext`?
In such cases, special values are often being used, like `null`. But as we discussed above, this is a defective practice. Another variant is prohibiting non-required fields, but that would pose considerable obstacles in a way of expanding the API.
In such cases, special values are often being used, for example, a `null` one. But as we discussed above, this is a defective practice. Another variant is prohibiting non-required fields, but that would pose considerable obstacles in a way of expanding the API.
**Better**: one of the following two strategies might be used.
@ -406,7 +406,7 @@ PUT /v1/orders/123/client-details
{ "phone_number" }
```
Omitting `client_phone_number_ext` in `PUT client-details` request would be sufficient to remove it. This approach also helps to separate constant and calculated fields (`order_id` and `updated_at`) from editable ones, thus getting rid of ambiguous situations (what happens if a client tries to rewrite the `updated_at` field?). You may also return the entire `order` entity from `PUT` endpoints (however, there should be some naming convention for that).
Omitting the `client_phone_number_ext` in the `PUT client-details` request would be sufficient to remove it. This approach also helps to separate constant and calculated fields (`order_id` and `updated_at`) from editable ones, thus getting rid of ambiguous situations (what happens if a client tries to rewrite the `updated_at` field?). You may also return the entire `order` entity from `PUT` endpoints (however, there should be some naming convention for that).
**Option 2**: design a format for atomic changes.
@ -425,7 +425,7 @@ X-Idempotency-Token: <see next paragraph>
}
```
This approach is much harder to implement, but it's the only viable method to implement collaborative editing, since it explicitly reflects what a user was actually doing with entity representation. With data exposed in such a format, you might actually implement offline editing, when user changes are accumulated and then sent at once, while the server automatically resolves conflicts by ‘rebasing’ the changes.
This approach is much harder to implement, but it's the only viable method to implement collaborative editing since it explicitly reflects what a user was actually doing with entity representation. With data exposed in such a format, you might actually implement offline editing, when user changes are accumulated and then sent at once, while the server automatically resolves conflicts by ‘rebasing’ the changes.
##### All API operations must be idempotent
@ -438,7 +438,7 @@ If the endpoint's idempotency can't be assured naturally, explicit idempotency p
// Creates an order
POST /orders
```
The second order will be produced if the request is repeated!
A second order will be produced if the request is repeated!
**Better**:
```
@ -446,7 +446,7 @@ The second order will be produced if the request is repeated!
POST /v1/orders
X-Idempotency-Token: <random string>
```
A client on its side must retain `X-Idempotency-Token` in case of automated endpoint retrying. A server on its side must check whether an order created with this token exists.
A client on its side must retain the `X-Idempotency-Token` in case of automated endpoint retrying. A server on its side must check whether an order created with this token exists.
**An alternative**:
```
@ -463,7 +463,7 @@ PUT /v1/orders/drafts/{draft_id}
Creating order drafts is a non-binding operation since it doesn't entail any consequences, so it's fine to create drafts without the idempotency token.
Confirming drafts is a naturally idempotent operation, with `draft_id` being its idempotency key.
Confirming drafts is a naturally idempotent operation, with the `draft_id` being its idempotency key.
Also worth mentioning that adding idempotency tokens to naturally idempotent handlers isn't meaningless either, since it allows to distinguish two situations:
* a client didn't get the response because of some network issues, and is now repeating the request;
@ -481,7 +481,7 @@ The server retrieves the actual resource revision and finds it to be 124. How to
The server may compare request bodies, assuming that identical `updates` values mean retrying, but this assumption might be dangerously wrong (for example if the resource is a counter of some kind, then repeating identical requests are routine).
Adding idempotency token (either directly as a random string, or indirectly in a form of drafts) solves this problem.
Adding the idempotency token (either directly as a random string, or indirectly in a form of drafts) solves this problem.
```
POST /resource/updates
X-Idempotency-Token: <token>
@ -505,10 +505,10 @@ X-Idempotency-Token: <token>
```
— the server found out that a different token was used in creating revision 124, which means an access conflict.
Furthermore, adding idempotency tokens not only resolves the issue but also makes advanced optimizations possible. If the server detects an access conflict, it could try to resolve it, ‘rebasing’ the update like modern version control systems do, and return `200 OK` instead of `409 Conflict`. This logic dramatically improves user experience, being fully backwards compatible, and helps to avoid conflict resolving code fragmentation.
Furthermore, adding idempotency tokens not only resolves the issue but also makes advanced optimizations possible. If the server detects an access conflict, it could try to resolve it, ‘rebasing’ the update like modern version control systems do, and return a `200 OK` instead of a `409 Conflict`. This logic dramatically improves user experience, being fully backwards compatible, and helps to avoid conflict resolving code fragmentation.
Also, be warned: clients are bad at implementing idempotency tokens. Two problems are common:
* you can't really expect that clients generate truly random tokens — they may share the same seed or simply use weak algorithms or entropy sources; therefore you must put constraints on token checking: token must be unique to specific user and resource, not globally;
* you can't really expect that clients generate truly random tokens — they may share the same seed or simply use weak algorithms or entropy sources; therefore you must put constraints on token checking: token must be unique to a specific user and resource, not globally;
* clients tend to misunderstand the concept and either generate new tokens each time they repeat the request (which deteriorates the UX, but otherwise healthy) or conversely use one token in several requests (not healthy at all and could lead to catastrophic disasters; another reason to implement the suggestion in the previous clause); writing detailed doc and/or client library is highly recommended.
##### Avoid non-atomic operations
@ -593,13 +593,13 @@ PATCH /v1/recipes
```
Here:
* `change_id` is a unique identifier of each atomic change;
* `occurred_at` is a moment of time when the change was actually applied;
* `error` field contains the error data related to the specific change.
* the `change_id` field is a unique identifier of each atomic change;
* the `occurred_at` field is a moment of time when the change was actually applied;
* the `error` field contains the error data related to the specific change.
Might be of use:
* introducing `sequence_id` parameters in the request to guarantee execution order and to align item order in response with the requested one;
* expose a separate `/changes-history` endpoint for clients to get the history of applied changes even if the app crashed while getting partial success response or there was a network timeout.
* expose a separate `/changes-history` endpoint for clients to get the history of applied changes even if the app crashed while getting a partial success response or there was a network timeout.
Non-atomic changes are undesirable because they erode the idempotency concept. Let's take a look at the example:
@ -681,7 +681,7 @@ Two questions arise:
* until when the price is valid?
* in what vicinity of the location the price is valid?
**Better**: you may use standard protocol capabilities to denote cache options, like `Cache-Control` header. If you need caching in both temporal and spatial dimensions, you should do something like that:
**Better**: you may use standard protocol capabilities to denote cache options, like the `Cache-Control` header. If you need caching in both temporal and spatial dimensions, you should do something like that:
```
// Returns an offer: for what money sum
// our service commits to make a lungo
@ -732,9 +732,9 @@ At the first glance, this is the most standard way of organizing the pagination
2. What happens if some record is deleted from the head of the list?
Easy: the client will miss one record and will never learn this.
3. What cache parameters to set for this endpoint?
None could be set: repeating the request with the same `limit` and `offset` each time produces new records set.
None could be set: repeating the request with the same `limit` and `offset` parameters each time produces a new record set.
**Better**: in such unidirectional lists the pagination must use that key which implies the order. Like this:
**Better**: in such unidirectional lists the pagination must use the key that implies the order. Like this:
```
// Returns a limited number of records
// sorted by creation date
@ -747,9 +747,9 @@ GET /v1/records?older_than={record_id}&limit=10
// preceding the specified one
GET /v1/records?newer_than={record_id}&limit=10
```
With the pagination organized like that, clients never bother about records being added or removed in the processed part of the list: they continue to iterate over the records, either getting new ones (using `newer_than`) or older ones (using `older_than`). If there is no record removal operation, clients may easily cache responses — the URL will always return the same recordset.
With the pagination organized like that, clients never bother about records being added or removed in the processed part of the list: they continue to iterate over the records, either getting new ones (using `newer_than`) or older ones (using `older_than`). If there is no record removal operation, clients may easily cache responses — the URL will always return the same record set.
Another way to organize such lists is returning a `cursor` to be used instead of `record_id`, making interfaces more versatile.
Another way to organize such lists is returning a `cursor` to be used instead of the `record_id`, making interfaces more versatile.
```
// Initial data request
POST /v1/records/list
@ -772,14 +772,14 @@ POST /v1/records/list
GET /v1/records?cursor=<cursor value>
{ "records", "cursor" }
```
One advantage of this approach is the possibility to keep initial request parameters (i.e. `filter` in our example) embedded into the cursor itself, thus not copying them in follow-up requests. It might be especially actual if the initial request prepares the full dataset, for example, moving it from the ‘cold’ storage to a ‘hot’ one (then `cursor` might simply contain the encoded dataset id and the offset).
One advantage of this approach is the possibility to keep initial request parameters (i.e. the `filter` in our example) embedded into the cursor itself, thus not copying them in follow-up requests. It might be especially actual if the initial request prepares the full dataset, for example, moving it from the ‘cold’ storage to a ‘hot’ one (then the `cursor` might simply contain the encoded dataset id and the offset).
There are several approaches to implementing cursors (for example, making a single endpoint for initial and follow-up requests, returning the first data portion in the first response). As usual, the crucial part is maintaining consistency across all such endpoints.
**NB**: some sources discourage this approach because in this case user can't see a list of all pages and can't choose an arbitrary one. We should note here that:
* such a case (pages list and page selection) exists if we deal with user interfaces; we could hardly imagine a *program* interface which needs to provide access to random data pages;
* such a case (pages list and page selection) exists if we deal with user interfaces; we could hardly imagine a *program* interface that needs to provide access to random data pages;
* if we still talk about an API to some application, which has a ‘paging’ user control, then a proper approach would be to prepare ‘paging’ data on the server side, including generating links to pages;
* cursor-based solution doesn't prohibit using `offset`/`limit`; nothing could stop us from creating a dual interface, which might serve both `GET /items?cursor=…` and `GET /items?offset=…&limit=…` requests;
* cursor-based solutions don't prohibit using the `offset`/`limit` parameters; nothing could prevent us from creating a dual interface, which might serve both `GET /items?cursor=…` and `GET /items?offset=…&limit=…` requests;
* finally, if there is a necessity to provide access to arbitrary pages in the user interface, we should ask ourselves a question, which problem is being solved that way; probably, users use this functionality to find something: a specific element on the list, or the position they ended while working with the list last time; probably, we should provide more convenient controls to solve those tasks than accessing data pages by their indexes.
**Bad**:
@ -791,11 +791,11 @@ There are several approaches to implementing cursors (for example, making a sing
GET /records?sort_by=date_modified&sort_order=desc&limit=10&offset=100
```
Sorting by the date of modification usually means that data might be modified. In other words, some records might change after the first data chunk is returned, but before the next chunk is requested. Modified records will simply disappear from the listing because of moving to the first page. Clients will never get those records that were changed during the iteration process, even if the `cursor` scheme is implemented, and they never learn the sheer fact of such an omission. Also, this particular interface isn't extendable as there is no way to add sorting by two or more fields.
Sorting by the date of modification usually means that data might be modified. In other words, some records might change after the first data chunk is returned, but before the next chunk is requested. Modified records will simply disappear from the listing because of moving to the first page. Clients will never get those records that were changed during the iteration process, even if the cursor-based scheme is implemented, and they never learn the sheer fact of such an omission. Also, this particular interface isn't extendable as there is no way to add sorting by two or more fields.
**Better**: there is no general solution to this problem in this formulation. Listing records by modification time will always be unpredictably volatile, so we have to change the approach itself; we have two options.
**Option one**: fix the record order at the moment we've got the initial request, e.g. our server produces the entire list and stores it in the immutable form:
**Option one**: fix the records ordering at the moment we've got the initial request, e.g. our server produces the entire list and stores it in the immutable form:
```
// Creates a view based on the parameters passed
@ -836,7 +836,7 @@ This scheme's downsides are the necessity to create separate indexed event stora
##### Errors must be informative
While writing the code developers face problems, many of them quite trivial, like invalid parameter type or some boundary violation. The more convenient are error responses your API return, the fewer is the time developers waste in struggling with it, and the more comfortable is working with the API.
While writing the code developers face problems, many of them quite trivial, like invalid parameter types or some boundary violations. The more convenient are the error responses your API return, the less is the amount of time developers waste struggling with it, and the more comfortable is working with the API.
**Bad**:
```
@ -851,7 +851,7 @@ POST /v1/coffee-machines/search
→ 400 Bad Request
{}
```
— of course, the mistakes (typo in `"lngo"` and wrong coordinates) are obvious. But the handler checks them anyway, why not return readable descriptions?
— of course, the mistakes (typo in the `"lngo"`, wrong coordinates) are obvious. But the handler checks them anyway, why not return readable descriptions?
**Better**:
```
@ -940,7 +940,7 @@ POST /v1/orders
```
— what was the point of showing the price changed dialog, if the user still can't make an order, even if the price is right? When one of the concurrent orders finishes, and the user is able to commit another one, prices, items availability, and other order parameters will likely need another correction.
**In third**, draw a chart: which error resolution might lead to the emergence of another one. Otherwise, you might eventually return the same error several times, or worse, make a cycle of errors.
**Third**, draw a chart: which error resolution might lead to the emergence of another one. Otherwise, you might eventually return the same error several times, or worse, make a cycle of errors.
```
// Create an order
@ -1023,12 +1023,12 @@ All endpoints must accept language parameters (for example, in a form of the `Ac
It is important to understand that the user's language and the user's jurisdiction are different things. Your API working cycle must always store the user's location. It might be stated either explicitly (requests contain geographical coordinates) or implicitly (initial location-bound request initiates session creation which stores the location), but no correct localization is possible in absence of location data. In most cases reducing the location to just a country code is enough.
The thing is that lots of parameters potentially affecting data formats depend not on language, but user location. To name a few: number formatting (integer and fractional part delimiter, digit groups delimiter), date formatting, the first day of the week, keyboard layout, measurement units system (which might be non-decimal!), etc. In some situations, you need to store two locations: user residence location and user ‘viewport’. For example, if the US citizen is planning a European trip, it's convenient to show prices in local currency, but measure distances in miles and feet.
The thing is that lots of parameters potentially affecting data formats depend not on language, but on a user's location. To name a few: number formatting (integer and fractional part delimiter, digit groups delimiter), date formatting, the first day of the week, keyboard layout, measurement units system (which might be non-decimal!), etc. In some situations, you need to store two locations: user residence location and user ‘viewport’. For example, if a US citizen is planning a European trip, it's convenient to show prices in local currency, but measure distances in miles and feet.
Sometimes explicit location passing is not enough since there are lots of territorial conflicts in a world. How the API should behave when user coordinates lie within disputed regions is a legal matter, regretfully. The author of this book once had to implement a ‘state A territory according to state B official position’ concept.
Sometimes explicit location passing is not enough since there are lots of territorial conflicts in the world. How the API should behave when user coordinates lie within disputed regions is a legal matter, regretfully. The author of this book once had to implement a ‘state A territory according to state B official position’ concept.
**Important**: mark a difference between localization for end users and localization for developers. Take a look at the example in rule \#19: `localized_message` is meant for the user; the app should show it if there is no specific handler for this error exists in code. This message must be written in the user's language and formatted according to the user's location. But `details.checks_failed[].message` is meant to be read by developers examining the problem. So it must be written and formatted in a manner that suits developers best. In a software development world, it usually means ‘in English’.
**Important**: mark a difference between localization for end users and localization for developers. Take a look at the example in rule \#19: `localized_message` is meant for the user; the app should show it if there is no specific handler for this error exists in code. This message must be written in the user's language and formatted according to the user's location. But the `details.checks_failed[].message` is meant to be read by developers examining the problem. So it must be written and formatted in a manner that suits developers best. In the software development world, it usually means ‘in English’.
Worth mentioning is that `localized_` prefix in the example is used to differentiate messages to users from messages to developers. A concept like that must be, of course, explicitly stated in your API docs.
Worth mentioning is that the `localized_` prefix in the example is used to differentiate messages to users from messages to developers. A concept like that must be, of course, explicitly stated in your API docs.
And one more thing: all strings must be UTF-8, no exclusions.

View File

@ -96,4 +96,4 @@ Let's briefly describe these decisions and the key factors for making them.
* if you provide server-side APIs and compiled SDKs only, you may basically do not expose minor versions at all, just the actual one: the server-side API is totally within your control, and you may fix any problem efficiently;
* if you provide code-on-demand SDKs, it is considered a good form to provide an access to previous minor versions of SDK for a period of time sufficient enough for developers to test their application and fix some issues if necessary. Since full rewriting isn't necessary, it's fine to align with apps release cycle duration in your industry, which is usually several months in worst cases.
We will address these questions in more detail in the next chapters. Additionally, in the Section III we will also discuss, how to communicate to customers about new releases and older versions support discontinuance, and how to stimulate them to adopt new API versions.
We will address these questions in more detail in the next chapters. Additionally, in Section III we will also discuss, how to communicate to customers about new releases and older versions support discontinuance, and how to stimulate them to adopt new API versions.