<metaname="description"content="Designing APIs is a very special skill: API is a multiplier to both your opportunities and mistakes. This book is written to share the expertise and describe the best practices in the API design. The book comprises three large sections. In Section I we'll discuss designing APIs as a concept: how to build the architecture properly, from a high-level planning down to final interfaces. Section II is dedicated to an API's lifecycle: how interfaces evolve over time, and how to elaborate the product to match users' needs. Finally, Section III is more about un-engineering sides of the API, like API marketing, organizing support, and working with a community.">
<metaproperty="og:title"content="Sergey Konstantinov. The API">
<metaproperty="og:description"content="Designing APIs is a very special skill: API is a multiplier to both your opportunities and mistakes. This book is written to share the expertise and describe the best practices in the API design. The book comprises three large sections. In Section I we'll discuss designing APIs as a concept: how to build the architecture properly, from a high-level planning down to final interfaces. Section II is dedicated to an API's lifecycle: how interfaces evolve over time, and how to elaborate the product to match users' needs. Finally, Section III is more about un-engineering sides of the API, like API marketing, organizing support, and working with a community.">
<divclass="page-break"></div><nav><h2class="toc">Table of Contents</h2><ulclass="table-of-contents"><li><ahref="#section-1">Introduction</a><ul><li><ahref="#chapter-1">Chapter 1. On the Structure of This Book</a></li><li><ahref="#chapter-2">Chapter 2. The API Definition</a></li><li><ahref="#chapter-3">Chapter 3. API Quality Criteria</a></li><li><ahref="#chapter-4">Chapter 4. Backwards Compatibility</a></li><li><ahref="#chapter-5">Chapter 5. On Versioning</a></li><li><ahref="#chapter-6">Chapter 6. Terms and Notation Keys</a></li></ul></li><li><ahref="#section-2">Section I. The API Design</a><ul><li><ahref="#chapter-7">Chapter 7. The API Contexts Pyramid</a></li><li><ahref="#chapter-8">Chapter 8. Defining an Application Field</a></li><li><ahref="#chapter-9">Chapter 9. Separating Abstraction Levels</a></li><li><ahref="#chapter-10">Chapter 10. Isolating Responsibility Areas</a></li><li><ahref="#chapter-11">Chapter 11. Describing Final Interfaces</a></li><li><ahref="#chapter-12">Chapter 12. Annex to Section I. Generic API Example</a></li></ul></li></ul></nav><divstyle="page-break-after: always;"></div><h2><ahref="#section-1"class="anchor"id="section-1">Introduction</a></h2><h3><ahref="#chapter-1"class="anchor"id="chapter-1">Chapter 1. On the Structure of This Book</a></h3>
<p>In Section I we'll discuss designing APIs as a concept: how to build the architecture properly, from a high-level planning down to final interfaces.</p>
<p>Section II is dedicated to an API's lifecycle: how interfaces evolve over time, and how to elaborate the product to match users' needs.</p>
<p>First two sections are interesting to engineers mostly, while the third section is more relevant to both engineers and product managers. However, we insist that the third section is the most important for the API software developer. Since an API is a product for engineers, you cannot simply pronounce non-engineering team responsible for product planning and support. Nobody but you understands more about your API's product features.</p>
<p>Before we start talking about the API design, we need to explicitly define what the API is. Encyclopedia tells us that ‘API’ is an acronym for ‘Application Program Interface’. This definition is fine, but useless. Much like ‘Man’ definition by Plato: Man stood upright on two legs without feathers. This definition is fine again, but it gives us no understanding what's so important about a Man. (Actually, not ‘fine’ either. Diogenes of Sinope once brought a plucked chicken, saying ‘That's Plato's Man’. And Plato had to add ‘with broad nails’ to his definition.)</p>
<p>You're possibly reading this book using a Web browser. To make the browser display this page correctly, a bunch of stuff must work correctly: parsing the URL according to the specification; DNS service; TLS handshake protocol; transmitting the data over HTTP protocol; HTML document parsing; CSS document parsing; correct HTML+CSS rendering.</p>
<p>But those are just a tip of an iceberg. To make HTTP protocol work you need the entire network stack (comprising 4-5 or even more different level protocols) work correctly. HTML document parsing is being performed according to hundreds of different specifications. Document rendering calls the underlying operating system API, or even directly graphical processor API. And so on: down to modern CISC processor commands being implemented on top of the microcommands API.</p>
<p>In other words, hundreds or even thousands of different APIs must work correctly to make basic actions possible, like viewing a webpage. Modern internet technologies simply couldn't exist without these tons of API working fine.</p>
<p>What differs between a Roman aqueduct and a good API is that APIs presume a contract being <em>programmable</em>. To connect two areas some <em>coding</em> is needed. The goal of this book is to help you in designing APIs which serve their purposes as solidly as a Roman aqueduct does.</p>
<p>An aqueduct also illustrates another problem of the API design: your customers are engineers themselves. You are not supplying water to end-users: suppliers are plugging their pipes to you engineering structure, building their own structures upon it. From one side, you may provide an access to the water to much more people through them, not spending your time on plugging each individual house to your network. But from other side, you can't control the quality of suppliers' solutions, and you are to be blamed every time there is a water problem caused by their incompetence.</p>
<p>That's why designing the API implies a larger area of responsibility. <strong>API is a multiplier to both your opportunities and mistakes</strong>.</p><divclass="page-break"></div><h3><ahref="#chapter-3"class="anchor"id="chapter-3">Chapter 3. API Quality Criteria</a></h3>
<p>Let's discuss second question first. Obviously, API ‘finesse’ is first of all defined through its capability to solve developers' problems. (One may reasonably say that solving developers' problem might not be the main purpose of offering the API of ours to developers. However, manipulating public opinion is out of this book's author interest. Here we assume that APIs exist primarily to help developers in solving their problems, not for some other covertly declared purposes.)</p>
<p>So, how the API design might help the developers? Quite simple: a well-designed API must solve their problems in the most efficient and comprehensible manner. The distance from formulating the task to writing a working code must be as short as possible. Among other things, it means that:</p>
<li>it must be totally obvious out of your API's structure how to solve a task; ideally, developers at first glance should be able to understand, what entities are meant to solve their problem;</li>
<li>the API must be readable; ideally, developers write correct code after just looking at the method nomenclature, never bothering about details (especially API implementation details!); it also also very important to mention, that not only problem solution (the ‘happy path’) should be obvious, but also possible errors and exceptions (the ‘unhappy path’);</li>
<li>the API must be consistent; while developing new functionality (i.e. while using unknown API entities) developers may write new code similar to the code they already wrote using known API concepts, and this new code will work.</li>
<p>However static convenience and clarity of APIs is a simple part. After all, nobody seeks for making an API deliberately irrational and unreadable. When we are developing an API, we always start with clear basic concepts. Providing you've got some experience in APIs, it's quite hard to make an API core which fails to meet obviousness, readability, and consistency criteria.</p>
<p>Problems begin when we start to expand our API. Adding new functionality sooner or later result in transforming once plain and simple API into a mess of conflicting concepts, and our efforts to maintain backwards compatibility lead to illogical, unobvious and simply bad design solutions. It is partly related to an inability to predict the future in details: your understanding of ‘fine’ APIs will change over time, both in objective terms (what problems the API is to solve, and what are the best practices) and in subjective ones too (what obviousness, readability and consistency <em>really means</em> regarding your API).</p>
<p>Principles we are explaining below are specifically oriented to making APIs evolve smoothly over time, not being turned into a pile of mixed inconsistent interfaces. It is crucial to understand that this approach isn't free: a necessity to bear in mind all possible extension variants and to preserve essential growth points means interface redundancy and possibly excessing abstractions being embedded in the API design. Besides both make developers' work harder. <strong>Providing excess design complexities being reserved for future use makes sense only when this future actually exists for your API. Otherwise it's simply an overengineering.</strong></p><divclass="page-break"></div><h3><ahref="#chapter-4"class="anchor"id="chapter-4">Chapter 4. Backwards Compatibility</a></h3>
<p>Backwards compatibility is a <em>temporal</em> characteristics of your API. An obligation to maintain backwards compatibility is the crucial point where API development differs form software development in general.</p>
<p>Of course, backwards compatibility isn't an absolute. In some subject areas shipping new backwards incompatible API versions is a routine. Nevertheless, every time you deploy new backwards incompatible API version, the developers need to make some non-zero effort to adapt their code to the new API version. In this sense, releasing new API versions puts a sort of a ‘tax’ on customers. They must spend quite real money just to make sure their product continue working.</p>
<p>Large companies, which occupy firm market positions, could afford implying such a taxation. Furthermore, they may introduce penalties for those who refuse to adapt their code to new API versions, up to disabling their applications.</p>
<p>From our point of view such practice cannot be justified. Don't imply hidden taxes on your customers. If you're able to avoid breaking backwards compatibility — never break it.</p>
<p>Of course, maintaining old API versions is a sort of a tax either. Technology changes, and you cannot foresee everything, regardless of how nice your API is initially designed. At some point keeping old API versions results in an inability to provide new functionality and support new platforms, and you will be forced to release new version. But at least you will be able to explain to your customers why they need to make an effort.</p>
<p>We will discuss API lifecycle and version policies in Section II.</p><divclass="page-break"></div><h3><ahref="#chapter-5"class="anchor"id="chapter-5">Chapter 5. On Versioning</a></h3>
<p>Sentences ‘major API version’ and ‘new API version, containing backwards incompatible changes’ are therefore to be considered as equivalent ones.</p>
<p>In Section II we will discuss versioning policies in more details. In Section I we will just use semver versions designation, specifically <code>v1</code>, <code>v2</code>, etc.</p><divclass="page-break"></div><h3><ahref="#chapter-6"class="anchor"id="chapter-6">Chapter 6. Terms and Notation Keys</a></h3>
<p>Software development is being characterized, among other things, by an existence of many different engineering paradigms, whose adepts sometimes are quite aggressive towards other paradigms' adepts. While writing this book we are deliberately avoiding using terms like ‘method’, ‘object’, ‘function’, and so on, using a neutral term ‘entity’ instead. ‘Entity’ means some atomic functionality unit, like class, method, object, monad, prototype (underline what you think right).</p>
<p>As for an entity's components, we regretfully failed to find a proper term, so we will use words ‘fields’ and ‘methods’.</p>
<p>Most of the examples of APIs will be provided in a form of JSON-over-HTTP endpoints. This is some sort of notation which, as we see it, helps to describe concepts in the most comprehensible manner. <code>GET /v1/orders</code> endpoint call could easily be replaced with <code>orders.get()</code> method call, local or remote; JSON could easily be replaced with any other data format. The meaning of assertions shouldn't change.</p>
<li>a client performs a <code>POST</code> request to a <code>/v1/bucket/{id}/some-resource</code> resource, where <code>{id}</code> is to be replaced with some <code>bucket</code>'s identifier (<code>{something}</code> notation refers to the nearest term from the left, unless explicitly specified otherwise);</li>
<li>a specific JSON, containing a <code>some_parameter</code> field and some other unspecified fields (indicated by ellipsis) is being sent as a request body payload;</li>
<li>in response (marked with arrow symbol <code>→</code>) server returns a <code>404 Not Founds</code> status code; the status might be omitted (treat it like <code>200 OK</code> if no status is provided);</li>
<li>the response could possibly contain additional notable headers;</li>
<li>the response body is a JSON comprising single <code>error_message</code> field; field value absence means that field contains exactly what you expect it should contain — some error message in this case.</li>
<p>Term ‘client’ here stands for an application being executed on a user's device, either native of web one. Terms ‘agent’ and ‘user agent’ are synonymous to ‘client’.</p>
<p>Simplified notation might be used to avoid redundancies, like <code>POST /some-resource</code><code>{…, "some_parameter", …}</code> → <code>{ "operation_id" }</code>; request and response bodies might also be omitted.</p>
<p>We will be using sentences like ‘<code>POST /v1/bucket/{id}/some-resource</code> method’ (or simply ‘<code>bucket/some-resource</code> method’, ‘<code>some-resource</code>’ method — if there are no other <code>some-resource</code>s in the chapter, so there is no ambiguity) to refer to such endpoint definitions.</p>
<p>Apart from HTTP API notation, we will employ C-style pseudocode, or, to be more precise, JavaScript-like or Python-like since types are omitted. We assume such imperative structures being readable enough to skip detailed grammar explanations.</p><divclass="page-break"></div><h2><ahref="#section-2"class="anchor"id="section-2">Section I. The API Design</a></h2><h3><ahref="#chapter-7"class="anchor"id="chapter-7">Chapter 7. The API Contexts Pyramid</a></h3>
<p>This four-step algorithm actually builds an API from top to bottom, from common requirements and use case scenarios down to a refined entity nomenclature. In fact, moving this way will eventually conclude with a ready-to-use API — that's why we value this approach highly.</p>
<p>It might seem that the most useful pieces of advice are given in the last chapter, but that's not true. The cost of a mistake made at certain levels differs. Fixing the naming is simple; revising the wrong understanding of what the API stands for is practically impossible.</p>
<p><strong>NB</strong>. Here and throughout we will illustrate API design concepts using a hypothetical example of an API allowing for ordering a cup of coffee in city cafes. Just in case: this example is totally synthetic. If we were to design such an API in a real world, it would probably have very few in common with our fictional example.</p><divclass="page-break"></div><h3><ahref="#chapter-8"class="anchor"id="chapter-8">Chapter 8. Defining an Application Field</a></h3>
<p>Key question you should ask yourself looks like that: what problem we solve? It should be asked four times, each time putting an emphasis on an another word.</p>
<p><em>What</em> problem we solve? Could we clearly outline the situation in which our hypothetical API is needed by developers?</p>
</li>
<li>
<p>What <em>problem</em> we solve? Are we sure that abovementioned situation poses a problem? Does someone really want to pay (literally or figuratively) to automate a solution for this problem?</p>
</li>
<li>
<p>What problem <em>we</em> solve? Do we actually possess an expertise to solve the problem?</p>
</li>
<li>
<p>What problem we <em>solve</em>? Is it true that the solution we propose solves the problem indeed? Aren't we creating another problem instead?</p>
<p>Why would someone need an API to make a coffee? Why ordering a coffee via ‘human-to-human’ or ‘human-to-machine’ interface is inconvenient, why have ‘machine-to-machine’ interface?</p>
<ul>
<li>Possibly, we're solving knowledge and selection problems? To provide humans with a full knowledge what options they have right now and right here.</li>
<li>Possibly, we're optimizing waiting times? To save the time people waste waiting their beverages.</li>
<li>Possibly, we're reducing the number of errors? To help people get exactly what they wanted to order, stop losing information in imprecise conversational communication or in dealing with unfamiliar coffee machine interfaces?</li>
<pre><code>‘Why’ question is the most important of all questions you must ask yourself. And not only about global project goals, but also locally about every single piece of functionality. **If you can't briefly and clearly answer the question ‘what for this entity is needed’, then it's not needed**.
</code></pre>
<pre><code>Here and throughout we assume, to make our example more complex and bizarre, that we are optimizing all three factors.
</code></pre>
<p>2. Do the problems we outlined really exist? Do we really observe unequal coffee-machines utilization in mornings? Do people really suffer from inability to find nearby a toffee nut latte they long for? Do they really care about minutes they spend in lines?</p>
<p>Do we actually have a resource to solve a problem? Do we have an access to sufficient number of coffee machines and users to ensure system's efficiency?</p>
<p>In general, there are no simple answers to those questions. Ideally, you should give answers having all the relevant metrics measured: how much time is wasted exactly, and what numbers we're going to achieve providing we have such coffee machines density? Let us also stress that in a real life obtaining these numbers is only possible if you're entering a stable market. If you try to create something new, your only option is to rely on your intuition.</p>
<p>Since our book is dedicated not to software development per se, but to developing APIs, we should look at all those questions from a different angle: why solving those problems specifically requires an API, not simply a specialized software application? In terms of our fictional example we should ask ourselves: why provide a service to developers, allowing for brewing coffee to end users, instead of just making an app?</p>
<p>In other words, there must be a solid reason to split two software development domains: there are the operators which provide APIs; and there are the operators which develop services for end users. Their interests are somehow different to such an extent, that coupling this two roles in one entity is undesirable. We will talk about the motivation to specifically provide APIs in more details in Section III.</p>
<p>We should also note, that you should try making an API when and only when you wrote ‘because that's our area of expertise’ in question 2. Developing APIs is sort of meta-engineering: you're writing some software to allow other companies to develop software to solve users' problems. You must possess an expertise in both domains (APIs and user products) to design your API well.</p>
<p>As for our speculative example, let us imagine that in the near future some tectonic shift happened within coffee brewing market. Two distinct player groups took shape: some companies provide a ‘hardware’, i.e. coffee machines; other companies have an access to customer auditory. Something like the flights market looks like: there are air companies, which actually transport passengers; and there are trip planning services where users are choosing between trip variants the system generates for them. We're aggregating a hardware access to allow app vendors for ordering the fresh brewed coffee.</p>
<p>After finishing all these theoretical exercises, we should proceed right to designing and developing the API, having a decent understanding regarding two things:</p>
<ul>
<li><em>what</em> we're doing, exactly;</li>
<li><em>how</em> we're doing it, exactly.</li>
</ul>
<p>In our coffee case, we are:</p>
<ul>
<li>providing an API to services with larger audience, so their users may order a cup of coffee in the most efficient and convenient manner;</li>
<li>abstracting an access to coffee machines ‘hardware’ and delivering methods to select a beverage kind and some location to brew — and to make an order.</li>
<p>‘Separate abstraction levels in your code’ is possibly the most general advice to software developers. However, we don't think it would be a grave exaggeration to say that abstraction levels separation is also the most difficult task to API developers.</p>
<p>Before proceeding to the theory, we should formulate clearly <em>why</em> abstraction levels are so important, and what goals we're trying to achieve by separating them.</p>
<p>Let us remember that software product is a medium connecting two outstanding contexts, thus transforming terms and operations belonging to one subject area into another area's concepts. The more these areas differ, the more interim connecting links we have to introduce.</p>
<li>Each cup of coffee is being prepared according to some <code>recipe</code>, which implies the presence of different ingredients and sequences of preparation steps.</li>
<p>Every level presents a developer-facing ‘facet’ in our API. While elaborating abstractions hierarchy, we first of all trying to reduce the interconnectivity of different entities. That would help us to reach several goals.</p>
<p>Simplifying developers' work and the learning curve. At each moment of time a developer is operating only those entities which are necessary for the task they're solving right now. And conversely, badly designed isolation leads to the situation when developers have to keep in mind lots of concepts mostly unrelated to the task being solved.</p>
</li>
<li>
<p>Preserving backwards compatibility. Properly separated abstraction levels allow for adding new functionality while keeping interfaces intact.</p>
</li>
<li>
<p>Maintaining interoperability. Properly isolated low-level abstraction help us to adapt the API to different platforms and technologies without changing high-level entities.</p>
<p>Then a developer just needs to compare two numbers to find out whether the order is ready.</p>
<p>This solution intuitively looks bad, and it really is: it violates all abovementioned principles.</p>
<p><strong>In first</strong>, to solve the task ‘order a lungo’ a developer needs to refer to the ‘recipe’ entity and learn that every recipe has an associated volume. Then they need to embrace the concept that an order is ready at that particular moment when the prepared beverage volume becomes equal to the reference one. This concept is simply unguessable, and knowing it is mostly useless.</p>
<p><strong>In second</strong>, we will have automatically got problems if we need to vary the beverage size. For example, if one day we decide to offer a choice to a customer, how many milliliters of lungo they desire exactly, then we have to perform one of the following tricks.</p>
<p>Variant I: we have a list of possible volumes fixed and introduce bogus recipes like <code>/recipes/small-lungo</code> or <code>recipes/large-lungo</code>. Why ‘bogus’? Because it's still the same lungo recipe, same ingredients, same preparation steps, only volumes differ. We will have to start the mass production of recipes, only different in volume, or introduce some recipe ‘inheritance’ to be able to specify the ‘base’ recipe and just redefine the volume.</p>
<p>Variant II: we modify an interface, pronouncing volumes stated in recipes being just the default values. We allow to set different volume when placing an order:</p>
<p>For those orders with an arbitrary volume requested, a developer will need to obtain the requested volume not from <code>GET /v1/recipes</code>, but <code>GET /v1/orders</code>. Doing so we're getting a whole bunch of related problems:</p>
<li>there is a significant chance that developers will make mistakes in this functionality implementation, if they add arbitrary volume support in the code working with the <code>POST /v1/orders</code> handler, but forget to make corresponding changes in the order readiness check code;</li>
<li>the same field (coffee volume) now means different things in different interfaces. In <code>GET /v1/recipes</code> context <code>volume</code> field means ‘a volume to be prepared if no arbitrary volume is specified in <code>POST /v1/orders</code> request’; and it cannot be renamed to ‘default volume’ easily, we now have to live with that.</li>
<p><strong>In third</strong>, the entire scheme becomes totally inoperable if different types of coffee machines produce different volumes of lungo. To introduce ‘lungo volume depends on machine type’ constraint we have to do quite a nasty thing: make recipes depend on coffee machine id. By doing so we start actively ‘stir’ abstraction levels: one part of our API (recipe endpoints) becomes unusable without explicit knowledge of another part (coffee machines listing). And what is even worse, developers will have to change the logic of their apps: previously it was possible to choose volume first, then a coffee-machine; but now this step must be rebuilt from scratch.</p>
<p>From user scenarios to their internal representation: high-level entities and their method nomenclature must directly reflect API usage scenarios; low-level entities reflect the decomposition of scenarios into smaller parts.</p>
<p>From user subject field terms to ‘raw’ data subject field terms — in our case from high-level terms like ‘order’, ‘recipe’, ‘café’ to low-level terms like ‘beverage temperature’, ‘coffee machine geographical coordinates’, etc.</p>
<p>Finally, from data structures suitable for end users to ‘raw’ data structures — in our case, from ‘lungo recipe’ and ‘"Chamomile" café chain’ to raw byte data stream from ‘Good Morning’ coffee machine sensors.</p>
<p>In our example with coffee readiness detection we clearly face the situation when we need an interim abstraction level:</p>
<ul>
<li>from one side, an ‘order’ should not store the data regarding coffee machine sensors;</li>
<li>from other side, a coffee machine should not store the data regarding order properties (and its API probably doesn't provide such functionality).</li>
<p>A naïve approach to this situation is to design an interim abstraction level as a ‘connecting link’, which reformulates tasks from one abstraction level to another. For example, introduce a <code>task</code> entity like that:</p>
<p>We call this approach ‘naïve’ not because it's wrong; on the contrary, that's quite a logical ‘default’ solution if you don't know yet (or don't understand yet) how your API will look like. The problem with this approach lies in its speculativeness: it doesn't reflect the subject area's organization.</p>
<p>An experienced developer in this case must ask: what options do exist? How we really should determine beverage readiness? If it turns out that comparing volumes <em>is</em> the only working method to tell whether the beverage is ready, then all the speculations above are wrong. You may safely include readiness-by-volume detection into your interfaces, since no other methods exist. Before abstracting something we need to learn what exactly we're abstracting.</p>
<p>In our example let's assume that we have studied coffee machines API specs, and learned that two device types exist:</p>
<li>coffee machines capable of executing programs coded in the firmware; the only customizable options are some beverage parameters, like desired volume, a syrup flavor and a kind of milk;</li>
<li>coffee machines with builtin functions, like ‘grind specified coffee volume’, ‘shed specified amount of water’, etc.; such coffee machines lack ‘preparation programs’, but provide an access to commands and sensors.</li>
<p><strong>NB</strong>. Just in case: this API violates a number of design principles, starting with a lack of versioning; it's described in such a manner because of two reasons: (1) to demonstrate how to design a more convenient API, (b) in a real life you really got something like that from vendors, and this API is quite sane, actually.</p>
<p><strong>NB</strong>. The example is intentionally factitious to model a situation described above: to determine beverage readiness you have to compare the requested volume with volume sensor readings.</p>
<p>Now the picture becomes more apparent: we need to abstract coffee machine API calls, so that ‘execution level’ in our API provides general functions (like beverage readiness detection) in a unified form. We should also note that these two coffee machine API kinds belong to different abstraction levels themselves: the first one provides a higher level API than the second one. Therefore, a ‘branch’ of our API working with second kind machines will be more intricate.</p>
<p>The next step in abstraction level separating is determining what functionality we're abstracting. To do so we need to understand the tasks developers solve at the ‘order’ level, and to learn what problems they get if our interim level is missing.</p>
<li>Obviously the developers desire to create an order uniformly: list high-level order properties (beverage kind, volume and special options like syrup or milk type), and don't think about how specific coffee machine executes it.</li>
<li>Developers must be able to learn the execution state: is the order ready? if not — when to expect it's ready (and is there any sense to wait in case of execution errors).</li>
<li>Developers need to address the order's location in space and time — to explain to users where and when they should pick the order up.</li>
<li>Finally, developers need to run atomic operations, like canceling orders.</li>
</ol>
<p>Note, that the first kind API is much closer to developers' needs than the second kind API. Indivisible ‘program’ is a way more convenient concept than working with raw commands and sensor data. There are only two problems we see in the first kind API:</p>
<ul>
<li>absence of explicit ‘programs’ to ‘recipes’ relation; program identifier is of no use to developers, actually, since there is a ‘recipe’ concept;</li>
<li>absence of explicit ‘ready’ status.</li>
</ul>
<p>But with the second kind API it's much worse. The main problem we foresee is an absence of ‘memory’ for actions being executed. Functions and sensors API is totally stateless, which means we don't event understand who called a function being currently executed, when, and which order it is related to.</p>
<p>So we need to introduce two abstraction levels.</p>
<p>Execution control level, which provides uniform interface to indivisible programs. ‘Uniform interface’ means here that, regardless of a coffee machine's kind, developers may expect:</p>
<ul>
<li>statuses and other high-level execution parameters nomenclature (for example, estimated preparation time or possible execution error) being the same;</li>
<li>methods nomenclature (for example, order cancellation method) and their behavior being the same.</li>
</ul>
</li>
<li>
<p>Program runtime level. For the first kind API it will provide just a wrapper for existing programs API; for the second kind API the entire ‘runtime’ concept is to be developed from scratch by us.</p>
<p>The <code>POST /orders</code> handler checks all order parameters, puts a hold of corresponding sum on user's credit card, forms a request to run, and calls the execution level. First, correct execution program needs to be fetched:</p>
<p>Please note that knowing the coffee machine API kind isn't required at all; that's why we're making abstractions! We could possibly make interfaces more specific, implementing different <code>run</code> and <code>match</code> endpoints for different coffee machines:</p>
<p>This approach has some benefits, like a possibility to provide different sets of parameters, specific to the API kind. But we see no need in such fragmentation. <code>run</code> method handler is capable of extracting all the program metadata and perform one of two actions:</p>
<p>Out of general concerns runtime level for the second kind API will be private, so we are more or less free in implementing it. The easiest solution would be to develop a virtual state machine which creates a ‘runtime’ (e.g. a stateful execution context) to run a program and control its state.</p>
<p><strong>NB</strong>: while implementing <code>orders</code> → <code>match</code> → <code>run</code> → <code>runtimes</code> call sequence we have two options:</p>
<li>either <code>POST /orders</code> handler requests the data regarding the recipe, the coffee machine model, and the program on its own behalf, and forms a stateless request which contains all the necessary data (the API kind, command sequence, etc.);</li>
<li>or the request contains only data identifiers, and next in chain handler will request pieces of data it needs via some internal APIs.</li>
<p>Crucial quality of properly separated abstraction levels (and therefore a requirement to their design) is a level isolation restriction: <strong>only adjacent levels may interact</strong>. If ‘jumping over’ is needed in the API design, then clearly mistakes were made.</p>
<li>user initiates a call to the <code>GET /v1/orders</code> method;</li>
<li>the <code>orders</code> handler completes operations on its level of responsibility (for example, checks user authorization), finds <code>program_run_id</code> identifier and performs a call to the <code>runs/{program_run_id}</code> endpoint;</li>
<li>the <code>runs</code> endpoint in its turn completes operations corresponding to its level (for example, checks the coffee machine API kind) and, depending on the API kind, proceeds with one of two possible execution branches:
<li>either calls the <code>GET /execution/status</code> method of a physical coffee machine API, gets the coffee volume and compares it to the reference value;</li>
<li>in case of the second API kind, the call chain continues: the <code>GET /runtimes</code> handler invokes the <code>GET /sensors</code> method of a physical coffee machine API and performs some manipulations with the data, like comparing the cup / ground coffee / shed water volumes with the reference ones, and changing the state and the status if needed.</li>
<p><strong>NB</strong>: The ‘call chain’ wording shouldn't be treated literally. Each abstraction level might be organized differently in a technical sense:</p>
<li>there might be a cache at each level, being updated upon receiving a callback call or an event. In particular, a low-level runtime execution cycle obviously must be independent from upper levels and renew its state in background, not waiting for an explicit call.</li>
<p>Note what happens here: each abstraction level wields its own status (e.g. order, runtime, sensors status), being formulated in corresponding to this level subject area terms. Forbidding the ‘jumping over’ results in necessity to spawn statuses at each level independently.</p>
<p>Let's now look how the order cancel operation flows through our abstraction levels. In this case the call chain will look like that:</p>
<li>finds the <code>program_run_id</code> identifier and calls the <code>runs/{program_run_id}/cancel</code> method;</li>
</ul>
</li>
<li>the <code>rides/cancel</code> handler completes operations on its level of responsibility and, depending on the coffee machine API kind, proceeds with one of two possible execution branches:
<p>Handling state-modifying operations like <code>cancel</code> requires more advanced abstraction levels juggling skills compared to non-modifying calls like <code>GET /status</code>. There are two important moments:</p>
<li>at the <code>orders</code> level this action in fact splits into several ‘cancels’ of other levels: you need to cancel money holding and to cancel an order execution;</li>
<li>at the second API kind physical level a ‘cancel’ operation itself doesn't exist: ‘cancel’ means ‘executing the <code>discard_cup</code> command’, which is quite the same as any other command.
The interim API level is need to make this transition between different level ‘cancels’ smooth and rational without jumping over canyons.</li>
</ul>
</li>
<li>
<p>From a high-level point of view, canceling an order is a terminal action, since no further operations are possible. From a low-level point of view, the processing continues until the cup is discarded, and then the machine is to be unlocked (e.g. new runtimes creation allowed). It's a task to the execution control level to couple those two states, outer (the order is canceled) and inner (the execution continues).</p>
<p>It might look that forcing the abstraction levels isolation is redundant and makes interfaces more complicated. In fact, it is: it's very important to understand that flexibility, consistency, readability and extensibility come with a price. One may construct an API with zero overhead, essentially just provide an access to coffee machine's microcontrollers. However using such an API would be a disaster to a developer, not mentioning an inability to expand it.</p>
<p>Separating abstraction levels is first of all a logical procedure: how we explain to ourselves and to developers what our API consists of. <strong>The abstraction gap between entities exists objectively</strong>, no matter what interfaces we design. Our task is just separate this gap into levels <em>explicitly</em>. The more implicitly abstraction levels are separated (or worse — blended into each other), the more complicated is your API's learning curve, and the worse is the code which use it.</p>
<p>One useful exercise allowing to examine the entire abstraction hierarchy is excluding all the particulars and constructing (on a paper or just in your head) a data flow chart: what data is flowing through you API entities, and how it's being altered at each step.</p>
<p>This exercise doesn't just helps, but also allows to design really large APIs with huge entity nomenclatures. Human memory isn't boundless; any project which grows extensively will eventually become too big to keep the entire entities hierarchy in mind. But it's usually possible to keep in mind the data flow chart; or at least keep a much larger portion of the hierarchy.</p>
<p>It starts with the sensors data, i.e. volumes of coffee / water / cups. This is the lowest data level we have, and here we can't change anything.</p>
</li>
<li>
<p>A continuous sensors data stream is being transformed into a discrete command execution statuses, injecting new concepts which don't exist within the subject area. A coffee machine API doesn't provide a ‘coffee is being shed’ or a ‘cup is being set’ notion. It's our software which treats incoming sensors data and introduces new terms: if the volume of coffee or water is less than the target one, then the process isn't over yet. If the target value is reached, then this synthetic status is to be switched, and the next command to be executed.<br>
It is important to note that we don't calculate new variables out from sensors data: we need to create a new dataset first, a context, an ‘execution program’ comprising a sequence of steps and conditions, and to fill it with initial values. If this context is missing, it's impossible to understand what's happening with the machine.</p>
</li>
<li>
<p>Having a logical data about the program execution state, we can (again via creating a new high-level data context) merge two different data streams from two different kinds of APIs into a single stream, which provides in a unified form the data regarding executing a beverage preparation program with logical variables like recipe, volume, and readiness status.</p>
<p>Each API abstraction level therefore corresponds to some data flow generalization and enrichment, converting the low-level (and in fact useless to end users) context terms into the higher level context terms.</p>
<p>At the order level we set its logical parameters: recipe, volume, execution place and possible statuses set.</p>
</li>
<li>
<p>At the execution level we read the order level data and create a lower level execution context: the program as a sequence of steps, their parameters, transition rules, and initial state.</p>
</li>
<li>
<p>At the runtime level we read the target parameters (which operation to execute, what the target volume is) and translate them into coffee machine API microcommands and statuses for each command.</p>
<p>Also, if we take a deeper look into the ‘bad’ decision, being discussed in the beginning of this chapter (forcing developers to determine actual order status on their own), we could notice a data flow collision there:</p>
<li>from one side, in the order context ‘leaked’ physical data (beverage volume prepared) is injected, therefore stirring abstraction levels irreversibly;</li>
<li>from other side, the order context itself is deficient: it doesn't provide new meta-variables, non-existent at the lower levels (the order status, in particular), doesn't initialize them and don't provide the game rules.</li>
<p>We will discuss data contexts in more details in the Section II. Here we will just state that data flows and their transformations might be and must be examined as a specific API facet, which, from one side, helps us to separate abstraction levels properly, and, from other side, to check if our theoretical structures work as intended.</p><divclass="page-break"></div><h3><ahref="#chapter-10"class="anchor"id="chapter-10">Chapter 10. Isolating Responsibility Areas</a></h3>
<li>the user level (those entities users directly interact with and which are formulated in terms, understandable by user: orders, coffee recipes);</li>
<li>the program execution control level (the entities responsible for transforming orders into machine commands);</li>
<li>the runtime level for the second API kind (the entities describing the command execution state machine).</li>
<p>We are now to define each entity's responsibility area: what's the reasoning in keeping this entity within our API boundaries; what operations are applicable to the entity directly (and which are delegated to other objects). In fact, we are to apply the ‘why’-principle to every single API entity.</p>
<p>To do so we must iterate all over the API and formulate in subject area terms what every object is. Let us remind that the abstraction levels concept implies that each level is a some interim subject area per se; a step we take in the journey from describing a task in the first connected context terms (‘a lungo ordered by a user’) to the second connect context terms (‘a command performed by a coffee machine’).</p>
<li>A <code>recipe</code> describes an ‘ideal model’ of some coffee beverage type, its customer properties. A <code>recipe</code> is an immutable entity for us, which means we could only read it.</li>
<li>A <code>coffee-machine</code> is a model of a real world device. We must be able to retrieve the coffee machine's geographical location and the options it support from this model (will be discussed below).</li>
<li>A <code>program</code> describes some general execution plan for a coffee machine. Programs could only be read.</li>
<li>The program matcher <code>programs/matcher</code> is capable of coupling a <code>recipe</code> and a <code>program</code>, which in fact means ‘to retrieve a dataset needed to prepare a specific recipe on a specific coffee machine’.</li>
<p>If we look closely at the entities, we may notice that each entity turns out to be a composite. For example a <code>program</code> will operate high-level data (<code>recipe</code> and <code>coffee-machine</code>), enhancing them with its subject area terms (<code>program_run_id</code> for instance). This is totally fine: connecting contexts is what APIs do.</p>
<p>At this point, when our API is in general clearly outlined and drafted, we must put ourselves into developer's shoes and try writing code. Our task is to look at the entities nomenclature and make some estimates regarding their future usage.</p>
<p>So, let us imagine we've got a task to write an app for ordering a coffee, based upon our API. What code would we write?</p>
<p>Obviously the first step is offering a choice to a user, to make them point out what they want. And this very first step reveals that our API is quite inconvenient. There are no methods allowing for choosing something. A developer has to do something like that:</p>
<p>As you see, developers are to write a lot of redundant code (to say nothing about the difficulties of implementing spatial indexes). Besides, if we take into consideration our Napoleonic plans to cover all coffee machines in the world with our API, then we need to admit that this algorithm is just a waste of resources on retrieving lists and indexing them.</p>
<p>A necessity of adding a new endpoint for searching becomes obvious. To design such an interface we must imagine ourselves being a UX designer, and think about how an app could try to arouse users' interest. Two scenarios are evident:</p>
<li>display all cafes in the vicinity and types of coffee they offer (a ‘service discovery’ scenario) — for new users or just users with no specific tastes;</li>
<li>an <code>offer</code> — is a marketing bid: on what conditions a user could have the requested coffee beverage (if specified in request), or a some kind of marketing offering — prices for the most popular or interesting products (if no specific preference was set);</li>
<li>a <code>place</code> — is a spot (café, restaurant, street vending machine) where the coffee machine is located; we never introduced this entity before, but it's quite obvious that users need more convenient guidance to find a proper coffee machine than just geographical coordinates.</li>
<p><strong>NB</strong>. We could have been enriched the existing <code>/coffee-machines</code> endpoint instead of adding a new one. This decision, however, looks less semantically viable: coupling in one interface different modes of listing entities, by relevance and by order, is usually a bad idea, because these two types of rankings implies different usage features and scenarios. Furthermore, enriching the search with ‘offers’ pulls this functionality out of <code>coffee-machines</code> namespace: the fact of getting offers to prepare specific beverage in specific conditions is a key feature to users, with specifying the coffee-machine being just a part of an offer.</p>
<p>Methods similar to newly invented <code>offers/search</code> are called <em>helpers</em>. The purpose they exist is to generalize known API usage scenarios and facilitate implementing them. By ‘facilitating’ we mean not only reducing wordiness (getting rid of ‘boilerplates’), but also helping developers to avoid common problems and mistakes.</p>
<p>For instance, let's consider the order price question. Our search function returns some ‘offers’ with prices. But ‘price’ is volatile; coffee could cost less during ‘happy hours’, for example. Developers could make a mistake thrice while implementing this functionality:</p>
<p>To solve the third problem we could demand including the displayed price in the order creation request, and return an error if it differs from the actual one. (In fact, any API working with money <em>shall</em> do so.) But it isn't helping with the first two problems, and makes user experience degrade. Displaying actual price is always a much more convenient behavior than displaying errors upon pressing the ‘place an order’ button.</p>
<p>Doing so we're not only helping developers to grasp a concept of getting relevant price, but also solving a UX task of telling users about ‘happy hours’.</p>
<p>As an alternative we could split endpoints: one for searching, another one for obtaining offers. This second endpoint would only be needed to actualize prices in the specified places.</p>
<p>Formally speaking, this error response is enough: users get the ‘Invalid price’ message, and they have to repeat the order. But from a UX point of view that would be a horrible decision: the user hasn't made any mistakes, and this message isn't helpful at all.</p>
<p>The main rule of error interfaces in the APIs is: an error response must help a client to understand <em>what to do with this error</em>. All other stuff is unimportant: if the error response was machine readable, there would be no need in the user readable message.</p>
<p>An error response content must address the following questions:</p>
HTTP APIs traditionally employ <code>4xx</code> status codes to indicate client problems, <code>5xx</code> to indicates server problems (with the exception of a <code>404</code>, which is an uncertainty status).</li>
<li>If the error is caused by a client, is it resolvable, or not?<br>
The invalid price error is resolvable: client could obtain a new price offer and create a new order with it. But if the error occurred because of a mistake in the client code, then eliminating the cause is impossible, and there is no need to make user push the ‘place an order’ button again: this request will never succeed.<br>
<strong>NB</strong>: here and throughout we indicate resolvable problems with <code>409 Conflict</code> code, and unresolvable ones with <code>400 Bad Request</code>.</li>
<li>If the error is resolvable, then what's the kind of the problem? Obviously, client couldn't resolve a problem it's unaware of. For every resolvable problem some <em>code</em> must be written (reobtaining the offer in our case), so a list of error descriptions must exist.</li>
<li>If the same kind of errors arise because of different parameters being invalid, then which parameter value is wrong exactly?</li>
<li>Finally, if some parameter value is unacceptable, then what values are acceptable?</li>
<p>After getting this error, a client is to check the error's kind (‘some problem with offer’), check the specific error reason (‘order lifetime expired’) and send an offer retrieve request again. If <code>checks_failed</code> field indicated another error reason (for example, the offer isn't bound to the specified user), client actions would be different (re-authorize the user, then get a new offer). If there were no error handler for this specific reason, a client would show <code>localized_message</code> to the user, and invoke standard error recovery procedure.</p>
<p>It is also worth mentioning that unresolvable errors are useless to a user at the time (since the client couldn't react usefully to unknown errors), but it doesn't mean that providing extended error data is excessive. A developer will read it when fixing the error in the code. Also, check paragraphs 12&13 in the next chapter.</p>
<p>Out of our own API development experience, we can tell without any doubt that the greatest final interfaces design mistake (and the greatest developers' pain accordingly) is an excessive overloading of entities' interfaces with fields, methods, events, parameters, and other attributes.</p>
<p>Meanwhile, there is the ‘Golden Rule’ of interface design (applicable not only to APIs, but almost to anything): humans could comfortably keep 7±2 entities in a short-term memory. Manipulating a larger number of chunks complicates things for most of humans. The rule is also known as <ahref="https://en.wikipedia.org/wiki/Working_memory#Capacity">‘Miller's law’</a>.</p>
<p>The only possible method of overcoming this law is decomposition. Entities should be grouped under single designation at every concept level of the API, so developers are never to operate more than 10 entities at a time.</p>
<p>Let's take a look at a simple example: what the coffee machine search function returns. To ensure an adequate UX of the app, quite bulky datasets are required.</p>
<p>This approach is quite normal, alas; could be found in almost every API. As we see, a number of entities' fields exceeds recommended 7, and even 9. Fields are being mixed into one single list, often with similar prefixes.</p>
<p>In this situation we are to split this structure into data domains: which fields are logically related to a single subject area. In our case we may identify at least 7 data clusters:</p>
<p>Such decomposed API is much easier to read than a long sheet of different attributes. Furthermore, it's probably better to group even more entities in advance. For example, <code>place</code> and <code>route</code> could be joined in a single <code>location</code> structure, or <code>offer</code> and <code>pricing</code> might be combined into a some generalized object.</p>
<p>It is important to say that readability is achieved not only by mere grouping the entities. Decomposing must be performed in such a manner that a developer, while reading the interface, instantly understands: ‘here is the place description of no interest to me right now, no need to traverse deeper’. If the data fields needed to complete some action are scattered all over different composites, the readability degrades, not improves.</p>
<p>Proper decomposition also helps extending and evolving the API. We'll discuss the subject in the Section II.</p><divclass="page-break"></div><h3><ahref="#chapter-11"class="anchor"id="chapter-11">Chapter 11. Describing Final Interfaces</a></h3>
<p>When all entities, their responsibilities, and relations to each other are defined, we proceed to developing the API itself. We are to describe the objects, fields, methods, and functions nomenclature in details. In this chapter we're giving purely practical advice on making APIs usable and understandable.</p>
<p>Rules are not to be applied unconditionally. They are not making thinking redundant. Every rule has a rational reason to exist. If your situation doesn't justify following the rule — then you shouldn't do it.</p>
<p>For example, demanding a specification being consistent exists to help developers spare time on reading docs. If you <em>need</em> developers to read some entity's doc, it is totally rational to make its signature deliberately inconsistent.</p>
<p>This idea applies to every concept listed below. If you get an unusable, bulky, unobvious API because you follow the rules, it's a motive to revise the rules (or the API).</p>
<p>It is important to understand that you always can introduce the concepts of your own. For example, some frameworks willfully reject paired <code>set_entity</code> / <code>get_entity</code> methods in a favor of a single <code>entity()</code> method, with an optional argument. The crucial part is being systematic in applying the concept. If it's rendered into life, you must apply it to every single API method, or at the very least elaborate a naming rule to discern such polymorphic methods from regular ones.</p>
<p>It's quite a surprise that accessing the <code>cancellation</code> resource (what is it?) with non-modifying <code>GET</code> method actually cancels an order.</p>
<p>Even if the operation is non-modifying, but computationally expensive, you should explicitly indicate that, especially if clients got charged for computational resource usage. Even more so, default values must not be set in a manner leading to maximum resource consumption.</p>
<p><strong>Better</strong>:</p>
<pre><code>// Returns aggregated statistics
// for a specified period of time
POST /v1/orders/statistics/aggregate
{ "begin_date", "end_date" }
</code></pre>
<p><strong>Try to design function signatures to be absolutely transparent about what the function does, what arguments takes and what's the result</strong>. While reading a code working with your API, it must be easy to understand what it does without reading docs.</p>
<p>Two important implications:</p>
<p><strong>1.1.</strong> If the operation is modifying, it must be obvious from the signature. In particular, there might be no modifying operations using <code>GET</code> verb.</p>
<p><strong>1.2.</strong> If your API's nomenclature contains both synchronous and asynchronous operations, then (a)synchronicity must be apparent from signatures, <strong>or</strong> a naming convention must exist.</p>
<p>Regretfully, the humanity is unable to agree on the most trivial things, like which day starts the week, to say nothing about more sophisticated standards.</p>
<p>So <em>always</em> specify exactly which standard is applied. Exceptions are possible, if you 100% sure that only one standard for this entity exists in the world, and every person on Earth is totally aware of it.</p>
<p><strong>Bad</strong>: <code>"date": "11/12/2020"</code> — there are tons of date formatting standards; you can't even tell which number means the day number and which number means the month.</p>
<p>One particular implication from this rule is that money sums must <em>always</em> be accompanied with a currency code.</p>
<p>It is also worth saying that in some areas the situation with standards is so spoiled that, whatever you do, someone got upset. A ‘classical’ example is geographical coordinates order (latitude-longitude vs longitude-latitude). Alas, the only working method of fighting with frustration there is the ‘Serenity Notepad’ to be discussed in Section II.</p>
<p>If the protocol allows, fractional numbers with fixed precision (like money sums) must be represented as a specially designed type like Decimal or its equivalent.</p>
<p>If there is no Decimal type in the protocol (for instance, JSON doesn't have one), you should either use integers (e.g. apply a fixed multiplicator) or strings.</p>
<p>In XXI century there's no need to shorten entities' names.</p>
<p><strong>Bad</strong>: <code>order.time()</code> — unclear, what time is actually returned: order creation time, order preparation time, order waiting time?…</p>
<pre><code>// Returns a pointer to the first occurrence
// in str1 of any of the characters
// that are part of str2
strpbrk (str1, str2)
</code></pre>
<p>Possibly, an author of this API thought that <code>pbrk</code> abbreviature would mean something to readers; clearly mistaken. Also it's hard to tell from the signature which string (<code>str1</code> or <code>str2</code>) stands for a character set.</p>
— though it's highly disputable whether this function should exist at all; a feature-rich search function would be much more convenient. Also, shortening <code>string</code> to <code>str</code> bears no practical sense, regretfully being a routine in many subject areas.</p>
<p>Field named <code>recipe</code> must be of <code>Recipe</code> type. Field named <code>recipe_id</code> must contain a recipe identifier which we could find within <code>Recipe</code> entity.</p>
<p>Same for primitive types. Arrays must be named in a plural form or as collective nouns, i.e. <code>objects</code>, <code>children</code>. If that's impossible, better add a prefix or a postfix to avoid doubt.</p>
<p><strong>Bad</strong>: <code>GET /news</code> — unclear whether a specific news item is returned, or a list of them.</p>
<p>Similarly, if a Boolean value is expected, entity naming must describe some qualitative state, i.e. <code>is_ready</code>, <code>open_now</code>.</p>
<p>Specific platforms imply specific additions to this rule with regard to first class citizen types they provide. For examples, entities of <code>Date</code> type (if such type is present) would benefit from being indicated with <code>_at</code> or <code>_date</code> postfix, i.e. <code>created_at</code>, <code>occurred_at</code>.</p>
<p>If entity name is a polysemantic term itself, which could confuse developers, better add an extra prefix or postfix to avoid misunderstanding.</p>
<p><strong>Bad</strong>:</p>
<pre><code>// Returns a list of coffee machine builtin functions
GET /coffee-machines/{id}/functions
</code></pre>
<p>Word ‘function’ is many-valued. It could mean builtin functions, but also ‘a piece of code’, or a state (machine is functioning).</p>
— <code>begin</code> and <code>stop</code> doesn't match; developers will have to dig into the docs.</p>
<p><strong>Better</strong>: either <code>begin_transition</code> / <code>end_transition</code> or <code>start_transition</code> / <code>stop_transition</code>.</p>
<pre><code>// Find the position of the first occurrence
// of a substring in a string
strpos(haystack, needle)
</code></pre>
<pre><code>// Replace all occurrences
// of the search string with the replacement string
str_replace(needle, replace, haystack)
</code></pre>
<p>Several rules are violated:</p>
<ul>
<li>inconsistent underscore using;</li>
<li>functionally close methods have different <code>needle</code>/<code>haystack</code> argument order;</li>
<li>first function finds the first occurrence while second one finds them all, and there is no way to deduce that fact out of the function signatures.</li>
</ul>
<p>We're leaving the exercise of making these signatures better to the reader.</p>
<p>It's considered good form to use globally unique strings as entity identifiers, either semantic (i.e. "lungo" for beverage types) or random ones (i.e. <ahref="https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)">UUID-4</a>). It might turn out to be extremely useful if you need to merge data from several sources under single identifier.</p>
<p>In general, we tend to advice using urn-like identifiers, e.g. <code>urn:order:<uuid></code> (or just <code>order:<uuid></code>). That helps a lot in dealing with legacy systems with different identifiers attached to the same entity. Namespaces in urns help to understand quickly which identifier is used, and is there a usage mistake.</p>
<p>One important implication: <strong>never use increasing numbers as external identifiers</strong>. Apart from abovementioned reasons, it allows counting how many entities of each type there are in the system. You competitors will be able to calculate a precise number of orders you have each day, for example.</p>
<p><strong>NB</strong>: this book often use short identifiers like "123" in code examples; that's for reading the book on small screens convenience, do not replicate this practice in a real-world API.</p>
<p>— though the operation looks to be executed successfully, the client must store order id and recurrently check <code>GET /v1/orders/{id}</code> state. This pattern is bad per se, but gets even worse when we consider two cases:</p>
<ul>
<li>clients might lose the id, if system failure happened in between sending the request and getting the response, or if app data storage was damaged or cleansed;</li>
<li>customers can't use another device; in fact, the knowledge of orders being created is bound to a specific user agent.</li>
</ul>
<p>In both cases customers might consider order creating failed, and make a duplicate order, with all the consequences to be blamed on you.</p>
— humans are bad at perceiving double negation; make mistakes.</p>
<p><strong>Better</strong>: <code>"prohibit_calling": true</code> or <code>"avoid_calling": true</code><br>
— it's easier to read, though you shouldn't deceive yourself. Avoid semantical double negations, even if you've found a ‘negative’ word without ‘negative’ prefix.</p>
<p>Also worth mentioning, that making mistakes in <ahref="https://en.wikipedia.org/wiki/De_Morgan's_laws">de Morgan's laws</a> usage is even simpler. For example, if you have two flags:</p>
<p>‘Coffee might be prepared’ condition would look like <code>has_beans && has_cup</code> — both flags must be true. However, if you provide the negations of both flags:</p>
<pre><code>{
"beans_absence": false,
"cup_absence": false
}
</code></pre>
<p>— then developers will have to evaluate one of <code>!beans_absence && !cup_absence</code> ⇔ <code>!(beans_absence || cup_absence)</code> conditions, and in this transition people tend to make mistakes. Avoiding double negations helps little, and regretfully only general advice could be given: avoid the situations, when developers have to evaluate such flags.</p>
<p>This advice is opposite to the previous one, ironically. When developing APIs you frequently need to add a new optional field with non-empty default value. For example:</p>
<p>New <code>contactless_delivery</code> option isn't required, but its default value is <code>true</code>. A question arises: how developers should discern explicit intention to abolish the option (<code>false</code>) from knowing not it exists (field isn't set). They have to write something like:</p>
<p>This practice makes the code more complicated, and it's quite easy to make mistakes, which will effectively treat the field in a quite opposite manner. Same could happen if some special values (i.e. <code>null</code> or <code>-1</code>) to denote value absence are used.</p>
<p>The universal rule to deal with such situations is to make all new Boolean flags being false by default.</p>
<p><strong>Better</strong></p>
<pre><code>POST /v1/orders
{}
→
{
"force_contact_delivery": false
}
</code></pre>
<p>If a non-Boolean field with specially treated value absence is to be introduced, then introduce two fields.</p>
<p><strong>Bad</strong>:</p>
<pre><code>// Creates a user
POST /users
{ … }
→
// Users are created with a monthly
// spending limit set by default
{
…
"spending_monthly_limit_usd": "100"
}
// To cancel the limit null value is used
POST /users
{
…
"spending_monthly_limit_usd": null
}
</code></pre>
<p><strong>Better</strong></p>
<pre><code>POST /users
{
// true — user explicitly cancels
// monthly spending limit
// false — limit isn't canceled
// (default value)
"abolish_spending_limit": false,
// Non-required field
// Only present if the previous flag
// is set to false
"spending_monthly_limit_usd": "100",
…
}
</code></pre>
<p><strong>NB</strong>: the contradiction with the previous rule lies in the necessity of introducing ‘negative’ flags (the ‘no limit’ flag), which we had to rename to <code>abolish_spending_limit</code>. Though it's a decent name for a negative flag, its semantics is still unobvious, and developers will have to read the docs. That's the way.</p>
<p>— this approach is usually chosen to lessen request and response body sizes, plus it allows to implement collaborative editing cheaply. Both these advantages are imaginary.</p>
<p><strong>In first</strong>, sparing bytes on semantic data is seldom needed in modern apps. Network packets sizes (MTU, Maximum Transmission Unit) are more than a kilobyte right now; shortening responses is useless while they're less then a kilobyte.</p>
<p>Excessive network traffic usually occurs if:</p>
<ul>
<li>no data pagination is provided;</li>
<li>no limits on field values are set;</li>
<li>binary data is transmitted (graphics, audio, video, etc.)</li>
</ul>
<p>Transferring only a subset of fields solves none of these problems, in the best case just masks them. More viable approach comprise:</p>
<ul>
<li>making separate endpoints for ‘heavy’ data;</li>
<li>introducing pagination and field value length limits;</li>
<li>stopping saving bytes in all other cases.</li>
<p><strong>In second</strong>, shortening response sizes will backfire exactly with spoiling collaborative editing: one client won't see the changes the other client has made. Generally speaking, in 9 cases out of 10 it is better to return a full entity state from any modifying operation, sharing the format with read access endpoint. Actually, you should always do this unless response size affects performance.</p>
<p><strong>In third</strong>, this approach might work if you need to rewrite a field's value. But how to unset the field, return its value to the default state? For example, how to <em>remove</em><code>client_phone_number_ext</code>?</p>
<p>In such cases special values are often being used, like <code>null</code>. But as we discussed above, this is a defective practice. Another variant is prohibiting non-required fields, but that would pose considerable obstacles in a way of expanding the API.</p>
<p><strong>Better</strong>: one of the following two strategies might be used.</p>
<p><strong>Option #1</strong>: splitting the endpoints. Editable fields are grouped and taken out as separate endpoints. This approach also matches well against <ahref="#chapter-10">the decomposition principle</a> we discussed in the previous chapter.</p>
<pre><code>// Return the order state
// by its id
GET /v1/orders/123
→
{
"order_id",
"delivery_details": {
"address"
},
"client_details": {
"phone_number",
"phone_number_ext"
},
"updated_at"
}
// Fully rewrite order delivery options
PUT /v1/orders/123/delivery-details
{ "address" }
// Fully rewrite order customer data
PUT /v1/orders/123/client-details
{ "phone_number" }
</code></pre>
<p>Omitting <code>client_phone_number_ext</code> in <code>PUT client-details</code> request would be sufficient to remove it. This approach also helps to separate constant and calculated fields (<code>order_id</code> and <code>updated_at</code>) from editable ones, thus getting rid of ambiguous situations (what happens if a client tries to rewrite the <code>updated_at</code> field?). You may also return the entire <code>order</code> entity from <code>PUT</code> endpoints (however, there should be some naming convention for that).</p>
<p><strong>Option 2</strong>: design a format for atomic changes.</p>
<pre><code>POST /v1/order/changes
X-Idempotency-Token: <see next paragraph>
{
"changes": [{
"type": "set",
"field": "delivery_address",
"value": <new value>
}, {
"type": "unset",
"field": "client_phone_number_ext"
}]
}
</code></pre>
<p>This approach is much harder to implement, but it's the only viable method to implement collaborative editing, since it's explicitly reflects what a user was actually doing with entity representation. With data exposed in such a format you might actually implement offline editing, when user changes are accumulated and then sent at once, while the server automatically resolves conflicts by ‘rebasing’ the changes.</p>
<p>Let us recall that idempotency is the following property: repeated calls to the same function with the same parameters don't change the resource state. Since we're discussing client-server interaction in a first place, repeating request in case of network failure isn't an exception, but a norm of life.</p>
<p>If endpoint's idempotency can't be assured naturally, explicit idempotency parameters must be added, in a form of either a token or a resource version.</p>
<p><strong>Bad</strong>:</p>
<pre><code>// Creates an order
POST /orders
</code></pre>
<p>Second order will be produced if the request is repeated!</p>
<p>A client on its side must retain <code>X-Idempotency-Token</code> in case of automated endpoint retrying. A server on its side must check whether an order created with this token exists.</p>
<p><strong>An alternative</strong>:</p>
<pre><code>// Creates order draft
POST /v1/orders/drafts
→
{ "draft_id" }
</code></pre>
<pre><code>// Confirms the draft
PUT /v1/orders/drafts/{draft_id}
{ "confirmed": true }
</code></pre>
<p>Creating order drafts is a non-binding operation since it doesn't entail any consequences, so it's fine to create drafts without idempotency token.</p>
<p>Also worth mentioning that adding idempotency tokens to naturally idempotent handlers isn't meaningless either, since it allows to distinguish two situations:</p>
<ul>
<li>a client didn't get the response because of some network issues, and is now repeating the request;</li>
<li>a client's mistaken, trying to make conflicting changes.</li>
</ul>
<p>Consider the following example: imagine there is a shared resource, characterized by a revision number, and a client tries updating it.</p>
<pre><code>POST /resource/updates
{
"resource_revision": 123
"updates"
}
</code></pre>
<p>The server retrieves the actual resource revision and find it to be 124. How to respond correctly? <code>409 Conflict</code> might be returned, but then the client will be forced to understand the nature of the conflict and somehow resolve it, potentially confusing the user. It's also unwise to fragment conflict resolving algorithms, allowing each client to implement it independently.</p>
<p>The server may compare request bodies, assuming that identical <code>updates</code> values means retrying, but this assumption might be dangerously wrong (for example if the resource is a counter of some kind, then repeating identical requests are routine).</p>
<p>Adding idempotency token (either directly as a random string, or indirectly in a form of drafts) solves this problem.</p>
<p>Furthermore, adding idempotency tokens not only resolves the issue, but also makes advanced optimizations possible. If the server detects an access conflict, it could try to resolve it, ‘rebasing’ the update like modern version control systems do, and return <code>200 OK</code> instead of <code>409 Conflict</code>. This logics dramatically improves user experience, being fully backwards compatible, and helps avoiding conflict resolving code fragmentation.</p>
<p>Also, be warned: clients are bad at implementing idempotency tokens. Two problems are common:</p>
<ul>
<li>you can't really expect that clients generate truly random tokens — they may share the same seed or simply use weak algorithms or entropy sources; therefore you must put constraints on token checking: token must be unique to specific user and resource, not globally;</li>
<li>clients tend to misunderstand the concept and either generate new tokens each time they repeat the request (which deteriorates the UX, but otherwise healthy) or conversely use one token in several requests (not healthy at all and could lead to catastrophic disasters; another reason to implement the suggestion in the previous clause); writing detailed doc and/or client library is highly recommended.</li>
<p>There is a common problem with implementing the changes list approach: what to do, if some changes were successfully applied, while others are not? The rule is simple: if you may ensure the atomicity (e.g. either apply all changes or none of them) — do it.</p>
<p><strong>Bad</strong>:</p>
<pre><code>// Returns a list of recipes
GET /v1/recipes
→
{
"recipes": [{
"id": "lungo",
"volume": "200ml"
}, {
"id": "latte",
"volume": "300ml"
}]
}
// Changes recipes' parameters
PATCH /v1/recipes
{
"changes": [{
"id": "lungo",
"volume": "300ml"
}, {
"id": "latte",
"volume": "-1ml"
}]
}
→ 400 Bad Request
// Re-reading the list
GET /v1/recipes
→
{
"recipes": [{
"id": "lungo",
// This value changed
"volume": "300ml"
}, {
"id": "latte",
// and this did not
"volume": "300ml"
}]
}
</code></pre>
<p>— there is no way how client might learn that failed operation was actually partially applied. Even if there is an indication of this fact in the response, the client still cannot tell, whether lungo volume changed because of the request, or some other client changed it.</p>
<p>If you can't guarantee the atomicity of an operation, you should elaborate in details how to deal with it. There must be a separate status for each individual change.</p>
<p><strong>Better</strong>:</p>
<pre><code>PATCH /v1/recipes
{
"changes": [{
"recipe_id": "lungo",
"volume": "300ml"
}, {
"recipe_id": "latte",
"volume": "-1ml"
}]
}
// You may actually return
// a ‘partial success’ status
// if the protocol allows it
→ 200 OK
{
"changes": [{
"change_id",
"occurred_at",
"recipe_id": "lungo",
"status": "success"
}, {
"change_id",
"occurred_at",
"recipe_id": "latte",
"status": "fail",
"error"
}]
}
</code></pre>
<p>Here:</p>
<ul>
<li><code>change_id</code> is a unique identifier of each atomic change;</li>
<li><code>occurred_at</code> is a moment of time when the change was actually applied;</li>
<li><code>error</code> field contains the error data related to the specific change.</li>
</ul>
<p>Might be of use:</p>
<ul>
<li>introducing <code>sequence_id</code> parameters in the request to guarantee execution order and to align item order in response with the requested one;</li>
<li>expose a separate <code>/changes-history</code> endpoint for clients to get the history of applied changes even if the app crashed while getting partial success response or there was a network timeout.</li>
</ul>
<p>Non-atomic changes are undesirable because they erode the idempotency concept. Let's take a look at the example:</p>
<pre><code>PATCH /v1/recipes
{
"idempotency_token",
"changes": [{
"recipe_id": "lungo",
"volume": "300ml"
}, {
"recipe_id": "latte",
"volume": "400ml"
}]
}
→ 200 OK
{
"changes": [{
…
"status": "success"
}, {
…
"status": "fail",
"error": {
"reason": "too_many_requests"
}
}]
}
</code></pre>
<p>Imagine the client failed to get a response because of a network error, and it repeats the request:</p>
<pre><code>PATCH /v1/recipes
{
"idempotency_token",
"changes": [{
"recipe_id": "lungo",
"volume": "300ml"
}, {
"recipe_id": "latte",
"volume": "400ml"
}]
}
→ 200 OK
{
"changes": [{
…
"status": "success"
}, {
…
"status": "success",
}]
}
</code></pre>
<p>To the client, everything looks normal: changes were applied, and the last response got is always actual. But the resource state after the first request was inherently different from the resource state after the second one, which contradicts the very definition of ‘idempotency’.</p>
<p>It would be more correct if the server did nothing upon getting the second request with the same idempotency token, and returned the same status list breakdown. But it implies that storing these breakdowns must be implemented.</p>
<p>Just in case: nested operations must be idempotent themselves. If they are not, separate idempotency tokens must be generated for each nested operation.</p>
<p>Client-server interaction usually implies that network and server resources are limited, therefore caching operation results on client devices is a standard practice.</p>
<p>So it's highly desirable to make caching options clear, if not from functions' signatures then at least from docs.</p>
<li>in what vicinity of the location the price is valid?</li>
</ul>
<p><strong>Better</strong>: you may use standard protocol capabilities to denote cache options, like <code>Cache-Control</code> header. If you need caching in both temporal and spatial dimensions, you should do something like that:</p>
<pre><code>// Returns an offer: for what money sum
Obviously a client could only retry the initial request (<code>offset=0</code>) and compare identifiers to those it already knows. But what if the number of new records exceeds the <code>limit</code>? Imagine the situation:
<li>some problem occurred, and a batch of new records awaits processing;</li>
<li>the client requests new records (<code>offset=0</code>) but can't find any known records on the first page;</li>
<li>the client continues iterating over records, page by page, until it finds the last known identifier; all this time the order processing is idle;</li>
<p>With the pagination organized like that, clients never bothers about record being added or removed in the processed part of the list: they continue to iterate over the records, either getting new ones (using <code>newer_than</code>) or older ones (using <code>older_than</code>). If there is no record removal operation, clients may easily cache responses — the URL will always return the same record set.</p>
<p>Another way to organize such lists is returning a <code>cursor</code> to be used instead of <code>record_id</code>, making interfaces more versatile.</p>
<p>One advantage of this approach is the possibility to keep initial request parameters (i.e. <code>filter</code> in our example) embedded into the cursor itself, thus not copying them in follow-up requests. It might be especially actual if the initial request prepares full dataset, for example, moving it from a ‘cold’ storage to a ‘hot’ one (then <code>cursor</code> might simply contain the encoded dataset id and the offset).</p>
<p>There are several approaches to implementing cursors (for example, making single endpoint for initial and follow-up requests, returning the first data portion in the first response). As usual, the crucial part is maintaining consistency across all such endpoints.</p>
<p><strong>NB</strong>: some sources discourage this approach because in this case user can't see a list of all pages and can't choose an arbitrary one. We should note here that:</p>
<ul>
<li>such a case (pages list and page selection) exists if we deal with user interfaces; we could hardly imagine a <em>program</em> interface which needs to provide an access to random data pages;</li>
<li>if we still talk about an API to some application, which has a ‘paging’ user control, then a proper approach would be to prepare ‘paging’ data on server, including generating links to pages;</li>
<li>cursor-based solution doesn't prohibit using <code>offset</code>/<code>limit</code>; nothing could stop us from creating a dual interface, which might serve both <code>GET /items?cursor=…</code> and <code>GET /items?offset=…&limit=…</code> requests;</li>
<li>finally, if there is a necessity to provide an access to arbitrary pages in user interface, we should ask ourselves a question, which problem is being solved that way; probably, users use this functionality to find something: a specific element on the list, or the position they ended while working with the list last time; probably, we should provide more convenient controls to solve those tasks than accessing data pages by their indexes.</li>
<p>Sorting by the date of modification usually means that data might be modified. In other words, some records might change after the first data chunk is returned, but before the next chunk is requested. Modified record will simply disappear from the listing because of moving to the first page. Clients will never get those records which were changed during the iteration process, even if the <code>cursor</code> scheme is implemented, and they never learn the sheer fact of such an omission. Also, this particular interface isn't extendable as there is no way to add sorting by two or more fields.</p>
<p><strong>Better</strong>: there is no general solution to this problem in this formulation. Listing records by modification time will always be unpredictably volatile, so we have to change the approach itself; we have two options.</p>
<p><strong>Option one</strong>: fix the records order at the moment we've got initial request, e.g. our server produces the entire list and stores it in immutable form:</p>
<pre><code>// Creates a view based on the parameters passed
POST /v1/record-views
{
sort_by: [
{ "field": "date_modified", "order": "desc" }
]
}
→
{ "id", "cursor" }
</code></pre>
<pre><code>// Returns a portion of the view
GET /v1/record-views/{id}?cursor={cursor}
</code></pre>
<p>Since the produced view is immutable, an access to it might be organized in any form, including a limit-offset scheme, cursors, <code>Range</code> header, etc. However there is a downside: records modified after the view was generated will be misplaced or outdated.</p>
<p><strong>Option two</strong>: guarantee a strict records order, for example, by introducing a concept of record change events:</p>
<pre><code>POST /v1/records/modified/list
{
// Optional
"cursor"
}
→
{
"modified": [
{ "date", "record_id" }
],
"cursor"
}
</code></pre>
<p>This scheme's downsides are the necessity to create separate indexed event storage, and the multiplication of data items, since for a single record many events might exist.</p>
<p>While writing the code developers face problems, many of them quite trivial, like invalid parameter type or some boundary violation. The more convenient are error responses your API return, the less time developers waste in struggling with it, and the more comfortable is working with the API.</p>
<p><strong>Bad</strong>:</p>
<pre><code>POST /v1/coffee-machines/search
{
"recipes": ["lngo"],
"position": {
"latitude": 110,
"longitude": 55
}
}
→ 400 Bad Request
{}
</code></pre>
<p>— of course, the mistakes (typo in <code>"lngo"</code> and wrong coordinates) are obvious. But the handler checks them anyway, why not return readable descriptions?</p>
<p><strong>Better</strong>:</p>
<pre><code>{
"reason": "wrong_parameter_value",
"localized_message":
"Something is wrong. Contact the developer of the app."
<p><strong>In first</strong>, always return unresolvable errors before the resolvable ones:</p>
<pre><code>POST /v1/orders
{
"recipe": "lngo",
"offer"
}
→ 409 Conflict
{
"reason": "offer_expired"
}
// Request repeats
// with the renewed offer
POST /v1/orders
{
"recipe": "lngo",
"offer"
}
→ 400 Bad Request
{
"reason": "recipe_unknown"
}
</code></pre>
<p>— what was the point of renewing the offer if the order cannot be created anyway?</p>
<p><strong>In second</strong>, maintain such a sequence of unresolvable errors which leads to a minimal amount of customers' and developers' irritation.</p>
<p>— what was the point of showing the price changed dialog, if the user still can't make an order, even if the price is right? When one of the concurrent orders finishes, and the user is able to commit another one, prices, items availability, and other order parameters will likely need another correction.</p>
<p><strong>In third</strong>, draw a chart: which error resolution might lead to the emergence of another one. Otherwise you might eventually return the same error several times, or worse, make a cycle of errors.</p>
<p>You may note that in this setup the error can't resolved in one step: this situation must be elaborated over, and either order calculation parameters must be changed (discounts should not be counted against the minimal order sum), or a special type of error must be introduced.</p>
<p>If a server processed a request correctly and no exceptional situation occurred — there must be no error. Regretfully, an antipattern is widespread — of throwing errors when zero results are found.</p>
<p><code>4xx</code> statuses imply that a client made a mistake. But no mistakes were made by either a customer or a developer: a client cannot know whether the lungo is served in this location beforehand.</p>
<p><strong>Better</strong>:</p>
<pre><code>POST /search
{
"query": "lungo",
"location": <customer's location>
}
→ 200 OK
{
"results": []
}
</code></pre>
<p>This rule might be reduced to: if an array is the result of the operation, than emptiness of that array is not a mistake, but a correct response. (Of course, if empty array is acceptable semantically; empty coordinates array is a mistake for sure.)</p>
<p>All endpoints must accept language parameters (for example, in a form of the <code>Accept-Language</code> header), even if they are not being used currently.</p>
<p>It is important to understand that user's language and user's jurisdiction are different things. Your API working cycle must always store user's location. It might be stated either explicitly (requests contain geographical coordinates) or implicitly (initial location-bound request initiates session creation which stores the location), bit no correct localization is possible in absence of location data. In most cases reducing the location to just a country code is enough.</p>
<p>The thing is that lots of parameters potentially affecting data formats depend not on language, but user location. To name a few: number formatting (integer and fractional part delimiter, digit groups delimiter), date formatting, first day of week, keyboard layout, measurement units system (which might be non-decimal!), etc. In some situations you need to store two locations: user residence location and user ‘viewport’. For example, if US citizen is planning a European trip, it's convenient to show prices in local currency, but measure distances in miles and feet.</p>
<p>Sometimes explicit location passing is not enough since there are lots of territorial conflicts in a world. How the API should behave when user coordinates lie within disputed regions is a legal matter, regretfully. Author of this books once had to implement a ‘state A territory according to state B official position’ concept.</p>
<p><strong>Important</strong>: mark a difference between localization for end users and localization for developers. Take a look at the example in rule #19: <code>localized_message</code> is meant for the user; the app should show it if there is no specific handler for this error exists in code. This message must be written in user's language and formatted according to user's location. But <code>details.checks_failed[].message</code> is meant to be read by developers examining the problem. So it must be written and formatted in a manner which suits developers best. In a software development world it usually means ‘in English’.</p>
<p>Worth mentioning is that <code>localized_</code> prefix in the example is used to differentiate messages to users from messages to developers. A concept like that must be, of course, explicitly stated in your API docs.</p>
<p>And one more thing: all strings must be UTF-8, no exclusions.</p><divclass="page-break"></div><h3><ahref="#chapter-12"class="anchor"id="chapter-12">Chapter 12. Annex to Section I. Generic API Example</a></h3>