1
0
mirror of https://github.com/twirl/The-API-Book.git synced 2025-01-05 10:20:22 +02:00
The-API-Book/docs/API.en.html
2020-12-09 00:20:54 +03:00

607 lines
51 KiB
HTML

<html><head>
<meta charset="utf-8"/>
<title>Sergey Konstantinov. The API</title>
<meta name="author" content="Sergey Konstantinov"/>
<link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=PT+Serif&amp;family=PT+Sans&amp;family=Inconsolata"/>
<style>html {
width: 100%;
margin: 0;
padding: 0;
}
body {
font-family: 'PT Serif';
font-size: 14pt;
text-align: justify;
}
.cc-by-nc {
background: transparent url(https://i.creativecommons.org/l/by-nc/4.0/88x31.png) 0 5px no-repeat;
padding-left: 92px;
}
code, pre {
font-family: Inconsolata, sans-serif;
}
code {
white-space: nowrap;
}
pre {
margin: 1em 0;
padding: 1em;
border-radius: .25em;
border-top: 1px solid rgba(0,0,0,.45);
border-left: 1px solid rgba(0,0,0,.45);
box-shadow: .1em .1em .1em rgba(0,0,0,.45);
page-break-inside: avoid;
overflow-x: auto;
font-size: 90%;
}
pre code {
white-space: pre;
}
.page-break {
page-break-after: always;
}
a {
text-decoration: none;
}
h1, h2, h3, h4, h5 {
text-align: left;
font-family: 'PT Sans';
font-weight: bold;
page-break-after: avoid;
}
h1 {
font-size: 200%;
}
h2 {
font-size: 160%;
text-transform: uppercase;
}
h3 {
font-size: 140%;
font-variant: small-caps;
}
h4, h5 {
font-size: 120%;
}
@page {
size: 8.5in 11in;
margin: 0.5in;
}
:root {
--main-font: 'PT Serif';
--alt-font: 'PT Serif';
--code-font: Inconsolata;
}
@media screen {
body {
margin: 2em auto;
max-width: 60%;
}
}
@media print {
h1 {
margin: 4in 0 4in 0;
}
body {
font-size: 14pt;
}
}
@media screen and (max-width: 1000px) {
body {
padding: 2em;
margin: 0;
max-width: none;
text-align: left;
}
pre {
margin: 0;
padding: 0.2em;
}
ul, ol {
padding-left: 1em;
}
}
</style>
</head><body>
<article><h1>Sergey Konstantinov<br/>The API</h1>
<p class="cc-by-nc">This work is licensed under a <a href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.</p>
<div class="page-break"></div>
<h2>Introduction</h2><h3>Chapter 1. On the Structure of This Book</h3><p>The book you're holding in your hands comprises this Introduction and three large sections.</p>
<p>In Section I we'll discuss designing the API as a concept: how to build the architecture properly, from a high-level planning down to final interfaces.</p>
<p>Section II is dedicated to API's lifecycle: how interfaces evolve over time, and how to elaborate the product to match users' needs.</p>
<p>Finally, Section III is more about un-engineering sides of the API, like API marketing, organizing support, and working with a community.</p>
<p>First two sections are the most interesting to engineers, while third section is being more relevant to both engineers and product managers. But we insist that this section is the most important for the API software developer. Since API is the product for engineers, you cannot simply pronounce non-engineering team responsible for its product planning and support. Nobody but you understands more what product features your API is capable of.</p>
<p>Let's start.</p><div class="page-break"></div><h3>Chapter 2. The API Definition</h3><p>Before we start talking about the API design, we need to explicitly define what the API is. Encyclopedia tells us that API is an acronym for ‘Application Program Interface’. This definition is fine, but useless. Much like ‘Man’ definition by Plato: Man stood upright on two legs without feathers. This definition is fine again, but it gives us no understanding what's so important about a Man. (Actually, not ‘fine’ either. Diogenes of Sinope once brought a plucked chicken, saying ‘That's Plato's Man’. And Plato had to add ‘with broad nails’ to his definition.)</p>
<p>What API <em>means</em> apart from the formal definition?</p>
<p>You're possibly reading this book using a Web browser. To make the browser display this page correctly, a bunch of stuff must work correctly: parsing the URL according to the specification; DNS service; TLS handshake protocol; transmitting the data over HTTP protocol; HTML document parsing; CSS document parsing; correct HTML+CSS rendering.</p>
<p>But those are just a tip of an iceberg. To make HTTP protocol work you need the entire network stack (comprising 4-5 or even more different level protocols) work correctly. HTML document parsing is being performed according to hundreds of different specifications. Document rendering calls the underlying operating system API, or even directly graphical processor API. And so on: down to contemporary CISC processor commands implemented on top of microcommands API.</p>
<p>In other words, hundreds or even thousands of different APIs must work correctly to make possible basic actions like viewing a webpage. Contemporary internet technologies simply couldn't exist without these tons of API working fine.</p>
<p><strong>An API is an obligation</strong>. A formal obligation to connect different programmable contexts.</p>
<p>When I'm asked of an example of a well-designed API, I usually show the picture of a Roman viaduct:</p>
<ul>
<li>it interconnects two areas;</li>
<li>backwards compatibility being broken not a single time in two thousand years.</li>
</ul>
<p>What differs between a Roman viaduct and a good API is that APIs presume a contract being <em>programmable</em>. To connect two areas some <em>coding</em> is needed. The goal of this book is to help you in designing APIs which serve their purposes as solidly as a Roman viaduct does.</p>
<p>A viaduct also illustrates another problem of the API design: your customers are engineers themselves. You are not supplying water to end-users: suppliers are plugging their pipes to you engineering structure, building their own structures upon it. From one side, you may provide water access to much more people through them, not spending your time on plugging each individual house to your network. But from other side, you can't control the quality of suppliers' solutions, and you are to be blamed every time there is a water problem caused by their incompetence.</p>
<p>That's why designing the API implies a larger area of responsibilities. <strong>API is a multiplier to both your opportunities and mistakes</strong>.</p><div class="page-break"></div><h3>Chapter 3. API Quality Criteria</h3><p>Before we start laying out the recommendations, we ought to specify what API we consider ‘fine’, and what's the profit of having a ‘fine’ API.</p>
<p>Let's discuss second question first. Obviously, API ‘finesse’ is first of all defined through its capability to solve developers' problems. (One may reasonably say that solving developers' problem might not be the main purpose of offering the API of ours to developers. However, manipulating public opinion is out of this book's author interest. Here we assume that APIs exist primarily to help developers in solving their problems, not for some other covertly declared purposes.)</p>
<p>So, how API design might help the developers? Quite simple: well-designed API must solve their problems in the most efficient and comprehensible manner. Distance from formulating the task to writing working code must be as short as possible. Among other things, it means that:</p>
<ul>
<li>it must be totally obvious out of your API's structure how to solve a task; ideally, developers at first glance should be able to understand, what entities are meant to solve their problem;</li>
<li>the API must be readable; ideally, developers write correct code after just looking at method nomenclature, never bothering about details (especially API implementation details!); it also also very important to mention, that not only problem solution should be obvious, but also possible errors and exceptions;</li>
<li>the API must be consistent; while developing new functionality (i.e. while using unknown new API entities) developers may write new code similar to the code they already wrote using known API concepts, and this new code will work.</li>
</ul>
<p>However static convenience and clarity of APIs is a simple part. After all, nobody seeks for making an API deliberately irrational and unreadable. When we are developing an API, we always start with clear basic concepts. While possessing some experience in designing APIs it's quite hard to make an API core which fails to meet obviousness, readability, and consistency criteria.</p>
<p>Problems begin we we start to expand our API. Adding new functionality sooner or later result in transforming once plain and simple API into a mess of conflicting concepts, and our efforts to maintain backwards compatibility lead to illogical, unobvious and simply bad design solutions. It is partly related to an inability to predict future completely: your understanding of ‘fine’ APIs will change over time, both in objective terms (what problems the API is to solve and what are the best practices) and in subjective ones too (what obviousness, readability and consistency <em>really means</em> regarding your API).</p>
<p>Principles we are explaining below are specifically oriented to make APIs evolve smoothly over time, not being turned into a pile of mixed inconsistent interfaces. It is crucial to understand that this approach isn't free: a necessity to bear in mind all possible extension variants and keep essential growth points mean interface redundancy and possibly excessing abstractions being embedded in the API design. Besides both make developers' work harder. <strong>Providing excess design complexities being reserved for future use makes sense only when this future actually exists for your API. Otherwise it's simply an overengineering.</strong></p><div class="page-break"></div><h3>Chapter 4. Backwards Compatibility</h3><p>Backwards compatibility is a temporal characteristics of your API. An obligation to maintain backwards compatibility is the crucial point where API developments differs form software development in general.</p>
<p>Of course, backwards compatibility isn't an absolute. In some subject areas shipping new backwards incompatible API versions is a routine. Nevertheless, every time you deploy new backwards incompatible API version, the developers need to make some non-zero effort to adapt their code to the new API version. In this sense, releasing new API versions puts a sort of a ‘tax’ on customers. They must spend quite real money just to make sure they product continue working.</p>
<p>Large companies, which occupy firm market positions, could afford implying such a taxation. Furthermore, they may introduce penalties for those who refuse to adapt their code to new API versions, up to disabling their applications.</p>
<p>From our point of view such practice cannot be justified. Don't imply hidden taxes on your customers. If you're able to avoid breaking backwards compatibility — never break it.</p>
<p>Of course, maintaining old API versions is sort of a tax either. Technology changes, and you cannot foresee everything, regardless of how nice your API is initially designed. At some point keeping old API versions results in an inability to provide new functionality and support new platforms, and you will be forced to release new version. But at least you will be able to explain to your customers why they need to make an effort.</p>
<p>We will discuss API lifecycle and version policies in Section II.</p><div class="page-break"></div><h3>Chapter 5. On versioning</h3><p>Here and throughout we firmly stick to <a href="https://semver.org/">semver</a> principles of versioning:</p>
<ol>
<li>API versions are denoted with three numbers, i.e. <code>1.2.3</code>.</li>
<li>First number (major version) when backwards incompatible changes in the API are shipped.</li>
<li>Second Number (minor version) increases when new functionality is added to the API, keeping backwards compatibility intact.</li>
<li>Third number (patch) increases when new API version contains bug fixes only.</li>
</ol>
<p>Terms ‘major API version’ and ‘new API version, containing backwards incompatible changes to functionality’ are therefore to be considered as equivalent.</p>
<p>In Section II we will discuss versioning policies in more details. In Section I we will just use semver versions designation, specifically <code>v1</code>, <code>v2</code>, etc.</p><div class="page-break"></div><h3>Chapter 6. Terms and Notation Keys</h3><p>Software development is being characterized, among other things, by an existence of many different engineering paradigms, whose adepts sometimes are quite aggressive towards other paradigms' adepts. While writing this book we are deliberately avoiding using terms like ‘method’, ‘object’, ‘function’, and so on, using a neutral term ‘entity’ instead. ‘Entity’ means some atomic functionality unit, like class, method, object, monad, prototype (underline what you think right).</p>
<p>For entity's components we regretfully failed to find a proper term, so we will use words ‘fields’ and ‘methods’.</p>
<p>Most of the examples of APIs in general will be provide in a form of JSON-over-HTTP endpoints. This is some sort of notation which, as we see it, helps to describe concepts in the most comprehensible manner. <code>GET /v1/orders</code> endpoint call could easily be replaced with <code>orders.get()</code> method call, local or remote. JSON could easily be replaced with any other data format. Meaning of assertions shouldn't change.</p>
<p>Let's take a look at the following example:</p>
<pre><code>// Method description
POST /v1/bucket/{id}/some-resource
X-Idempotency-Token: &lt;idempotency token&gt;
{
// This is a single-line comment
"some_parameter": "example value",
}
→ 404 Not Found
Cache-Control: no-cache
{
/* And this is
a multiline comment */
"error_message"
}
</code></pre>
<p>It should be read like:</p>
<ul>
<li>client performs a POST-request to a <code>/v1/bucket/{id}/some-resource</code> resource, where <code>{id}</code> is to be replaced with some <code>bucket</code>'s identifier <code>{something}</code> should refer to the nearest term from the left, unless explicitly specified otherwise);</li>
<li>a specific <code>X-Idempotency-Token</code> header is added to the request alongside with standard headers (which we omit);</li>
<li>terms in angle brackets (<code>&lt;idempotency token&gt;</code>) describe the semantic of an entity value (field, header, parameter);</li>
<li>a specific JSON, containing a <code>some_parameter</code> field with <code>example value</code> value and some other unspecified fields (indicated by ellipsis) is being sent as a request body payload;</li>
<li>in response (marked with arrow symbol <code></code>) server returns a <code>404 Not Founds</code> status code; status might be omitted (treat it like <code>200 OK</code> if no status is provided);</li>
<li>response could possibly contain additional notable headers;</li>
<li>response body is a JSON comprising single <code>error_message</code> field; field value absence means that field contains exactly what you expect it should contain — some error message in this case.</li>
</ul>
<p>Term ‘client’ here stands for an application being executed on a user's device, either native of web one. Terms ‘agent’ and ‘user agent’ are synonymous to ‘client’.</p>
<p>Some request and response parts might be omitted if they are irrelevant to a topic being discussed.</p>
<p>Simplified notation might be used to avoid redundancies, like <code>POST /some-resource</code> <code>{…,"some_parameter",…}</code><code>{ "operation_id" }</code>; request and response bodies might also be omitted.</p>
<p>We will be using expressions like ‘<code>POST /v1/bucket/{id}/some-resource</code> method’ (or simply ‘<code>bucket/some-resource</code> method’, ‘<code>some-resource</code>’ method if no other <code>some-resource</code>s are specified throughout the chapter, so there is no ambiguity) to refer to such endpoint definition.</p>
<p>Apart from HTTP API notation we will employ C-style pseudocode, or, to be more precise, JavaScript-like or Python-like since types are omitted. We assume such imperative structures being readable enough to skip detailed grammar explanations.</p><div class="page-break"></div>
<h2>Section I. The API Design</h2><h3>Chapter 7. The API Contexts Pyramid</h3><p>The approach we use to design API comprises four steps:</p>
<ul>
<li>defining an application field;</li>
<li>separating abstraction levels;</li>
<li>isolating responsibility areas;</li>
<li>describing final interfaces.</li>
</ul>
<p>This for-step algorithm actually builds an API from top to bottom, from common requirements and use case scenarios down to refined entity nomenclature. In fact, moving this way you will eventually get a ready-to-use API — that's why we value this approach.</p>
<p>It might seem that the most useful pieces of advice are given in a last chapter, but that's not true. The cost of a mistake made at certain levels differs. Fixing naming is simple; revising wrong understanding what the API stands for is practically impossible.</p>
<p><strong>NB</strong>. Here and throughout we will illustrate API design concepts using a hypothetical example of an API allowing for ordering a cup of coffee in city cafes. Just in case: this example is totally synthetic. If we were to design such an API in a real world, it would probably have very few in common with our fictional example.</p><div class="page-break"></div><h3>Chapter 8. Defining an Application Field</h3><p>Key question you should ask yourself looks like that: what problem we solve? It should be asked four times, each time putting emphasis on another word.</p>
<ol>
<li><p><em>What</em> problem we solve? Could we clearly outline the situation in which our hypothetical API is needed by developers?</p></li>
<li><p>What <em>problem</em> we solve? Are we sure that abovementioned situation poses a problem? Does someone really want to pay (literally or figuratively) to automate a solution for this problem?</p></li>
<li><p>What problem <em>we</em> solve? Do we actually possess an expertise to solve the problem?</p></li>
<li><p>What problem we <em>solve</em>? Is it true that the solution we propose solves the problem indeed? Aren't we creating another problem instead?</p></li>
</ol>
<p>So, let's imagine that we are going to develop an API for automated coffee ordering in city cafes, and let' apply the key question to it.</p>
<ol>
<li><p>Why would someone need an API to make a coffee? Why ordering a coffee via ‘human-to-human’ or ‘human-to-machine’ interface is inconvenient, why have ‘machine-to-machine’ interface?</p>
<ul>
<li>Possibly, we're solving knowledge and selection problems? To provide humans with a full knowledge what options they have right now and right here.</li>
<li>Possibly, we're optimizing waiting times? To save the time people waste while waiting their beverages.</li>
<li>Possibly, we're reducing the number of errors? To help people get exactly what they wanted to order, stop losing information in imprecise conversational communication or in dealing with unfamiliar coffee machine interfaces?</li></ul>
<p>‘Why’ question is the most important of all questions you must ask yourself. And not only about global project goals, but also locally about every single piece of functionality. <strong>If you can't briefly and clearly answer the question ‘what for this entity is needed’, then it's not needed</strong>.</p>
<p>Here and throughout we assume, to make our example more complex and bizarre, that we are optimizing all three factors.</p></li>
<li><p>Do the problems we outlined really exist? Do we really observe unequal coffee-machines utilization in mornings? Do people really suffer from inability to find nearby toffee nut latte they long for? Do they really care about minutes they spend in lines?</p></li>
<li><p>Do we actually have a resource to solve a problem? Do we have an access to sufficient number of coffee machines and users to ensure system's efficiency?</p></li>
<li><p>Finally, will we really solve a problem? How we're going to quantify an impact our API makes? </p></li>
</ol>
<p>In general, there is no simple answers to those questions. Ideally, you should give answers having all relevant metrics measured: how much time is wasted exactly, and what numbers we're going to achieve having this coffee machines density? Let us also stress that in real life obtaining these numbers is only possibly when you're entering a stable market. If you try to create something new, your only option is to rely on your intuition.</p>
<h4 id="whyanapi">Why an API?</h4>
<p>Since our book is dedicated not to software development per se, but developing APIs, we should look at all those questions from different angle: why solving those problems specifically requires an API, not simply specialized software? In terms of our fictional example we should ask ourselves: why provide a service to developers to allow brewing coffee to end users instead of just making an app for end users?</p>
<p>In other words, there must be a solid reason to split two software development domains: there are the operators which provide APIs; and there are the operators which develop services for end users. Their interests are somehow different to such an extent that coupling this two roles in one entity is undesirable. We will talk about the motivation to specifically provide APIs in more details in Section III.</p>
<p>We should also note, that you should try making an API when and only when you wrote ‘because that's our area of expertise’ in question 2. Developing APIs is sort of meta-engineering: your writing some software to allow other companies to develop software to solve users' problems. You must possess an expertise in both domains (API and user products) to design your API well.</p>
<p>As for our speculative example, let us imagine that in near future some tectonic shift happened on coffee brewing market. Two distinct player groups took shape: some companies provide a ‘hardware’, i.e. coffee machines; other companies have an access to customer auditory. Something like flights market looks like: there are air companies, which actually transport passengers; and there are trip planning services where users are choosing between trip variants the system generates for them. We're aggregating a hardware access to allow app vendors for ordering fresh brewed coffee.</p>
<h4 id="whatandhow">What and How</h4>
<p>After finishing all these theoretical exercises, we should proceed right to designing and developing the API, having a decent understanding regarding two things:</p>
<ul>
<li><em>what</em> we're doing, exactly;</li>
<li><em>how</em> we're doing it, exactly.</li>
</ul>
<p>In our coffee case, we are:</p>
<ul>
<li>providing an API to services with larger audience, so their users may order a cup of coffee in the most efficient and convenient manner;</li>
<li>abstracting an access to coffee machines ‘hardware’ and delivering methods to select a beverage kind and some location to brew — and to make an order.</li>
</ul><div class="page-break"></div><h3>Chapter 9. Separating Abstraction Levels</h3><p>‘Separate abstraction levels in your code’ is possibly the most general advice to software developers. However we don't think it would be a grave exaggeration to say that abstraction levels separation is also the most difficult task to API developers.</p>
<p>Before proceeding to the theory we should formulate clearly, <em>why</em> abstraction levels are so imprtant and what goals we trying to achieve by separating them.</p>
<p>Let us remember that software product is a medium connecting two outstanding context, thus transforming terms and operations belonging to one subject area into another area's concepts. The more these areas differ, the more interim connecting links we have to introduce.</p>
<p>Back to our coffee example. What entity abstraction levels we see?</p>
<ol>
<li>We're preparing an <code>order</code> via the API: one (or more) cup of coffee and take payments for this.</li>
<li>Each cup of coffee is being prepared according to some <code>recipe</code>, which implies the presence of different ingredients and sequences of preparation steps.</li>
<li>Each beverage is being prepared on some physical <code>coffee machine</code> occupying some position in space.</li>
</ol>
<p>Every level presents a developer-facing ‘facet’ in our API. While elaboration abstractions hierarchy we first of all trying to reduce the interconnectivity of different entities. That would help us to reach several goals.</p>
<ol>
<li><p>Simplifying developers' work and learning curve. At each moment of time a developer is operating only those entities which are necessary for the task they're solving right now. And conversely, badly designed isolation leads to the situation when developers have to keep in mind lots of concepts mostly unrelated to the task being solved.</p></li>
<li><p>Preserving backwards compatibility. Properly separated abstraction levels allow for adding new functionality while keeping interfaces intact.</p></li>
<li><p>Maintaining interoperability. Properly isolated low-level abstraction help us to adapt the API to different platforms and technologies without changing high-level entities.</p></li>
</ol>
<p>Let's say we have the following interface:</p>
<pre><code>// Returns lungo recipe
GET /v1/recipes/lungo
</code></pre>
<pre><code>// Posts an order to make a lungo
// using coffee-machine specified
// and returns an order identifier
POST /v1/orders
{
"coffee_machine_id",
"recipe": "lungo"
}
</code></pre>
<pre><code>// Returns order state
GET /v1/orders/{id}
</code></pre>
<p>Let's consider the question: how exactly developers should determine whether the order is ready or not? Let's say we do the following:</p>
<ul>
<li>add a reference beverage volume to the lungo recipe;</li>
<li>add currently prepared volume of beverage to order state.</li>
</ul>
<p>Then a developer just need to compare to numbers to find out whether the order is ready.</p>
<p>This solutions intuitively looks bad, and it really is: it violates all abovementioned principles.</p>
<p><strong>In first</strong>, to solve the task ‘order a lung’ a developer need to refer to the ‘recipe’ entity and learn that every recipe has an associated volume. Then they need to embrace the concept that order is ready at that particular moment when beverage volume becomes equal to reference one. This concept is simply unguessable and bears to particular sense in knowing it.</p>
<p><strong>In second</strong>, we will automatically got problems if we need to vary beverage size. For example, if one day we decide to offer a choice to a customer how many milliliters of lungo they desire exactly, then we will have to performs one of the following tricks.</p>
<p>Variant I: we have a list of possible volumes fixed and introduce bogus recipes like <code>/recipes/small-lungo</code> or <code>recipes/large-lungo</code>. Why ‘bogus’? Because it's still the same lungo recipe, same ingredients, same preparation steps, only volumes differ. We will have to start mass producing a bunch of recipes only different in volume, or introduce some recipe ‘inheritance’ to be able to specify ‘base’ recipe and just redefine the volume.</p>
<p>Variant II: we modify an interface, pronouncing volumes stated in recipes being just default values. We allow to set different volume when placing an order:</p>
<pre><code>POST /v1/orders
{
"coffee_machine_id",
"recipe":"lungo",
"volume":"800ml"
}
</code></pre>
<p>For those orders with arbitrary volume requested a developer will need to obtain requested volume not from <code>GET /v1/recipes</code>, but <code>GET /v1/orders</code>. Doing so we're getting a whole bunch of related problems:</p>
<ul>
<li>there is a significant chance that developers will make mistakes in this functionality implementation if they add arbitrary volume support in a code working with the <code>POST /v1/orders</code> handler, but forget to make corresponding changes in an order readiness check code;</li>
<li>the same field (coffee volume) now means different things in different interfaces. In <code>GET /v1/recipes</code> context <code>volume</code> field means ‘a volume to be prepared if no arbitrary volume is specified in <code>POST /v1/orders</code> request’; and it cannot simply be renamed to ‘default volume’, we now have to live with that.</li>
</ul>
<p><strong>In third</strong>, the entire scheme becomes totally inoperable if different types of coffee machines produce different volumes of lungo. To introduce ‘lungo volume depends on machine type’ constraint we have to do quite a nasty thing: make recipes depend on coffee machine id. By doing so we start actively ‘stir’ abstraction levels: one part of our API (recipe endpoints) becomes unusable without explicit knowledge of another part (coffee machines parameters). And which is even worse, developers will have to change logics of their apps: previously it was possible to choose volume first, then a coffee-machine; but now this step must be rebuilt from scratch.</p>
<p>Okay, we understood how to make things bad. But how to make them <em>nice</em>?</p>
<p>Abstraction levels separation should go alongside three directions:</p>
<ol>
<li><p>From user scenarios to their internal representation: high-level entities and their method nomenclature must directly reflect API usage scenarios; low-level entities reflect the decomposition of scenarios into smaller parts.</p></li>
<li><p>From user subject field terms to ‘raw’ data subject field terms — in our case from high-level terms like ‘order’, ‘recipe’, ‘cafe’ to low-level terms like ‘beverage temperature’, ‘coffee machine geographical coordinates’, etc.</p></li>
<li><p>Finally, from data structures suitable for end users to ‘raw’ data structures — in our case, from ‘lungo recipe’ and ‘"Chamomile" cafe chain’ to raw byte data stream from ‘Good Morning’ coffee machine sensors.</p></li>
</ol>
<p>The more is the distance between programmable context which our API connects, the deeper is the hierarchy of the entities we are to develop.</p>
<p>In our example with coffee readiness detection we clearly face the situation when we need an interim abstraction level:</p>
<ul>
<li>from one side, an ‘order’ should not store the data regarding coffee machine sensors;</li>
<li>from other side, a coffee machine should not store the data regarding order properties (and its API probably doesn't provide such functionality).</li>
</ul>
<p>A naïve approach to this situation is to design an interim abstraction level as a ‘connecting link’ which reformulates tasks from one abstraction level to another. For example, introduce a <code>task</code> entity like that:</p>
<pre><code>{
"volume_requested": "800ml",
"volume_prepared": "200ml",
"readiness_policy": "check_volume",
"ready": false,
"operation_state": {
"status": "executing",
"operations": [
// description of commands
// being executed on physical coffee machine
]
}
}
</code></pre>
<p>We call this approach ‘naïve’ not because its wrong; on the contrary, that's quite logical ‘default’ solution if you don't know yet (or don't understand yet) how your API will look like. The problem with this approach lies in its speculativeness: it doesn't reflect subject area's organization.</p>
<p>An experienced developer in this case must ask: what options do exist? How we really should determine beverage readiness? If it turns out that comparing volumes <em>is</em> the only working method to tell whether the beverage is ready, then all the speculations above are wrong. You may safely include readiness by volume detection into your interfaces, since no other method exists. Before abstraction something we need to learn what exactly we're abstracting.</p>
<p>In our example let's assume that we have studied coffee machines API specs and learned that two device types exist:</p>
<ul>
<li>coffee machines capable of executing programs coded in the firmware, and the only customizable options are some beverage parameters, like desired volume, syrup flavor and kind of milk;</li>
<li>coffee machines with builtin functions like ‘grind specified coffee volume’, ‘shed specified amount of water’, etc; such coffee machines lack ‘preparation programs’, but provide an access to commands and sensors.</li>
</ul>
<p>To be more specific, let's assume those two kinds of coffee machines provide the following physical API.</p>
<ul>
<li><p>Coffee machines with prebuilt programs:</p>
<pre><code>// Returns a list of programs
GET /programs
{
// program identifier
"program": "01",
// coffee type
"type": "lungo"
}
</code></pre>
<pre><code>// Starts an execution of a specified program
// and returns execution status
POST /execute
{
"program": 1,
"volume": "200ml"
}
{
// Unique identifier of the execution
"execution_id": "01-01",
// Identifier of the program
"program": 1,
// Beverage volume requested
"volume": "200ml"
}
</code></pre>
<pre><code>// Cancels current program
POST /cancel
</code></pre>
<pre><code>// Returns execution status
// Format is the same as in POST /execute
GET /execution/status
</code></pre>
<p><strong>NB</strong>. Just in case: this API violates a number of design principles, starting with a lack of versioning; it's described in such a manner because of two reasons: (1) to demonstrate how to design a more convenient API, (b) in real life you really get something like that from vendors, and this API is quite sane, actually.</p></li>
<li><p>Coffee machines with builtin functions:</p>
<pre><code>// Returns a list of functions available
GET /functions
{
"functions": [
{
// Operation type:
// * set_cup
// * grind_coffee
// * shed_water
// * discard_cup
"type": "set_cup",
// Arguments available to each operation.
// To keep it simple, let's limit these to one:
// * volume — a volume of a cup, coffee, or water
"arguments": ["volume"]
},
]
}
</code></pre>
<pre><code>// Takes arguments values
// and starts executing a function
POST /functions
{
"type": "set_cup",
"arguments": [{ "name": "volume", "value": "300ml" }]
}
</code></pre>
<pre><code>// Returns sensors' state
GET /sensors
{
"sensors": [
{
// Values allowed:
// * cup_volume
// * ground_coffee_volume
// * cup_filled_volume
"type": "cup_volume",
"value": "200ml"
},
]
}
</code></pre>
<p><strong>NB</strong>. The example is intentionally factitious to model a situation described above: to determine beverage readiness you have to compare requested volume with volume sensor state.</p></li>
</ul>
<p>Now the picture becomes more apparent: wee need to abstract coffee machine API calls, so that ‘execution level’ in our API provides general functions (like beverage readiness detection) in a unified form. We should also note that these two coffee machine kinds belong to different abstraction levels themselves: first one provide a higher level API than second one. Therefore, a ‘branch’ of our API working with second kind machines will be more intricate.</p>
<p>The next step in abstraction level separating is determining what functionality we're abstracting. To do so we need to understand the tasks developers solve at the ‘order’ level, and to learn what problems they got if our interim level missed.</p>
<ol>
<li>Obviously the developers desire to create an order uniformly: list high-level order properties (beverage kind, volume and special options like syrup or milk type), and don't think about how specific coffee machine executes it.</li>
<li>Developers must be able to learn the execution state: is order ready? if not — when to expect it's ready (and is there any sense to wait in case of execution errors).</li>
<li>Developers need to address an order's location in space and time — to explain to users where and when they should pick the order up.</li>
<li>Finally, developers need to run atomic operations, like canceling orders.</li>
</ol>
<p>Note, that the first kind API is much closer to developers' needs than the second kind API. Indivisible ‘program’ is a way more convenient concept than working with raw commands and sensor data. There are only two problems we see in the first kind API:</p>
<ul>
<li>absence of explicit ‘programs’ to ‘recipes’ relation; program identifier is of no use to developers, actually, since there is a ‘recipe’ concept;</li>
<li>absence of explicit ‘ready’ status.</li>
</ul>
<p>But with the second kind API it's much worse. The main problem we foresee is an absence of ‘memory’ for actions being executed. Functions and sensors API is totally stateless, which means we don't event understand who called a function being currently executed, when, and which order it is related to.</p>
<p>So we need to introduce two abstraction levels.</p>
<ol>
<li><p>Execution control level which provides uniform interface to indivisible programs. ‘Uniform interface’ means here that, regardless of a coffee machine kind, developers may expect:</p>
<ul>
<li>statuses and other high-level execution parameters nomenclature (for example, estimated preparation time or possible execution error) being the same;</li>
<li>methods nomenclature (for example, order cancellation method) and their behavior being the same.</li></ul></li>
<li><p>Program runtime level. For the first kind API it will provide just a wrapper for existing programs API; for the second kind API the entire ‘runtime’ concept is to be developed from scratch by us.</p></li>
</ol>
<p>What does this mean in practical sense? Developers will still be creating orders dealing with high-level entities only:</p>
<pre><code>POST /v1/orders
{
"coffee_machin
"recipe": "lungo",
"volume": "800ml"
}
{ "order_id" }
</code></pre>
<p>The <code>POST /orders</code> handler will check all order parameters, puts a hold of corresponding sum on user's credit card, forms a request to run and calls the execution level. First, correct execution program needs to be fetched:</p>
<pre><code>POST /v1/programs/match
{ "recipe", "coffee-machine" }
{ "program_id" }
</code></pre>
<p>Now, after obtaining a correct <code>program</code> identifier the handler runs a program:</p>
<pre><code>POST /v1/programs/{id}/run
{
"order_id",
"coffee_machine_id",
"parameters": [
{
"name": "volume",
"value": "800ml"
}
]
}
{ "program_run_id" }
</code></pre>
<p>Please note that knowing the coffee machine API kind isn't required at all; that's why we're making abstractions! We could make interfaces more specific, implementing different <code>run</code> and <code>match</code> endpoints for different coffee machines:</p>
<ul>
<li><code>POST /v1/programs/{api_type}/match</code></li>
<li><code>POST /v1/programs/{api_type}/{program_id}/run</code></li>
</ul>
<p>This approach has some benefits, like a possibility to provide different sets of parameters, specific to the API kind. But we see no need in such fragmentation. <code>run</code> method handler is capable of extracting all the program metadata and perform one of two actions:</p>
<ul>
<li>call <code>POST /execute</code> physical API method passing internal program identifier — for the first API kind;</li>
<li>initiate runtime creation to proceed with the second API kind.</li>
</ul>
<p>Out of general concerns runtime level for the second kind API will be private, so we are more or less free in implementing it. The easiest solution would be to develop a virtual state machine which creates a ‘runtime’ (e.g. stateful execution context) to run a program and controls its state.</p>
<pre><code>POST /v1/runtimes
{ "coffee_machine", "program", "parameters" }
{ "runtime_id", "state" }
</code></pre>
<p>The <code>program</code> here would look like that:</p>
<pre><code>{
"program_id",
"api_type",
"commands": [
{
"sequence_id",
"type": "set_cup",
"parameters"
},
]
}
</code></pre>
<p>And the <code>state</code> like that:</p>
<pre><code>{
// Runtime status:
// * "pending" — awaiting execution
// * "executing" — performing some command
// * "ready_waiting" — beverage is ready
// * "finished" — all operations done
"status": "ready_waiting",
// Command being currently executed
"command_sequence_id",
// How the exectuion concluded:
// * "success" — beverage prepared and taken
// * "terminated" — execution aborted
// * "technical_error" — preparation error
// * "waiting_time_exceeded" — beverage prepared
// but not taken; timed out then disposed
"resolution": "success",
// All variables values,
// including sensors state
"variables"
}
</code></pre>
<p><strong>NB</strong>: while implementing <code>orders</code><code>match</code><code>run</code><code>runtimes</code> call sequence we have two options:</p>
<ul>
<li>either <code>POST /orders</code> handler requests the data regarding recipe, coffee machine model, and program on its own behalf and forms a stateless request which contains all the necessary data (the API kind, command sequence, etc.);</li>
<li>or the request contains only data identifiers, and next in chain handlers will request pieces of data they need via some internal APIs.</li>
</ul>
<p>Both variants are plausible, selecting one of them depends on implementation details.</p>
<h4 id="abstractionlevelsisolation">Abstraction Levels Isolation</h4>
<p>Crucial quality of properly separated abstraction levels (and therefore a requirement to their design) is a level isolation restriction: <strong>only adjacent levels may interact</strong>. If ‘jumping over’ is needed in the API design, then clearly mistakes were made.</p>
<p>Get back to our example. How retrieving order status operation would work? To obtain a status the following call chain is to be performed:</p>
<ul>
<li>user initiate a call to <code>GET /v1/orders</code> method;</li>
<li><code>order</code> handler completes operations on its level of responsibility (for example, checks user authorization), finds <code>program_run_id</code> identifier and performs a call to <code>runs/{program_run_id}</code> endpoint;</li>
<li><code>runs</code> endpoint in its turn completes operations corresponding to its level (for example, checks the coffee machine API kind) and, depending on the API kind, proceeds with one of two possible execution branches:<ul>
<li>either calls <code>GET /execution/status</code> method of a physical coffee machine API, gets coffee volume and compares it to the reference value;</li>
<li>or invokes <code>GET /v1/runtimes/{runtime_id}</code> to obtain <code>state.status</code> and convert it to order status;</li></ul></li>
<li>in case of the second API kind the call chain continues: <code>GET /runtimes</code> handler invokes <code>GET /sensors</code> method of a physical coffee machine API and performs some manipulations on them, like comparing cup / ground coffee / shed water volume with those requested upon command execution and changing state and status if needed.</li>
</ul>
<p><strong>NB</strong>: ‘Call chain’ wording shouldn't be treated literally. Each abstraction level might be organized differently in a technical sense:</p>
<ul>
<li>there might be explicit proxying of calls down the hierarchy;</li>
<li>there might be a cache at each level being updated upon receiving a callback call or an event. In particular, low-level runtime execution cycle obviously must be independent from upper levels and renew its state in background, not waiting for an explicit call.</li>
</ul>
<p>Note that what happens here: each abstraction level wields its own status (e.g. order, runtime, sensors status), being formulated in corresponding to this level subject area terms. Forbidding the ‘jumping over’ results in necessity to spawn statuses at each level independently.</p>
<p>Let's now look how the order cancel operation springs through our abstraction level. In this case the call chain will look like that:</p>
<ul>
<li>user initiates a call to <code>POST /v1/orders/{id}/cancel</code> method;</li>
<li>the method handler completes operations on its level of responsibility:<ul>
<li>checks the authorization;</li>
<li>solves money issues, whether a refund is needed;</li>
<li>finds <code>program_run_id</code> identifier and calls <code>runs/{program_run_id}/cancel</code> method;</li></ul></li>
<li>the <code>rides/cancel</code> handler completes operations on its level of responsibility and, depending on the coffee machine API kind, proceeds with one of two possible execution branches:<ul>
<li>either calls <code>POST /execution/cancel</code> method of a physical coffee machine API;</li>
<li>or invokes <code>POST /v1/runtimes/{id}/terminate</code>;</li></ul></li>
<li>in a second case the call chain continues, <code>terminate</code> handler operates its internal state:<ul>
<li>changes <code>resolution</code> to <code>"terminated"</code>;</li>
<li>runs <code>"discard_cup"</code> command.</li></ul></li>
</ul>
<p>Handling state-modifying operations like <code>cancel</code> requires more advanced abstraction levels juggling skills compared to non-modifying calls like <code>GET /status</code>. There are two important moments:</p>
<ol>
<li><p>At every abstraction level the idea of ‘order canceling’ is reformulated:</p>
<ul>
<li>at <code>orders</code> level this action in fact splits into several ‘cancels’ of other levels: you need to cancel money holding and to cancel an order execution;</li>
<li>while at a second API kind physical level a ‘cancel’ operation itself doesn't exist: ‘cancel’ means executing a <code>discard_cup</code> command, which is quite the same as any other command.
The interim API level is need to make this transition between different level ‘cancels’ smooth and rational without jumping over principes.</li></ul></li>
<li><p>From a high-level point of view, cancelling an order is a terminal action, since no further operations are possible. From a low-level point of view processing a request continues until the cup is discard, and then the machine is to be unlocked (e.g. new runtimes creation allowed). It's a task to execution control level to couple those two states, outer (the order is canceled) and inner (the execution continues).</p></li>
</ol>
<p>It might look like forcing the abstraction levels isolation is redundant and makes interfaces more complicated. In fact, it is: it's very important to understand that flexibility, consistency, readability and extensibility come with a price. One may construct an API with zero overhead, essentially just provide an access to coffee machine's microcontrollers. However using such an API would be a disaster, not mentioning and inability to expand it.</p>
<p>Separating abstraction levels is first of all a logical procedure: how we explain to ourselves and to developers what our API consists of. <strong>The abstraction gap between entities exists objectively</strong>, no matter what interfaces we design. Our task is just separate this gap into levels <em>explicitly</em>. The more implicitly abstraction levels are separated (or worse — blended into each other), the more complicated is your API's learning curve, and the worse is the code which use it.</p>
<h4 id="dataflow">Data Flow</h4>
<p>One useful exercise allowing to examine the entire abstraction hierarchy is excluding all the particulars and constructing (on a paper or just in your head) a data flow chart: what data is flowing through you API entities and how it's being altered at each step.</p>
<p>This exercise doesn't just helps, but also allows to design really large APIs with huge entities nomenclatures. Human memory isn't boundless; any project which grows extensively will eventually become too big to keep the entire entities hierarchy in mind. But it's usually possible to keep in mind the data flow chart; or at least keep a much larger portion of the hierarchy.</p>
<p>What data flows we have in our coffee API?</p>
<ol>
<li><p>Sensor data, i.e. volumes of coffee / water / cups. This is the lowest data level we have, and here we can't change anything.</p></li>
<li><p>A continuous sensors data stream is being transformed into a discrete command execution statuses, injecting new concepts which don't exist within the subject area. A coffee machine API doesn't provide ‘coffee is being shed’ or ‘cup is being set’ notions. It's our software which treats incoming sensor data and introduces new terms: if the volume of coffee or water is less than target one, then the process isn't over yet. If the target value is reached, then this synthetic status is to be switched and next command to be executed.<br />
It is important to note that we don't calculate new variables out from sensor data: we need to create new data set first, a context, an ‘execution program’ comprising a sequence of steps and conditions, and to fill it with initial values. If this context is missing, it's impossible to understand what's happening with the machine.</p></li>
<li><p>Having a logical data on program execution state we can (again via creating new, high-level data context) merge two different data streams from two different kinds of APIs into a single stream in a unified form of executing a beverage preparation program with logical variables like recipe, volume, and readiness status.</p></li>
</ol>
<p>Each API abstraction level therefore corresponds to data flow generalization and enrichment, converting low-level (and in fact useless to end users) context terms into upper higher level context terms.</p>
<p>We may also traverse the tree backwards.</p>
<ol>
<li><p>At an order level we set its logical parameters: recipe, volume, execution place and possible statuses set.</p></li>
<li><p>At an execution level we read order level data and create lower level execution contest: a program as a sequence of steps, their parameters, transition rules, and initial state.</p></li>
<li><p>At a runtime level we read target parameters (which operation to execute, what the target volume is) and translate them into coffee machine API microcommands and a status for each command.</p></li>
</ol>
<p>Also, if we take a look into the ‘bad’ decision, being discussed in the beginning of this chapter (forcing developers to determine actual order status on their own), we could notice a data flow collision there:</p>
<ul>
<li>from one side, in an order context ‘leaked’ physical data (beverage volume prepared) is injected, therefore stirring abstraction levels irreversibly;</li>
<li>from other side, an order context itself is deficient: it doesn't provide new meta-variables non-existent on low levels (order status, in particular), doesn't initialize them and don't provide game rules.</li>
</ul>
<p>We will discuss data context in more details in Section II. There we will just state that data flows and their transformations might be and must be examined as an API facet which, from one side, helps us to separate abstraction levels properly, and, from other side, to check if our theoretical structures work as intended.</p><div class="page-break"></div></article>
</body></html>