1
0
mirror of https://github.com/twirl/The-API-Book.git synced 2025-08-10 21:51:42 +02:00

proofreading"

This commit is contained in:
Sergey Konstantinov
2023-05-24 02:24:44 +03:00
parent 3e6e9b7fba
commit fb978fbeb9
2 changed files with 73 additions and 74 deletions

View File

@@ -335,7 +335,7 @@ As a useful exercise, try modeling the typical lifecycle of a partner's app's ma
##### No Results Is a Result ##### No Results Is a Result
If a server processed a request correctly and no exceptional situation occurred — there must be no error. Regretfully, the antipattern is widespread — of throwing errors when no results are found. If a server processes a request correctly and no exceptional situation occurs, there should be no error. Unfortunately, the antipattern of throwing errors when no results are found is widespread.
**Bad** **Bad**
``` ```
@@ -351,7 +351,7 @@ POST /v1/coffee-machines/search
} }
``` ```
`4xx` statuses imply that a client made a mistake. But no mistakes were made by either the customer or the developer: a client cannot know whether the lungo is served in this location beforehand. The response implies that a client made a mistake. However, in this case, neither the customer nor the developer made any mistakes. The client cannot know beforehand whether lungo is served in this location.
**Better**: **Better**:
``` ```
@@ -366,9 +366,9 @@ POST /v1/coffee-machines/search
} }
``` ```
This rule might be reduced to: if an array is the result of the operation, then the emptiness of that array is not a mistake, but a correct response. (Of course, if an empty array is acceptable semantically; an empty array of coordinates is a mistake for sure.) This rule can be summarized as follows: if an array is the result of the operation, then the emptiness of that array is not a mistake, but a correct response. (Of course, this applies if an empty array is semantically acceptable; an empty array of coordinates, for example, would be a mistake.)
**NB**: this pattern should be applied in the opposite case as well. If an array of entities might be am optional parameter to the request, the empty array and the absence of the field must be treated differently. Let's take a look at the example: **NB**: this pattern should also be applied in the opposite case. If an array of entities is an optional parameter in the request, the empty array and the absence of the field must be treated differently. Let's consider the example:
``` ```
// Finds all coffee recipes // Finds all coffee recipes
@@ -401,7 +401,7 @@ POST /v1/offers/search
} }
``` ```
Let's imagine that the first request returned an empty array of results, i.e., there are no known recipes that satisfy the condition. Of course, would be nice if the developer expected this situation and installed a guard that prohibits the call to the offer search function in this case — but we can't be 100% sure they did. If this logic is missing, the application will make the following call: Now let's imagine that the first request returned an empty array of results meaning there are no known recipes that satisfy the condition. Ideally, the developer would have expected this situation and installed a guard to prevent the call to the offer search function in this case. However, we can't be 100% sure they did. If this logic is missing, the application will make the following call:
``` ```
POST /v1/offers/search POST /v1/offers/search
@@ -411,13 +411,13 @@ POST /v1/offers/search
} }
``` ```
Often, the endpoint implementation ignores the empty recipe array and returns a list of offers just like no recipe filter was supplied. In our case, it means that the application seemingly ignores the user's request to show only milk-free beverages, which we can't consider acceptable behavior. Therefore, the response to such a request with an empty array parameter should be either an error or an empty result. Often, the endpoint implementation ignores the empty recipe array and returns a list of offers as if no recipe filter was supplied. In our case, it means that the application seemingly ignores the user's request to show only milk-free beverages, which we consider unacceptable behavior. Therefore, the response to such a request with an empty array parameter should either be an error or an empty result.
##### Validate Inputs ##### Validate Inputs
The decision of which of the options to choose in the previous example, an exception or an empty response, directly depends on what's stated in the contract. If the specification prescribes that the `recipes` parameter must not be empty, an error shall be generated (otherwise you violate your own spec). The decision of whether to use an exception or an empty response in the previous example depends directly on what is stated in the contract. If the specification specifies that the `recipes` parameter must not be empty, an error should be generated (otherwise, you would violate your own spec).
This rule applies not only to empty arrays but to every restriction stipulated in the contract. “Silent” fixing of invalid values rarely bears practical sense: This rule applies not only to empty arrays but to every restriction specified in the contract. “Silently” fixing invalid values rarely makes practical sense.
**Bad**: **Bad**:
``` ```
@@ -436,7 +436,7 @@ POST /v1/offers/search
} }
``` ```
As we can see, the developer somehow passed the wrong latitude value (100 degrees). Yes, we can “fix” it, e.g., reduce it to the closest valid value, which is 90 degrees, but who got benefitted from this? The developer will never learn about this mistake, and we doubt that Northern Pole coffee offers are relevant to users. As we can see, the developer somehow passed the wrong latitude value (100 degrees). Yes, we can “fix” it by reducing it to the closest valid value, which is 90 degrees, but who benefits from this? The developer will never learn about this mistake, and we doubt that coffee offers in the Northern Pole vicinity are relevant to users.
**Better**: **Better**:
``` ```
@@ -453,7 +453,7 @@ POST /v1/coffee-machines/search
} }
``` ```
It is also useful to proactively notify partners about the behavior that looks like a mistake: It is also useful to proactively notify partners about behavior that appears to be a mistake:
``` ```
POST /v1/coffee-machines/search POST /v1/coffee-machines/search
@@ -479,7 +479,7 @@ POST /v1/coffee-machines/search
} }
``` ```
As it might happen that adding such notices is not possible, we might introduce the debug mode or strict mode, in which notices are escalated: If it is not possible to add such notices, we can introduce a debug mode or strict mode in which notices are escalated:
``` ```
POST /v1/coffee-machines/search⮠ POST /v1/coffee-machines/search⮠
@@ -511,7 +511,7 @@ POST /v1/coffee-machines/search⮠
##### Default Values Must Make Sense ##### Default Values Must Make Sense
Setting default values is one of the most powerful tools that help in avoiding many-wordiness while working with APIs. However, these values must help developers, not hide their mistakes. Setting default values is one of the most powerful tools that help avoid verbosity when working with APIs. However, these values should help developers rather than hide their mistakes.
**Bad**: **Bad**:
``` ```
@@ -529,7 +529,7 @@ POST /v1/coffee-machines/search
} }
``` ```
Formally speaking, having such a behavior is feasible: why not have a “default geographical coordinates” concept? In the reality, however, such policies of “silent” fixing of mistakes lead to absurd situations like “the null island” — [the most visited place in the world](https://www.sciencealert.com/welcome-to-null-island-the-most-visited-place-that-doesn-t-exist). The more popular an API, the more chances partners just overlook these edge cases. Formally speaking, having such behavior is feasible: why not have a “default geographical coordinates” concept? However, in reality, such policies of “silently” fixing mistakes lead to absurd situations like “the null island” — [the most visited place in the world](https://www.sciencealert.com/welcome-to-null-island-the-most-visited-place-that-doesn-t-exist). The more popular an API becomes, the higher the chances that partners will overlook these edge cases.
**Better**: **Better**:
``` ```
@@ -546,7 +546,7 @@ POST /v1/coffee-machines/search
##### Errors Must Be Informative ##### Errors Must Be Informative
It is not enough to just validate inputs; describing the cause of the error properly is also a must. While writing code developers face problems, many of them quite trivial, like invalid parameter types or some boundary violations. The more convenient the error responses your API return, the less the amount of time developers waste struggling with it, and the more comfortable working with the API. It is not enough to simply validate inputs; providing proper descriptions of errors is also essential. When developers write code, they encounter problems, sometimes quite trivial, such as invalid parameter types or boundary violations. The more convenient the error responses returned by your API, the less time developers will waste struggling with them, and the more comfortable working with the API will be for them.
**Bad**: **Bad**:
``` ```
@@ -561,7 +561,7 @@ POST /v1/coffee-machines/search
→ 400 Bad Request → 400 Bad Request
{} {}
``` ```
— of course, the mistakes (typo in the `"lngo"`, wrong coordinates) are obvious. But the handler checks them anyway, so why not return readable descriptions? — of course, the mistakes (typo in `"lngo"`, wrong coordinates) are obvious. But the handler checks them anyway, so why not return readable descriptions?
**Better**: **Better**:
``` ```
@@ -597,7 +597,7 @@ POST /v1/coffee-machines/search
} }
``` ```
It is also a good practice to return all detectable errors at once to spare developers time. It is also a good practice to return all detectable errors at once to save developers time.
##### Return Unresolvable Errors First ##### Return Unresolvable Errors First
@@ -623,11 +623,11 @@ POST /v1/orders
"reason": "recipe_unknown" "reason": "recipe_unknown"
} }
``` ```
— what was the point of renewing the offer if the order cannot be created anyway? For the user, it will look like meaningless efforts (or meaningless waiting) that will anyway result in an error, whatever they do. Yes, maintaining errors priorities won't change the result — the order still cannot be created — but, first, users will spend less time (also, make fewer mistakes and contribute less to the error metrics) and, second, diagnostic logs for the problem will be much easier readable. — what was the point of renewing the offer if the order cannot be created anyway? For the user, it will look like meaningless efforts (or meaningless waiting) that will ultimately result in an error regardless of what they do. Yes, maintaining error priorities won't change the result — the order still cannot be created. However, first, users will spend less time (also make fewer mistakes and contribute less to the error metrics) and second, diagnostic logs for the problem will be much easier to read.
##### Resolve Error Starting With Big Ones ##### Prioritize Significant Errors
If the errors under consideration are resolvable (i.e., the user might carry on some actions and still get what they need), you should first notify them of those errors that will require more significant state update. If the errors under consideration are resolvable (i.e., the user can take some actions and still get what they need), you should first notify them of those errors that will require more significant state updates.
**Bad**: **Bad**:
``` ```
@@ -680,7 +680,7 @@ In complex systems, it might happen that resolving one error leads to another on
``` ```
// Create an order // Create an order
// with a paid delivery // with paid delivery
POST /v1/orders POST /v1/orders
{ {
"items": 3, "items": 3,
@@ -697,7 +697,7 @@ POST /v1/orders
"reason": "delivery_is_free" "reason": "delivery_is_free"
} }
// Create an order // Create an order
// with a free delivery // with free delivery
POST /v1/orders POST /v1/orders
{ {
"items": 3, "items": 3,
@@ -707,7 +707,7 @@ POST /v1/orders
"total": "9000.00" "total": "9000.00"
} }
→ 409 Conflict → 409 Conflict
// Error: minimal order sum // Error: the minimal order sum
// is 10000 tögrögs // is 10000 tögrögs
{ {
"reason": "below_minimal_sum", "reason": "below_minimal_sum",
@@ -716,13 +716,13 @@ POST /v1/orders
} }
``` ```
You may note that in this setup the error can't be resolved in one step: this situation must be elaborated over, and either order calculation parameters must be changed (discounts should not be counted against the minimal order sum), or a special type of error must be introduced. You may note that in this setup the error can't be resolved in one step: this situation must be elaborated on, and either order calculation parameters must be changed (discounts should not be counted against the minimal order sum), or a special type of error must be introduced.
##### Specify Caching Policies and Lifespans of Resources ##### Specify Caching Policies and Lifespans of Resources
In modern systems, clients usually have their own state and almost universally cache results of requests — no matter, session-wise or long-term, every entity has some period of autonomous existence. So it's highly desirable to make clarifications; it should be understandable how the data is supposed to be cached, if not from operation signatures, but at least from the documentation. In modern systems, clients usually have their own state and almost universally cache results of requests. Every entity has some period of autonomous existence, whether session-wise or long-term. So it's highly desirable to provide clarifications: it should be understandable how the data is supposed to be cached, if not from operation signatures, but at least from the documentation.
Let's stress that we understand “cache” in the extended sense: which variation of operation parameters (not just the request time, but other variables as well) should be considered close enough to some previous request to use the cached result? Let's emphasize that we understand “cache” in the extended sense: which variations of operation parameters (not just the request time, but other variables as well) should be considered close enough to some previous request to use the cached result?
**Bad**: **Bad**:
@@ -736,10 +736,10 @@ GET /price?recipe=lungo⮠
{ "currency_code", "price" } { "currency_code", "price" }
``` ```
Two questions arise: Two questions arise:
* until when the price is valid? * Until when is the price valid?
* in what vicinity of the location the price is valid? * In what vicinity of the location is the price valid?
**Better**: you may use standard protocol capabilities to denote cache options, like the `Cache-Control` header. If you need caching in both temporal and spatial dimensions, you should do something like that: **Better**: you may use standard protocol capabilities to denote cache options, such as the `Cache-Control` header. If you need caching in both temporal and spatial dimensions, you should do something like this:
``` ```
GET /price?recipe=lungo⮠ GET /price?recipe=lungo⮠
@@ -754,8 +754,8 @@ GET /price?recipe=lungo⮠
"conditions": { "conditions": {
// Until when the price is valid // Until when the price is valid
"valid_until", "valid_until",
// What vicinity // In what vicinity
// the price is valid within // the price is valid
// * city // * city
// * geographical object // * geographical object
// * … // * …
@@ -767,17 +767,17 @@ GET /price?recipe=lungo⮠
##### Keep the Precision of Fractional Numbers Intact ##### Keep the Precision of Fractional Numbers Intact
If the protocol allows, fractional numbers with fixed precision (like money sums) must be represented as a specially designed type like `Decimal` or its equivalent. If the protocol allows, fractional numbers with fixed precision (such as money sums) must be represented as a specially designed type like `Decimal` or its equivalent.
If there is no `Decimal` type in the protocol (for instance, JSON doesn't have one), you should either use integers (e.g., apply a fixed multiplicator) or strings. If there is no `Decimal` type in the protocol (for instance, JSON doesn't have one), you should either use integers (e.g., apply a fixed multiplier) or strings.
If conversion to a float number will certainly lead to losing the precision (let's say if we translate “20 minutes” into hours as a decimal fraction), it's better to either stick to a fully precise format (e.g., opt for `00:20` instead of `0.33333…`), or provide an SDK to work with this data, or as a last resort describe the rounding principles in the documentation. If converting to a float number will certainly lead to a loss of precision (for example, if we translate “20 minutes” into hours as a decimal fraction), it's better to either stick to a fully precise format (e.g., use `00:20` instead of `0.33333…`), or provide an SDK to work with this data. As a last resort, describe the rounding principles in the documentation.
##### All API Operations Must Be Idempotent ##### All API Operations Must Be Idempotent
Let us remind the reader that idempotency is the following property: repeated calls to the same function with the same parameters won't change the resource state. Since we're discussing client-server interaction in the first place, repeating requests in case of network failure isn't an exception, but a norm of life. Let us remind the reader that idempotency is the following property: repeated calls to the same function with the same parameters won't change the resource state. Since we are primarily discussing client-server interaction, repeating requests in case of network failure is not something exceptional but a common occurrence.
If the endpoint's idempotency can't be assured naturally, explicit idempotency parameters must be added, in a form of either a token or a resource version. If an endpoint's idempotency can not be naturally assured, explicit idempotency parameters must be added in the form of a token or a resource version.
**Bad**: **Bad**:
``` ```
@@ -793,9 +793,9 @@ POST /v1/orders
X-Idempotency-Token: <random string> X-Idempotency-Token: <random string>
``` ```
A client on its side must retain the `X-Idempotency-Token` in case of automated endpoint retrying. A server on its side must check whether an order created with this token exists. The client must retain the `X-Idempotency-Token` in case of automated endpoint retrying. The server must check whether an order created with this token already exists.
**An alternative**: **Alternatively**:
``` ```
// Creates order draft // Creates order draft
POST /v1/orders/drafts POST /v1/orders/drafts
@@ -809,15 +809,13 @@ PUT /v1/orders/drafts⮠
{ "confirmed": true } { "confirmed": true }
``` ```
Creating order drafts is a non-binding operation since it doesn't entail any consequences, so it's fine to create drafts without the idempotency token. Creating order drafts is a non-binding operation as it doesn't entail any consequences, so it's fine to create drafts without the idempotency token. Confirming drafts is a naturally idempotent operation, with the `draft_id` serving as its idempotency key.
Confirming drafts is a naturally idempotent operation, with the `draft_id` being its idempotency key. It is also worth mentioning that adding idempotency tokens to naturally idempotent handlers is not meaningless. It allows distinguishing between two situations:
* The client did not receive the response due to network issues and is now repeating the request.
* The client made a mistake by posting conflicting requests.
Also worth mentioning that adding idempotency tokens to naturally idempotent handlers isn't meaningless either, since it allows to distinguish two situations: Consider the following example: imagine there is a shared resource, characterized by a revision number, and the client tries to update it.
* a client didn't get the response because of some network issues, and is now repeating the request;
* a client made a mistake by posting conflicting requests.
Consider the following example: imagine there is a shared resource, characterized by a revision number, and a client tries updating it.
``` ```
POST /resource/updates POST /resource/updates
@@ -827,11 +825,11 @@ POST /resource/updates
} }
``` ```
The server retrieves the actual resource revision and finds it to be 124. How to respond correctly? The `409 Conflict` code might be returned, but then the client will be forced to understand the nature of the conflict and somehow resolve it, potentially confusing the user. It's also unwise to fragment the conflict-resolving algorithm, allowing each client to implement it independently. The server retrieves the actual resource revision and finds it to be 124. How should it respond correctly? Returning the `409 Conflict` code will force the client to try to understand the nature of the conflict and somehow resolve it, potentially confusing the user. It is also unwise to fragment the conflict-resolving algorithm and allow each client to implement it independently.
The server may compare request bodies, assuming that identical `updates` values mean retrying, but this assumption might be dangerously wrong (for example if the resource is a counter of some kind, then repeating identical requests are routine). The server can compare request bodies, assuming that identical requests mean retrying. However, this assumption might be dangerously wrong (for example if the resource is a counter of some kind, repeating identical requests is routine).
Adding the idempotency token (either directly as a random string, or indirectly in a form of drafts) solves this problem. Adding the idempotency token (either directly as a random string or indirectly in the form of drafts) solves this problem.
``` ```
POST /resource/updates POST /resource/updates
@@ -843,7 +841,7 @@ X-Idempotency-Token: <token>
→ 201 Created → 201 Created
``` ```
— the server found out that the same token was used in creating revision 124, which means the client is retrying the request. — the server determined that the same token was used in creating revision 124 indicating the client is retrying the request.
Or: Or:
@@ -857,31 +855,31 @@ X-Idempotency-Token: <token>
→ 409 Conflict → 409 Conflict
``` ```
— the server found out that a different token was used in creating revision 124, which means an access conflict. — the server determined that a different token was used in creating revision 124 indicating an access conflict.
Furthermore, adding idempotency tokens not only resolves the issue but also makes advanced optimizations possible. If the server detects an access conflict, it could try to resolve it, “rebasing” the update like modern version control systems do, and return a `200 OK` instead of a `409 Conflict`. This logic dramatically improves user experience, being fully backward-compatible, and helps to avoid conflict-resolving code fragmentation. Furthermore, adding idempotency tokens not only fixes the issue but also enables advanced optimizations. If the server detects an access conflict, it could attempt to resolve it by “rebasing” the update like modern version control systems do, and return a `200 OK` instead of a `409 Conflict`. This logic dramatically improves the user experience, being fully backward-compatible, and helps avoid code fragmentation for conflict resolution algorithms.
Also, be warned: clients are bad at implementing idempotency tokens. Two problems are common: However, be warned: clients are bad at implementing idempotency tokens. Two common problems arise:
* you can't really expect clients generate truly random tokens — they may share the same seed or simply use weak algorithms or entropy sources; therefore you must put constraints on token checking: token must be unique to a specific user and resource, not globally; * You can't really expect clients to generate truly random tokens. They might share the same seed or simply use weak algorithms or entropy sources. Therefore constraints must be placed on token checking, ensuring that tokens are unique to the specific user and resource rather than globally.
* client developers might misunderstand the concept and either generate new tokens each time they repeat the request (which deteriorates the UX, but otherwise healthy) or conversely use one token in several requests (not healthy at all and could lead to catastrophic disasters; another reason to implement the suggestion in the previous clause); writing detailed doc and/or client library is highly recommended. * Client developers might misunderstand the concept and either generate new tokens for each repeated request (which degrades the UX but is otherwise harmless) or conversely use a single token in several requests (which is not harmless at all and could lead to catastrophic disasters; this is another reason to implement the suggestion in the previous clause). Writing an SDK and/or detailed documentation is highly recommended.
##### Don't Invent Security Practices ##### Don't Invent Security Practices
If the author of this book was given a dollar each time he had to implement the additional security protocol invented by someone, he would be already retired. The API developers' passion for signing request parameters or introducing complex schemes of exchanging passwords for tokens is as obvious as meaningless. If the author of this book were given a dollar each time he had to implement an additional security protocol invented by someone, he would be retired by now. API developers' inclination to create new signing procedures for requests or complex schemes of exchanging passwords for tokens is both obvious and meaningless.
**First**, almost all security-enhancing procedures for every kind of operation *are already invented*. There is no need to re-think them anew; just take the existing approach and implement it. No self-invented algorithm for request signature checking provides the same level of preventing the [Man-in-the-Middle attack](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) as a TLS connection with mutual certificate pinning. **First**, there is no need to reinvent the wheel when it comes to security-enhancing procedures for various operations. All the algorithms you need are already invented, just adopt and implement them. No self-invented algorithm for request signature checking can provide the same level of protection against a [Man-in-the-Middle attack](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) as a TLS connection with mutual certificate pinning.
**Second**, it's quite presumptuous (and dangerous) to assume you're an expert in security. New attack vectors come every day, and being aware of all the actual threats is a full-day job. If you do something different during workdays, the security system designed by you will contain vulnerabilities that you have never heard about — for example, your password-checking algorithm might be susceptible to the [timing attack](https://en.wikipedia.org/wiki/Timing_attack), and your webserver, to the [request splitting attack](https://capec.mitre.org/data/definitions/105.html). **Second**, assuming oneself to be an expert in security is presumptuous and dangerous. New attack vectors emerge daily, and staying fully aware of all actual threats is a full-time job. If you do something different during workdays, the security system you design will contain vulnerabilities that you have never heard about — for example, your password-checking algorithm might be susceptible to a [timing attack](https://en.wikipedia.org/wiki/Timing_attack) or your webserver might be vulnerable to a [request splitting attack](https://capec.mitre.org/data/definitions/105.html).
Just in case: any APIs must be provided over TLS 1.2 or higher (better 1.3). Just in case: all APIs must be provided over TLS 1.2 or higher (preferably 1.3).
##### Help Partners With Security ##### Help Partners With Security
It is equally important to provide such interfaces to partners that would minimize possible security problems for them. It is equally important to provide interfaces to partners that minimize potential security problems for them.
**Bad**: **Bad**:
``` ```
// Allows partners for setting // Allows partners to set
// descriptions for their beverages // descriptions for their beverages
PUT /v1/partner-api/{partner-id}⮠ PUT /v1/partner-api/{partner-id}⮠
/recipes/lungo/info /recipes/lungo/info
@@ -895,13 +893,13 @@ GET /v1/partner-api/{partner-id}⮠
"<script>alert(document.cookie)</script>" "<script>alert(document.cookie)</script>"
``` ```
Such an interface directly creates a stored XSS that potential attackers might exploit. While it's the responsibility of partners to sanitize inputs and safely display them, the big numbers are working against you: there always be inexperienced developers who are unaware of this vulnerability or haven't thought about it. In the worst case, this stored XSS might affect all the API consumers, not just a specific partner. Such an interface directly creates a stored XSS vulnerability that potential attackers might exploit. While it is the partners' responsibility to sanitize inputs and display them safely, the large numbers work against you: there will always be inexperienced developers who are unaware of this vulnerability or haven't considered it. In the worst case, this stored XSS might affect all API consumers, not just a specific partner.
In these situations, first, we recommend sanitizing the data if it looks potentially exploitable (e.g., meant to be displayed in the UI and/or accessible by a direct link), and second, limit the blast radius so that stored exploits in one partner's data space can't affect other partners. If the functionality of unsafe data input is still required, the risks must be explicitly addressed: In these situations, we recommend, first, sanitizing the data if it appears potentially exploitable (e.g. if it is meant to be displayed in the UI and/or is accessible through a direct link). Second, limiting the blast radius so that stored exploits in one partner's data space can't affect other partners. If the functionality of unsafe data input is still required, the risks must be explicitly addressed:
**Better** (though not perfect): **Better** (though not perfect):
``` ```
// Allows for setting potentially // Allows for setting a potentially
// unsafe description for a beverage // unsafe description for a beverage
PUT /v1/partner-api/{partner-id}⮠ PUT /v1/partner-api/{partner-id}⮠
/recipes/lungo/info /recipes/lungo/info
@@ -918,7 +916,7 @@ X-Dangerously-Allow-Raw-Value: true
"<script>alert(document.cookie)</script>" "<script>alert(document.cookie)</script>"
``` ```
One important finding is that if you allow executing scripts via API, always prefer typed input to unsafe input: One important finding is that if you allow executing scripts via the API, always prefer typed input over unsafe input:
**Bad**: **Bad**:
``` ```
@@ -951,38 +949,38 @@ In the second case, you will be able to sanitize parameters and avoid SQL inject
It's considered good practice to use globally unique strings as entity identifiers, either semantic (e.g., "lungo" for beverage types) or random ones (e.g., [UUID-4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random))). It might turn out to be extremely useful if you need to merge data from several sources under a single identifier. It's considered good practice to use globally unique strings as entity identifiers, either semantic (e.g., "lungo" for beverage types) or random ones (e.g., [UUID-4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random))). It might turn out to be extremely useful if you need to merge data from several sources under a single identifier.
In general, we tend to advise using urn-like identifiers, e.g. `urn:order:<uuid>` (or just `order:<uuid>`). That helps a lot in dealing with legacy systems with different identifiers attached to the same entity. Namespaces in urns help to understand quickly which identifier is used and if there is a usage mistake. In general, we tend to advise using URN-like identifiers, e.g. `urn:order:<uuid>` (or just `order:<uuid>`). That helps a lot in dealing with legacy systems with different identifiers attached to the same entity. Namespaces in URNs help to quickly understand which identifier is used and if there is a usage mistake.
One important implication: **never use increasing numbers as external identifiers**. Apart from the abovementioned reasons, it allows counting how many entities of each type there are in the system. Your competitors will be able to calculate a precise number of orders you have each day, for example. One important implication: **never use increasing numbers as external identifiers**. Apart from the abovementioned reasons, it allows counting how many entities of each type there are in the system. Your competitors will be able to calculate the precise number of orders you have each day, for example.
##### Stipulate Future Restrictions ##### Stipulate Future Restrictions
With the API popularity growth, it will inevitably become necessary to introduce technical means of preventing illicit API usage, such as displaying captchas, setting honeypots, raising the “too many requests” exceptions, installing anti-DDoS proxies, etc. All these things cannot be done if the corresponding errors and messages were not described in the docs from the very beginning. With the growth of API popularity, it will inevitably become necessary to introduce technical means of preventing illicit API usage, such as displaying captchas, setting honeypots, raising “too many requests” exceptions, installing anti-DDoS proxies, etc. All these things cannot be done if the corresponding errors and messages were not described in the docs from the very beginning.
You are not obliged to actually generate those exceptions, but you might stipulate this possibility in the docs. For example, you might describe the `429 Too Many Requests` error or captcha redirect but implement the functionality when it's actually needed. You are not obliged to actually generate those exceptions, but you might stipulate this possibility in the docs. For example, you might describe the `429 Too Many Requests` error or captcha redirect but implement the functionality when it's actually needed.
It is extremely important to leave room for multi-factored authentication (such as TOTP, SMS, or 3D-secure-like technologies) if it's possible to make payments through the API. In this case, it's a must-have from the very beginning. It is extremely important to leave room for multi-factor authentication (such as TOTP, SMS, or 3D-secure-like technologies) if it's possible to make payments through the API. In this case, it's a must-have from the very beginning.
**NB**: this rule has an important implication: **always separate endpoints for different API families**. (This may seem obvious, but many API developers fail to follow it.) If you provide a server-to-server API, a service for end users, and a widget to be embedded in third-party apps — all these APIs must be served from different endpoints to allow for different security measures (let's say, mandatory API keys, login requirement, and solving captcha respectively). **NB**: this rule has an important implication: **always separate endpoints for different API families**. (This may seem obvious, but many API developers fail to follow it.) If you provide a server-to-server API, a service for end users, and a widget to be embedded in third-party apps — all these APIs must be served from different endpoints to allow for different security measures (e.g., mandatory API keys, forced login, and solving captcha respectively).
##### No Bulk Access to Sensitive Data ##### No Bulk Access to Sensitive Data
If it's possible to access the API users' personal data, bank card numbers, private messages, or any other kind of information, exposing which might seriously harm users, partners, and/or the API vendor there must be *no* methods for bulk retrieval of the data, or at least there must be rate limiters, page size restrictions, and, ideally, multi-factored authentication in front of them. If it's possible to access the API users' personal data, bank card numbers, private messages, or any other kind of information that, if exposed, might seriously harm users, partners, and/or the API vendor, there must be *no* methods for bulk retrieval of the data, or at least there must be rate limiters, page size restrictions, and ideally, multi-factor authentication in front of them.
Often, making such offloads on an ad-hoc basis, i.e., bypassing the API, is a reasonable practice. Often, making such offloads on an ad-hoc basis, i.e., bypassing the API, is a reasonable practice.
##### Localization and Internationalization ##### Localization and Internationalization
All endpoints must accept language parameters (for example, in a form of the `Accept-Language` header), even if they are not being used currently. All endpoints must accept language parameters (e.g., in the form of the `Accept-Language` header), even if they are not currently being used.
It is important to understand that the user's language and the user's jurisdiction are different things. Your API working cycle must always store the user's location. It might be stated either explicitly (requests contain geographical coordinates) or implicitly (initial location-bound request initiates session creation which stores the location), but no correct localization is possible in absence of location data. In most cases reducing the location to just a country code is enough. It is important to understand that the user's language and the user's jurisdiction are different things. Your API working cycle must always store the user's location. It might be stated either explicitly (requests contain geographical coordinates) or implicitly (initial location-bound request initiates session creation which stores the location) but no correct localization is possible in the absence of location data. In most cases reducing the location to just a country code is enough.
The thing is that lots of parameters that potentially affect data formats depend not on language but on the user's location. To name a few: number formatting (integer and fractional part delimiter, digit groups delimiter), date formatting, the first day of the week, keyboard layout, measurement units system (which might be non-decimal!), etc. In some situations, you need to store two locations: user residence location and user “viewport.” For example, if a US citizen is planning a European trip, it's convenient to show prices in local currency, but measure distances in miles and feet. The thing is that lots of parameters that potentially affect data formats depend not on language but on the user's location. To name a few: number formatting (integer and fractional part delimiter, digit groups delimiter), date formatting, the first day of the week, keyboard layout, measurement units system (which might be non-decimal!), etc. In some situations, you need to store two locations: the user's residence location and the user's “viewport.” For example, if a US citizen is planning a European trip, it's convenient to show prices in the local currency but measure distances in miles and feet.
Sometimes explicit location passing is not enough since there are lots of territorial conflicts in the world. How the API should behave when user coordinates lie within disputed regions is a legal matter, regretfully. The author of this book once had to implement a “state A territory according to state B official position” concept. Sometimes explicit location passing is not enough since there are lots of territorial conflicts in the world. How the API should behave when user coordinates lie within disputed regions is a legal matter, regretfully. The author of this book once had to implement a “state A territory according to state B official position” concept.
**Important**: mark a difference between localization for end users and localization for developers. In the examples above, the `localized_message` field is meant for the user; the app should show it if there is no specific handler for this error exists in the client code. This message must be written in the user's language and formatted according to the user's location. But the `details.checks_failed[].message` is meant to be read by developers examining the problem. So it must be written and formatted in a manner that suits developers best — which usually means “in English,” as English is a de-facto standard in software development. **Important**: mark a difference between localization for end users and localization for developers. In the examples above, the `localized_message` field is meant for the user; the app should show it if no specific handler for this error exists in the client code. This message must be written in the user's language and formatted according to the user's location. But the `details.checks_failed[].message` is meant to be read by developers examining the problem. So it must be written and formatted in a manner that suits developers best — which usually means “in English,” as English is a *de facto* standard in software development.
Worth mentioning is that the `localized_` prefix in the examples is used to differentiate messages to users from messages to developers. A concept like that must be, of course, explicitly stated in your API docs. It is worth mentioning that the `localized_` prefix in the examples is used to differentiate messages to users from messages to developers. A concept like that must be, of course, explicitly stated in your API docs.
And one more thing: all strings must be UTF-8, no exclusions. And one more thing: all strings must be UTF-8, no exclusions.

View File

@@ -190,6 +190,7 @@ POST /v1/bulk-status-change
1. Если вы можете обойтись без таких эндпойнтов — обойдитесь. В server-to-server интеграциях экономия копеечная, в современных сетях с поддержкой протокола [QUIC](https://datatracker.ietf.org/doc/html/rfc9000) и мультиплексирования запросов тоже весьма сомнительная. 1. Если вы можете обойтись без таких эндпойнтов — обойдитесь. В server-to-server интеграциях экономия копеечная, в современных сетях с поддержкой протокола [QUIC](https://datatracker.ietf.org/doc/html/rfc9000) и мультиплексирования запросов тоже весьма сомнительная.
2. Если такой эндпойнт всё же нужен, лучше реализовать его атомарно и предоставить SDK, которые помогут партнёрам не допускать типичные ошибки. 2. Если такой эндпойнт всё же нужен, лучше реализовать его атомарно и предоставить SDK, которые помогут партнёрам не допускать типичные ошибки.
3. Если реализовать атомарный эндпойнт невозможно, тщательно продумайте дизайн API, чтобы не допустить ошибок, подобных описанным выше. 3. Если реализовать атомарный эндпойнт невозможно, тщательно продумайте дизайн API, чтобы не допустить ошибок, подобных описанным выше.
4. Вне зависимости от выбранного подхода, ответы сервера должны включать разбивку по подзапросам. В случае атомарных эндпойнтов это означает включение в ответ списка ошибок, из-за которых исполнение запроса не удалось, в идеале — со всеми потенциальными ошибками (т.е. с результатами проверок каждого подзапроса на валидность). Для неатомарных эндпойнтов необходимо возвращать список со статусами каждого подзапроса и всеми возникшими ошибками.
Один из подходов, позволяющих минимизировать возможные проблемы — разработать смешанный эндпойнт, в котором потенциально зависящие друг от друга операции группированы, например, вот так: Один из подходов, позволяющих минимизировать возможные проблемы — разработать смешанный эндпойнт, в котором потенциально зависящие друг от друга операции группированы, например, вот так: