Diff for "책/RESTfulWebAPIs"

Differences between revisions 42 and 43

Leonard Richardson, Mike Amundsen, Sam Ruby

Contents

Ch 1. Surfing the Web
Ch 2. A Simple API
Ch 3. Resources and Representations
Ch 4. Hypermedia
Chapter 5. Domain-Specific Designs

Ch 1. Surfing the Web

Ch 2. A Simple API

Ch 3. Resources and Representations

REST is not a protocol, a file format, or a development framework. It’s a set of design constraints: statelessness, hypermedia as the engine of application state, and so on. Collectively, we call these the Fielding constraints, because they were first identified in Roy T. Fielding’s 2000 dissertation on software architecture, which gathered them together under the name “REST.”

In this chapter, I’ll finish my explanation of the Fielding constraints in terms of the World Wide Web. My “bible,” as it were, will not be the Fielding dissertation. Instead, I’ll be drawing from the W3C’s guide to the Web, The Architecture of the World Wide Web, Volume One (there is no Volume Two). The Fielding dissertation explains the decisions behind the design of the Web, but Architecture explains the three technologies that came out of those decisions: URL, HTTP, and HTML.

A Resource Can Be Anything

A Representation Describes Resource State

A pomegranate can be an HTTP resource, but you can’t transmit a pomegranate over the Internet. A row in a database can be an HTTP resource; in fact, it can be an information resource, because you can literally send it over the Internet. But what would the client do with a chunk of binary data, ripped from an unknown database without any context?

When a client issues a GET request for a resource, the server should serve a document that captures the resource in a useful way. That’s a representation—a machine-readable explanation of the current state of a resource. The size and ripeness of the pomegranate, the data contained in the database fields.

The server might describe a database row as an XML document, a JSON object, a set of comma-separated values, or as the SQL INSERT statement used to create it. These are all legitimate representations; it depends on what the client asks for.

One application might represent a pomegranate as an item for sale, using a custom XML vocabulary. Another might represent it with a binary image taken by a Pomegranate-Cam. It depends on the application. A representation can be any machine-readable document containing any information about a resource.

Representations Are Transferred Back and Forth

We think of representations as something the server sends to the client. That’s because when we surf the Web, most of our requests are GET requests. We’re asking for representations. But in a POST, PUT, or PATCH request, the client sends a representation to the server. The server’s job is then to change the resource state so it reflects the incoming representation.

The server sends a representation describing the state of a resource. The client sends a representation describing the state it would like the resource to have. That’s representational state transfer.

The Protocol Semantics of HTTP

Overloaded POST

Those two strings are not much to work from. Until recently, application semantics were so poorly understood that I recommended not using overloaded POST at all. But if you follow the advice I give in Chapter 8, you can use a profile to reliably communicate application semantics to your clients. It won’t be as reliable as the protocol semantics—every HTTP client ever made knows what GET means—but you’ll be able to do it.

Since an overloaded POST request can do anything at all, the POST method is neither safe nor idempotent. One particular overloaded POST request may turn out to be safe, but as far as HTTP is concerned, POST is unsafe.

Which Methods Should You Use?

If you want an API entirely described by HTML documents, then your protocol semantics are limited to GET and POST. If you want to speak to filesystem GUI applications like Microsoft’s Web Folders, you’ll be using HTTP plus the WebDAV extensions. If you need to talk to a wide variety of HTTP caches and proxies, you should stay away from PATCH and other methods not defined in RFC 2616.

Ch 4. Hypermedia

Look closer, and you’ll see a question that hasn’t been answered: how does the client know which requests it can make? There are infinitely many URLs. How does a client know which URLs have representations behind them and which ones will give a 404 error? Should the client send an entity-body with its POST request? If so, what should the entity-body look like? HTTP defines a set of protocol semantics, but which subset of those semantics does this web server support on this URL right now?

The missing piece of the puzzle is hypermedia. Hypermedia connects resources to each other, and describes their capabilities in machine-readable ways. Properly used, hypermedia can solve—or at least mitigate—the usability and stability problems found in today’s web APIs.

Like REST, hypermedia isn’t a single technology described by a standards document somewhere. Hypermedia is a strategy, implemented in different ways by dozens of technologies. I’ll cover several hypermedia standards in the next three chapters, and a whole lot more in Chapter 10. It’s up to you to choose the technologies that fit your business requirements.

The hypermedia strategy always has the same goal. Hypermedia is a way for the server to tell the client what HTTP requests the client might want to make in the future. It’s a menu, provided by the server, from which the client is free to choose. The server knows what might happen, but the client decides what actually happens.

In this chapter, I want to dispel the mystery of hypermedia, so you can create APIs that have some of the flexibility of the Web.

HTML as a Hypermedia Format

To sum up, the familiar HTML controls allow the server to describe four kinds of HTTP requests.

The <a> tag describes a GET request for one specific URL, which is made only if the user triggers the control.
The <img> tag describes a GET request for one specific URL, which happens automatically, in the background.
The <form> tag with method="POST" describes a POST request to one specific URL, with a custom entity-body constructed by the client. The request is only made if the user triggers the control.
The <form> tag with method="GET" describes a GET request to a custom URL constructed by the client. The request is only made if the user triggers the control.

HTML also defines some more exotic hypermedia controls, and other data formats may define controls that are stranger still. All of them fall under the formal definition of hypermedia given in the Fielding dissertation:

Hypermedia is defined by the presence of application control information embedded within, or as a layer above, the presentation of information.

The World Wide Web is full of HTML documents, and the documents are full of things people like to read—prices, statistics, personal messages, prose, and poetry. But all of those things fall under presentation of information. In terms of presentation of information, the Web isn’t much different from a printed book.

It’s the application control information that distinguishes an HTML document from a book. I’m talking about the hypermedia controls that people interact with all the time, but rarely examine closely. The <img> tags that tell the browser to embed certain images, the <a> tags that transport the end user to another part of the Web, and the <script> tags that supply JavaScript for the browser to execute.

An HTML document that contains a poem will probably also feature a link to “Other poems by this author,” or a form that lets the reader “Rate this poem.” This is application control information that couldn’t show up in a printed book of poetry. The presence of application control information can certainly reduce the emotional impact of a poem, but an HTML document containing only the text of a poem is not a full participant in the Web. It’s just simulating a printed book.

URI Templates

RFC 6570, URI Template

URI Versus URL

Most web APIs deal exclusively with URLs, so for most of this book, the distinction doesn’t matter. But when it’s important (as it will be in Chapter 12), it’s really important.

A URL is a short string used to identify a resource. A URI is also a short string used to identify a resource. Every URL is a URI. They’re described in the same standard: RFC 3986.

What’s the difference? As far as this book is concerned, the difference is this: there’s no guarantee that a URI has a representation. A URI is nothing but an identifier. A URL is an identifier that can be dereferenced. That is, a computer can somehow take a URL and get a representation of the underlying resource.

Here’s a URI that’s not a URL: urn:isbn:9781449358063. It designates a resource: the print edition of this book. Not any particular copy of this book, but the abstract concept of an entire edition. (Remember that a resource can be anything at all.) This URI is not a URL because… what’s the protocol? How would a computer get a representation? You can’t do it.

Without a URL, you can’t get a representation. Without representations, there can be no representational state transfer. A resource that’s not identified by a URL cannot fulfill many of the Fielding constraints. It can’t fulfill the self-descriptive message constraint, because it can’t send any messages. A representation can link to a URI that’s not a URL (<a href="urn:isbn:9781449358063">), but that won’t fulfill the hypermedia constraint, because a client can’t follow the link.

The Link Header

The Link header has approximately the same functionality as an HTML <a> tag. I recommend you use real hypermedia formats whenever possible, but when that’s not an option, the Link header can be very useful.

What Hypermedia Is For

We need to take a step back and see what hypermedia is for.

Hypermedia controls have three jobs:

They tell the client how to construct an HTTP request: what HTTP method to use, what URL to use, what HTTP headers and/or entity-body to send.
They make promises about the HTTP response, suggesting the status code, the HTTP headers, and/or the data the server is likely to send in response to a request.
They suggest how the client should integrate the response into its workflow.

Beware of Fake Hypermedia!

There are a lot of existing APIs that were designed by people who understood the benefits of hypermedia, but that don’t technically contain any hypermedia. Imagine a bookstore API that serves a JSON representation like this:

HTTP/1.1 200 OK
Content-Type: application/json

{
 "title": "Example: A Novel",
 "description": "http://www.example.com/"
}

This is a representation of a book. The description field happens to look like a URL: http://www.example.com/. But is this a link? Is description supposed to link to a resource that gives the description? Or is it supposed to be a textual description, and some smart aleck typed in some text that happens to be a valid URL?

Formally speaking, "http://www.example.com/" is a string. The application/json media type doesn’t define any hypermedia controls, so even if some part of a representation really looks like a hypermedia link, it’s not! It’s just a string!

If you’re trying to consume an API like this, you won’t get very far dogmatically denying the existence of links. Instead, you’ll read some human-readable documentation written by the API provider. That documentation will explain the conventions the provider used to embed hypermedia links in a format (JSON) that doesn’t support hypermedia. Then you’ll know how to distinguish between links and strings, and you’ll be able to write a client that can detect and follow the hypermedia links.

But your client will only work for that specific API. The documentation you read is the documentation for a one-off fiat standard. The next API you use will have a different set of conventions for embedding hypermedia links in JSON, and you’ll have to do the work all over again.

That’s why API designers shouldn’t design APIs that serve plain JSON. You should use a media type that has real support for hypermedia. Your users will thank you. They’ll be able to use preexisting libraries written against the media type, rather than writing new ones specifically for your API.

JSON has been the most popular representation format for APIs for quite a while, but as recently as a couple years ago, there were no JSON-based hypermedia formats. As you’ll see in the next few chapters, that has changed. Don’t worry that you’ll have to give up JSON to gain real hypermedia.

The Semantic Challenge: How Are We Doing?

The application described by HTML is the World Wide Web, a very flexible application that’s used for all sorts of things.

A hypermedia format doesn’t have to be generic like HTML. It can be defined in enough detail to convey the application semantics of a wiki or a store. In the next chapter, I’ll talk about hypermedia formats that are designed to represent one specific type of problem. Outside that problem space, they’re practically useless. But within their limits, they meet the semantic challenge very well.

Chapter 5. Domain-Specific Designs

Maze+XML: A Domain-Specific Design

The media type of a Maze+XML document is application/vnd.amundsen.maze+xml. If you ever make an HTTP request and see that string used as the Content-Type of the response, you’ll know that you need the Maze+XML specification to fully understand the entity-body. This is how a domain-specific design meets the semantic challenge: by defining a document format that represents the problem (such as the layout of a maze), and by registering a media type for that format, so that a client knows right away when it’s encountered an instance of the problem.

In general, I don’t recommend creating new domain-specific media types. It’s usually less work to add application semantics to a generic hypermedia format—a technique I’ll cover in the next two chapters. If you set out to do a domain-specific design, you’ll probably end up with a fiat standard that doesn’t take advantage of the work done by your predecessors. You probably won’t have the flexibility problems that plague most of today’s APIs, but you’ll have done more work for no real benefit.

But a domain-specific design is the average developer’s first instinct when designing an API. What could be more natural than simply solving the problem at hand? That’s why I’m covering domain-specific designs first. It’s easy to show how a custom hypermedia format can bridge the semantic gap.

How Maze+XML Works

Each cell in a Maze+XML maze is an HTTP resource with its own URL. If you send a GET request to the first cell in this maze, you’ll get a representation that looks like this:

<maze version="1.0">
 <cell href="/cells/M" rel="current">
  <title>The Entrance Hallway</title>
  <link rel="east" href="/cells/N"/>
  <link rel="west" href="/cells/L"/>
 </cell>
</maze>

A link relation is a magical string associated with a hypermedia control like Maze+XML’s <link> tag. It explains the change in application state (for safe requests) or resource state (for unsafe requests) that will happen if the client triggers the control. Link relations are formally defined in RFC 5988, but the idea has been around for a long time, and nearly every hypermedia format supports them.

RFC 5988 defines two kinds of link relations: registered relation types and extension relation types. Registered link relations look like the ones you see in the IANA registry: short strings like east and previous. To avoid conflicts, these short strings need to be registered somewhere—not necessarily with the IANA, but in some kind of standard such as the definition of a media type.

Chapter 9 includes a guide explaining when it’s OK to use the shorter names of registered relations. Here’s a summary:

You can use extension relations wherever you want.
You can use IANA-registered link relations whenever you want.
If a document’s media type defines some registered relations, you can use them within the document.
If a document includes a profile that defines some link relations (see Chapter 8), you can treat them as registered relations within that document.
Don’t give your link relations names that conflict with the names in the IANA registry.

-  ⇤ ← Revision 42 as of 2025-02-17 16:30:58 → 
  Size: 15896
  Editor: 정수
  Comment:
+   ← Revision 43 as of 2025-02-17 16:31:26 → ⇥
  Size: 16894
  Editor: 정수
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 182:
+RFC 5988 defines two kinds of link relations: registered relation types and extension relation types. Registered link relations look like the ones you see in the IANA registry: short strings like east and previous. To avoid conflicts, these short strings need to be registered somewhere—not necessarily with the IANA, but in some kind of standard such as the definition of a media type.

Chapter 9 includes a guide explaining when it’s OK to use the shorter names of registered relations. Here’s a summary:

 * You can use extension relations wherever you want. 
 * You can use IANA-registered link relations whenever you want. 
 * If a document’s media type defines some registered relations, you can use them within the document. 
 * If a document includes a profile that defines some link relations (see Chapter 8), you can treat them as registered relations within that document. 
 * Don’t give your link relations names that conflict with the names in the IANA registry.