Working with HAL in PUT requests
At my new company, we’re developing a REST api. We’re trying to strike the balance between ‘easy to use’ and sticking to the rules of REST, and that’s given us more than a few unforseen benefits. When you work within a framework that a lot of people have spent time thinking about, there’s a lot of answers if you know where to look. For this API, we’ve decided to use HAL as the primary format.
HAL really is just a document format for a hypermedia API, like HTML is for hypertext. It doesn’t tell you how to express your domain model, and doesn’t really tell you how to use HAL to submit changes.
Expressing relationships
The way a relationship should be expressed in HAL is using a link. For
example, every user
might be part of a team
, so when I GET
a user, I
might receive something like this:
{
"firstName" : "Evert",
"lastName" : "Pot",
"_links" : {
"self" : { "href" : "/team/5/user/4234" },
"team" : { "href" : "/team/5" }
}
}
Here we see the link self
, which is the uri for this user, and the team
it
belongs to. This works pretty well. We made a really strong effort to
absolutely never expose database id
’s anywhere in our documents, as the URI
is ultimately the real identifier, and we don’t want clients to start
composing these urls on their own. We also don’t want to create two types of
unique identifiers (database id’s and URIs) and force users to have to think
about which to use in which situation.
Adding a new user
In this example, adding a new user to a team is fairly simple. Since this user relation to the team is a sort of ‘belongsTo’ relationship, a new user could be added using a request such as this:
POST /team/5/user HTTP/1.1
Content-Type: application/vnd.foo-bar.hal+json
{
"firstName" : "Roxy",
"lastName" : "Kesh"
}
Since the target ‘collection’ is /team/5/user
we can infer from that
that the team for this will be /team/5
.
It turns out that most of our relationships actually follow that model. Lots of them are basically a ‘1 to many’ relationship. It’s not always possible to follow this model though.
Expressing relations in a PUT request
I have another fictional example that’s somewhat similar to our real-life problem. Say we have a list of blog posts. They all need to be in one category.
For reasons I won’t go into, it did not make sense to have a structure such as:
/category/personal/post/5
So we have 2 distinct URL structures. Our categories might look a bit like this:
/category/personal
/category/animals
/category/vomit
And the response to a GET request to a blog post might look like this:
{
"title": "Why I ran away",
"date" : "2016-12-14T20:43:23Z",
"contents" : "...",
"_links": {
"self" : { "href" : "/post/5" },
"category" : { "href" : "/category/animals" }
}
}
Pretty simple. There’s a blog post, and it expresses via a category
relation
type in what category it’s in. But now we want to change the category with PUT
.
There’s not that much information out there from people who do this.
On the HAL Primer page for PhlyRestfully, the following is straight-up mentioned:
If POST-ing, PUT-ting, PATCH-ing, or DELETE-ing a resource, you will usually use a Content-Type header of either application/json, or some vendor-specific mediatype you define for your API; this mediatype would be used to describe the particular structure of your resources without any HAL “_links”. Any “_embedded” resources will typically be described as properties of the resource, and point to the mediatype relevant to the embedded resource.
This is one of the top hits for this Google search and pretty much implies that
a HAL document (with _links
) is only meant to be returned from a GET
request
and not sent along with a PUT
. Two different media-types depending on which
direction the data flows.
They can get away with it though, because they express relationship as both id’s
and links, which I definitely believe is the wrong way to go about it. So when
PhlyRestfully updates a resource, they follow a bit of an odd convention. If I
followed it, my PUT
request should look like this:
{
"title": "Why I ran away",
"date" : "2016-12-14T20:43:23Z",
"contents" : "...",
"category" : { "id" : "animals" }
}
When I asked him about this, part of his answer was:
@evertp I see HAL more as a response format, not a request format. But there's nothing saying you can't use it in either direction.
— weierophinney (@mwop) September 26, 2016
Anyway, this was a bit unsatisfying. Not only because it meant introducing the
id
everywhere, but I also really want clients to be able to just do a GET
request, make minimal modifications to the document and use the exact same
format for PUT
.
When reading the HAL mailing list, it certainly does seem that many just provide
links in the PUT
request. Here’s a good example:
Here poster asks whether he should use _embedded
in PUT
requests. Having
the _links
there seem like a given.
However, looking again at a whole bunch of the public HAL apis that exists,
most of them completely ignore the notion of using _links
as a real relationship
and either use id-properties as well, or just provide _links
separately,
completely redundant.
Here’s a bunch:
Apparently both Amazon and Comcast also use HAL, but I had trouble finding their API documentation.
Our decision
We’re gonna stick to our guns and let clients create new relationships using
a _links
property in a PUT
request.
If you are creating a new blog post, and want it to be filed under a certain category, this how how it looks like:
POST /blog/ HTTP/1.1
Content-Type: application/vnd.blog.hal+json
{
"title": "Why I came back",
"date" : "2016-01-15T08:44:23Z",
"contents" : "...",
"_links": {
"category" : { "href" : "/category/animals" }
}
}
_links
is optional
Because _links
are almost always server-controlled with a few exceptions,
and might be a bit confusing for new users, we decided that we’re going to make
specifying the _links
in PUT
requests optional. For the most part they are
‘meta data’ and not part of the core resource representation, and want to keep
it somewhat simple.
This goes somewhat against the rules if you follow HTTP strictly. After all, a
PUT
request should completely replace the target resource. This is definitely
something I choose to not strictly follow though. It rarely makes real sense.
But making _links
optional creates a new problem. What if the category
in
our blog post is optional, and we want to to remove the category
from an
existing post.
Well, since _links
normally is optional, this would not do the trick:
PUT /blog/6 HTTP/1.1
Content-Type: application/vnd.blog.hal+json
{
"title": "Why I came back",
"date" : "2016-01-15T08:44:23Z",
"contents" : "...",
}
Instead, we’re opting for using the about:blank
to specifically mark the
link as removed:
PUT /blog/6 HTTP/1.1
Content-Type: application/vnd.blog.hal+json
{
"title": "Why I came back",
"date" : "2016-01-15T08:44:23Z",
"contents" : "...",
"_links" : {
"category" : { "href" : "about:blank" }
}
}
Is this crazy? It seemed like the sane solution to us. If you have an opinion, I would love to hear it!
Comments
ruFog •
About:blank looks appropriate! Nice solution.
Jorik •
In my opinion, you should express a relation in a json body of a put request by representing it as simple as a http link. Nothing more. This works fine if the related objects already exist. Furthermore get and put do not need to use the same structure of the body. Refer CQRS
Evert •
CQRS has little to do with this imho. What you're suggesting is not a different data-model for GET and PUT, but just a different format. If the actual underlying data-model was different, then you should really just use multiple resources (uris) for GET and PUT, and not content-negotation.
But that's not what you're suggesting. You're suggesting using the same data-model, but just a different format for GET and PUT. I fail to see the benefit of this. What problem are you trying to solve with this?
Wes Biggs •
Why not use PATCH when updating? The PATCH spec (https://tools.ietf.org/html... works well for well-defined resources like this:
PATCH /blog/6 HTTP/1.1
Content-Type: application/json-patch+json
{
"_links" :{
"category" : null
}
}
would remove the category link.
Evert •
PATCH is not REST for a number of reasons. You are no longer transferring state, and lose idempotency.
Probably a good solution for some systems, but not our design goal.