What is the Block Protocol?

A new standard for describing typed data (and nothing to do with blockchains)

December 1st, 2022

HASH

The Block Protocol Þ is an open standard originally developed by HASH. It allows frontend components built once to be used in many different contexts, and from within any application that supports the protocol.

The Þ works by standardizing the way information is described, with things captured as entities and the relationships between them as links. Links and entities have types, and as of version 0.3 of the Þ an entity can have multiple types.

The Þ 0.3 release also introduces a rich set of primitives for relating schemas that describe entities and links to one another, allowing anybody to extend an existing public schema with their own additional information, without sacrificing the ability to semantically link back to the original definition and benefit from blocks built to work with data of that type.

But more than that, the Þ specifies the actual methods by which frontend components can operate on data.

Yet another standard?

aka. "How does the Block Protocol compare to ________?"

While existing technologies like Storybook Controls and Custom Element Manifests standardize how events and callbacks are described, there's no standardization of what those events actually are. This means that, generally speaking, you often need to know the details of how of a specific element operates in order to implement it.

The goal of the Þ is to make it so that new blocks can be added to an application without any configuration or special required knowledge about how data is handled, and the protocol leverages existing standards such as JSON Schema.

How does it work?

The Block Protocol combines a number of different concepts in a core specification, alongside several additional service specs. One way the core spec enables the zero-knowledge integration of blocks and apps is by defining a standard set of messages that blocks and apps can exchange, each with an explicitly defined purpose (e.g. to update a piece of data).

Related to this are the Þ's data model and type system, which specify that entities should be linked together in a graph. Each entity is described by a type which defines its schema. While technologies such as the Custom Element Manifest describe the schema of a given component, these schemas are isolated and unrelated to a wider pool of data. The Þ aims to provide a means for blocks to be used interchangably between applications to allow both reading and creating complex, interlinked data.

Aligning incentives around interoperability

Previous attempts to create a more interoperable, open web have typically tried to tackle just a few parts of the problem in a piecemeal fashion, without clear handover to other standards or the developer utilities required to make them useful in practice.

Incentive misalignment has also meant that－if anything－the number of walled gardens has increased over time, with semantic web tech facing resistance to adoption when confronted with the real-world.

On 28th February 2023 we'll be releasing the Þ version 0.3, alongside the ability to extend WordPress with the Block Protocol, and use any Þ block in WordPress, the world's most-used CMS, which powers 43% of the public web, and our own HASH entity store and knowledge management app.

What Þ 0.3 contains

→
An ecosystem of blocks on the Þ Hub: frontend components which can be used in any protocol-supporting environment, without either block or app requiring any special knowledge of one another
→
The Core Specification: a technical document which outlines the standardized method of block <-> app communication at the heart of the ecosystem
→
The Graph Service: a common data model for describing information. This allows information created in one application to be used within another, through user-authored types, also hosted and discoverable on the Þ Hub
→
All this in a completely open-source framework. The Þ specification is free and open, and all of the core technology is available under the Apache 2.0 or MIT license (at your option)

How the Block Protocol aligns incentives

Easier, not harder to use semantic types

For people who actually want to make their data and sites semantically meaningful on the web, it's not a great user experience today. To make your data widely machine-readable, you'll have a look to see what published schemas already exist (e.g. on schema.org) and try to find something that matches what you're looking to represent.

It's unlikely you'll find a type definition (aka. 'schema') that exactly represent the data you have in the format you've collected it, and published schemas generally capture the "lowest common denominator" properties of entities only, in a "one size fits all" fashion at that.

Even then, some of the most used schemas might store information in ways you regard as bad practice (e.g. the schema.org/Person entity has familyName and givenName properties, rather than legalName and preferredName ones which tend to both generalize better in international/non-US contexts, and provide more useful information in most cases).

So what do you do? Most likely, you give up, and store your data as it makes most sense to you. Unfortunately, your data now makes no sense to any machine out there (unless it's doing some funky AI inference), and isn't really contributing to a democratized semantic web.

How does the Þ change this? Rather than at this stage give up, with the Þ you can find an existing entity type that mostly matches what you're looking for and do two things. Current RFCs propose enabling users to:

→
extend an existing type with additional properties. We call this type extension. Properties can be created from scratch, or any previously used property can be re-used. For example, you may want to track the location of a Person entity on a map. You might find that a property called "Geolocation" already exists, and rather than create it yourself, use this. Because properties can have data types associated with them which describe the possibility space of valid, accepted inputs, leveraging well-built existing properties is generally preferable to creating your own, providing you with free validation and additional semantic context, helping connect your information to a whole world of linked open data.
→
fork types, to introduce changes, but retain a conceptual link back. Whereas type extension enables adding individual properties to an existing type, forking allows for a type that is semantically or conceptually "the same thing as" another to represent properties differently. In such cases, users will be able to specify machine-resolvable rules and mappings between entity type definitions that refer to the same thing (e.g. "Person") but have conflicting views on how their properties should be represented. Even in this case, starting with an existing entity type that you don't like which happens to represent the same entity type has benefits, allowing applications that use the Block Protocol to more successfully render your arbitrary data in the appropriate types of blocks.

Our belief is that a free-market of types will emerge, with some more centrally connected than others, and the vast majority of widely-used schemas cross-walked to form a single large machine navigable graph. In this ontology of sorts, different people's understanding of the same concepts can be distinctly captured, with common ground between these and existing expressions preserved.

Unconstrained by the pace at which formal, centralized schema-defining organizations can adapt, and liberated in the depth to which they can model their domains, we expect a new, open global ontology to emerge in which all types are versioned, allowing for their evolution and interconnection over time.

This graph is multi-tenant and in contrast to historical approaches is emphatically not centrally maintained.

Improved data portability

While a lot of attention has been given to user ownership of data (e.g. allowing users to "bring their own backend/database"), apps typically don't want to give up control of their datastores. This means that great technologies like Solid remain relatively under-adopted, while traditional owned and operated databases continue to grow.

The Þ is ambivalent about where the data apps interact with is stored. It also doesn't demand that applications expose the data they contain to users in a way that does let them move it off-platform. However, the Þ does ensure that data is stored in accordance with the common model laid out in the protocol, enabling information from one app to port across neatly inside another should that be made possible by the application itself. Increasingly regulators are demanding this of large actors (e.g. the EU's new Digital Markets Act contains much language to this effect).

In this way adopting the Þ makes compliance easy, ensuring data is accessible in a portable format intelligible by others, while preserving a platform developer's ability to choose how and when they expose this to users or other services

More useful applications, without the headache

A plugin ecosystem for apps without their own plugin architectures. A growing community of useful components that extend the functionality of an application, providing utility to users. The ability to access powerful new features that appear in lots of applications at the same time, without being left behind.

Applications that support the Þ will be able to offer their users more. And those users will be able to access new features faster, benefiting from a community of developers building new utilities all the time.

TL;DR we have a plan

As a developer, you get a lot for free by using the Þ. It improves your ability to model information well, and lets you leverage an open-source community of blocks in your application, offering your users far more.

The Þ core team include a behavioral economist, a number of outstanding software engineers, and the creator of Stack Overflow. Our first angel check was from the founder and former CEO of Kaggle, the world's largest data science community. Between us, we believe we have a strong understanding of the incentive structures that need to exist for software engineers to adopt the Block Protocol, and a pragmatic plan to get there. If you'd like to join us, or if you think we're missing something, let us know on the Þ Discord.

Create a free account

By signing up you agree to our terms and conditions and privacy policy