This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Key Concepts

All the key concepts that you need to understand to get the best out of BitBroker

1: Introduction
2: Users
3: Entity Types
4: Data Connectors
5: The Catalog
6: Data Sharing Policy
7: User Data Access
8: Developer Portal

BitBroker is made up of a series of interlocking concepts. To get the most out of BitBroker, you should understand what these concepts are and how they work together to form a complete system.

In the section, we outline these key concepts in detail.

1 - Introduction

A step-by-step introduction to the BitBroker system

In this section, we will go through all the key concepts which you need to understand to get the best out of BitBroker. The system is made up of a series of interlocking concepts and it’s important to understand how these operate and interact.

Step # 1 - Create Users

All activity within BitBroker is parceled out to known Users and what they can and can’t do is governed by what roles they play. The first step is to learn how to create users and how to assign them roles, depending on what you want them to be able to do.

Step # 2 - Create Entity Types

All information within BitBroker is stored and presented within the context of a high level enumeration called Entity Types. These are the object types which are naturally present within the domain under consideration. The next step is to learn how to create and manage entity types.

Step # 3 - Create Data Connectors

Next, for each entity type, you can create Data Connectors. Each of these has permission to contribute entity instance records for their given type. BitBroker marshals and manages these contributions, through a set of rules which you can define.

Step # 4 - Populate the Catalog

Data connectors provide records to the BitBroker Catalog. This is a place where the existence of all the entity instances in the domain space are stored, enumerable and searchable. It also provides the route for consumers to search, discover and use entity instance data.

Step # 5 - Create Policy

Data Sharing Policies are the main back-bone of the BitBroker system. They are defined by coordinator users, who use them to specify the exact context in which they permit data to be accessed by consumers. In this step, you can learn how to define and manage these vital policies.

Step # 6 - Grant Access

Once all other elements are in place, you can proceed to grant Access to data. In BitBroker, such grants are only permitted within the context of a consumer user and a policy. The connection between these two system concepts is called an access.

2 - Users

Users and their roles within a BitBroker instance

Read the corresponding API documentation here

All activity within BitBroker is parceled out to known Users and what they can and can’t do is governed by what roles they play. In this section, we will learn about users: how to create, update and delete them and how user roles are employed to partition functionality and responsibility.

User Roles

All user activity within BitBroker is divided between three logical user roles. Each role has its own responsibilities and access control rules. These three user types come together to form a complete system.

These are “logical” roles and it is perfectly possible for one actual person to be adopting more than one role within a deployed instance.

In practice, users assume a role by being in receipt of an authorization token, which grants them the ability to perform actions using the corresponding API.

In the section, we will outline what the three user roles are. Later sections of this documentation will go on to explain how to create users and grant them such roles.

Coordinators

Coordinators have the most wide-ranging rights within a BitBroker system. They alone have the ability to perform key tasks such as: defining the entity types which are present, giving permission to contribute data and creating policy which governs data access. This is achieved by access to the Coordinator API.

It is envisaged that there will only be a small number of coordinator users within any deployed BitBroker instance. Coordinator roles should be limited to people to whom you want to grant wide responsibility over the entire system. Whilst there can be multiple Coordinators, there can never be zero Coordinators (the system will prevent you from deleting the last Coordinator).

Contributors

Contributors are users who have permission to contribute data within the context of a single specified entity type, via access to the Contributor API. Such rights can only be granted by a coordinator user.

If a person or organization is contributing data to many entity types within a single instance, they will be adopting multiple, different contributor roles in order to achieve this. Hence, the number of contributors is linked to the number of data connectors which is linked to the number of entity types.

Consumers

Consumers are the end-users of the system and the people who will be subject to the data access policies created by coordinators.

Consumers come in via the “front door”, which is the Consumer API. They are the people who will be accessing the data to create their own applications and insights - but only ever in the context of a policy declaration which defines what data they can see and how they can access and use it.

There can be an unlimited number of consumer users in an active BitBroker instance.

3 - Entity Types

The types of entities which are the basis of a BitBroker instance

Read the corresponding API documentation here

All information within BitBroker is stored and presented within the context of a high level enumeration called Entity Types.

Entity types are the object types which are naturally present within the domain under consideration. You can define any create and set of entity types which make sense for your deployed instance. Only coordinator users have the ability to create, delete and modify entity types.

For example, here are some entity type enumerations which may naturally occur in different domains of operation:

Domain	Possible Entity Types
Transport	`bus-stop`, `bus`, `timetable`, `station`, `road`, `route`, `train`, etc
Health	`patient`, `prescription`, `doctor`, `treatment`, `condition`, etc
Manufacturing	`robot`, `tool`, `belt`, `factory`, `shift`, `quota`, `order`, etc

You should choose your entity types with care, since they are difficult to modify once a system is operational.

Entity types should be real-world concepts which an average user would instinctively understand. They can represent physical, logical or virtual concepts within your domain. Typically, an entity type will be a noun within the domain space.

The list of known entity types forms the entire basis of a BitBroker instance. The Coordinator API documentation contains more detail about how to go about naming entity types and the attributes which must be present in order to create them.

Entity Instances

The whole point of entity types, is to provide some structure for the list of entity instances - which form the bedrock of the data which BitBroker is managing. Entity instances are submitted into the BitBroker catalog via contributions from data connectors.

Entity Schemas

However, in other instances, you maybe relying upon second and third parties to be contributing data into your BitBroker instance. It is entirely possible to have multiple contributors, submitting entity instances for a shared entity type.

In scenarios where the contributor community is diverse, it can be helpful to define clear rules as to the nature and quality of the incoming data. Rules can be defined to ensure consistency of data types, formats and representation schemes. It is also vital to ensure semantic alignment between similar concepts being contributed by different users.

This can be achieved within a BitBroker system by specifying a JSON schema per entity type. Once this schema is in place, BitBroker will automatically validate all incoming records against it. Violations will be rejected and contributors will be informed as to the specific reasons why.

Such schemas are an optional extra and may not be required in all instances. You can specify such schemas at the points you create and/or modify entity types.

4 - Data Connectors

Data connectors which allow contribution of data into a BitBroker instance

Read the corresponding API documentation here

BitBroker is a contribution based system, meaning that data contributed by a community of users. In some cases, these contributors will be people you have direct control over (and may well be other roles you yourself are playing). However, in other instances, you maybe relying upon second and third parties to be contributing data into your BitBroker instance.

The vector through which data contribution is managed is the concept of data connectors. Each entity type within the system should have at least one data connector. However, it is entirely possible for an entity type to be sharing multiple data connectors, all contributing entity instance records for that type.

A data connector can only be contributing data to one entity type. If a party wants to contribute data to multiple entity types, you must create a different connector for each one. These can, in practice, point back to one data connector implementation.

A quick way to get going building data connectors is to adapt the example connectors which have been built for a range of data sources.

Managing Data Contribution

Whether or not the contribution is coming from internal or external players, the first step in managing the process is to create a data connector - housed within the entity type for which it has permission to submit records. Creating and managing data connectors can only be done by coordinator users.

As part of the creation process for connectors, the system will generate a connector ID and an authorization token. These will be returned to the coordinator user in response to the creation request. The coordinator should communicate these items in a secure manner to the party responsible for implementing the data connector. Here is an example of these items:

{
    "id":"9afcf3235500836c6fcd9e82110dbc05ffbb734b",
    "token":"ef6ba361-ef55-4a7a-ae48-b4ecef9dabb5.5ab6b7a2-6a50-4d0a-a6b4-f43dd6fe12d9.7777df3f-e26b-4f4e-8c80-628f915871b4"
}

Armed with these two items, data connectors can go ahead and contribute entity instance records into the catalog, by using the Contributor API. You should direct new data connectors to the contributor documentation in order to get them started with their implementation.

Connector authors should be aware of the important distinction between data and metadata within the context of the BitBroker catalog.

Third parties can implement their data connectors in whatever manner they want. The system only requires that they communicate their submissions via the HTTP based mechanisms outlined in the Contributor API. Coordinator users may well have their own requirements which they impose upon contributors. However, this is of no direct concern to BitBroker.

Because of the way the division of responsibility has been separated within the system, BitBroker has no requirements to know or understand how the original source data is stored or secured. This is purely the concern of the data connector authors.

Contribution is scoped to be within the connector’s domain space only. A connector cannot affect records delivered by another connector, even within the same entity type and even if they have a clashing key space.

Coordinators can use the mechanism of entity schemas to ensure alignment around things such as data types, formats, representation schemes and semantics.

Live vs Staging Connectors

By default, newly created connectors are not “live”. This means that data which they contribute will not be visible within the Consumer API. Implementing a data connector can be a tricky operation and may require a few attempts before data is being delivered to a sufficient quality. Hence, it is useful to isolate contributions from certain connectors into a “staging space” - so that it doesn’t pollute the public space visible to consumers.

In order to make a connector’s data visible, it must be promoted to live status. Only coordinator users can promote connectors in this way. They can also demote connectors, if they believe their data should no longer be publically visible.

When developing a new connector, it is often useful to be able to see data from it alongside other public data. There is a mechanism available in the Consumer API which allows data connectors to see how their records will look alongside other existing public records.

5 - The Catalog

The catalog and how it facilitates search, discovery and access

Read the corresponding API documentation here

The BitBroker catalog is a place where the existence of all the entity instances in the domain space are stored, enumerable and searchable. It also provides the route for consumers to search, discover and use entity instance data.

The BitBroker catalog performs the following specific functions:

Records the existence of all domain entity instances
Marshals all domain data into a published entity type enumeration
Presents a small set of global attributes in a consistent manner
Ensures data representation consistency between entity specific properties
Arbitrates the key spaces of the domain systems managed by data connectors
Offers consistent listing, enumeration and search semantics over all entries via the Consumer API
Acts as the bridge to the connected domain systems
Removes the need for users to understand where the source data originated from
Blends data from multiple data connectors for the same entity type
Provides key arbitration between multiple data connectors
Maximizes data interoperability

Consumer access to the the catalog is via its own API, which is a part of the Consumer API.

Data vs Metadata

The catalog is a place to store metadata about entity instances. Consumer use the catalog to search and discover entity instances which are needed for their applications. Once an entity instance has been discovered, its details can be obtained via calls to the relevant part of the Consumer API.

When consumers obtain a detailed record about a particular entity instance, BitBroker will ask it’s submitting data connector directly for it’s live and on-demand information via a webhook. It will then merge this with the catalog record and return the whole to the consumer. The consumer is unaware of the existence of data connectors and even that some data is being drawn on-demand from another system.

What exactly is the difference between metadata and data in this context?

What data should you store in the catalog and what data should you retain for webhook callbacks? There is no right or wrong answer to this question. Typically, you should aim to store information which has a slow change rate in the catalog and retain other information for a callback.

Here are some examples of how this might operate in practice:

Entity Type	Catalog Data	Live Webhook Data
`thermometer`	`location`, `range`, `unit`	`value`
`bus`	`route`, `seats`, `operator`	`location`
`car-park`	`name`, `location`, `capacity`	`occupancy`

To be clear, data connectors can store everything in the catalog. This has the advantage that everything becomes searchable and they no longer need to host a webhook.

But, on the flip side, they have taken on the burden of sending constant updates to the catalog in order to keep it up to date. Here, the catalog is always likely to be a temporally lagging copy of the source stores.

Ultimately, the decision on how you use the catalog is up to the coordinators and data connector authors.

6 - Data Sharing Policy

Create, manage and deploy data sharing policies to grant access to data

Read the corresponding API documentation here

Data sharing policies are the main back-bone of the BitBroker system. They are defined by coordinators, who use them to specify the exact context in which they permit data to be accessed by consumers.

Policy definitions can only be submitted by coordinators via the corresponding end-points in the Coordinator API. Once a policy is deployed, accesses can be created which allow consumers to interact with the data specified in the ways allowed.

A policy definition is made up of three sections. We will outline each in detail.

Data Segment

The data segment section of a policy definition, defines the maximum subset of data which will be visible when users access the Consumer API via a policy authorization token. The combination of a consumer authorization token and a policy authorization token, is locked within a data segment. There is no action they can perform with the Consumer API to break out in the wider catalog of data.

A data segment definition is made up of the following attributes:

Attribute	Necessity	Description
`segment_query`	required	A valid catalog query, from the Consumer API Defines the visible subset of data
`field_masks`	optional	An array of strings Attributes to be masked out of returned documents

Data segments are defined via the same semantics as users searching the catalog using catalog queries. You should think of your data segment as a second query which will be boolean and with the user’s queries.

If a consumer asks for entity instance data which is out-of-policy, they will be returned an HTTP/1.1 404 Not Found error. The system will deny the existence of such entities. If two consumers share a BitBroker link, where they have different policies - it is possible that one may see a data document and other an HTTP/1.1 404 Not Found for the same requesting API end-point.

Field masks are a way of removing individual attributes from entity instance data records. You can only remove attributes from within the entity section of the overall document. You do this by specifying an array of strings, where each one is [entity-type].[attribute].

For example, if you specify [ "country.capital", "country.currency.name", "country.location" ] then three attributes will be removed from the entity section of all returned documents. So this record:

{
    "id": "GB",
    "name": "United Kingdom",
    "entity": {
        "area": 242900,
        "calling_code": 44,
        "code": "GB",
        "continent": "Europe",
        "currency": {
            "code": "GBP"
        },
        "landlocked": false,
        "location": {
            "coordinates": [
                -3.435973,
                55.378051
            ],
            "type": "Point"
        },
        "population": 66040229,
        "link": "https://en.wikipedia.org/wiki/United_Kingdom"
    }
}

will be returned like this, for in-policy calls:

{
    "id": "GB",
    "name": "United Kingdom",
    "entity": {
        "area": 242900,
        "calling_code": 44,
        "capital": "London",
        "code": "GB",
        "continent": "Europe",
        "currency": {
            "code": "GBP",
            "name": "Sterling"
        },
        "landlocked": false,
        "population": 66040229,
        "link": "https://en.wikipedia.org/wiki/United_Kingdom"
    }
}

Access Control

The access control section of a policy definition, defines the ways in which users can interact with the Consumer API via a policy authorization token. An access control definition is made up of the following attributes:

Attribute	Necessity	Description
`enabled`	required	Whether or not to use access control
`quota`	optional	An object describing allowable data quotas
`quota.max_number`	optional	A number of calls for the quota
`quota.interval_type`	optional	The quota period of either `day` or `month`
`rate`	optional	The maximum calls-per-second rate

If you do not want to use the access control section, you can simply specify false for the enabled attribute. In this case, all other attributes will be ignored and the consumer will enjoy unrestricted access rates to the Consumer API.

As an example, if you specify the following:

{
    "enabled": true,
    "quota": {
        "max_number": 86400,
        "interval_type": "day"
    },
    "rate": 250
}

Users with a policy authorization token will be able to make calls at a maximum rate of 250-per-second and with a maximum quota of 86,400-per-day.

If a consumer breaches a rate or quota limit, then calls to any part of the Consumer API will respond HTTP/1.1 429 Too Many Requests error. This response will persist until the breach has expired.

Legal Context

The legal context section of a policy definition, defines the legal basis on which data access is permitted. A legal notice will be present at the end of every entity instance record returned by any part of the Consumer API.

These legal notices are in the form of a JSON array of objects, each with three attributes:

Attribute	Necessity	Description
`type`	required	The type of the legal notice
`text`	required	The description of the legal notice
`link`	required	A link for more information about the legal notice

Examples of types of legal notices which may be present are:

Type	Description
`attribution`	How the data should be attributed within your application
`contact`	Who to contact about the data and it’s use
`license`	The licenses under which this data sharing operates
`note`	Ad hoc notes about the data and/or it’s use
`source`	Information about the source or origination of the data
`terms`	The terms and conditions of use for the data

It is possible that you may have more information about the legal basis on the use of your data by consumers. You may, for example, require consumers to perform additional legal steps in order to be given a consumer authorization token. This is outside the current scope of BitBroker.

7 - User Data Access

Managing data consumers and their associated data access tokens

Read the corresponding API documentation here

Access to data within a BitBroker instance is always permitted within the context of a consumer and a policy. The connection between these two system concepts is called an access.

In practice, these are manifested as authorization tokens which authorize the consumer’s calls to the Consumer API. Such accesses can only be created, managed and rescinded by coordinator users.

Since an access is in the context of a policy, it restricts its holder to the designated data segment and access control.

8 - Developer Portal

Create, deploy and operate a branded developer portal to manage your data user community.

Coming Soon…

APIs can be hard work for non-technical people!

Watch-this-space for a full portal, which will allow configuration, management and operation of a BitBroker instance with just clicks in a browser.