This documentation explains all the concepts, models and processes that you will need to understand to get started with BitBroker.
This documentation is still work-in-progress and will be updated as the project progresses.
What is BitBroker?
BitBroker is an open source, end-to-end, data sharing solution.
It allows organizations to share data with third-parties in the context of robust policy definitions. These policies ensure that:
Agreed people can get access to the data
… only the specified data segment is visible to them
… they access the data in the permitted manner
… everything happens within a defined legal framework
BitBroker lets data owners share data with confidence.
Using BitBroker you can leverage the expertise of external players to help analyze, process and realize the latent value within your data. Services provided by the system are:
Service
Description
Connect heterogeneous data sets
BitBroker is a great way to bring order into an otherwise chaotic data environment. It helps organize data from complex and heterogeneous backend systems, presenting a more consistent view to data consumers.
Controlled data contribution
BitBroker allows for a community of people to contribute data; however, it is not a free-for-all. Coordinators decide who can contribute and what they can contribute. They can set schematic expectations per data type, allowing BitBroker to police all contributions to ensure they are compliant.
Comprehensive cataloging
Unless asked it to, BitBroker does not take a copy of data. Instead, it indexes all the entity instances in the domain and facilitates the search and discovery of these. Policy based access is then brokered via the core engine, meaning that the consumer is never made aware of the source details of any data.
Flexible data sharing policies
Coordinators can create and deploy any number of data sharing policies. These define which data segments are visible, how they can be accessed and the legal basis for any use. Once deployed, BitBroker will police the policy on your behalf, giving you confidence that data is only being accessed and used as you would want it to be.
Reach-for data licenses
BitBroker eases the legal burden of sharing data by allowing Coordinators to select and incorporate from common data licenses, attribution models, terms and conditions, acceptable content, etc. This means they can quickly get started with data consumers, but still within a robust legal framework.
Modern, effective suite of APIs
BitBroker works out-of-the-box with a comprehensive suite of modern APIs offering search, discovery and access to the policy specified entity and time series data.
Complete user and key management
Flexible key workflow configuration means that Coordinators can decide exactly which consumers are granted access and for how long. BitBroker will take care of all users' access and even flag users who are attempting to violate policy.
Branded developer portal
Coming soon… a web portal which will allow you to administer and operate a BitBroker instance via clicks in a browser.
BitBroker is a one-stop-shop for policy based data sharing. As you will see in later documentation, it can be deployed in a variety of configurations, depending upon your data access and control requirements.
Is BitBroker for me?
BitBroker is not for everyone. Sometimes, people just want to share data openly, with minimal control and in the context of open licenses. In that case, you might be able to get away with an open data portal, such as CKAN or Socrata.
However, in situations where more control is required, then BitBroker can become a key enabling resource. Deploying BitBroker makes most sense when:
You want to share data from multiple, complex and heterogeneous backend systems
You want to share data with a wide and diverse set of people
You want to create and deploy policies, such that only defined people, can access defined subsets, via defined routes
You want to apply different licenses, to different people, at different times
You want to maintain control and have the ability to rescind access at any time
BitBroker lets you build confidence with your data consumers, by enabling progressive data access as your relationship with them matures and develops.
A modern refresh for legacy applications?
Perhaps your needs are simpler…
BitBroker is also a great solution to bring together different data sources within an organization and to present them in a simple and modern API.
Using very little effort, multiple, legacy data applications can be plugged into a BitBroker - enabling cross-siloed information sharing. Users can access this via a simple, RESTful API. You can plug the data into your favorite business applications or build an application from the range of sample applications you will find documented here.
Ready to get started?
Here are some great jumping-off points for BitBroker:
BitBroker is an open source project under the Apache 2 license. We welcome contributions from anyone who is interested in helping to make BitBroker better. If you would like to contribute, then you should start by reading our contribution guide and then get in-touch with the core team.
1 - Getting Started
Getting started with BitBroker, including demos, installation and basic configuration
In this section we explore a number of ways to get you started using, deploying, operating and developing BitBroker.
Here you will find information about a range of BitBroker demo applications and connectors to help you understand what BitBroker is and how it can operate in complex data sharing scenarios.
Most importantly, it will help you get started building your own applications which use BitBroker data or your own data connectors to contribute data into the system.
The example connectors are deployed supporting two example datasets: countries and heritage sites.
Demo Applications
We have a number of example applications, which allow users to explore policy based access to data via the Consumer API.
Data Explorer
This application allows you to explore the entire Consumer API by directly trying out a number of interactive scenarios. It has a set of example data and polices already pre-installed and running.
You can explore using the Catalog API to try different, complex catalog queries. You can see how the results of these queries differ in the light of different policies - which you can switch between simply in the application.
Once you have executed a query and obtain entity instance records, you can use the Entity API to browse the whole list and inspect the details of individual entities.
This application allows you to explore the Consumer API via the medium of a mapping application. It has a set of example data and polices already pre-installed and running. The geographical attributes within the example data are used to populate a map view of the data records.
You can explore how the application outputs are changed in the light of different policies - which you can switch between simply in the application.
Here we provide a range of data connectors to help you understand what they are and how to build your own. Indeed, it is hoped that you can simply modify one of these data connectors to achieve your own data submission aims.
We will endeavor to increase the example data connectors here over time. Offering more choices of both data sources and implementation languages. We welcome help from the community on this and you are encouraged to submit your own data connectors in the sample set.
We currently have two types of example connector:
File based - dataset loaded directly from a file
RDBMS - data drawn from a relational database
All implementations upload data to the BitBroker catalog. They also fetch and return third-party data in their entity webhooks for the example country dataset, and both support time series data for the country dataset.
Installing BitBroker on Kubernetes, either for cloud or for local development
There are several ways in which you can install BitBroker, depending on what you are trying to achieve.
Are you trying to deploy to a public cloud or to a local environment? Are you wanting to use BitBroker in a production environment, or as a local sandbox, perhaps you are developing data connectors or even trying to enhance BitBroker itself?
In this section, we will cover in detail and step-by-step, all the different ways in which you can install a BitBroker instance using Kubernetes, to help you achieve your goals.
Bootstrapping Fresh Installations
For all Kubernetes installations, you should be aware of the ways in which fresh installations perform certain bootstrap operations.
Bootstrap User
Every fresh installation of BitBroker comes with one preinstalled user (uid: 1). This user is automatically created when the system is bought-up for the first time.
As we will explain below, this user is linked to the authorization of API calls. The user is also important for the up-coming web portal BitBroker control interface.
Bootstrap Coordinator Token
All interactions with BitBroker stem, ultimately, from interactions with the Coordinator API. This is the main administrative API for the whole system. In order to use this API, you need an access token.
Using this API, new users can be created and then promoted to have coordinator status. This results in the production of a new coordinator access token for them. But this act of promotion itself, requires permission. So how can we get started with this circular scenario?
Whenever a fresh system is installed using Kubernetes, a special bootstrap coordinator authorization token is produced. This token is valid for use with the Coordinator API. You can use this token to get going with the process of then creating your own users and giving them coordinator status.
It is possible, but not recommended, to use the bootstrap coordinator token in normal operation. Instead, you should use it to create your own master coordinator user and then utilize their token for further operations.
The bootstrap coordinator token is a different (longer) format, to normal coordinator tokens.
The detailed install steps below, will contain more information about how to extract and use this bootstrap token.
In all the sample calls below, we use a placeholder value for the bootstrap token. Enter your bootstrap coordinator token into the box below, in order to update the sample calls with your install details:
Your Bootstrap Coordinator Authorization Token
Cloud Kubernetes Installation
In the section, we will explore how to you can use our pre-prepared Helm charts to install a complete BitBroker instance into your cloud of choice. These charts will be downloaded directly from Docker Hub.
This section assumes you are familiar with Kubernetes and Helm. If you are new to these technologies, we advise that you become familiar with them first, before trying out the steps outlined below.
This type of deployment will give you full access to the complete BitBroker feature set, including full policy enforcement. The system will be accessible to anyone to whom you grant access.
In all the sample calls below, we use the placeholder text https://your-cloud-host for the host base URL. Enter your cloud host base URL into the box below, in order to update the sample calls with your install details:
Your Cloud Host Base URL
Prerequisites
Start with a clean machine, with no remnants of previous BitBroker installations. Please ensure you have the following software installed, configured and operational:
This step takes a few moments to complete. After it has finished, you will see a series of notes which discuss some key points about your installation. The sections on JWKS, Auth Service and Rate Service are for advanced use cases, and you can ignore these for now.
It can take a few moments for the system to come into existence and for it to complete its initialization steps. You can test that the system is up-and-ready, by the using this command:
This will output Not Ready until all the servers are up, after which it will output Ready. Keep trying this command until it signals it’s OK to proceed.
Installing a DNS Alias
If you want to add an alias to this installation into your DNS record, then you need to first get the service load balancer address:
kubectl get svc --no-headers -o custom-columns=":status.loadBalancer.ingress[0].hostname" --selector=app.kubernetes.io/name=bbk-emissary-ingress -n bit-broker | head -1
Then you can use this to add an ALIAS into your DNS record. Depending on the domain and the registrar, the procedure and naming terminology will be different. Here is an example procedure for AWS’s Route 53.
Bootstrap Coordinator Token
A key thing to note in the results output, is the section which says: “Here is how to get the Coordinator token”. Extracting and recording this token is a vital step to proceed with the install. This is the bootstrap coordinator token, which we outlined earlier in this document.
As the results output states, you can get hold of this bootstrap coordinator token as follows:
kubectl exec$(kubectl get pods --no-headers -o custom-columns=":metadata.name" --selector=app=bit-broker-bbk-auth-service -n bit-broker | head -1) -n bit-broker -c auth-service -- npm run sign-token coordinator $(kubectl get secret --namespace bit-broker bit-broker-bbk-admin-jti -o jsonpath="{.data.ADMIN_JTI}"| base64 --decode)
This is a long command and it will take a few seconds to complete. It will output the token, which will be in the format of a long string. Copy this token and store in a secure location. Be careful if sharing this token, as it has full rights to the entire Coordinator API.
Testing Your Installation
If everything worked as expected, the BitBroker API servers will be up-and-running in your cloud waiting for calls. You can test this by using the sample call below:
Like all BitBroker API end-points, these require a working authorization to be in place. Hence, this announcement can be used for testing or verification purposes.
Uninstallation
If you want to completely uninstall this instance of BitBroker, you should follow these steps:
This will remove all the key elements which were created as part of the installation. If you want to go further and clear out all the images which were downloaded too, use the following:
In the section, we will explore how to you can use our pre-prepared Helm charts to install a complete BitBroker instance on your local machine. These charts will be downloaded directly from Docker Hub.
This type of install is suitable for development purposes only.
This section assumes you are familiar with Kubernetes and Helm. If you are new to these technologies, we advise that you become familiar with them first, before trying out the steps outlined below.
This type of deployment will give you full access to the complete BitBroker feature set, including full policy enforcement. But the system will only be available on your local machine.
This section assumes have already installed Kubernetes and Helm command line tools. It also assumes you have installed and are running Docker Desktop.
Prerequisites
Start with a clean machine, with no remnants of previous BitBroker installations. Please ensure you have the following software installed, configured and operational:
First, let’s prepare the context we want to install:
helm repo add bit-broker https://bit-broker.github.io/charts
JWKS=$(docker run bbkr/auth-service:latest npm run --silent create-jwks)
kubectl apply -f https://app.getambassador.io/yaml/emissary/2.2.2/emissary-crds.yaml
Our pre-prepared Helm charts are, by default, configured for production, cloud environments. However, here we want to install to localhost only. So we need to make some modifications to the default chart to cater for this:
This step takes a few moments to complete. After it has finished, you will see a series of notes which discuss some key points about your installation. The sections on JWKS, Auth Service and Rate Service are for advanced use cases, and you can ignore these for now.
It can take a few moments for the system to come into existence and for it to complete its initialization steps. You can test that the system is up-and-ready, by the using this command:
This will output Not Ready until all the servers are up, after which it will output Ready. Keep trying this command until it signals it’s OK to proceed.
Bootstrap Coordinator Token
A key thing to note in the results output, is the section which says: “Here is how to get the Coordinator token”. Extracting and recording this token is a vital step to proceed with the install. This is the bootstrap coordinator token, which we outlined earlier in this document.
As the results output states, you can get hold of this bootstrap coordinator token as follows:
kubectl exec$(kubectl get pods --no-headers -o custom-columns=":metadata.name" --selector=app=bit-broker-bbk-auth-service -n bit-broker | head -1) -n bit-broker -c auth-service -- npm run sign-token coordinator $(kubectl get secret --namespace bit-broker bit-broker-bbk-admin-jti -o jsonpath="{.data.ADMIN_JTI}"| base64 --decode)
This is a long command and it will take a few seconds to complete. It will output the token, which will be in the format of a long string. Copy this token and store in a secure location. Be careful if sharing this token, as it has full rights to the entire Coordinator API.
Testing Your Installation
If everything worked as expected, the BitBroker API servers will be up-and-running on localhost waiting for calls. You can test this by using the sample call below:
Like all BitBroker API end-points, these require a working authorization to be in place. Hence, this announcement can be used for testing or verification purposes.
Uninstallation
If you want to completely uninstall this instance of BitBroker, you should follow these steps:
This will remove all the key elements which were created as part of the installation. If you want to go further and clear out all the images which were downloaded too, use the following:
There are several ways in which you can install BitBroker, depending on what you are trying to achieve.
Local installations using Docker Compose or even direct on to your physical machine, can be a useful option for some use cases. For example, where you are developing data connectors or even trying to enhance BitBroker itself.
In this section, we will cover in detail and step-by-step, all the different ways in which you can install a BitBroker instance locally, to help you achieve your goals.
Server Naming and Ports
For consistency across the system, in local and development mode we use a set of standard logical server name and port for addressing the three principle API services.
This helps with readability and removes ambiguity, since some APIs share resource names. Also, it reduces confusion, if you start multiple API servers on the same physical machine.
Logical Server Names
We use a convention of assigning each API service a logical server name as follows:
API Service
Logical Server Name
Coordinator API
bbk-coordinator
Contributor API
bbk-contributor
Consumer API
bbk-consumer
You will find these names used across all the documentation and in the sample code. This is merely a convention; you do not need to use these names in your code. You can, instead, use your cloud URLs or even base IP addresses.
If you choose to stick to this convention, you will need to map these name to their ultimate end-points inside your system hosts file. Here is an example, mapping the standard logical server names to localhost.
Each API service listens on a distinct, non-clashing port. Unless you configure it otherwise, even the docker images are designed to start each service on its designated port.
This port mapping makes it simple and unambiguous to bring up multiple (or indeed all) of these API services on the same physical machine, without them interfering with each other. The assigned service ports are as follows:
API Service
Server Port
Coordinator API
8001
Contributor API
8002
Consumer API
8003
Development Only Headers
When installing BitBroker locally, authorization is bypassed. In this scenario, you do not need to supply authorization tokens to any API.
However, when using the Consumer API, you do still need to specify the policy you are using. This is so that BitBroker is aware of the data segment which is in-play for consumer calls. This is achieved by specifying a development header value.
Rather than the authorization header:
x-bbk-auth-token: your-token-will-go-here
You instead use the development header, as follows:
x-bbk-audience: your-policy-id-will-go-here
Failure to specify the development header on Consumer API calls when in local mode, will lead to those requests being rejected.
Bootstrap User
Every fresh installation of BitBroker comes with one preinstalled user (uid: 1). This user is automatically created when the system is bought-up for the first time.
As we will explain below, this user is linked to the authorization of API calls. The user is also important for the up-coming web portal BitBroker control interface.
It is possible, but not recommended, to use the bootstrap user in normal operation. Instead, you should create your own master coordinator user and then use that for further operations.
Docker Compose Local Installation
In the section we will explore how to you can use our pre-prepared Docker Compose files to install a BitBroker instance on your local machine.
This type of install is suitable for development purposes only.
This section assumes you are familiar with Containers and Docker. If you are new to these technologies, we advise that you become familiar with them first, before trying out the steps outlined below.
This section assumes have already installed Docker command line tools.
Prerequisites
Start with a clean machine, with no remnants of previous BitBroker installations. Please ensure you have the following software installed, configured and operational:
This will have created a bit-broker directory and you should move into it:
cd bit-broker
First, let’s start by preparing our instance’s environment file. For a standard local install, we can simply copy the existing one which came with the repository:
cp .env.example .env
Now we can use our docker compose scripts to install and launch our local instance:
docker-compose -f ./development/docker-compose/docker-compose.yml up
At this point, your BitBroker installation is up and running on your local machine. You can test it by running the steps below.
If you are going to develop on BitBroker, either for building data connectors or for contributing to the core system, then this mode of install can provide a fast and low friction installation technique.
Testing Your Installation
If everything worked as expected, the BitBroker API servers will be up-and-running on localhost waiting for calls. You can test this by using the sample call below. In this local mode, you don’t need to specify any authorization tokens.
curl http://bbk-coordinator:8001/v1
If you have not applied the standard server name and port format, then you should use http://localhost:8001 here as your API host base URL. The base end-points of all the three API servers respond with a small announcement:
Since you will be running BitBroker on your “bare” machine, this section assumes a list of preinstalled technologies, operating within the host OS. These are outlined in detail below.
Prerequisites
Start with a clean machine, with no remnants of previous BitBroker installations. Please ensure you have the following software installed, configured and operational:
This will have created a bit-broker directory and you should move into it:
cd bit-broker
For development purposes, BitBroker has a handy shell script called bbk.sh which can be used to manage the system in local install mode. You can use this to prepare your new clone:
./development/scripts/bbk.sh unpack
The unpack step, makes sure that all the dependent node packages needed to operate the system are downloaded and ready. It also creates a .env file automatically, by using the settings in the .env.example file.
At this point, your BitBroker installation is up and running on your local machine. You can test it by running the steps below. But first, let’s just see what other commands the bbk.sh script supports:
./development/scripts/bbk.sh <command>
command:
unpack → prepares a fresh git clone
start → starts bbk services
stop → stops bbk services
status → show bbk service status
logs → tails all bbk services logs
db → start a sql session
wipe → resets the bbk database
drop → drops the bbk database
bounce → stop » start
reset → stop » wipe » start
clean → stop » drop
If you are going to develop on BitBroker, either for building data connectors or for contributing to the core system, then this mode of install can provide a fast and low friction installation technique.
Testing Your Installation
If everything worked as expected, the BitBroker API servers will be up-and-running on localhost waiting for calls. You can test this by using the sample call below. In this local mode, you don’t need to specify any authorization tokens.
curl http://bbk-coordinator:8001/v1
If you have not applied the standard server name and port format, then you should use http://localhost:8001 here as your API host base URL. The base end-points of all the three API servers respond with a small announcement:
If you want to completely uninstall this instance of BitBroker, you can follow these steps, from the top-level bit-broker folder:
./development/scripts/bbk.sh clean
cd ..
rm -rm bit-broker
This will delete the Git cloned folder created earlier.
1.4 - Configuring BitBroker
Configuration options for a BitBroker instance
In the section, we cover how you can configure your BitBroker instance to align to your specific needs.
Whilst it is sometimes necessary to change some configuration, you should perform such actions with care. Incorrect configuration may mean that your instance fails to start or operates incorrectly.
If you follow the installation guides for either Kubernetes or for local, then these configuration issues will be taken care of for you, in that process. There is no need to hand-configure, unless your needs are specific and unusual.
Environment Settings
In this section, we outline the details of our installation environment file. This file is called .env and resides at the root of the BitBroker file layout.
There is no master record for this in the repository, however there is a file called .env.example, which contains all the most common parameters. You can activate this common set by simply copying this file:
cd bit-broker
cp .env.example .env
Here are all the settings in .env which can be modified:
Parameter
Default
Description
APP_MODE
standard
This is reserved for a future feature
APP_SERVER_METRICS
false
Enables express-prom-bundle to each API server
APP_SERVER_LOGGING
false
Enables stdout logging to each API server
APP_FILE_LOGGING
false
Enables file based logging to each API server
APP_DATABASE
see .env.example
PostgreSQL connection string - use CREDENTIALS as per the default
APP_SECRET
see .env.example
Instance level secret used to create secure hashes
BOOTSTRAP_USER_EMAIL
noreply@bit-broker.io
The email for the bootstrap user
BOOTSTRAP_USER_NAME
Admin
The name of the bootstrap user
BOOTSTRAP_USER_KEY_ID
see .env.example
This parameter reserved
COORDINATOR_PORT
8001
The listening port for the coordinator API
COORDINATOR_BASE
v1
The version of the coordinator API
COORDINATOR_USER
see .env.example
Database access for the coordinator API
CONTRIBUTOR_PORT
8002
The listening port for the contributor API
CONTRIBUTOR_BASE
v1
The version of the contributor API
CONTRIBUTOR_USER
see .env.example
Database access for the contributor API
CONSUMER_PORT
8003
The listening port for the consumer API
CONSUMER_BASE
v1
The version of the consumer API
CONSUMER_USER
see .env.example
Database access for the consumer API
POLICY_CACHE
see .env.example
Redis connection string for the policy cache
AUTH_SERVICE
see .env.example
End-point for the auth service
RATE_SERVICE
see .env.example
End-point for the rate service
You will also see some parameters starting with TESTS_ in the .env.example. These are reserved parameters used by the Mocha test suite. You can ignore these values unless you are developing the core system itself.
Advanced Use Cases
In this section, we outline the details of some more advanced use cases.
Sorry, this section is pending. Come back soon…
2 - Key Concepts
All the key concepts that you need to understand to get the best out of BitBroker
BitBroker is made up of a series of interlocking concepts. To get the most out of BitBroker, you should understand what these concepts are and how they work together to form a complete system.
In the section, we outline these key concepts in detail.
2.1 - Introduction
A step-by-step introduction to the BitBroker system
In this section, we will go through all the key concepts which you need to understand to get the best out of BitBroker. The system is made up of a series of interlocking concepts and it’s important to understand how these operate and interact.
Step # 1 - Create Users
All activity within BitBroker is parceled out to known Users and what they can and can’t do is governed by what roles they play. The first step is to learn how to create users and how to assign them roles, depending on what you want them to be able to do.
All information within BitBroker is stored and presented within the context of a high level enumeration called Entity Types. These are the object types which are naturally present within the domain under consideration. The next step is to learn how to create and manage entity types.
Next, for each entity type, you can create Data Connectors. Each of these has permission to contribute entity instance records for their given type. BitBroker marshals and manages these contributions, through a set of rules which you can define.
Data connectors provide records to the BitBroker Catalog. This is a place where the existence of all the entity instances in the domain space are stored, enumerable and searchable. It also provides the route for consumers to search, discover and use entity instance data.
Data Sharing Policies are the main back-bone of the BitBroker system. They are defined by coordinator users, who use them to specify the exact context in which they permit data to be accessed by consumers. In this step, you can learn how to define and manage these vital policies.
Once all other elements are in place, you can proceed to grant Access to data. In BitBroker, such grants are only permitted within the context of a consumer user and a policy. The connection between these two system concepts is called an access.
All activity within BitBroker is parceled out to known Users and what they can and can’t do is governed by what roles they play. In this section, we will learn about users: how to create, update and delete them and how user roles are employed to partition functionality and responsibility.
User Roles
All user activity within BitBroker is divided between three logical user roles. Each role has its own responsibilities and access control rules. These three user types come together to form a complete system.
These are “logical” roles and it is perfectly possible for one actual person to be adopting more than one role within a deployed instance.
In practice, users assume a role by being in receipt of an authorization token, which grants them the ability to perform actions using the corresponding API.
In the section, we will outline what the three user roles are. Later sections of this documentation will go on to explain how to create users and grant them such roles.
Coordinators
Coordinators have the most wide-ranging rights within a BitBroker system. They alone have the ability to perform key tasks such as: defining the entity types which are present, giving permission to contribute data and creating policy which governs data access. This is achieved by access to the Coordinator API.
It is envisaged that there will only be a small number of coordinator users within any deployed BitBroker instance. Coordinator roles should be limited to people to whom you want to grant wide responsibility over the entire system. Whilst there can be multiple Coordinators, there can never be zero Coordinators (the system will prevent you from deleting the last Coordinator).
Contributors
Contributors are users who have permission to contribute data within the context of a single specified entity type, via access to the Contributor API. Such rights can only be granted by a coordinator user.
If a person or organization is contributing data to many entity types within a single instance, they will be adopting multiple, different contributor roles in order to achieve this. Hence, the number of contributors is linked to the number of data connectors which is linked to the number of entity types.
Consumers
Consumers are the end-users of the system and the people who will be subject to the data access policies created by coordinators.
Consumers come in via the “front door”, which is the Consumer API. They are the people who will be accessing the data to create their own applications and insights - but only ever in the context of a policy declaration which defines what data they can see and how they can access and use it.
There can be an unlimited number of consumer users in an active BitBroker instance.
2.3 - Entity Types
The types of entities which are the basis of a BitBroker instance
All information within BitBroker is stored and presented within the context of a high level enumeration called Entity Types.
Entity types are the object types which are naturally present within the domain under consideration. You can define any create and set of entity types which make sense for your deployed instance. Only coordinator users have the ability to create, delete and modify entity types.
For example, here are some entity type enumerations which may naturally occur in different domains of operation:
You should choose your entity types with care, since they are difficult to modify once a system is operational.
Entity types should be real-world concepts which an average user would instinctively understand. They can represent physical, logical or virtual concepts within your domain. Typically, an entity type will be a noun within the domain space.
The list of known entity types forms the entire basis of a BitBroker instance. The Coordinator API documentation contains more detail about how to go about naming entity types and the attributes which must be present in order to create them.
Entity Instances
The whole point of entity types, is to provide some structure for the list of entity instances - which form the bedrock of the data which BitBroker is managing. Entity instances are submitted into the BitBroker catalog via contributions from data connectors.
Entity Schemas
BitBroker is a contribution based system, meaning that data contributed by a community of users. In some cases, these contributors will be people you have direct control over (and may well be other roles you yourself are playing).
However, in other instances, you maybe relying upon second and third parties to be contributing data into your BitBroker instance. It is entirely possible to have multiple contributors, submitting entity instances for a shared entity type.
In scenarios where the contributor community is diverse, it can be helpful to define clear rules as to the nature and quality of the incoming data. Rules can be defined to ensure consistency of data types, formats and representation schemes. It is also vital to ensure semantic alignment between similar concepts being contributed by different users.
This can be achieved within a BitBroker system by specifying a JSON schema per entity type. Once this schema is in place, BitBroker will automatically validate all incoming records against it. Violations will be rejected and contributors will be informed as to the specific reasons why.
Such schemas are an optional extra and may not be required in all instances. You can specify such schemas at the points you create and/or modify entity types.
2.4 - Data Connectors
Data connectors which allow contribution of data into a BitBroker instance
BitBroker is a contribution based system, meaning that data contributed by a community of users. In some cases, these contributors will be people you have direct control over (and may well be other roles you yourself are playing). However, in other instances, you maybe relying upon second and third parties to be contributing data into your BitBroker instance.
The vector through which data contribution is managed is the concept of data connectors. Each entity type within the system should have at least one data connector. However, it is entirely possible for an entity type to be sharing multiple data connectors, all contributing entity instance records for that type.
A data connector can only be contributing data to one entity type. If a party wants to contribute data to multiple entity types, you must create a different connector for each one. These can, in practice, point back to one data connector implementation.
A quick way to get going building data connectors is to adapt the example connectors which have been built for a range of data sources.
Managing Data Contribution
Whether or not the contribution is coming from internal or external players, the first step in managing the process is to create a data connector - housed within the entity type for which it has permission to submit records. Creating and managing data connectors can only be done by coordinator users.
As part of the creation process for connectors, the system will generate a connector ID and an authorization token. These will be returned to the coordinator user in response to the creation request. The coordinator should communicate these items in a secure manner to the party responsible for implementing the data connector. Here is an example of these items:
Connector authors should be aware of the important distinction between data and metadata within the context of the BitBroker catalog.
Third parties can implement their data connectors in whatever manner they want. The system only requires that they communicate their submissions via the HTTP based mechanisms outlined in the Contributor API. Coordinator users may well have their own requirements which they impose upon contributors. However, this is of no direct concern to BitBroker.
Because of the way the division of responsibility has been separated within the system, BitBroker has no requirements to know or understand how the original source data is stored or secured. This is purely the concern of the data connector authors.
Contribution is scoped to be within the connector’s domain space only. A connector cannot affect records delivered by another connector, even within the same entity type and even if they have a clashing key space.
Coordinators can use the mechanism of entity schemas to ensure alignment around things such as data types, formats, representation schemes and semantics.
Live vs Staging Connectors
By default, newly created connectors are not “live”. This means that data which they contribute will not be visible within the Consumer API. Implementing a data connector can be a tricky operation and may require a few attempts before data is being delivered to a sufficient quality. Hence, it is useful to isolate contributions from certain connectors into a “staging space” - so that it doesn’t pollute the public space visible to consumers.
In order to make a connector’s data visible, it must be promoted to live status. Only coordinator users can promote connectors in this way. They can also demote connectors, if they believe their data should no longer be publically visible.
When developing a new connector, it is often useful to be able to see data from it alongside other public data. There is a mechanism available in the Consumer API which allows data connectors to see how their records will look alongside other existing public records.
2.5 - The Catalog
The catalog and how it facilitates search, discovery and access
The BitBroker catalog is a place where the existence of all the entity instances in the domain space are stored, enumerable and searchable. It also provides the route for consumers to search, discover and use entity instance data.
The BitBroker catalog performs the following specific functions:
The catalog is a place to store metadata about entity instances. Consumer use the catalog to search and discover entity instances which are needed for their applications. Once an entity instance has been discovered, its details can be obtained via calls to the relevant part of the Consumer API.
When consumers obtain a detailed record about a particular entity instance, BitBroker will ask it’s submitting data connector directly for it’s live and on-demand information via a webhook. It will then merge this with the catalog record and return the whole to the consumer. The consumer is unaware of the existence of data connectors and even that some data is being drawn on-demand from another system.
What exactly is the difference between metadata and data in this context?
What data should you store in the catalog and what data should you retain for webhook callbacks? There is no right or wrong answer to this question. Typically, you should aim to store information which has a slow change rate in the catalog and retain other information for a callback.
Here are some examples of how this might operate in practice:
Entity Type
Catalog Data
Live Webhook Data
thermometer
location, range, unit
value
bus
route, seats, operator
location
car-park
name, location, capacity
occupancy
To be clear, data connectors can store everything in the catalog. This has the advantage that everything becomes searchable and they no longer need to host a webhook.
But, on the flip side, they have taken on the burden of sending constant updates to the catalog in order to keep it up to date. Here, the catalog is always likely to be a temporally lagging copy of the source stores.
Ultimately, the decision on how you use the catalog is up to the coordinators and data connector authors.
2.6 - Data Sharing Policy
Create, manage and deploy data sharing policies to grant access to data
Data sharing policies are the main back-bone of the BitBroker system. They are defined by coordinators, who use them to specify the exact context in which they permit data to be accessed by consumers.
Policy definitions can only be submitted by coordinators via the corresponding end-points in the Coordinator API. Once a policy is deployed, accesses can be created which allow consumers to interact with the data specified in the ways allowed.
A policy definition is made up of three sections. We will outline each in detail.
Data Segment
The data segment section of a policy definition, defines the maximum subset of data which will be visible when users access the Consumer API via a policy authorization token. The combination of a consumer authorization token and a policy authorization token, is locked within a data segment. There is no action they can perform with the Consumer API to break out in the wider catalog of data.
A data segment definition is made up of the following attributes:
An array of strings Attributes to be masked out of returned documents
Data segments are defined via the same semantics as users searching the catalog using catalog queries. You should think of your data segment as a second query which will be boolean and with the user’s queries.
If a consumer asks for entity instance data which is out-of-policy, they will be returned an HTTP/1.1 404 Not Found error. The system will deny the existence of such entities. If two consumers share a BitBroker link, where they have different policies - it is possible that one may see a data document and other an HTTP/1.1 404 Not Found for the same requesting API end-point.
Field masks are a way of removing individual attributes from entity instance data records. You can only remove attributes from within the entity section of the overall document. You do this by specifying an array of strings, where each one is [entity-type].[attribute].
For example, if you specify [ "country.capital", "country.currency.name", "country.location" ] then three attributes will be removed from the entity section of all returned documents. So this record:
The access control section of a policy definition, defines the ways in which users can interact with the Consumer API via a policy authorization token. An access control definition is made up of the following attributes:
Attribute
Necessity
Description
enabled
required
Whether or not to use access control
quota
optional
An object describing allowable data quotas
quota.max_number
optional
A number of calls for the quota
quota.interval_type
optional
The quota period of either day or month
rate
optional
The maximum calls-per-second rate
If you do not want to use the access control section, you can simply specify false for the enabled attribute. In this case, all other attributes will be ignored and the consumer will enjoy unrestricted access rates to the Consumer API.
Users with a policy authorization token will be able to make calls at a maximum rate of 250-per-second and with a maximum quota of 86,400-per-day.
If a consumer breaches a rate or quota limit, then calls to any part of the Consumer API will respond HTTP/1.1 429 Too Many Requests error. This response will persist until the breach has expired.
Legal Context
The legal context section of a policy definition, defines the legal basis on which data access is permitted. A legal notice will be present at the end of every entity instance record returned by any part of the Consumer API.
These legal notices are in the form of a JSON array of objects, each with three attributes:
Attribute
Necessity
Description
type
required
The type of the legal notice
text
required
The description of the legal notice
link
required
A link for more information about the legal notice
Examples of types of legal notices which may be present are:
Type
Description
attribution
How the data should be attributed within your application
contact
Who to contact about the data and it’s use
license
The licenses under which this data sharing operates
note
Ad hoc notes about the data and/or it’s use
source
Information about the source or origination of the data
terms
The terms and conditions of use for the data
It is possible that you may have more information about the legal basis on the use of your data by consumers. You may, for example, require consumers to perform additional legal steps in order to be given a consumer authorization token. This is outside the current scope of BitBroker.
2.7 - User Data Access
Managing data consumers and their associated data access tokens
Access to data within a BitBroker instance is always permitted within the context of a consumer and a policy. The connection between these two system concepts is called an access.
In the section, we outline these foundational principles in detail.
3.1 - API Architecture
How the APIs are structured and formatted
All the API sets within BitBroker conform to the commonly accepted standards of a RESTful API.
There are many resources around the web to learn and understand what RESTful APIs are and how to work with them. If you are unfamiliar with this API architecture, we encourage you to investigate it further before you use the BitBroker APIs.
Resource Manipulation
RESTful APIs use HTTP concepts to access and manipulate resources on the hosting server. Typical manipulations are to Create, Update and Delete resources. As with the standard RESTful convention, BitBroker maps HTTP methods to resource actions as follows:
HTTP Method
Resource Action
HTTP/GET
Read a resource or a list of resources
HTTP/POST
Create a new resource
HTTP/PUT
Update an existing resource
HTTP/DELETE
Delete an existing resource
Data Exchange
All data exchange with BitBroker APIs, both to and from the server, is in JavaScript Object Notation (JSON) format.
When posting data via an API (most often as part of a create or update action), the message body must be a valid JSON document and all the JSON keys within it should be “double-quoted”. If you are seeing validation errors indicating the JSON is incorrectly formed, you might want to try a JSON validator to get more detailed validation information.
When posting data via an API, the HTTP header Content-Type should always be set to application/json.
API Responses
RESTful APIs use HTTP response codes to indicate the return status from the call. BitBroker uses a subset of the standard HTTP response codes and maps them to call state as follows:
HTTP Response
Type
API Call State
HTTP/1.1 200 OK
success
The request completed successfully and data is present in the response body
HTTP/1.1 201 Created
success
The requested resource was successfully created - the new resource’s URI will be returned in the Location attribute of the response header
HTTP/1.1 204 OK
success
The request completed successfully, but there is no data in the response body
HTTP/1.1 400 Bad Request
error
The request was rejected because it resulted in validation errors - for example, a mandatory attribute was not sent in the request
HTTP/1.1 401 Unauthorized
error
The request was rejected because it contains an unapproved context - for example, a supplied authorization token was not valid or has expired (most often a failure of authorization)
HTTP/1.1 403 Forbidden
error
The request was rejected because it contains an invalid or expired context - for example, a supplied authorization token referring to a deleted policy
HTTP/1.1 404 Not Found
error
The request was rejected because the specified resource is not present
HTTP/1.1 405 Method Not Allowed
error
The request was rejected because the action is not permitted - for example, a user deleting themselves
HTTP/1.1 409 Conflict
error
The request was rejected because the action would cause a conflict on existing resource state - for example, creating a policy with a duplicate id
HTTP/1.1 429 Too Many Requests
error
The request was rejected because a limit has been exceeded - for example, the policy defined call rate or call quota has been breached
Whenever you receive an error response (HTTP/1.1 4**), the response body will contain further information within a standard error response format.
Responses of HTTP/1.1 500 Server Error should never be returned in normal operation. If you see such an error, do help us to identify and rectify the underlying problem. Please raise an issue in our GitHub repository describing, in as much detail as possible, the circumstances which lead to the error.
API Versioning
As the system develops, there may be changes to the API structure to encompass new concepts or features. Where API modifications imply code changes for existing clients, a new version of the API will be released.
For now, there is only one version of each of the API sets.
All the APIs within BitBroker include a version string as the lead resource. For example, the /v1/ part of the following:
http://bbk-coordinator:8001/v1/user
A version string must be present in every API call in each API set.
3.2 - Authorization and Tokens
How to send authorized requests to BitBroker APIs
All the API sets within BitBroker require authorization by callers. Whilst the process for authorization is common across the APIs, the context and reasons for authorization differ:
Contributor API - needed to contribute data to a designated entity type
Consumer API - needed to access data via a policy definition
Except in some development modes, there is no access to BitBroker services without authorization and hence without having a corresponding authorization token.
Authorizing API calls
When you have obtained the relevant authorization token (see below), you can use it to authorize API calls by using a specified HTTP header attribute. The same authorization header structure is used for all three BitBroker API sets.
x-bbk-auth-token: your-token-will-go-here
The header must be exactly as it appears here, with the same casing and without spaces. If you do not correctly specify the authorization header, or you use an invalid authorization token, you will get an HTTP/1.1 401 Unauthorized error.
Testing your Authorization Token
It can be useful to make test calls to the API to check that your authorization token is valid and that your authorization header is formatted correctly.
The base end-point of all the three API servers respond with a small announcement. Like all BitBroker API end-points, these require a working authorization to be in place. Hence, this announcement can be used for testing or verification purposes.
See the section on testing installations for Kubernetes or local modes for more details.
You are encouraged to make successful test calls to these end-points before launching into more complex scenarios.
Obtaining an Authorization Token
The method used to obtain an authorization token differs across the three API sets. However, once you have a authorization token, the mechanics of authorization are the same. All authorization tokens are in the form of a long hexadecimal string, such as:
These authorization tokens should be kept by their owner in a secure location and never shared with unauthorized users.
BitBroker does not retain an accessible copy of authorization tokens. If tokens are lost, new tokens will have to be generated as per the instructions below. The lost authorization tokens will then be rescinded.
Obtaining a Coordinator Authorization Token
Coordinator authorization tokens are required to authorize calls to the Coordinator API. This API is used to perform administrative services for a BitBroker instance.
Coordinator authorization tokens are obtained by utilizing end-points on the Coordinator API, in order to promote a user to coordinator status. To do this you must first create a new user and then you must promote that user to be a coordinator.
Once you promote a user to be a coordinator, then their coordinator authorization token will be returned in the body of the response to that API call.
It is possible, but not recommended, to use the bootstrap user in normal operation. Instead, you should use the bootstrap coordinator token to create your own master coordinator user and then utilize their token for further operations.
If a coordinator authorization token is lost, then a new token will have to be generated. This can be done by first demoting the user from being a coordinator and then promoting them again. Note that, in this scenario, the old coordinator authorization token will be rescinded.
Obtaining a Contributor Authorization Token
Contributor authorization tokens are required to authorize calls to the Contributor API. This API is used to contribute data to a designated entity type.
Contributor authorization tokens are obtained by utilizing end-points on the Coordinator API, in order to create a connector on a given entity type.
Once the connector is created, then its contribution authorization token will be returned in the body of the response to that API call. More information about data connectors and data contribution is available in the key concepts section of this documentation.
If a contributor authorization token is lost, then a new token will have to be generated. This can be done by first deleting the connector and then creating it afresh. Note that, in this scenario, the old contributor authorization token will be rescinded.
Losing contributor authorization tokens can cause major disruption. Since the resolution involves deletion and the re-creation of the connector, all the associated data records will be lost and will have to be reinserted into the catalog by the new connector.
Recreating a connector which was previously deleted, will result in the same connector ID, if the new connector has the exact same name.
It is expected that the coordinator user, who creates the data connector, will distribute the contribution authorization token in a secure manner to the relevant party.
Obtaining a Consumer Token
Consumer authorization tokens are required to authorize calls to the Consumer API. This API is used to access data via a policy definition.
Consumer authorization tokens are obtained by utilizing end-points on the Coordinator API. To do this you must create an access, which is a link between a user and a policy definition.
Once you create such an access, then the consumer authorization token will be returned in the body of the response to that API call.
If a consumer authorization token is lost, then you can reissue the access to obtain a new token. Note that, in this scenario, the old consumer authorization token will be rescinded.
3.3 - Error Handling
How API errors are handled and communicated
All errors returned by all API services are presented in a standard error format. This complete list of errors which you may encounter are listed in the earlier page detailing the API architecture.
Standard Error Format
All errors will be present in a common structure as follows:
The standard text associated with the HTTP response code
message
always
A string (sometimes) containing extra information about the error condition
Validation Error Format
In situations which lead to a validation error on inbound data, the response code will always be HTTP/1.1 400 Bad Request. In the case of such errors, the message attribute will always contain an array of validation errors:
{"error":{"code":400,"status":"Bad Request","message":[{"name":"name","index":null,"reason":"does not meet minimum length of 1"},{"name":"email","index":null,"reason":"does not conform to the 'email' format"}]}}
Here, the message attributes will be as follows:
Attribute
Presence
Description
message
always
An array of validation error objects
message.name
always
A string indicating the id of the erroring attribute - this can be present multiple times, when there are multiple validation issues with the same attribute
message.index
always
An integer indicating the index of the erroring item, when it is within an array - this will be null for attributes which are not arrays
message.reason
always
A string indicating the validation error
The validation error structure is designed to make it simple to integrate such errors into an end-user experience.
4 - Coordinator API
The main administrative API for the management of users, entity types, connectors and policies
The Coordinator API is the main administrative API for the BitBroker system. It is the API you will need to use to create and manipulate all the main system elements required to operate a BitBroker instance.
Before you use this API, you should become familiar with the general, system-wide API conventions - which are used across all three BitBroker API sets. This covers topics such as API architecture, authorization, error reporting and handling, etc.
4.1 - Users
APIs for creating and manipulating users
Users are a main component of the BitBroker system. You will find more details about users, within the key concepts section of this documentation.
Every fresh installation of BitBroker comes with one preinstalled user (uid: 1). This user is automatically created when the system is bought-up for the first time.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your coordinator API authorization token. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Coordinator API Authorization Token
Creating a New User
New users can be created by issuing an HTTP/POST to the /user end-point.
HTTP/1.1 201 Created
Location: http://bbk-coordinator:8001/v1/user/2
The ID for the new user will be returned within the Location attribute of the response header.
The following validation rules will be applied to the body of a new user request.
Attribute
Necessity
Validation Rules
name
required
String between 1 and 64 characters long
email
required
String between 1 and 256 characters long Must conform to email address format Must be unique across all users in the system
Email addresses need to be unique across an operating BitBroker instance. If you attempt to create a user with a duplicate email address, it will result in an HTTP/1.1 409 Conflict response.
You cannot update the user’s email address, which was specified when they were created. Email addresses are system-wide unique identifiers. If you need to update an email address, your only option is to create a new user with that address.
The validation rules for updated user information, are the same as that for creating new users.
List of Existing Users
You can obtain a list of all the existing users by issuing an HTTP/GET to the /user end-point.
[{"id":1,"url":"http://bbk-coordinator:8001/v1/user/1","name":"Admin","email":"noreply@bit-broker.io","coordinator":true,"accesses":[]},{"id":2,"url":"http://bbk-coordinator:8001/v1/user/2","name":"Bob","email":"alice@domain.com","coordinator":false,"accesses":[]}// ... other users here
]
Each user on the system will be returned within this array. Later sections of this document will explain what the coordinator and accesses attributes refer to. Note: There is currently no paging on this API.
Details of an Existing User
You can obtain the details of an existing user by issuing an HTTP/GET to the /user/:uid end-point.
In order to obtain details of a user, you must know their user ID (uid).
The body of this response will contain the coordinator authorization token, which the newly promoted user should utilize to authorize their own calls to the Coordinator API. For example:
It is expected that the promoting user will securely distribute this coordinator authorization token to the promoted user.
Promoted users will gain coordinator privileges right away. When getting details for such users, their coordinator status will be reflected in the coordinator attribute:
{"id":2,"url":"http://bbk-coordinator:8001/v1/user/2","name":"Bob","email":"alice@domain.com","coordinator":true,// this is the new coordinator status
"accesses":[],"addendum":{}}
If you attempt to promote a user who is already a coordinator, it will result in an HTTP/1.1 409 Conflict response. This error is benign and will not impact the user’s status.
Demoting a User from Coordinator
Existing users can be demoted from coordinator status by issuing an HTTP/DELETE to the /user/:uid/coordinator end-point.
In order to demote a user, you must know their user ID (uid).
Demoted users will lose coordinator privileges right away. The coordinator authorization token they were holding will no longer be valid. When getting details for such users, their coordinator status will be reflected in the coordinator attribute:
{"id":2,"url":"http://bbk-coordinator:8001/v1/user/2","name":"Bob","email":"alice@domain.com","coordinator":false,// this is the new coordinator status
"accesses":[],"addendum":{}}
If you attempt to demote a user who is not a coordinator, it will result in an HTTP/1.1 409 Conflict response. This error is benign and will not impact the user’s status.
Demoting a user has no impact on any policy authorization tokens which they may be holding for the Consumer API.
You cannot demote yourself from being a coordinator. If you attempt this, the system will return an HTTP/1.1 405 Method Not Allowed response. This is done in order to prevent a situation where there are no coordinators present within an operating instance. If you wish to demote yourself, you will need to ask another coordinator to perform this action on your behalf.
Deleting a User
Existing users can be deleted from the system by issuing an HTTP/DELETE to the /user/:uid end-point.
In order to delete a user, you must know their user ID (uid).
Any policy authorization tokens which were assigned to a deleted user will be rescinded as part of the deletion process. Consumer API requests made with such tokens will fail (after a short propagation delay).
You cannot delete yourself from the system. If you attempt this, the system will return an HTTP/1.1 405 Method Not Allowed response. This is done in order to prevent a situation where there are no users present within an operating instance. If you wish to delete yourself, you will need to ask another coordinator to perform this action on your behalf.
User Addendum Information
Observant readers will have noticed an addendum section at the bottom of the user details object:
{"id":2,"url":"http://bbk-coordinator:8001/v1/user/2","name":"Bob","email":"alice@domain.com","coordinator":false,"accesses":[],"addendum":{}// this is the addendum section
}
This section is reserved for use by the up-coming BitBroker Portal. We request that you do not use this section at this time.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your coordinator API authorization token. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Coordinator API Authorization Token
Creating a New Entity Type
New entity types can be created by issuing an HTTP/POST to the /entity/:eid end-point.
In order to create an entity type, you must select a unique entity type ID (eid) for it.
curl http://bbk-coordinator:8001/v1/entity/country \
--request POST \
--include \
--header "Content-Type: application/json"\
--header "x-bbk-auth-token: your-token-goes-here"\
--data-binary @- << EOF
{ \
"name": "Countries", \
"description": "All the countries in the world as defined by the UN", \
"schema": {}, \
"timeseries": { \
"population": { \
"period": "P1Y", \
"value": "people", \
"unit": "x1000" \
} \
} \
}
EOF
This will result in a response as follows:
HTTP/1.1 201 Created
Location: http://bbk-coordinator:8001/v1/entity/country
The following validation rules will be applied to the body of a new entity type request.
Attribute
Necessity
Validation Rules
eid
required
String between 3 and 32 characters long Consisting of lowercase letters, numbers and dashes only Starting with a lowercase letter Conforming to the regex expression ^[a-z][a-z0-9-]+$
A list of timeseries which are present on this entity type
timeseries.id
required
String between 3 and 32 characters long Consisting of lowercase letters, numbers and dashes only Starting with a lowercase letter Conforming to the regex expression ^[a-z][a-z0-9-]+$
Details about the meaning of these attributes can be found within the key concepts section of this documentation. The schema attribute is a powerful concept, which is explained in more detail there.
Entity type IDs are required to be unique across an operating BitBroker instance.
Timeseries IDs are required to be unique within their housing entity type.
By convention, entity types are expressed in the singular, non-plural form. So, for example, product is preferred over products.
You should choose your entity type IDs (eid) with care, as these cannot be changed once created.
Updating an Entity Type
Existing entity types can have their profile updated by issuing an HTTP/PUT to the /entity/:eid end-point.
In order to update an entity type, you must know its entity type ID (eid).
Great care should be taken when updating the JSON schema associated with an entity type. Consider whether this breaks an implicit contract that you may have made with existing data connectors for this entity type. It may make more sense to create a new entity type, if the amended schema definition is incompatible with the existing one.
The validation rules for updated entity type information, are the same as that for creating new entity types.
List of Existing Entity Types
You can obtain a list of all the existing entity types by issuing an HTTP/GET to the /entity end-point.
[{"id":"country","url":"http://bbk-coordinator:8001/v1/entity/country","name":"Countries","description":"This is a new description for countries"}// ... other entity types here
]
Each entity type on the system will be returned within this array. Note: There is currently no paging on this API.
Details of an Existing Entity Type
You can obtain the details of an existing entity type by issuing an HTTP/GET to the /entity/:eid end-point.
In order to obtain details of an entity type, you must know its entity type ID (eid).
{"id":"country","url":"http://bbk-coordinator:8001/v1/entity/country","name":"Countries","description":"This is a new description for countries","schema":{},"timeseries":{"population":{"unit":"x1000","value":"people","period":"P1Y"}}}
Deleting an Entity Type
Existing entity types can be deleted from the system by issuing an HTTP/DELETE to the /entity/:eid end-point.
In order to delete an entity type, you must know its entity type ID (eid).
Deleting an entity type will also delete all entity instances associated with that type from the Catalog.
Deleting an entity type will also delete all data connectors associated with that entity type. No further data contribution will be accepted for it.
Any policy authorization tokens which were issued where this entity type formed part of the data segment, will no longer return entity instances of this type. In some circumstances, this could render policy authorization tokens unfit for purpose.
4.3 - Data Connectors
APIs for creating and manipulating data connectors
Connectors are always created within the context of housing entity types, which can be created and manipulated using other part of this API, described earlier in this documentation.
In order to use the sample calls in this section, first create the housing entity as outlined in the previous section.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your coordinator API authorization token. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Coordinator API Authorization Token
Creating a New Connector
New connectors can be created by issuing an HTTP/POST to the /entity/:eid/connector/:cid end-point.
Connectors are always created within the context of a housing entity type and, hence, you must know its ID (eid). In order to create a connector, you must select a unique connector ID (cid) for it.
It is expected that the coordinator user will securely distribute the connector ID and authorization token to the operator of the new data connector.
Connector authorization tokens are not stored within the system. If you lose this connector authorization token, you will be forced to delete and recreate the connector to obtain a new one. This can have major implications for an operating instance.
The following validation rules will be applied to the body of a new connector request.
Attribute
Necessity
Validation Rules
cid
required
String between 3 and 32 characters long Consisting of lowercase letters, numbers and dashes only Starting with a lowercase letter Conforming to the regex expression ^[a-z][a-z0-9-]+$
name
required
String between 1 and 64 characters long
description
required
String between 1 and 2048 characters long
webhook
optional
String between 1 and 1024 characters long Must conform to URI format
cache
optional
Integer between 0 and 31536000
Details about the meaning of these attributes can be found within the key concepts section of this documentation.
Connector IDs are required to be unique across an operating BitBroker instance.
Connector who present no webhook are required to submit complete entity instance records to the catalog. Before making the decision as to whether to host a webhook or not, careful note should be taken on the distinction between data vs metadata within the catalog.
You should choose your connector ID (cid) with care, as these cannot be changed once created.
Updating a connector
Existing connectors can have their profile updated by issuing an HTTP/PUT to the /entity/:eid/connector/:cid end-point.
In order to update a connector, you must know the ID of its housing entity type (eid) and it’s connector ID (cid).
curl http://bbk-coordinator:8001/v1/entity/country/connector/wikipedia \
--request PUT \
--include \
--header "Content-Type: application/json"\
--header "x-bbk-auth-token: your-token-goes-here"\
--data-binary @- << EOF
{ \
"name": "Wikipedia", \
"description": "A new description for the Wikipedia connector", \
"webhook": "http://my-domain.com/connectors/1", \
"cache": 0 \
}
EOF
This will result in a response as follows:
HTTP/1.1 204 No Content
The validation rules for updated connector information, are the same as that for creating new connectors.
List of Existing Connectors
You can obtain a list of all the existing connectors housed within a parent entity type by issuing an HTTP/GET to the /entity/:eid/connector end-point.
[{"id":"wikipedia","url":"http://bbk-coordinator:8001/v1/entity/country/connector/wikipedia","name":"Wikipedia","description":"A new description for the Wikipedia connector"}// ... other connectors here
]
Each connector, housed within a parent entity type, will be returned within this array. Note: There is currently no paging on this API.
Details of an Existing Connector
You can obtain the details of an existing connector by issuing an HTTP/GET to the /entity/:eid/connector/:cid end-point.
In order to obtain details of a connector, you must know the ID of its housing entity type (eid) and it’s connector ID (cid).
{"id":"wikipedia","url":"http://bbk-coordinator:8001/v1/entity/country/connector/wikipedia","name":"Wikipedia","description":"A new description for the Wikipedia connector","entity":{"id":"country","url":"http://bbk-coordinator:8001/v1/entity/country"},"contribution_id":"9afcf3235500836c6fcd9e82110dbc05ffbb734b","webhook":"http://my-domain.com/connectors/1","cache":0,"is_live":false,"in_session":false}
Connector authorization tokens are not stored in an accessible way within the system and hence not present in this returned document.
Other sections of this document will explain what the is_live and in_session attributes refer to.
Promoting a Connector to Live
Connectors can be promoted to live status by issuing an HTTP/POST to the /entity/:eid/connector/:cid/live end-point.
In order to promote a connector, you must know the ID of its housing entity type (eid) and it’s connector ID (cid).
When getting details for promoted connectors, their is_live status will be reflected in the corresponding attribute:
{"id":"wikipedia","url":"http://bbk-coordinator:8001/v1/entity/country/connector/wikipedia","name":"Wikipedia","description":"A new description for the Wikipedia connector","entity":{"id":"country","url":"http://bbk-coordinator:8001/v1/entity/country"},"contribution_id":"9afcf3235500836c6fcd9e82110dbc05ffbb734b","webhook":"http://my-domain.com/connectors/1","cache":0,"is_live":true,"in_session":false}
If you attempt to promote a connector which is already live, it will still result in an HTTP/1.1 204 No Content response. Such requests are benign and will not impact the connector’s status.
Promoting connectors will cause all the staged records which they have contributed to the Catalog to become visible for calls to the Consumer API.
Demoting a Connector from Live
Existing connectors can be demoted from live status by issuing an HTTP/DELETE to the /entity/:eid/connector/:cid/live end-point.
In order to demote a connector, you must know the ID of its housing entity type (eid) and it’s connector ID (cid).
When getting details for such users, their is_live status will be reflected in the corresponding attribute:
{"id":"wikipedia","url":"http://bbk-coordinator:8001/v1/entity/country/connector/wikipedia","name":"Wikipedia","description":"A new description for the Wikipedia connector","entity":{"id":"country","url":"http://bbk-coordinator:8001/v1/entity/country"},"contribution_id":"9afcf3235500836c6fcd9e82110dbc05ffbb734b","webhook":"http://my-domain.com/connectors/1","cache":0,"is_live":false,"in_session":false}
If you attempt to demote a connector which is not live, this will still result in an HTTP/1.1 204 No Content response. Such requests are benign and will not impact the connector’s status.
Demoting connectors will remove all the records which they have contributed to the Catalog from calls to the Consumer API. Users holding on to full qualified URLs for such records, will instead receive HTTP/1.1 404 Not Found responses.
Deleting a connector
Existing connectors can be deleted from the system by issuing an HTTP/DELETE to the /entity/:eid/connector/:cid end-point.
In order to delete a connector, you must know the ID of its housing entity type (eid) and it’s connector ID (cid).
Deleting a connector will also remove all the records which it has contributed to the Catalog.
Deleting a connector will also remove its ability to contribute data. No further contribution will be accepted and all subsequent calls to the Contributor API will fail.
Any policy authorization tokens which were issued where this connector’s records formed part of the data segment, will no longer return entity instances contributed by it. In some circumstances, this could render policy authorization tokens unfit for purpose.
4.4 - Data Sharing Policy
APIs for creating and manipulating policies
Data Sharing Policies are a main component of the BitBroker system. You will find more details about policies, within the key concepts section of this documentation.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your coordinator API authorization token. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Coordinator API Authorization Token
Creating a New Policy
New policies can be created by issuing an HTTP/POST to the /policy/:pid end-point.
In order to create a policy, you must select a unique policy ID (pid) for it.
curl http://bbk-coordinator:8001/v1/policy/over-a-billion \
--request POST \
--include \
--header "Content-Type: application/json"\
--header "x-bbk-auth-token: your-token-goes-here"\
--data-binary @- << EOF
{ \
"name": "The Most Populated Countries", \
"description": "Countries with a population of over a billion", \
"policy": { \
"access_control": { \
"enabled": true, \
"quota": { \
"max_number": 86400, \
"interval_type": "day" \
}, \
"rate": 250 \
}, \
"data_segment": { \
"segment_query": { \
"type": "country", \
"entity.population": { "$gt": 1000000000 } \
}, \
"field_masks": [] \
}, \
"legal_context": [ { \
"type": "attribution", \
"text": "Data is supplied by Wikipedia", \
"link": "https://en.wikipedia.org/" \
} \
] \
} \
}
EOF
This will result in a response as follows:
HTTP/1.1 201 Created
Location: http://bbk-coordinator:8001/v1/policy/country
The following validation rules will be applied to the body of a new policy request.
Attribute
Necessity
Validation Rules
pid
required
String between 3 and 32 characters long Consisting of lowercase letters, numbers and dashes only Starting with a lowercase letter Conforming to the regex expression ^[a-z][a-z0-9-]+$
An array of 0 to 100 of object outlining the legal basis of data sharing
legal_context.type
required
One of an enumeration attribution, contactlicense, note, source or terms
legal_context.text
required
String between 1 and 256 characters long
legal_context.link
required
String between 1 and 1024 characters long Must conform to URI format
Details about the meaning of these attributes can be found within the key concepts section of this documentation.
Policy IDs are required to be unique across an operating BitBroker instance.
You should choose your policy IDs (pid) with care, as these cannot be changed once created.
Data sharing policies are complex objects. You should refer to the detailed explanation of these of the key concepts area, if you are unclear about any aspects of their construction.
Updating a Policy
Existing policies can have their profile updated by issuing an HTTP/PUT to the /policy/:pid end-point.
In order to update a policy, you must know its policy ID (pid).
Great care should be taken when updating policies, since authorization tokens may have been issued in the previous context. The ability to change the definition of live policies is a powerful feature, but comes with responsibility.
The new policy definition will come into play right away (after a small propagation delay). Authorization tokens issued prior to the change, will automatically become subject to the new definition.
The validation rules for updated policy information, are the same as that for creating new policies.
List of Existing Policies
You can obtain a list of all the existing policies by issuing an HTTP/GET to the /policy end-point.
[{"id":"over-a-billion","url":"http://bbk-coordinator:8001/v1/policy/over-a-billion","name":"This is a new name","description":"This is a new description"}// ... other policies here
]
Each policy on the system will be returned within this array. Note: There is currently no paging on this API.
Details of an Existing Policy
You can obtain the details of an existing policy by issuing an HTTP/GET to the /policy/:pid end-point.
In order to obtain details of a policy, you must know its policy ID (pid).
{"id":"over-a-billion","url":"http://bbk-coordinator:8001/v1/policy/over-a-billion","name":"This is a new name","description":"This is a new description","policy":{"data_segment":{"field_masks":[],"segment_query":{"type":"country","entity.population":{"$gt":1000000000}}},"legal_context":[{"link":"https://en.wikipedia.org/","text":"Data is supplied by Wikipedia","type":"attribution"}],"access_control":{"rate":250,"quota":{"max_number":86400,"interval_type":"day"},"enabled":true}}}
Deleting a Policy
Existing policies can be deleted from the system by issuing an HTTP/DELETE to the /policy/:pid end-point.
In order to delete a policy, you must know its policy ID (pid).
Accesses are always created within the context of a user and a policy. These can be created and manipulated using other parts of this API, described elsewhere in this documentation.
In order to use the sample calls in this section, you must create first the user and the policy, as outlined in the previous sections.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your coordinator API authorization token. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Coordinator API Authorization Token
Creating a New Access
New accesses can be created by issuing an HTTP/POST to the /user/:uid/access/:pid end-point.
Accesses are always created within the context of a user and a policy. Hence, you must know the user ID (uid) and the policy ID (pid) in order to create one.
HTTP/1.1 201 Created
Location: http://bbk-coordinator:8001/v1/policy/country
The body of this response will contain the authorization token, which the user should utilize to authorize their calls to the Consumer API. For example:
It is expected that coordinator user will securely distribute this consumer authorization token to the user.
Reissuing an Access
Existing accesses can be reissued by issuing an HTTP/PUT to the /user/:uid/access/:pid end-point.
Accesses are always created within the context of a user and a policy. Hence, you must know the user ID (uid) and the policy ID (pid) in order to reissue one.
The body of this response will contain the new authorization token, which the user should utilize to authorize their calls to the Consumer API. For example:
[{"id":"over-a-billion","url":"http://bbk-coordinator:8001/v1/user/2/access/over-a-billion","policy":{"id":"over-a-billion","url":"http://bbk-coordinator:8001/v1/policy/over-a-billion"},"created":"2022-06-01T14:18:30.635Z"}// ... other accesses here
]
Each access the user has will be returned within this array. Note: There is currently no paging on this API.
A shorter list of accesses by user can also be obtained when you ask for the details of an individual user. For example:
{"id":2,"url":"http://bbk-coordinator:8001/v1/user/2","name":"Alice","email":"alice@domain.com","coordinator":false,"accesses":[{"id":"over-a-billion","url":"http://bbk-coordinator:8001/v1/user/2/access/over-a-billion"}// ... other accesses here
],"addendum":{}}
Details of an Existing Access
You can obtain the details of an existing access by issuing an HTTP/GET to the /user/:uid/access/:pid end-point.
Accesses are always created within the context of a user and a policy. Hence, you must know the user ID (uid) and the policy ID (pid) to get its details.
Once the access is deleted, the authorization token issued with it will no longer return any results for any of the end-points which form the Consumer API.
5 - Contributor API
The API for managing the contribution of data to a BitBroker instance
The Contributor API is the API which is used for submitting data contributions into the BitBroker catalog. It is tightly connected with the concepts of entity types and their associated data connectors.
It is important that you understand these, and other key concepts, before you begin using the Contributor API.
Before you use this API, you should become familiar with the general, system-wide API conventions - which are used across all three BitBroker API sets. This covers topics such as API architecture, authorization, error reporting and handling, etc.
5.1 - Contributing Records
How connectors contribute entity instance records to BitBroker
All the data being managed by a BitBroker instance, enters the system via the Contribution API. The process of contributing such data is documented in detail in this section.
In this section, we will consider the basic use case of contributing entity instance records. Later sections of this documentation will detail how you can contribute live, on-demand data and timeseries data.
Contributing data is tightly bound with the concepts of entity types and their associated data connectors. All contributions happen in the context of these important system elements. It is vital that you fully understand these and other key concepts before using this API to contribute records.
A quick way to get going building your own data connectors is to adapt the example connectors which have been built for a range of data sources.
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your contributor API authorization token. This authorization token should have been provided to you by the coordinator user who created your data connector within BitBroker.
The sample calls in this section will not work as-is. Contributor API calls require the use of session IDs, which are generated on-demand. Hence, the sample calls here are merely illustrative.
Contributing Records to the Catalog
We will assume for the purposes of this section that an entity type and it’s associated data connector have been created and are present within the system. Further, that the connector ID and authorization token, which were obtained when the data connecter was created, have been recorded and are available.
Data can now be contributed into the catalog by this data connector, but within the context of its parent entity type only. Hence, we say that a single connector contributes “entity instance records”. If one organization wants to contribute data to multiple entity types, then they must do this via multiple data connectors.
The process of contributing entity instance records into the catalog breaks down into three steps:
Create a data contribution session
Upsert and/or delete records into this session
Close the session
These steps are achieved via an HTTP based API, which we outline in detail below. Each data connector will have a private end-point on this API which is waiting for its contributions.
It is important to understand the distinction between data and metadata in the context of the BitBroker instance. It is an expectation that only metadata is being contributed into the catalog and that live data is kept back for on-demand requests. This distinction is outlined in more detail in the key concepts documentation.
It is important to understand that data connectors might be in a live or staged state. That is, their contribution might be approved for the live catalog, or might be being held back into a staging space only. This concept is outlined in more detail in the key concepts documentation. There is a mechanism available in the Consumer API which allows data connectors to see how their records will look alongside other existing public records.
If your connector is marked as “non-live”, your data contribution will not become visible to consumers. If you want to make your connector “live”, you must ask the coordinator user who created the connector for you.
Sessions
Sessions are used by the Contribution API to manage inbound data coming from the community of data connectors. Sessions allow the connectors to contribute entity instance records in well-defined ways, which are respectful of the state management of the source data store.
BitBroker supports three types of sessions: stream, accrue and replace. Each one provides for different update and delete contexts.
You can only have one session open at a time. If you open a new session without closing a previous one, the previous one is implicitly closed with a rollback request.
The three types of session provide for different application logic in the following areas:
Whether data is available to consumers whilst the session is still open or only after is it closed.
Whether the data provided within a session adds to or replaces earlier data from your connector.
Here is the detail of how each session type functions:
Area
Stream
Accrue
Replace
Data visibility
as soon as posted
on session close
on session close
Data from previous session
in addition to
in addition to
replaces entirely
Let’s explore each of these in more detail:
Stream Sessions
Stream sessions are likely to be the default mode of operation for most data connectors. Inbound entity instance records arrive in the catalog as soon as they are posted and whilst the session remains open. They are immediately available to consumers to view via the Consumer API.
New records are in addition to existing records in the catalog and removal must be explicitly requested. Closing a stream session is a moot operation, since the session type is essentially an “open pipe” into the catalog. In fact, stream sessions can be opened and left open indefinitely.
Type
Session
Action
stream
open
session data is already visible, in addition to previous data
stream
close true
no operation - session data is already visible, in addition to previous data
stream
close false
no operation - session data is already visible, in addition to previous data
Accrue Sessions
Accrue sessions are useful when entity instance records should only become visible as complete sets. In this scenario, the entity instance records contributed within a session, only become visible via the Consumer API when the session is closed - and hence only as a complete set.
New records are in addition to existing records in the catalog and removal must be explicitly requested. When you close an accrue session, you must specify a commit state as true or false. Closing the session with true makes the contributed records visible in the Consumer API, but closing it with false will discard all the records contributed within that session.
Type
Close
Action
accrue
open
session data not visible, but previous data is
accrue
close true
session data now becomes visible, in addition to previous data
accrue
close false
session data is discarded and previous data persists
Replace Sessions
Replace sessions are useful when contributed entity instance records should completely replace the set provided in previous sessions. In this scenario, the entity instance records contributed within a session, become visible via the Consumer API when the session is closed as a complete set - but all the records contributed in earlier sessions are discarded. Replace sessions are useful when you cannot maintain state about earlier contributions, and hence each contribution is a complete statement of your record set.
New records are in replacement for existing records in the catalog and removal of these “old” records is implicit. When you close an accrue session, you must specify a commit state as true or false. Closing the session with true makes the contributed records visible in the Consumer API and deletes records from previous sessions. However, closing it with false will discard all the records contributed within that session and previously contributed records will remain untouched.
Type
Close
Action
replace
open
session data not visible, but previous data is
replace
close true
session data now becomes visible and replaces all previous data
replace
close false
session data is discarded and previous data persists
As you can see, picking the right session type is vitally important to ensure you make the best use of the catalog. In general, you should aim to use a stream type session where you can, as this is the simplest.
If you don’t want clients to be able to see intermediate updates in the catalog, then accrue and replace may be better options. Where you don’t want to (or can’t) store any state about what you previously sent to the catalog, then replace is probably the best option.
Using Sessions
There are only three HTTP calls which your data connectors need make in order to contribute records into the catalog.
Opening a Session
New sessions can be created by issuing an HTTP/GET to the /connector/:cid/session/open/:mode end-point.
In order to open a session, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker.
You will also need to select one of the three session modes from stream, accure and replace. These should be specified in lowercase and without any spaces.
The body of this response will contain a session ID (sid), which should be recorded as it will be needed for subsequent API calls. For example:
4527eff4-d9cf-41c0-9ecc-8e06b57fcf54
Posting Records in a Session
Once you have an open session, you can post two types of actions to it in order to manipulate your catalog entries.
upsert to update or insert a record into the catalog
delete to remove an existing record from the catalog
You can only make changes to your own records within the catalog. Your data connector will have no effect on records which came from other connectors - even if you share an entity type with them.
Entity instance records can be upserted or deleted by issuing an HTTP/POST to the /connector/:cid/session/:sid/:action end-point.
In order to post record actions, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker. You must also know the session ID (sid), which was returned in the previous step where a session was opened.
Finally, you will also need to select one of the two valid actions from upsert and delete. These should be specified in lowercase and without any spaces.
You can specify upsert and/or delete record actions, but these cannot be mixed into a single API call. However, you can upsert and delete as many times as you wish within an open session.
Your upsert and delete actions will be executed in the strict order in which they were sent. You can safely upsert and then delete an entity instance within a single session boundary, if you so wish.
Care should be taken to ensure that the session ID (sid) used to post updates is the ID which was returned in the last call to open a session. If you send an old, invalid or mismatched session ID, it will result in an HTTP/1.1 403 Forbidden response. This will have no impact on any existing open session.
In the example above, we upsert an empty array - this is obviously not useful. Let’s now look in detail about how records are inserted, update and deleted using this API call.
Upserting records
When you post an upsert request, you should include an array of entity instances in JSON format within your post body. Each record can contain the following attributes:
Attribute
Necessity
Validation Rules
id
required
String between 1 and 64 characters long
name
required
String between 1 and 64 characters long
entity
required
An object conforming to the entity schema for this entity type
instance
optional
An object containing other, ancillary information
Only expected attributes will be stored within the catalog. Any other attributes which are sent will simply be ignored.
It is important to understand the difference between the three classes of attributes which you can be present within each entity instance record:
Global Attributes
These attributes are required to be present for entity instance in the system, regardless of its entity type. This set consists of only these attributes:
Attribute
Description
id
Your domain key for this entity instance
name
A human-readable name describing this entity instance
Entity Attributes
These attributes are required to be present for entity instance in the system, of a given entity type. This set of attributes will have been communicated to you by the coordinator user who created your connector within BitBroker. It will presented in the form of a JSON schema.
Instance Attributes
These attributes only exist for a given entity instance in the system. This is a free format object which can be used to store additional or ancillary information.
This simple hierarchy of three classes (global, entity and instance) is designed to give consumers maximum assurance about which data can be expected to be available to them:
They can always expect to find the global data present
They have firm expectations about data availability within an entity type
They understand that instance data is ad-hoc and cannot be relied upon
If any record within the posted record set contains a validation error, then the entire set will be rejected. The call will return an HTTP/1.1 400 Bad Request response and the body of the response will contain details of every record validation error which was encountered in the standard validation error format.
The catalog will decide whether to insert or update your record based upon the domain key which you supplied in the id field of each posted record. If a record already exists with this key, it will be updated - otherwise it will be inserted.
Your records are scoped to be within your data connector space only. You cannot affect records delivered by another data connector, even within the same entity type and even if you have a clashing key space.
Here is the post body for an example upsert request for a set of three records:
Whenever records are upserted into the catalog, it will return a report to the caller with information about how each posted record was processed. For example, for the three records above, you might get a report such as:
In the report, you will see a row for every record that was posted, alongside the BitBroker key which is
being used for this entity instance. This is the key which consumers will use in order to retrieve this record via the Consumer API.
There is no expectation that you need to store this consumer key, if you do not wish to do so. You should continue to simply use your own domain key for your catalog interactions.
Deleting records
When deleting records from the catalog, you need to simply post an array of your domain keys for the records to be removed. These should be the same domain keys you specified when you upserted the records. For example, to remove two of the records upserted in the previous step, the post body would need to be:
["GB","BR"]
Whenever records are deleted from the catalog, it will return a report to the caller with information about how each posted ID was processed. For example, for the two IDs above, you might get a report such as:
In the report, you will see a row for every ID that was posted, alongside the BitBroker key which was
being used for this (now removed) entity instance. This is the key which consumers will have used in order to retrieve this record via the Consumer API.
If you post a domain key to delete a record which does not exist in the catalog, this will simply be ignored.
Closing a Session
After entity instance records have been posted, you can be close a session by issuing an HTTP/GET to the /connector/:cid/session/:sid/close/:commit end-point.
In order to post record actions, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker. You must also know the session ID (sid), which was returned in the previous step where a session was opened.
Finally, you will also need to select one of the two valid commits from true and false. These should be specified in lowercase and without any spaces.
The exact mechanics of closing a session depends on the type of session that specified when it was opened. This was covered in detail in the earlier section on session types.
5.2 - Hosting a Webhook
How to use webhooks to incorporate live and on-demand data
It is an expectation that the BitBroker catalog contains information which is useful to enable search and discovery of entity instances. Hence, it contains key metadata - but it does not normally contain actual entity data. This is pulled on-demand via a webhook hosted by the data connector who contributed the entity record.
The distinction between data and metadata is covered in more detail in the key concepts documentation. Depending on how data and metadata is balanced in a BitBroker instance, there may or may not be a requirement to host a webhook.
In this section, we will outline how to implement a webhook within a data container.
A quick way to get going integrating a webhook into your own data connector is to adapt the example connectors which have been built for a range of data sources.
It is permitted and valid for one webhook to be servicing the needs of multiple data connectors. Sufficient inbound information will be provided to allow the webhook to be clear about which entity instance data is being request.
Your webhook should be an HTTP server which is capable of receiving calls from the BitBroker instance. You can host this server in any manner you like, however the coordinator of your BitBroker may have their own hosting and security requirements of it.
You need to maintain your webhook so that it is always available to its connected BitBroker instance. If your webhook is down or inaccessible when BitBroker needs it, this will result in a poor experience for consumers using the Consumer API. In this scenario, they will only see partial records. Information about misbehaving data connectors will be available to coordinator users.
Required End-points
You are required to implement two end-points as part of your webhook deployment.
Whilst BitBroker advertises its own key space to its consumers, there is no need for data connectors to take heed of these. They can continue to concern themselves with only their own domain key space. When BitBroker makes requests of your webhook, it will only ever use its own key space.
Whenever your webhook is called, it will be in the context of an on-demand request - meaning that the call is in the direct line of response to a waiting user of the Consumer API. Hence, you should endeavor to respond to webhook calls in a timely manner. Information about poorly performing data connectors will be available to coordinator users.
Entity End-point
The entity end-point is used by BitBroker to get a full data record for an entity instance which you previously submitted into the catalog.
Your own domain key, which you previously submitted into the catalog
The entity type is presented here to allow for scenarios where one webhook is servicing the needs of multiple data connectors.
In response to this call, you should return a JSON object consisting of an entity and instance attribute only - all other attributes will be ignored. The object you return will be merged with the catalog record, which you provided earlier. Hence, there is no need to resupply the catalog information you have already submitted in previous steps.
For example, consider this (previously submitted) catalog record:
The system will then merge this live information with the catalog record to send a combined record to the consumer.
{"id":"GB","name":"United Kingdom","type":"country","entity":{"area":242900,"calling_code":44,"capital":"London","code":"GB","continent":"Europe","currency":{"code":"GBP","name":"Sterling"},"population":66040229,"inflation":4.3// this has been merged in
},"instance":{"independence":1066,"temperature":18.8// this has been merged in
}}
Timeseries End-point
The timeseries end-point is used by BitBroker to get a timeseries information associated with an entity instance previously submitted into the catalog.
Not all entity type will have timeseries associated with them. When they do, then this callback is vital, since no timeseries data points are held within the catalog itself. Only the existence of timeseries and key metadata about them is stored.
The timeseries end-point has the following signature:
Your own domain key, which you previously submitted into the catalog
tsid
always
The ID of the timeseries associated with this entity instance
start
sometimes
The earliest timeseries data point being requested When present, an ISO 8601 formatted date
end
sometimes
The latest timeseries data point being requested When present, an ISO 8601 formatted date
limit
always
The maximum number of timeseries points to return An integer greater than zero
Further information about the possible URL parameters supplied with this callback are:
Attribute
Information
start
Should be treated as inclusive of the range being requested When not supplied, assume a start from the latest timeseries point
end
Should be treated as exclusive of the range being requested When present, this will always after the start Never present without start also being present When not supplied, defer to the limit count
limit
Takes precedence over the start and end range The end may not be reached, if limit is breached first
Then the webhook should respond timeseries data points as follows:
[{"from":1910,"to":1911,"value":5231},{"from":1911,"to":1912,"value":6253},// other timeseries points here
]
You should return your timeseries points with the latest first. Taking the first item of a returned array, should always represent the latest data point.
Specifying both from and to is rare - in most cases, only a from will be present. You can place any data type which makes sense for your timeseries in the value attribute. But this should be consistent across all the timeseries points you return.
6 - Consumer API
The outward facing API which consumers will use to access data from BitBroker
The Consumer API is the main outward facing API for the BitBroker system. It is the API used by consumers to access data from the system, but always in the context of a data sharing policy.
All Consumer API calls happen in the context of a data sharing policy. The policy defines a data segment which you are permitted access to. There are no operations within the Consumer API which allow you to “break out” of this defined data segment.
Before you use this API, you should become familiar with the general, system-wide API conventions - which are used across all three BitBroker API sets. This covers topics such as API architecture, authorization, error reporting and handling, etc.
Paging Lists
It is possible that record lists returned from APIs may run to a great many records. There is a hard limit to how many records will be returned in a single call to any Consumer API.
The hard limit to how many records will be returned in a single call to the Consumer API is 250
It is possible to page access to record lists by using a set of URL query parameters, as follows:
Attribute
Necessity
Description
limit
optional
An integer number of data records, between 1 and 250
offset
optional
A positive integer index of the earlier desired record
If the paging parameters are incorrect, the API will respond with a standard validation error containing details of the violation.
If the specified offset is greater than the count of records available, the API will return an empty list. Using limit and offset you can page your way through a long list, getting the entire list a page at a time.
Timeseries data has separate paging semantics, which is more attuned to it’s time-value pair data points.
Rate and Quota Limits
All Consumer API calls happen within the context of a data sharing policy. Amongst other things, policy defines a rate and quota limit on calls you can make with the Consumer API. These should have been communicated to you by the coordinator user who gave you your consumer authorization token.
Rate - the maximum calls-per-second that you are allowed to make (e.g. 250)
Quota - the total calls you can make in a given period (e.g. 86,400 per day)
If you breach a rate or quota limit then calls to any part of the Consumer API will respond as follows:
HTTP/1.1 429 Too Many Requests
This response will persist until the breach has expired.
Legal Notices
All Consumer API calls happen within the context of a data sharing policy. Amongst other things, policy defines the legal context under which data access is permitted. A legal notice will be present at the end of every entity instance record returned by any part of the Consumer API.
These legal notices are in the form of a JSON array of objects, each with three attributes:
Attribute
Presence
Description
type
always
The type of the legal notice
text
always
The description of the legal notice
link
always
A link for more information about the legal notice
Examples of types of legal notices which may be present are:
Type
Description
attribution
How the data should be attributed within your application
contact
Who to contact about the data and it’s use
license
The licenses under which this data sharing operates
note
Ad hoc notes about the data and/or it’s use
source
Information about the source or origination of the data
terms
The terms and conditions of use for the data
Please respect the legal basis under which the data is being shared. Coordinator users have the power to rescind consumer authorization tokens and hence cut-off access to data for users who breach the legal context.
It is possible that the coordinator user who gave you your consumer authorization token will have more information about the legal basis of use of the data. They may require you to perform additional legal steps in order to be given a consumer authorization token.
Accessing “Staged” Records
It is important to understand that data connectors might be in a live or staged state. That is, their contribution might be approved for the live catalog, or might be being held back into a staging space only.
When staged, any records they contributed will not be visible in any part of the Consumer API. This concept is outlined in more detail in the key concepts documentation.
When developing a new connector, it is often useful to be able to see data from it alongside other public data. There is a mechanism available in the Consumer API which allows data connectors to see how their records will look alongside other existing public records.
In order to achieve this, connectors authors can send in a list of connector IDs in a private HTTP header attribute to any part of the Consumer API. Only connector authors will be aware of their connector ids. They can send up to 16 connectors in a single request header. The API will then include in responses, records contributed by those connectors as if they were live.
The header they should use for this is x-bbk-connectors. This should be a string value which is a comma separated list of connector IDs without any spaces.
6.1 - Catalog API
The catalog API enabling search and discovery services for entity instances
All the entity instances being managed by BitBroker are recorded with the catalog. This Catalog API is used to search this index and to discover entity instances which are needed by applications. The Catalog API returns a list of entity instances, which are then accessible via the Entity API.
In this section, we will explore the capabilities of the Catalog API in depth.
A quick way to get going with building your own applications is to adapt the example apps which use this and the other Consumer APIs.
All Consumer API calls happen within the context of a data sharing policy. The policy defines a data segment which you are allowed to access. Any queries which you make using the Catalog API can only operate within the data segment you are permitted access to.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
The Consumer API can be run in production or development mode; each requiring a different header. If you are running a local installation of BitBroker, then you can switch the sample calls on this page to development mode by activating the toggle below. If you are unsure, leave this toggle off.
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your consumer API authorization token. This authorization token should have been provided to you by the coordinator user who administers the BitBroker instance. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Consumer API Authorization Token
When installing BitBroker locally, authorization is bypassed. In this scenario, you need to specify the policy you are using in a development header. Enter your desired policy in the box below to update all the sample calls on this page:
Your Current Policy
Querying the Catalog
You can query the catalog by issuing an HTTP/GET to the /catalog end-point.
The catalog API will return a list of entity instances which match the submitted query string. The query string is submitted by adding a q URL parameter to the call. Submitting no query string, results in an empty array (as opposed to all items).
Lists retrieved from this API will be returned in pages
Here we have one matching entity instance. The API returns Entity API links to matching instances.
In the examples here we show the query strings in readable JSON format, but when actual calls are made, the query string is required to be url encoded.
Querying Options
Here, we will go through each available query option one-by-one, giving an example of each.
All query strings are url encoded JSON objects. The query string syntax used is loosely based on the query language used within Mongo DB.
Implicit Equality
This query format is a shorthand for using the $eq operator
The $near operator is used to find entity instances close to a GeoJSON geometry. The $min and $max parameters are specified in meters. You must specify either one of $min or $max or both together. The $geometry attribute must be present.
This call will find all entity instances which have a timeseries called noise with a unit of decibel.
6.2 - Entity API
The entity API enabling direct access to entity types and entity instances
The Entity API is used to retrieve information about entity instances which are present within the catalog. You can use this API to get a list of such entity instances or to get details of one particular entity instance. Calls to the Entity API are most often a result of a query to the Catalog API.
In this section, we will explore the capabilities of the Entity API in depth.
A quick way to get going with building your own applications is to adapt the example apps which use this and the other Consumer APIs.
All Consumer API calls happen within the context of a data sharing policy. The policy defines a data segment which you are allowed to access. Any queries which you make using the Entity API can only operate within the data segment you are permitted access to.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
The Consumer API can be run in production or development mode; each requiring a different header. If you are running a local installation of BitBroker, then you can switch the sample calls on this page to development mode by activating the toggle below. If you are unsure, leave this toggle off.
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your consumer API authorization token. This authorization token should have been provided to you by the coordinator user who administers the BitBroker instance. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Consumer API Authorization Token
When installing BitBroker locally, authorization is bypassed. In this scenario, you need to specify the policy you are using in a development header. Enter your desired policy in the box below to update all the sample calls on this page:
Your Current Policy
Entity Types Lists
You can query for a list of known entity types by issuing an HTTP/GET to the /entity end-point.
[{"id":"country","url":"http://bbk-consumer:8003/v1/entity/country","name":"Countries","description":"All the countries in the world as defined by the UN"},{"id":"heritage-site","url":"http://bbk-consumer:8003/v1/entity/heritage-site","name":"World Heritage Sites","description":"A landmark or area with legal protection by an international convention administered by UNESCO"}// other entity types here
]
Each entity type, which you are permitted to see, will be returned within this array.
Lists retrieved from this API will be returned in pages
Entity Instance Lists
You can query for a list of entity instances of a given entity type by issuing an HTTP/GET to the /entity/:type end-point.
[{"id":"d38e86a5591b1f6e562040b9189556ff2d190ea7","url":"http://bbk-consumer:8003/v1/entity/country/d38e86a5591b1f6e562040b9189556ff2d190ea7","type":"country","name":"Andorra","legal":[]},{"id":"34c3ab32774042098ddc0ffa9878e4a1a60b33c0","url":"http://bbk-consumer:8003/v1/entity/country/34c3ab32774042098ddc0ffa9878e4a1a60b33c0","type":"country","name":"United Arab Emirates","legal":[]},{"id":"8c52885171d12b5cda6c77e2b9e9d52ed6bfe867","url":"http://bbk-consumer:8003/v1/entity/country/8c52885171d12b5cda6c77e2b9e9d52ed6bfe867","type":"country","name":"Afghanistan","legal":[]},// other entity instances here
]
Each entity instance, which you are permitted to see, will be returned within this array.
Lists retrieved from this API will be returned in pages
Entity Instance Details
You can get the details of a particular entity instance by issuing an HTTP/GET to the /entity/:type/:id end-point.
{"id":"34c3ab32774042098ddc0ffa9878e4a1a60b33c0","url":"http://bbk-consumer:8003/v1/entity/country/34c3ab32774042098ddc0ffa9878e4a1a60b33c0","type":"country","name":"United Arab Emirates","entity":{"area":83600,"code":"AE","link":"https://en.wikipedia.org/wiki/United_Arab_Emirates","capital":"Abu Dhabi","currency":{"code":"AED","name":"Arab Emirates Dirham"},"location":{"type":"Point","coordinates":[53.847818,23.424076]},"continent":"Asia","population":9682088,"calling_code":971,},"instance":{"independence":1971},"timeseries":{"population":{"unit":"x1000","value":"people","period":"P1Y","url":"http://bbk-consumer:8003/v1/entity/country/34c3ab32774042098ddc0ffa9878e4a1a60b33c0/timeseries/population"}},"legal":[]}
You can only get details of entity instances which you are permitted to see.
6.3 - Time Series API
The time series API enabling access to time-value pair datasets within entity instances
The Timeseries API is used to retrieve timeseries data points from entity types present within the catalog. Not all entity types have timeseries associated with them.
In this section, we will explore the capabilities of the Timeseries API in depth.
A quick way to get going with building your own applications is to adapt the example apps which use this and the other Consumer APIs.
All Consumer API calls happen within the context of a data sharing policy. The policy defines a data segment which you are allowed to access. Any queries which you make using the Timeseries API can only operate within the data segment you are permitted access to.
In the Catalog API documentation, you can find examples of how to search for entity instances which have timeseries associated with them.
In our sample calls, we use the standard server name and port format outlined for local installations and demos. If you have an alternative API host base URL, then enter it in the box below to update all the sample calls on this page :
Your API Host Base URL
The Consumer API can be run in production or development mode; each requiring a different header. If you are running a local installation of BitBroker, then you can switch the sample calls on this page to development mode by activating the toggle below. If you are unsure, leave this toggle off.
All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your consumer API authorization token. This authorization token should have been provided to you by the coordinator user who administers the BitBroker instance. If you already have a token, enter it in the box below to update all the sample calls on this page:
Your Consumer API Authorization Token
When installing BitBroker locally, authorization is bypassed. In this scenario, you need to specify the policy you are using in a development header. Enter your desired policy in the box below to update all the sample calls on this page:
Your Current Policy
Getting Timeseries Data
You can query for a list timeseries data by issuing an HTTP/GET to the /entity/:type/:id/timeseries/:tsid end-point.
Timeseries are always housed within a parent entity type and each has a unique ID on that entity type. Hence, you will need to know the entity type ID (type), the entity instance ID (id) and timeseries ID (tsid), in order to get access to such data points.
[{"from":1960,"value":89608},{"from":1961,"value":97727},{"from":1962,"value":108774},{"from":1963,"value":121574},// other timeseries data points here
]
In this response, the attribute you will see is as follows:
The exact nature of the value attribute will depend upon the context of the timeseries you are using. It might be as simple as an integer, or as complex as an object.
The timeseries will always be returned sorted on the from attribute. Taking the first item of a returned array, should always represent the latest data point.
Paging Timeseries
It is possible that timeseries may run to a great many data points. There is a hard limit to how many data points will be returned in a single call to this API.
The hard limit to how many data points will be returned in a single call to this API is 500
It is possible to page access to timeseries data points by using a set of URL query parameters, as follows: