This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Contributor API

The API for managing the contribution of data to a BitBroker instance

1: Contributing Records
2: Hosting a Webhook

The Contributor API is the API which is used for submitting data contributions into the BitBroker catalog. It is tightly connected with the concepts of entity types and their associated data connectors.

It is important that you understand these, and other key concepts, before you begin using the Contributor API.

Before you use this API, you should become familiar with the general, system-wide API conventions - which are used across all three BitBroker API sets. This covers topics such as API architecture, authorization, error reporting and handling, etc.

1 - Contributing Records

How connectors contribute entity instance records to BitBroker

All the data being managed by a BitBroker instance, enters the system via the Contribution API. The process of contributing such data is documented in detail in this section.

In this section, we will consider the basic use case of contributing entity instance records. Later sections of this documentation will detail how you can contribute live, on-demand data and timeseries data.

Contributing data is tightly bound with the concepts of entity types and their associated data connectors. All contributions happen in the context of these important system elements. It is vital that you fully understand these and other key concepts before using this API to contribute records.

A quick way to get going building your own data connectors is to adapt the example connectors which have been built for a range of data sources.

All API calls in BitBroker require authorization. The sample calls below contain a placeholder string where you should insert your contributor API authorization token. This authorization token should have been provided to you by the coordinator user who created your data connector within BitBroker.

The sample calls in this section will not work as-is. Contributor API calls require the use of session IDs, which are generated on-demand. Hence, the sample calls here are merely illustrative.

Contributing Records to the Catalog

We will assume for the purposes of this section that an entity type and it’s associated data connector have been created and are present within the system. Further, that the connector ID and authorization token, which were obtained when the data connecter was created, have been recorded and are available.

Data can now be contributed into the catalog by this data connector, but within the context of its parent entity type only. Hence, we say that a single connector contributes “entity instance records”. If one organization wants to contribute data to multiple entity types, then they must do this via multiple data connectors.

The process of contributing entity instance records into the catalog breaks down into three steps:

Create a data contribution session
Upsert and/or delete records into this session
Close the session

These steps are achieved via an HTTP based API, which we outline in detail below. Each data connector will have a private end-point on this API which is waiting for its contributions.

It is important to understand the distinction between data and metadata in the context of the BitBroker instance. It is an expectation that only metadata is being contributed into the catalog and that live data is kept back for on-demand requests. This distinction is outlined in more detail in the key concepts documentation.

It is important to understand that data connectors might be in a live or staged state. That is, their contribution might be approved for the live catalog, or might be being held back into a staging space only. This concept is outlined in more detail in the key concepts documentation. There is a mechanism available in the Consumer API which allows data connectors to see how their records will look alongside other existing public records.

If your connector is marked as “non-live”, your data contribution will not become visible to consumers. If you want to make your connector “live”, you must ask the coordinator user who created the connector for you.

Sessions

Sessions are used by the Contribution API to manage inbound data coming from the community of data connectors. Sessions allow the connectors to contribute entity instance records in well-defined ways, which are respectful of the state management of the source data store.

BitBroker supports three types of sessions: stream, accrue and replace. Each one provides for different update and delete contexts.

You can only have one session open at a time. If you open a new session without closing a previous one, the previous one is implicitly closed with a rollback request.

The three types of session provide for different application logic in the following areas:

Whether data is available to consumers whilst the session is still open or only after is it closed.
Whether the data provided within a session adds to or replaces earlier data from your connector.

Here is the detail of how each session type functions:

Area	Stream	Accrue	Replace
Data visibility	as soon as posted	on session close	on session close
Data from previous session	in addition to	in addition to	replaces entirely

Let’s explore each of these in more detail:

Stream Sessions

Stream sessions are likely to be the default mode of operation for most data connectors. Inbound entity instance records arrive in the catalog as soon as they are posted and whilst the session remains open. They are immediately available to consumers to view via the Consumer API.

New records are in addition to existing records in the catalog and removal must be explicitly requested. Closing a stream session is a moot operation, since the session type is essentially an “open pipe” into the catalog. In fact, stream sessions can be opened and left open indefinitely.

Type	Session	Action
stream	open	session data is already visible, in addition to previous data
stream	close `true`	no operation - session data is already visible, in addition to previous data
stream	close `false`	no operation - session data is already visible, in addition to previous data

Accrue Sessions

Accrue sessions are useful when entity instance records should only become visible as complete sets. In this scenario, the entity instance records contributed within a session, only become visible via the Consumer API when the session is closed - and hence only as a complete set.

New records are in addition to existing records in the catalog and removal must be explicitly requested. When you close an accrue session, you must specify a commit state as true or false. Closing the session with true makes the contributed records visible in the Consumer API, but closing it with false will discard all the records contributed within that session.

Type	Close	Action
accrue	open	session data not visible, but previous data is
accrue	close `true`	session data now becomes visible, in addition to previous data
accrue	close `false`	session data is discarded and previous data persists

Replace Sessions

Replace sessions are useful when contributed entity instance records should completely replace the set provided in previous sessions. In this scenario, the entity instance records contributed within a session, become visible via the Consumer API when the session is closed as a complete set - but all the records contributed in earlier sessions are discarded. Replace sessions are useful when you cannot maintain state about earlier contributions, and hence each contribution is a complete statement of your record set.

New records are in replacement for existing records in the catalog and removal of these “old” records is implicit. When you close an accrue session, you must specify a commit state as true or false. Closing the session with true makes the contributed records visible in the Consumer API and deletes records from previous sessions. However, closing it with false will discard all the records contributed within that session and previously contributed records will remain untouched.

Type	Close	Action
replace	open	session data not visible, but previous data is
replace	close `true`	session data now becomes visible and replaces all previous data
replace	close `false`	session data is discarded and previous data persists

As you can see, picking the right session type is vitally important to ensure you make the best use of the catalog. In general, you should aim to use a stream type session where you can, as this is the simplest.

If you don’t want clients to be able to see intermediate updates in the catalog, then accrue and replace may be better options. Where you don’t want to (or can’t) store any state about what you previously sent to the catalog, then replace is probably the best option.

Using Sessions

There are only three HTTP calls which your data connectors need make in order to contribute records into the catalog.

Opening a Session

New sessions can be created by issuing an HTTP/GET to the /connector/:cid/session/open/:mode end-point.

In order to open a session, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker.

You will also need to select one of the three session modes from stream, accure and replace. These should be specified in lowercase and without any spaces.

curl http://bbk-contributor:8002/v1/connector/9afcf3235500836c6fcd9e82110dbc05ffbb734b/session/open/stream \
     --include \
     --header "x-bbk-auth-token: your-token-goes-here"

This will result in a response as follows:

HTTP/1.1 200 OK

The body of this response will contain a session ID (sid), which should be recorded as it will be needed for subsequent API calls. For example:

4527eff4-d9cf-41c0-9ecc-8e06b57fcf54

Posting Records in a Session

Once you have an open session, you can post two types of actions to it in order to manipulate your catalog entries.

upsert to update or insert a record into the catalog
delete to remove an existing record from the catalog

You can only make changes to your own records within the catalog. Your data connector will have no effect on records which came from other connectors - even if you share an entity type with them.

Entity instance records can be upserted or deleted by issuing an HTTP/POST to the /connector/:cid/session/:sid/:action end-point.

In order to post record actions, you must know the connector ID (cid). This should have been communicated to you by the coordinator user who created your data connector within BitBroker. You must also know the session ID (sid), which was returned in the previous step where a session was opened.

Finally, you will also need to select one of the two valid actions from upsert and delete. These should be specified in lowercase and without any spaces.

curl http://bbk-contributor:8002/v1/connector/9afcf3235500836c6fcd9e82110dbc05ffbb734b/session/4527eff4-d9cf-41c0-9ecc-8e06b57fcf54/upsert \
     --request POST \
     --include \
     --header "Content-Type: application/json" \
     --header "x-bbk-auth-token: your-token-goes-here" \
     --data-binary @- << EOF
     [ ]
EOF

You can specify upsert and/or delete record actions, but these cannot be mixed into a single API call. However, you can upsert and delete as many times as you wish within an open session.

Your upsert and delete actions will be executed in the strict order in which they were sent. You can safely upsert and then delete an entity instance within a single session boundary, if you so wish.

Care should be taken to ensure that the session ID (sid) used to post updates is the ID which was returned in the last call to open a session. If you send an old, invalid or mismatched session ID, it will result in an HTTP/1.1 403 Forbidden response. This will have no impact on any existing open session.

In the example above, we upsert an empty array - this is obviously not useful. Let’s now look in detail about how records are inserted, update and deleted using this API call.

Upserting records

When you post an upsert request, you should include an array of entity instances in JSON format within your post body. Each record can contain the following attributes:

Attribute	Necessity	Validation Rules
`id`	required	String between 1 and 64 characters long
`name`	required	String between 1 and 64 characters long
`entity`	required	An object conforming to the entity schema for this entity type
`instance`	optional	An object containing other, ancillary information

Only expected attributes will be stored within the catalog. Any other attributes which are sent will simply be ignored.

It is important to understand the difference between the three classes of attributes which you can be present within each entity instance record:

Global Attributes

These attributes are required to be present for entity instance in the system, regardless of its entity type. This set consists of only these attributes:

Attribute	Description
`id`	Your domain key for this entity instance
`name`	A human-readable name describing this entity instance

Entity Attributes

These attributes are required to be present for entity instance in the system, of a given entity type. This set of attributes will have been communicated to you by the coordinator user who created your connector within BitBroker. It will presented in the form of a JSON schema.

Instance Attributes

These attributes only exist for a given entity instance in the system. This is a free format object which can be used to store additional or ancillary information.

This simple hierarchy of three classes (global, entity and instance) is designed to give consumers maximum assurance about which data can be expected to be available to them:

They can always expect to find the global data present
They have firm expectations about data availability within an entity type
They understand that instance data is ad-hoc and cannot be relied upon

If any record within the posted record set contains a validation error, then the entire set will be rejected. The call will return an HTTP/1.1 400 Bad Request response and the body of the response will contain details of every record validation error which was encountered in the standard validation error format.

The catalog will decide whether to insert or update your record based upon the domain key which you supplied in the id field of each posted record. If a record already exists with this key, it will be updated - otherwise it will be inserted.

Your records are scoped to be within your data connector space only. You cannot affect records delivered by another data connector, even within the same entity type and even if you have a clashing key space.

Here is the post body for an example upsert request for a set of three records:

[
    {
        "id": "GB",
        "name": "United Kingdom",
        "entity": {
            "area": 242900,
            "calling_code": 44,
            "capital": "London",
            "code": "GB",
            "continent": "Europe",
            "currency": {
                "code": "GBP",
                "name": "Sterling"
            },
            "population": 66040229
        }
    },
    {
        "id": "IN",
        "name": "India",
        "entity": {
            "area": 3287263,
            "calling_code": 91,
            "capital": "New Delhi",
            "code": "IN",
            "continent": "Asia",
            "currency": {
                "code": "INR",
                "name": "Indian Rupee"
            },
            "population": 1344860000
        },
        "instance": {
            "independence": 1947
        }
    },
    {
        "id": "BR",
        "name": "Brazil",
        "entity": {
            "area": 8547403,
            "calling_code": 55,
            "capital": "Brasilia",
            "code": "BR",
            "continent": "South America",
            "currency": {
                "code": "BRL",
                "name": "Brazilian Real"
            },
            "population": 209659000
        },
        "instance": {}
    }
]

Whenever records are upserted into the catalog, it will return a report to the caller with information about how each posted record was processed. For example, for the three records above, you might get a report such as:

{
    "GB": "5ebb30afaa6ce33843b00bbff63f63b90e91028c",
    "IN": "917d0311c687e5ffb28c91a9ea57cd3a306890d0",
    "BR": "d5fa7d9d8e4625399da7771fc0e3e87886f2a5ac"
}

In the report, you will see a row for every record that was posted, alongside the BitBroker key which is being used for this entity instance. This is the key which consumers will use in order to retrieve this record via the Consumer API.

There is no expectation that you need to store this consumer key, if you do not wish to do so. You should continue to simply use your own domain key for your catalog interactions.

Deleting records

When deleting records from the catalog, you need to simply post an array of your domain keys for the records to be removed. These should be the same domain keys you specified when you upserted the records. For example, to remove two of the records upserted in the previous step, the post body would need to be:

[ "GB", "BR" ]

Whenever records are deleted from the catalog, it will return a report to the caller with information about how each posted ID was processed. For example, for the two IDs above, you might get a report such as:

{
    "GB": "5ebb30afaa6ce33843b00bbff63f63b90e91028c",
    "BR": "d5fa7d9d8e4625399da7771fc0e3e87886f2a5ac"
}

In the report, you will see a row for every ID that was posted, alongside the BitBroker key which was being used for this (now removed) entity instance. This is the key which consumers will have used in order to retrieve this record via the Consumer API.

If you post a domain key to delete a record which does not exist in the catalog, this will simply be ignored.

Closing a Session

After entity instance records have been posted, you can be close a session by issuing an HTTP/GET to the /connector/:cid/session/:sid/close/:commit end-point.

Finally, you will also need to select one of the two valid commits from true and false. These should be specified in lowercase and without any spaces.

curl http://bbk-contributor:8002/v1/connector/9afcf3235500836c6fcd9e82110dbc05ffbb734b/session/4527eff4-d9cf-41c0-9ecc-8e06b57fcf54/close/true \
     --include \
     --header "x-bbk-auth-token: your-token-goes-here"

This will result in a response as follows:

HTTP/1.1 200 OK

The exact mechanics of closing a session depends on the type of session that specified when it was opened. This was covered in detail in the earlier section on session types.

2 - Hosting a Webhook

How to use webhooks to incorporate live and on-demand data

It is an expectation that the BitBroker catalog contains information which is useful to enable search and discovery of entity instances. Hence, it contains key metadata - but it does not normally contain actual entity data. This is pulled on-demand via a webhook hosted by the data connector who contributed the entity record.

The distinction between data and metadata is covered in more detail in the key concepts documentation. Depending on how data and metadata is balanced in a BitBroker instance, there may or may not be a requirement to host a webhook.

In this section, we will outline how to implement a webhook within a data container.

A quick way to get going integrating a webhook into your own data connector is to adapt the example connectors which have been built for a range of data sources.

It is permitted and valid for one webhook to be servicing the needs of multiple data connectors. Sufficient inbound information will be provided to allow the webhook to be clear about which entity instance data is being request.

Registering your Webhook

The first step is to register your webhook with BitBroker. This is done when the connector is created or can be done later by updating the connector. These actions are part of the Coordinator API and hence can only be performed by a coordinator user on your behalf.

Your webhook should be an HTTP server which is capable of receiving calls from the BitBroker instance. You can host this server in any manner you like, however the coordinator of your BitBroker may have their own hosting and security requirements of it.

You need to maintain your webhook so that it is always available to its connected BitBroker instance. If your webhook is down or inaccessible when BitBroker needs it, this will result in a poor experience for consumers using the Consumer API. In this scenario, they will only see partial records. Information about misbehaving data connectors will be available to coordinator users.

Required End-points

You are required to implement two end-points as part of your webhook deployment.

Whilst BitBroker advertises its own key space to its consumers, there is no need for data connectors to take heed of these. They can continue to concern themselves with only their own domain key space. When BitBroker makes requests of your webhook, it will only ever use its own key space.

Whenever your webhook is called, it will be in the context of an on-demand request - meaning that the call is in the direct line of response to a waiting user of the Consumer API. Hence, you should endeavor to respond to webhook calls in a timely manner. Information about poorly performing data connectors will be available to coordinator users.

Entity End-point

The entity end-point is used by BitBroker to get a full data record for an entity instance which you previously submitted into the catalog.

The entity end-point has the following signature:

   HTTP/GET /entity/:type/:id

Where:

Attribute	Presence	Description
`type`	always	The entity type ID, for this entity instance
`id`	always	Your own domain key, which you previously submitted into the catalog

The entity type is presented here to allow for scenarios where one webhook is servicing the needs of multiple data connectors.

In response to this call, you should return a JSON object consisting of an entity and instance attribute only - all other attributes will be ignored. The object you return will be merged with the catalog record, which you provided earlier. Hence, there is no need to resupply the catalog information you have already submitted in previous steps.

For example, consider this (previously submitted) catalog record:

{
    "id": "GB",
    "name": "United Kingdom",
    "type": "country",  
    "entity": {
        "area": 242900,
        "calling_code": 44,
        "capital": "London",
        "code": "GB",
        "continent": "Europe",
        "currency": {
            "code": "GBP",
            "name": "Sterling"
        },
        "population": 66040229
    },
    "instance": {
        "independence": 1066
    }
}

If there is a call for the detail of this record made on the Consumer API, the system will callback on the entity end-point as follows:

HTTP/GET /entity/country/GB

Then the webhook should respond with any extra / live / on-demand entity and instance data:

{
    "entity": {
        "inflation": 4.3
    },
    "instance": {
        "temperature": 18.8
    }
}

The system will then merge this live information with the catalog record to send a combined record to the consumer.

{
    "id": "GB",
    "name": "United Kingdom",
    "type": "country",
    "entity": {
        "area": 242900,
        "calling_code": 44,
        "capital": "London",
        "code": "GB",
        "continent": "Europe",
        "currency": {
            "code": "GBP",
            "name": "Sterling"
        },
        "population": 66040229,
        "inflation": 4.3  // this has been merged in
    },
    "instance": {
        "independence": 1066,
        "temperature": 18.8  // this has been merged in
    }
}

Timeseries End-point

The timeseries end-point is used by BitBroker to get a timeseries information associated with an entity instance previously submitted into the catalog.

Not all entity type will have timeseries associated with them. When they do, then this callback is vital, since no timeseries data points are held within the catalog itself. Only the existence of timeseries and key metadata about them is stored.

The timeseries end-point has the following signature:

HTTP/GET /timeseries/:type/:id/:tsid?start=:start&end=:end&limit=:limit

Where:

Attribute	Presence	Description
`type`	always	The entity type ID, for this entity instance
`id`	always	Your own domain key, which you previously submitted into the catalog
`tsid`	always	The ID of the timeseries associated with this entity instance
`start`	sometimes	The earliest timeseries data point being requested When present, an ISO 8601 formatted date
`end`	sometimes	The latest timeseries data point being requested When present, an ISO 8601 formatted date
`limit`	always	The maximum number of timeseries points to return An integer greater than zero

Further information about the possible URL parameters supplied with this callback are:

Attribute	Information
`start`	Should be treated as inclusive of the range being requested When not supplied, assume a `start` from the latest timeseries point
`end`	Should be treated as exclusive of the range being requested When present, this will always after the `start` Never present without `start` also being present When not supplied, defer to the `limit` count
`limit`	Takes precedence over the `start` and `end` range The `end` may not be reached, if `limit` is breached first

Then the webhook should respond timeseries data points as follows:

[
    {
        "from": 1910,
        "to": 1911,
        "value": 5231
    },
    {
        "from": 1911,
        "to": 1912,
        "value": 6253
    },
    // other timeseries points here
]

Where:

Attribute	Necessity	Description
`from`	required	An ISO 8601 formatted date
`to`	optional	When present, an ISO 8601 formatted date
`value`	required	A valid JSON data type or object

You should return your timeseries points with the latest first. Taking the first item of a returned array, should always represent the latest data point.

Specifying both from and to is rare - in most cases, only a from will be present. You can place any data type which makes sense for your timeseries in the value attribute. But this should be consistent across all the timeseries points you return.