The Catalog
The BitBroker catalog is a place where the existence of all the entity instances in the domain space are stored, enumerable and searchable. It also provides the route for consumers to search, discover and use entity instance data.
The BitBroker catalog performs the following specific functions:
- Records the existence of all domain entity instances
- Marshals all domain data into a published entity type enumeration
- Presents a small set of global attributes in a consistent manner
- Ensures data representation consistency between entity specific properties
- Arbitrates the key spaces of the domain systems managed by data connectors
- Offers consistent listing, enumeration and search semantics over all entries via the Consumer API
- Acts as the bridge to the connected domain systems
- Removes the need for users to understand where the source data originated from
- Blends data from multiple data connectors for the same entity type
- Provides key arbitration between multiple data connectors
- Maximizes data interoperability
Consumer access to the the catalog is via its own API, which is a part of the Consumer API.
Data vs Metadata
The catalog is a place to store metadata about entity instances. Consumer use the catalog to search and discover entity instances which are needed for their applications. Once an entity instance has been discovered, its details can be obtained via calls to the relevant part of the Consumer API.
When consumers obtain a detailed record about a particular entity instance, BitBroker will ask it’s submitting data connector directly for it’s live and on-demand information via a webhook. It will then merge this with the catalog record and return the whole to the consumer. The consumer is unaware of the existence of data connectors and even that some data is being drawn on-demand from another system.
What exactly is the difference between metadata and data in this context?
What data should you store in the catalog and what data should you retain for webhook callbacks? There is no right or wrong answer to this question. Typically, you should aim to store information which has a slow change rate in the catalog and retain other information for a callback.
Here are some examples of how this might operate in practice:
Entity Type | Catalog Data | Live Webhook Data |
---|---|---|
thermometer |
location , range , unit |
value |
bus |
route , seats , operator |
location |
car-park |
name , location , capacity |
occupancy |
To be clear, data connectors can store everything in the catalog. This has the advantage that everything becomes searchable and they no longer need to host a webhook.
But, on the flip side, they have taken on the burden of sending constant updates to the catalog in order to keep it up to date. Here, the catalog is always likely to be a temporally lagging copy of the source stores.
Ultimately, the decision on how you use the catalog is up to the coordinators and data connector authors.