4.1 Intake Agents
Intake Agent Applications
Intake Agents are separate applications designed to facilitate seamless integration of various data sources into the
Data Context Hub. The data source can be a database, file system, or any
other source of data. An Intake Agent acts as a standalone service that represents the bridge between an external data
source and the Data Context Hub, handling all data retrieval tasks before handing over the data to GBS for import
operations.
To communicate with the Data Context Hub Intake Agent applications must implement predefined openapi
specification intake-agent-api.yaml.
For commonly used data sources (e.g., SQL databases, csv and XML files, etc.) there are ready-to-use intake agent
implementations available.
For specific data sources a custom Intake Agent can be implemented with different programming languages and
frameworks.
Custom Intake Agents must not send a licenseModuleKey in the GET /api/v1/core/info request.
Currently, the fastest and the easiest way to develop an intake agent is to create a new .NET application and reference
DCH.IntakeAgentCore NuGet package which implements required contract, basic admin functionality, etc.
Security
Except for the data import process, all other communication between an Intake Agent and GBS is secured with JWT Bearer
tokens that are issued by an instance's Keycloak server. A custom Intake Agent must ensure to accept and validate
these tokens or use open endpoints, which is highly discouraged. It must also ensure to send valid tokens when
communicating with GBS.
Intake Agent registration
To begin using an Intake Agent, it must first be registered in GBS:
An Intake Agent has to store the dataSourceRefId if it plans to create a Data Import Configuration in GBS.
Data import configuration
Intake Agent exposes available data configurations, source entities, and their properties via
api/v1/core/data-source-configurations endpoint. They can be referenced in Target Entity Property Mappings in GBS
configuration.
Data import processing
When data needs to be imported into GBS (requested by a user, workflow trigger, etc.) GBS calls
POST /api/v1/core/data-jobs endpoint.
This request specifies a subset of source entities and their properties required for a particular import job.
Sometimes the order in which source entity data is sent is important. To satisfy these requirements, GBS may specify
that particular source entities should not be sent until api/v1/core/data-jobs/{jobId}/trigger endpoint is called.
After receiving a job request, IntakeAgent should start sending requested source entities data to GBS. Each job
request
contains a unique jobApiKey that is used to secure the data sending process. Each request to
POST {gbsUrl}/api/v2/data-import-jobs/batch has to include the jobApiKey in the X-API-Key header.
If there is no source data for a requested entity, the intake agent must send an empty batch with the isLatest
flag set. This signals to GBS that the job can be completed.
GBS may periodically query the intake agent for the job status. Ensure the intake agent exposes a stable status endpoint or response so GBS can reliably determine progress and completion.
Error handling
In case when corresponding GBS job fails or gets canceled it calls api/v1/core/data-jobs/{jobId}/cancel endpoint on
the Intake Agent.
In case when Intake Agent fails to process the job it should call api/v2/data-import-jobs/failure-report endpoint
in GBS.