5.2 Workers
Workers are isolated applications that can be used by the Data Context Hub. They are meant to be deployed and run separately from the stack with as much as possible independence. They are loosely integrated with GBS and EBS.
It is highly recommended that workers are automatically scaled (e.g. by using horizontal pod autoscaling in Kubernetes) since the load can be quite high.
Custom implementations for following worker types are currently supported:
ExternalGraphSecurity
General Information
Workers should be isolated applications which allows to have them developed with any language and framework. GBS currently assumes that workers that are added to workers list must provide some basic REST functionality.
Information Endpoint: GET /
This request has to return basic information about a worker:
class
is used to distinguish different classes of workers meant to perform different type of operations. Supported classes:result
type
is the type of workeruuid
is used together with URL to validate if a worker is the expected worker when adding to GBSversion
is used to display the current version of a worker
This request is important as it is also used as a validation when adding a worker to GBS. The following steps are used for validation:
- GBS performs a
GET
request to the root of the provided worker URL (i.e. if the provided URL ishttp://localhost:90/api/filter
,http://localhost:90/
is used). If either an error or nothing is returned it is assumed that a worker is not available. - When
GET
is successful it checks the response:class
,type
andversion
are available in the response- User provided UUID must match to the
uuid
value in the response
Execution Endpoints
These worker specific endpoints are used to trigger a worker's logic and each defines its own input and output schema. Independent of which request and response is required by a specific worker type, calls to such endpoints will always include a Keycloak Bearer token that can be validated by the worker.
ExternalGraphSecurity
Worker
This type of worker is called if any nodes or relationships remain after applying Graph Access Rules to graph
related EBS requests. All remaining objects are sent to all ExternalGraphSecurity
workers registered in the system.
This allows filtering based on different external systems.
A worker of this type has to provide one POST
endpoint that can handle the following requests. Note that the
parameters.type
field is used to distinguish between nodes and relationships.
Requests
- Nodes
- Relationships
{
"parameters": {
"type": "nodes"
},
"data": [{
"elementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834844",
"labels": ["label"],
"properties": {
"key1": "value1",
"key2": "value2"
}
}, {
"elementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834877",
"labels": ["label"],
"properties": {
"key": "value"
}
}
]
}
{
"parameters": {
"type": "relationships"
},
"data": [{
"startNodeLabel": "label1",
"endNodeLabel": "label2",
"elementId": "5:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:90494",
"type": "CONTAINS",
"startNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834877",
"endNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834844",
"properties": {
"key1": "value1",
"key2": "value2"
}
}, {
"startNodeLabel": "label1",
"endNodeLabel": "label1",
"elementId": "5:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:90484",
"type": "REQUIRES",
"startNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834877",
"endNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834843",
"properties": {
"key": "value"
}
}
]
}
The expected responses can be found below. A worker does not have to distinguish if a response contains nodes or
relationships since EBS (the calling system) keeps track of that. The error response is only expected when the returned
status code is not in the range 200-299
.
Responses
- Nodes
- Relationships
- Error
{
"result": {
"nodes": [{
"elementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834844",
"labels": ["label"],
"properties": {
"key1": "value1",
"key2": "value2"
}
}, {
"elementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834877",
"labels": ["label"],
"properties": {
"key": "value"
}
}
]
}
}
{
"result": {
"relationships": [{
"elementId": "5:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:90494",
"type": "CONTAINS",
"startNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834877",
"endNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834844",
"properties": {
"key1": "value1",
"key2": "value2"
}
}, {
"elementId": "5:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:90484",
"type": "CONTAINS",
"startNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834877",
"endNodeElementId": "4:1d78d5ba-072c-4cfa-96c7-dda7da1c5620:1834843",
"properties": {
"key": "value"
}
}
]
}
}
{
"description": "An error occured: Signature has expired.",
"error": "invalid_token"
}
RuleTransformation
Worker
This worker type will be used to extract data from external systems and transform them into Graph Access Rules. RuleTransformation workers have to be registered with GBS to know which workers are available in Load Plans.
A worker of this type has to provide a POST
endpoint that can handle the following request. Note that the data sent
to a worker depends on the configuration of the corresponding rule transformation step in a Load Plan.
Request sent to worker
{
"users": [
"admin"
],
"clients": [
"service-account-user"
],
"groups": [
"Test Group"
],
"entities": [
{
"id": 414,
"title": "Benchmark",
"properties": [
{
"id": 5303,
"title": "Benchmark_ID"
},
{
"id": 5296,
"title": "BK_IDdd"
},
{
"id": 5301,
"title": "Description"
},
{
"id": 5299,
"title": "Parents123"
},
{
"id": 5300,
"title": "Reference_Date"
},
{
"id": 5302,
"title": "Title"
}
]
}
],
"relationships": [
{
"id": 415,
"entityNameFrom": "Benchmark",
"entityNameTo": "Benchmark",
"name": "PARENTS",
"alias": "PARENTS"
},
{
"id": 1443,
"entityNameFrom": "Benchmark",
"entityNameTo": "A_Benchmark",
"name": "testab",
"alias": "TESTAB"
}
]
}
Workers are supposed to call the GBS endpoint POST /api/graph-access-rule/accessRules
to add rules to the system.
The minimum data required to store a rule is one of user
, group
or service account
and one of entity
or relationship
.
There are different ruleCreationStrategy
options available:
Ignore
: if a rule with the same name already exists, it will be ignored (default behavior)DeleteOnConflict
: if a rule with the same name already exists, it will first be deleted and recreated based on the requestDeleteAllForUser
: all rules that were created by access token user will be deleted before processing the request
Request sent to GBS
{
"ruleCreationStrategy": "Ignore",
"rules": [
{
"accessLevel": "view",
"active": false,
"clients": [
"service-account-user"
],
"entities": [
{
"id": 414,
"properties": [
5303,
5296,
5301,
5299,
5300,
5302
]
}
],
"groups": [
"Test Group"
],
"name": "rule-transform-mock-rule1",
"relationships": [
415
],
"propertyValueConditions": [
{ "PropertyId": 5303,"Value": "2500" },
{ "PropertyId": 5296,"Value": "Test" }
],
"functions": [
12
],
"users": [
"admin"
]
}
]
}
If the worker does not encounter any errors, it should return status code 200
. Otherwise, it should return an
appropriate status code and error message which will be logged in Airflow.