Metadata Microservices

Metadata Microservices

Use the CDAP Metadata Microservices to set, retrieve, and delete the metadata annotations of applications, datasets, and other entities in CDAP.

Note: Metadata for versioned entities is not versioned, including entities such as applications, programs, schedules, and program runs. Additions to metadata in one version are reflected in all versions.

Metadata consists of properties (a list of key-value pairs) or tags (a list of keys). Metadata and their use are described in the Metadata and Lineage section.

The Microservices is divided into these sections:

  • Metadata properties

  • Metadata tags

  • Searching metadata

  • Viewing lineage

  • Field level lineage

  • Metadata for a run of a program

Metadata keys, values, and tags must conform to the CDAP alphanumeric extra extended character set, and are limited to 50 characters in length. The entire metadata object associated with a single entity is limited to 10K bytes in size.

There is one reserved word for property keys and values: tags, either as tags or TAGS. Tags themselves have no reserved words.

All methods or endpoints described in this API have a base URL (typically http://<host>:11015 or https://<host>:10443) that precedes the resource identifier, as described in the Microservices Conventions. These methods return a status code, as listed in the Microservices Status Codes.

Note: Datasets are deprecated and will be removed in CDAP 7.0.0.

Metadata Properties

Annotating Properties

To annotate user metadata properties for an application, dataset, or other entities including custom entities, submit an HTTP POST request:

POST /v3/namespaces/<namespace-id>/<entity-details>/metadata/properties

or, for a particular program of a specific application:

POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties

or, for a particular version of an artifact:

POST /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties

or, for a custom entity like field of a dataset:

POST /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/properties

with the metadata properties as a JSON string map of string-string pairs, passed in the request body:

{ "key1" : "value1", "key2" : "value2", ... }

New property keys will be added and existing keys will be updated. Existing keys not in the properties map will not be deleted.

Parameter

Description

Parameter

Description

namespace-id

Namespace ID.

entity-details

Hierarchical key-value representation of the entity.

app-id

Name of the application.

program-type

One of mapreducesparkworkflowsservices, or workers.

program-id

Name of the program.

artifact-id

Name of the artifact.

artifact-version

Version of the artifact.

dataset-id

Name of the dataset.

field-name

Name of the field.

HTTP Responses

Status Codes

Description

Status Codes

Description

200 OK

The properties were set.

Note: When using this API, properties can be added to the metadata of the specified entity only in the user scope.

Retrieving Properties

To retrieve user metadata properties for an application, dataset, or other entities including custom entities, submit an HTTP GET request:

GET /v3/namespaces/<namespace-id>/<entity-details>/metadata/properties[?scope=<scope>]

or, for a specific application:

GET /v3/namespaces/<namespace-id>/apps/<app-id>/metadata/properties

or, for a particular program of a specific application:

GET /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties[?scope=<scope>]

or, for a particular version of an artifact:

GET /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties[?scope=<scope>]

or, for a custom entity like field of a dataset:

GET /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/properties[?scope=<scope>]

with the metadata properties returned as a JSON string map of string-string pairs, passed in the response body (pretty-printed):

{ "key1" : "value1", "key2" : "value2", ... }

Parameter

Description

Parameter

Description

namespace-id

Namespace ID.

entity-details

Hierarchical key-value representation of the entity.

app-id

Name of the application.

program-type

One of mapreducesparkworkflowsservices, or workers.

program-id

Name of the program.

artifact-id

Name of the artifact.

artifact-version

Version of the artifact.

dataset-id

Name of the dataset.

field-name

Name of the field.

scope

Optional scope filter. If not specified, properties in the user and system scopes are returned. Otherwise, only properties in the specified scope are returned.

Example

To get the creation time for a deployed data pipeline called POS_SALES, issue the following GET request:

GET /v3/namespaces/default/apps/POS_SALES/metadata/properties

The result is:

{ "Workflow:DataPipelineWorkflow": "DataPipelineWorkflow", "plugin:csv:validatingOutputFormat": "csv:validatingOutputFormat", "entity-name": "POS_SALES", "plugin:GCSFile:batchsource": "GCSFile:batchsource", "plugin:GCS:batchsink": "GCS:batchsink", "creation-time": "1613569893556", "description": "Data Pipeline Application", "plugin:text:validatingInputFormat": "text:validatingInputFormat", "version": "-SNAPSHOT", "schedule:dataPipelineSchedule": "dataPipelineSchedule:Data pipeline schedule", "Spark:phase-1": "phase-1", "plugin:Wrangler:transform": "Wrangler:transform" }

HTTP Responses

Status Codes

Description

Status Codes

Description

200 OK

The properties requested were returned as a JSON string in the body of the response which can be empty if there are no properties associated with the entity, or the entity does not exist.

Deleting Properties

To delete all user metadata properties for an application, dataset, or other entities including custom entities, submit an HTTP DELETE request:

DELETE /v3/namespaces/<namespace-id>/<entity-details>/metadata/properties

or, for all user metadata properties of a particular program of a specific application:

DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties

or, for a particular version of an artifact:

DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties

To delete a specific property for an application, dataset, or submit an HTTP DELETE request with the property key:

DELETE /v3/namespaces/<namespace-id>/<entity-type>/<entity-id>/metadata/properties/<key>

or, for a particular property of a program of a specific application:

DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/properties/<key>

or, for a particular version of an artifact:

DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/properties/<key>

or, for a custom entity like field of a dataset:

DELETE /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/properties/<key>

Parameter

Description

Parameter

Description

namespace-id

Namespace ID.

entity-details

Hierarchical key-value representation of the entity.

app-id

Name of the application.

program-type

One of mapreducesparkworkflowsservices, or workers.

program-id

Name of the program.

artifact-id

Name of the artifact.

artifact-version

Version of the artifact.

dataset-id

Name of the dataset.

field-name

Name of the field.

key

Metadata property key.

HTTP Responses

Status Codes

Description

Status Codes

Description

200 OK

The method was successfully called, and the properties were deleted, or in the case of a specific key, were either deleted or the key was not present, or the entity itself was not present.

Note: When using this API, only properties in the user scope can be deleted.

Metadata Tags

Adding Tags

To add user metadata tags for an application, dataset, or other entities including custom entities, submit an HTTP POST request:

POST /v3/namespaces/<namespace-id>/<entity-details>/metadata/tags

or, for a particular program of a specific application:

POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags

or, for a particular version of an artifact:

POST /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags

or, for a custom entity like field of a dataset:

POST /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/tags

with the metadata tags, as a list of strings, passed in the JSON request body:

["tag1", "tag2"]

Parameter

Description

Parameter

Description

namespace-id

Namespace ID.

entity-details

Hierarchical key-value representation of the entity.

app-id

Name of the application.

program-type

One of mapreducesparkworkflowsservices, or workers.

program-id

Name of the program.

artifact-id

Name of the artifact.

artifact-version

Version of the artifact.

dataset-id

Name of the dataset.

field-name

Name of the field.

HTTP Responses

Status Codes

Description

Status Codes

Description

200 OK

The tags were set.

Note: When using this API, tags can be added to the metadata of the specified entity only in the user scope.

Retrieving Tags

To retrieve user metadata tags for an application, dataset, or other entities including custom entities, submit an HTTP GET request:

GET /v3/namespaces/<namespace-id>/<entity-details>/metadata/tags[?scope=<scope>]

or, for a particular program of a specific application:

GET /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags[?scope=<scope>]

or, for a particular version of an artifact:

GET /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags[?scope=<scope>]

or, for a custom entity like field of a dataset:

GET /v3/namespaces/<namespace-id>/dataset/<dataset-id>/field/<field-name>/metadata/tags[?scope=<scope>]

with the metadata tags returned as a JSON string in the return body:

["tag1", "tag2"]

Parameter

Description

Parameter

Description

namespace-id

Namespace ID.

entity-details

Hierarchical key-value representation of the entity.

app-id

Name of the application.

program-type

One of mapreducesparkworkflowsservices, or workers.

program-id

Name of the program.

artifact-id

Name of the artifact.

artifact-version

Version of the artifact.

dataset-id

Name of the dataset.

field-name

Name of the field.

scope

Optional scope filter. If not specified, properties in the user and system scopes are returned. Otherwise, only properties in the specified scope are returned.

HTTP Responses

Status Codes

Description

Status Codes

Description

200 OK

The tags requested were returned as a JSON string in the body of the response which can be empty if there are no tags associated with the entity or entity does not exist.

Removing Tags

To delete all user metadata tags for an application, dataset, or other entities including custom entities, submit an HTTP DELETE request:

DELETE /v3/namespaces/<namespace-id>/<entity-details>/metadata/tags

or, for all user metadata tags of a particular program of a specific application:

DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags

or, for a particular version of an artifact:

DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags

To delete a specific user metadata tag for an application, dataset, or submit an HTTP DELETE request with the tag:

DELETE /v3/namespaces/<namespace-id>/<entity-type>/<entity-id>/metadata/tags/<tag>

or, for a particular user metadata tag of a program of a specific application:

DELETE /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/metadata/tags/<tag>

or, for a particular version of an artifact:

DELETE /v3/namespaces/<namespace-id>/artifacts/<artifact-id>/versions/<artifact-version>/metadata/tags/<tag>

or, for a custom entity like field of a dataset:

DELETE /v3/namespaces/<namespace-id>/datasets/<dataset-id>/field/<field-name>/metadata/tags/<tag>

Parameter

Description

Parameter

Description

namespace-id

Namespace ID.

entity-details

Hierarchical key-value representation of the entity.

app-id

Name of the application.

program-type

One of mapreducesparkworkflowsservices, or workers.

program-id

Name of the program.

artifact-id

Name of the artifact.

artifact-version

Version of the artifact.

dataset-id

Name of the dataset.

field-name

Name of the field.

tag

Metadata tag.

HTTP Responses

Status Codes

Description

Status Codes

Description

200 OK

The method was successfully called, and the tags were deleted, or in the case of a specific tag, was either deleted or the tag was not present, or the entity itself was not present.

Note: When using this API, only tags in the user scope can be deleted.

Searching for Metadata

CDAP supports searching metadata of entities. To find which applications, datasets, etc. have a particular metadata property or metadata tag, submit an HTTP GET request:

GET /v3/namespaces/<namespace-id>/metadata/search?query=<term>[&target=<entity-type>&target=<entity-type2>...][&<option>=<option-value>&...]

Parameter

Description

Parameter

Description

namespace-id

Namespace ID.

query

Query term, as described below. Query terms are case-insensitive.

entity-type

Restricts the search to either all or specified entity types: allartifactappdatasetprogramview.

option

Options for controlling cursors, limits, offsets, the inclusion of hidden and custom entities, and sorting:

Option NameOption Value, Description, and NotessortThe sorting order for the results being returned. Default is to sort search results as a function of relative weights for the specified search query. Specify the sort order as the field name followed by the sort order (either asc or desc) with a space separating the two. Using URL-encoding, an example: &sort=creation-time+asc. Note that this field is only applicable when the search query is *.offsetThe number of search results to skip before including them in the returned results. Default is 0.limitThe number of metadata search entities to return in the results. By default, there is no limit.cursorCursor to move to in the search results. This would be a value returned in the cursors field of a response of a previous metadata search request. Note that this field is only applicable when the search query is *.numCursorsDetermines the number of chunks of search results of size limit to fetch after the first chunk of size limit. This parameter can be used to roughly estimate the total number of results that match the search query. Only used when the search query is *.showHiddenBy default, metadata search hides entities whose name starts with an _ (underscore) from the search results. Set this to true to include these hidden entities in search results. Default is false.showCustomBy default, metadata search hides custom entities from the search results for backward compatibility. Set this to true to include these custom entities in search results. Default is false.entityScopeThe scope of entities for the metadata search. By default, all entities will be returned. Set this to USER to include only user entities; set this to SYSTEM to include only system entities.

Format for an option: &<option-name>=<option-value>

Entities that match the specified query and entity type are returned in the body of the response in JSON format:

{ "cursors": [ ], "limit": 20, "numCursors": 0, "offset": 0, "showHidden": false, "sort": "creation-time DESC", "total": 2, "entityScope": [ "SYSTEM" ] "results": [ { "entityId": "application": "WordCount", "entity": "PROGRAM", "namespace": "default", "program": "RetrieveCounts", "type": "Service", "version": "-SNAPSHOT", "metadata": "SYSTEM": "properties": "creation-time": "1482091087438", "description": "A service to retrieve statistics, word counts, and associations.", "entity-name": "RetrieveCounts", "version": "-SNAPSHOT" } ] "tags": [ "Realtime", "Service" ] }

HTTP Responses

Status Codes

Description

Status Codes

Description

Created in 2020 by Google Inc.