Loom REST API V1 - Teradata Developer Exchange

advertisement
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Loom REST API
API Version ‘V1’
This documents the ‘v1’ version of the Loom API. This version is applicable for Loom 2.2 and
beyond.
API Overview
Operations in Loom are all managed through a set of HTTP-based APIs. While all operations can
be performed through the Loom application, users may also access the APIs directly.
The Loom API is designed to be versioned. The initial version is ‘v1’, accessible through the Loom
server URL from the root:
http://<host>:<port>/api/v1/
Basic Organization of API
The Loom API is organized into two parts: the Resource API centers around resources managed in
the Loom Registry: sources, datasets, etc; and the Activity API focuses on activities that users can
perform with Loom: executing transforms, accessing data, etc.
Resources and Resource API
The Resource API is focused on entities, which are exposed as resources according to their types.
The routes shown in the following tables are relative to the Loom root, e.g. the ‘datasets’ resources
are accessible from http://<host>:<port>/api/v1/datasets.
Resource
Description
Route
sources
Sources of data, not directly controlled by Loom.
/sources
datasets
Sets of data whose lifecycles are controlled by Loom.
/datasets
processes
Processing performed on datasets.
/processes
jobs
Tracking of asynchronous processes executed through
Loom.
/jobs
users
User accounts.
/users
glossaries
Glossaries of business terms.
/glossaries
-1-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
relationships Dynamic relationships between entities.
/relationships
The ‘generic’ form of the Resource API is accessed through the following endpoints. Note that
these parts of the API are not as well-developed as the type-specific instance methods shown
above.
Resource
Description
Route
entities
Generic entities (irrespective of type)
/entities
types
Type information.
/types
Activity API
The Activity API is focused on activities that users can perform against the Loom system.
Activity
Description
Route
connection
User login, logout, ping
/connect
search
Related to search - fulltext search, filters for search, etc
/search
data
Data access, reading files and getting data from datasets
/data
execution
Executing transformations against datasets
/execute
environment
Environment interactions - browse the file system, etc
/environ
system
Loom system information
/system
API Standard Response
Every method returns the following standard response:
{
results: { ...the actual result(s) ... },
related: { ...any entities that are referred to by id from within 'results'... },
count: ...the size of 'results' ... ,
errors: [ ...any errors that occurred in processing the request... ]
}
Note that the results will be returned as an array for ‘many’ requests, and as a scalar for ‘individual’
-2-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
requests (such as when <id> is in URL). When the return value is the unique identifier of an entity,
the results will contain a map with a single key, 'entity/id'.
In the documentation of each method, generally only the ‘results’ part of the response structure are
described.
Related Section
The ‘related’ section is a map of entity ID’s to properties. The entity ID’s will match property values
returned in the ‘results’ section. In that way, results containing properties whose values are entity
ID’s can be resolved into a more human-processable form. For example, the ‘entity/createdBy’
value is the unique entity identifier of a user in the system; for display purposes though, usually the
user’s actual name is preferred. This can be obtained from the related section using the createdBy
value from the results section to look up the key, from which the user’s name can be obtained.
Example:
{
“results”: {
"entity/name": "SomeEntity",
"entity/description": "The entity description.",
"entity/tags": "tag1, tag2, tag3"
"entity/folder": "test1/test2",
"entity/createdBy": "52e28419-2c48-436d-8e7c-643cf331e071",
"entity/modifiedBy": "52e28419-1725-47bd-9884-6149e7b9b446",
}
"related": {
"52e28419-2c48-436d-8e7c-643cf331e071": { "user/username" : "smusial" }
"52e28419-1725-47bd-9884-6149e7b9b446": { “user/username” : “bgibson” }
}
}
-3-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Resources and Resource API
This API exposes the Loom registry entities as resources, using standard REST conventions. The
Resource API is organized by type (e.g., Source, Dataset, etc), with two generic parts for entity
instances and entity types.
For each section, the following information is provided:
● Attributes - orange indicates a domain entity, magenta a struct (subordinate to an entity)
● Requests - summary of the methods available for the resource
● Request Details - calling details for each method
Note that every Entity in Loom automatically has all the core Entity attributes (entity/id, entity/name,
etc). See the Model Overview below for details on the core Entity attributes.
Sources
Sources represent sources of data whose lifecycles are not controlled by Loom. Sources are
containers of data units that conform to some structural form (tables are currently supported by
Loom). Sources are similar in structure to Datasets, but are semantically different, as Datasets are
managed by Loom.
Source Entity Attributes
The Source entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
data/structuralForm
string
The structural form of the data contained within the data
container. Currently, the only supported form is “table”.
data/structure
array of embedded struct
Default structure for the data units contained in this source,
defined by a schema (or possibly, multiple schemas), of type
‘data/Schema’. Applies to all data units in the source, unless
specifically overridden by a data unit.
source/expandable
boolean
Whether the source is an expandable collection or not.
source/metadataAccessible
boolean
Whether the source's metadata can be accessed by Loom.
source/dataAccessible
boolean
Whether the source's data can be accessed by Loom.
-4-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
source/entityState
string
Indicator of the state the entity is in. One of ‘potential’, ‘active’,
or ‘deleted.
persist/storage
reference
Reference (pointer via entity ID) to a persist/Storage.
data/dataUnit
array of references
References (pointers via entity ID) to one or more
data/DataUnits.
DataUnit Attributes
A Source is a data container, represented by the type DataContainer. All data containers own a
set of DataUnits. DataUnits represent the actual data (although in most cases, they are merely
proxies for the data, and do not physically contain it). DataUnits are first-class entities, with unique
identifiers, so they may be referenced from outside the context of the containing data container.
Name [Type]
Description
data/structuralForm
string
The structural form of the data contained within the data unit.
Currently, the only supported form is “table”.
data/structure
array of embedded struct
Structure for the data, defined by a schema (or possibly,
multiple schemas), of type ‘data/Schema’. Overrides the default
structure for the containing source, to set the structure on this
data unit.
persist/storage
reference
Reference (pointer via entity ID) to a persist/StorageUnit.
Schema Attributes
Data units contain one or more schemas. A Schema is a structure; it is fully-owned by a DataUnit
and does not (currently) have a unique identifier
Name [Type]
Description
data/structuralForm
string
The structural form that the schema represents.
Currently, the only supported form is “table”.
data/isDefault
boolean
If true (or nil if one schema), the schema is the default
one for the data unit
The type of the schema depends on the structural form of the data unit. For table data units, with
structural form of ‘table’, the schema type is TableSchema.
Storage Attributes
A Source is physically persisted to some system (HDFS, database, etc). The Storage entity
represents the persistence information. For example, a source may hold its information in a
directory of files, or in a Hive database. Storage is a container; it owns a set of StorageUnits. For
-5-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
example, if the Storage is a directory, the storage units are individual files within the directory.
Name [Type]
Description
persist/storageType
string
The type of storage. E.g., ‘file/text’, ‘file/binary’, ‘rdb/hive’,
‘rdb/generic’.
persist/location
string
Location of storage; used to connect to or otherwise access the
storage.
persist/application
string
Application that can process this type of storage.
persist/storageUnit
array of references
References (pointers via entity IDs) to the storage units for this
storage.
persist/format
embedded struct
Default format for the storage; applies to all storage units unless
overridden by a storage unit.
There are additional properties that apply for extensions to the base Storage. For example, a
FileSet.
StorageUnit Attributes
StorageUnits are proxies for the individual units that hold the data that is exposed from a Source
as a DataUnit. For example, a Source represent the actual data (although in most cases, they are
merely proxies for the data, and do not physically contain it). DataUnits are first-class entities, with
unique identifiers, so they may be referenced from outside the context of the containing data
container.
Name [Type]
Description
persist/location
string
Absolute location of storage unit, if applicable.
persist/relativeLocation
string
Relative location of storage unit in storage.
persist/containsData
boolean
True if storage unit contains data (will get exposed from source
as a DataUnit).
persist/format
string
Storage format for storage unit; overrides storage-level format.
persist/formatType
string
Type of storage format.
There are additional properties that apply for extensions to the base StorageUnit. For example, a
FileSetFile.
Format Attributes
Formats define how the bits in persistent storage are to be read and parsed. There is a default
-6-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
format nested under a Storage, which applies to all Storage Units in that Storage unless
overridden. Each StorageUnit can explicitly define a Format, which takes precedence over the
default stored in its Storage container.
Name [Type]
Description
persist/formatType
string
Type of storage format. E.g., ‘textdelim’ or ‘text/pattern’ for
storage types of ‘file/text’; ‘binary/avro’ for storage types of
‘file/binary’. There is no format for storage types of ‘rdb/*’.
The interesting properties are associated with specific subclasses of Format, e.g., in
DelimitedFormat and PatternFormat.
Source Summary Attributes
The SourceSummary structure captures a ‘view’ of a Source entity, pulling in information from
related entities (such as scan measurements). Instances of these structs are returned from the
‘summary’ API methods.
Name [Type]
Description
summary/entityID
string
The unique identifier of the entity that the summary is for.
summary/entityName
string
The name of the entity that the summary is for.
summary/entityDescription
string
The description of the entity that the summary is for.
summary/entityCreatedAt
instant
The creation timestamp of the entity that the summary is for.
summary/entityCreatedBy
string
The username of the person who created the entity.
summary/entityModifiedAt
instant
The timestamp when the entity was last modified.
summary/entityModifiedBy
string
The username of the person who last modified the entity.
data/structuralForm
string
The structural form of the data contained within the data
container. E.g., “table”.
persist/storageType
string
Storage form: file/text, file/binary, rdb/hive, rdb/generic, etc.
persist/location
string
The location of the source; duplicate of Storage location, for
convenience.
data/expandable
boolean
Whether the source is an expandable collection or not,
data/autoUpdate
Whether the source will be auto-updated as new files, etc are
-7-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
boolean
created.
source/metadataAccessible
boolean
Whether the source's metadata can be accessed by Loom.
source/dataAccessible
boolean
Whether the source's data can be accessed by Loom.
source/entityState
string
Indicator of the lifecycle state of the source entity. One of
‘active’ or ‘potential’.
summary.source/nDataset
long
Number of datasets derived from the source.
summary.source/nDataUnit
long
Number of data units exposed by the source.
summary.data/dataUnit
array of embedded structs
Relationship to summary items for each data unit
(DataUnitSummary structs).
For a particular source instance, a SourceSummary structure contains one DataUnitSummary for
each data unit in the source.
Name [Type]
Description
summary/entityID
string
The unique identifier of the entity that the summary is for.
summary/entityName
string
The name of the entity that the summary is for.
summary/entityDescription
string
The description of the entity that the summary is for.
summary/entityCreatedAt
instant
The creation timestamp of the entity that the summary is for.
summary/entityCreatedBy
string
The username of the person who created the entity.
summary/entityModifiedAt
instant
The timestamp when the entity was last modified.
summary/entityModifiedBy
string
The username of the person who last modified the entity.
data/structuralForm
string
The structural form of the data contained within the data
container. E.g., “table”.
summary.data/nRow
long
Number of rows in the (2-dimensional) data item
summary.data/nCol
long
Number of columns or fields in the data item
summary.data/sizeBytes
Size of the data item, in bytes; null if unknown
-8-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
long
summary.source/nameInSource The name of the data entity in its native source (e.g. file path for
string
files)
Requests
Request
Description
GET sources
Get all sources matching the provided filters.
POST sources
Create a new source, given entity metadata and storage
information.
GET sources/default
Get a default source instance, for use when creating a new
source.
POST sources/default
Create a new source given a location, using all default settings.
GET sources/summary
Get summaries of all sources matching the provided filters.
GET sources/<id>
Get the source with the specified entity ID.
PATCH sources/<id>
Modify the source with the specified ID with updated attributes
and storage information
DELETE sources/<id>
Delete the source with the specified entity ID.
GET sources/<id>/summary
Get the summary of the specified source.
POST /sources/<id>/data_units
Add data unit to an existing source.
The ‘default’ methods are convenience functions, to help in defining a source. The ‘GET’ version
provides default settings, which can be edited by users, and then saved to Loom using ‘POST
/sources’. The ‘post’ version simply creates a source using all defaults, with no user interaction.
Note on setting Data Unit schemas:
There are several possible ways in which Loom will determine the structure (i.e. schema) with
which to access data in a data unit contained in a given source.
● The schema may be read directly from the physical storage for that source, e.g. a file
header or the Hive metastore.
● The schema may be explicitly set by the user or by the API client.
● The schema may be omitted, and inherited from the default schema set on the source
container.
The API methods which result in creation or modification of data unit entities (POST sources,
PATCH sources/<id>, POST sources/<id>/data_units) all use the following rules to determine the
schema that ultimately gets used for a given data unit entity, listed in priority order:
-9-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
1. If the 'data/structure' property is set on a data unit entity passed through the API and is a
non-empty array, then this property is stored unchanged on the resulting data unit entity in
the registry.
a. No validation is done of such a manually-supplied schema; if it is inconsistent with
the underlying data (e.g. missing fields, unknown fields) then this may result in
missing data or errors when accessing the underlying data through Loom.
2. If the 'data/structure' property is explicitly set to null on the data unit entity passed through
the API, or is an empty array, then no structure is set on the resulting data unit entity stored
in the registry; the structure is inherited from the parent source.
a. If no default data/structure is set on the source, then an error code is returned and
the operation will fail.
b. No validation is done of the source default schema with respect to the data unit's
data; if it is inconsistent with the underlying data (e.g. missing fields, unknown fields)
then this may result in missing data or errors when accessing the underlying data
through Loom.
3. If no data unit entity is supplied for a table to be created, or a data unit entity is passed
through the API without a 'data/structure' property, and the 'source/metadataAccessible'
property is true for the source, then Loom will infer the table structure from the underlying
storage. This is the recommended mode of operation when working with Loom's
supported storage formats.
a. If Loom can directly access metadata about the structure of the data unit through
underlying storage format (e.g. an embedded schema in a file header, or a schema
managed by another metastore) then this structure is translated into a schema entity
and stored as part of the tata unit entity in Loom, and used whenever the data is
accessed through Loom.
b. If the structure for the underlying data cannot be directly determined by examining
the underlying storage, as might be the case for unstructured or semi-structured text
formats such as CSV or log files, then a structure is generated for the data unit
based on examining a sample of the underlying data, using auto-generated field
names.
i.
If a default structure is present on the parent source entity, and matches the
generated structure in number and type of fields, then no structure is set on
the resulting data unit entity in the registry; the source default schema is
used whenever accessing this data unit.
ii.
If the parent Source entity has no default structure, or it has one that does
not match the generated structure in number or type of fields, then the
generated structure is stored in the data/structure property of the resulting
DataUnit entity in the registry, and will override the Source default structure.
4. If no DataUnit entity is supplied for a table to be created, or a DataUnit entity is passed
through the API without a 'data/structure' property, and the 'source/metadataAccessible'
property is false for the Source, then Loom will not include a structure on the DataUnit entity
that is stored in the registry.
Request Details
-10-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Paths are all relative to the root of the API, including version number.
Get Registered Sources
Get all sources matching the provided filters.
Path:
Method:
Parameters:
Returns:
See Also:
sources
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_source_storagetype: filter on the source’s storage type; values ‘all’ (default)
or one of the valid values of persist/storageType
filter_source_entitystate: filter on the source’s entity state; values ‘all’ (default) or
one of the valid values of source/entityState
filter_ndataunit: filter on the number of data units in the source
filter_use_frequency: filter on how often the source has been used to create
datasets
array of Source entities
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
Register (Create) a Source
Create a new source, given entity metadata and storage information. The values returned from
‘GET /sources/default’ can be edited and passed into this method to create a new source.
Path:
Method:
Parameters:
Returns:
Notes:
sources
POST
entity: a set of core entity properties (e.g., entity/name, entity/description, etc);
optional, if name can be derived from the storage location or storage name
data_units: an array of DataUnit’s, which define metadata and schemas of data
units (tables) corresponding to storage units specified via the storage_units
parameter. The correspondence between data units and storage units is through
their entity names. Optional - if not specified, all storage units with their
source/containsData property set to true will be exposed as tables named the same
as the storage units, and the schemas will be derived from the underlying physical
persistence if the source/metadataAccessible property is true and they are not
explicitly set in the corresponding data unit entity.
storage: a Storage instance, which defines the container persistence; required
storage_units: an array of StorageUnit’s, which define the unit-level persistence;
required - there must be at least 1 storage unit. All storage units with their
containsData property set to true will be exposed as tables in the source.
id: ID of created source
The storage units drive the source container contents. Any storage unit with its
-11-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
source/containsData property set to true will be exposed as a table (data unit) in the
source. If the data_units section is not specified, or if there is not a data unit in that
section corresponding to a storage unit, then the name of the data unit will be
derived from that of the storage unit, and the schema will be inferred from the
physical persistence (e.g., a file on disk) if source/metadataAccessible is true. If
there is a data unit associated with a storage unit, then that data unit can define the
descriptive properties (such as description, tags), and the schema to be used. The
data unit is tied to its corresponding storage unit through their names, which must
exactly match.
Get a Source Instance with Default Settings
Get a default source instance, for use when creating a new source. The values returned from this
method can be edited, and used with ‘POST /sources’ to create a new source in Loom.
Path:
Method:
Parameters:
Returns:
sources/default
GET
location: the location of the source (semantics based on storageType); required
storage_type: the type of storage; optional, if server can derive it
format_type: the type of format to use; optional (not needed for all storage types)
entity: source object (no entity/id)
storage: Storage, without embedded StorageUnits
storage_units: StorageUnits, separate from Storage
Register (Create) a Source using Default Settings
Create a new source given a location, using all default settings. This is a convenience to combine
the calls to ‘GET /sources/default’ and ‘POST /sources’ when no modification of the default source
is required.
Path:
Method:
Parameters:
Returns:
sources/default
POST
location: the location of the source (semantics based on storage_type); required
storage_type: the type of storage; optional, if server can derive it
format_type: the type of format used to read and parse the source; optional (not
needed for all storage types); defaults to ‘text/delim’ if storage type==’file/text’, else
null
entity: a bundle of core properties (e.g., entity/name, entity/description, etc) to be
set on the created Source entity; optional (name can be derived from location)
id: ID of created source
Get Source Summaries
Get summaries of all sources matching the provided filters.
-12-
Loom REST API
Path:
Method:
Parameters:
Returns:
See Also:
Document Revision 0.10 - 01 Aug 2014
sources/summary
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_source_storagetype: filter on the source’s storage type; values ‘all’ (default)
or one of the valid values of persist/storageType
filter_source_entitystate: filter on the source’s entity state; values ‘all’ (default) or
one of the valid values of source/entityState
filter_ndataunit: filter on the number of data units in the source
filter_use_frequency: filter on how often the source has been used to create
datasets
array of SourceSummary structs
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
Get a Source Instance
Get the source with the specified entity ID.
Path:
Method:
Parameters:
Returns:
sources/<id>
GET
none, except for entity/id that is built into the URL
entity: source entity; persist/storage matches Storage’s entity/id
storage: Storage; persist/storage references StorageUnits’ entity/id’s
storage_units: StorageUnits, separate from Storage
Replace a Source Instance
Replace the source with the specified ID with updated attributes and storage information.
Path:
Method:
Parameters:
sources/<id>
PATCH
entity: a set of core entity properties (e.g., entity/name, entity/description, etc);
optional, if name can be derived from the storage location or storage name; if
entity/id is included (e.g., if using results from GET /sources/<id>), it must match the
ID in the URL, or it is considered an error
data_units: an array of DataUnit’s, which define metadata and schemas of data
units (tables) corresponding to storage units specified via the storage_units
parameter. The correspondence between data units and storage units is through
their entity names. Optional - if not specified, all storage units with their
source/containsData property set to true will be exposed as tables named the same
as the storage units, and the schemas will be derived from the underlying physical
persistence if the source/metadataAccessible property is true and they are not
-13-
Loom REST API
Returns:
Notes:
Document Revision 0.10 - 01 Aug 2014
explicitly set in the corresponding data unit entity.
storage: the Storage to replace the existing Storage; optional, if not changing
Storage
storage_units: the StorageUnits to replace the existing StorageUnits, en masse;
optional, if not changing StorageUnits
id: ID of source
The properties specified under the entity parameter are merged in with the existing
properties.
Delete a Source Instance
Delete the source with the specified ID.
Path:
Method:
Parameters:
Returns:
sources/<id>
DELETE
none, except for entity/id that is built into the URL
nothing
Get a Source Instance’s Summary
Get the summary of the specified source.
Path:
Method:
Parameters:
Returns:
sources/<id>/summary
GET
none, except for entity/id that is built into the URL
SourceSummary struct
Create a DataUnit in an Existing Source
Creates a new data unit (table) in an existing source. This is useful to represent a table that was
added to the source through an external system (as opposed to through Loom).
Path:
Method:
Parameters:
sources/<id>/data_units
POST
location: is a relative path to the item in the source (if absolute path provided, will
attempt to reconcile with source location, and extract relative path if compatible)
format: is a Format object, for interpreting the persisted data pointed to by location.
Optional; will use source’s Storage format (the ‘default’ format) if not specified; if
specified, must be of the same type as the source’s Storage format.
entity: is a set of properties to assign to the created table. Optional, will derive table
name from location if not specified.
schema: an optional schema to associate with the data unit as the ‘default’ schema
for that data unit. This is only allowed if the source that the data unit is being added
to has the source/metadataAccessible property set to false. If this parameter is set,
-14-
Loom REST API
Returns:
Notes:
See Also:
Document Revision 0.10 - 01 Aug 2014
then the format parameter is optional, and if specified, will not be used to read
metadata from the persistent store.
id: The ID of the data unit
1. If the source’s source/metadataAccessible property is set to true, then the location
and format will be be used to read the metadata and infer a schema for the
associated data unit. If the source’s source/dataAccessible property is set to true,
then the new table’s associated storage unit will have its persist/containsData
property set to true; otherwise that property will be set to false.
2. A schema can only be specified if the source’s source/metadataAccessible
property is false. Otherwise, as noted above, the schema will be inferred from the
location and format information.
3. If a schema is specified, the format property is for informational purposes only; it
is not used to infer the table metadata.
POST /sources: to create a new source
___________________________________________
Datasets
Datasets represent sets of data whose lifecycles are controlled by Loom. Datasets originate from
Sources, and from other Datasets through transformations. Datasets are containers of data units
that conform to some structural form (such as tables). Datasets are similar in this way to Sources,
but are semantically different, as Datasets are managed by Loom whereas Sources are controlled
by some external entity (and so may be changed without direct involvement of Loom).
Dataset Entity Attributes
The Dataset entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
data/structuralForm
string
The structural form of the data contained within the data
container. E.g., “table”.
dataset/entityState
string
Indicates the ‘state’ the entity is currently in. One of ‘pending’,
‘active’, or ‘deleted’. A dataset is ‘pending’ if it is in the process
of being created from a long-running transformation.
dataset/sourcedFrom
reference
The Source identifier, if the dataset was created directly from a
source Null if the dataset was created through a transformation.
persist/storage
reference
Reference (pointer via entity ID) to a persist/Storage.
data/dataUnit
array of references
References (pointers via entity ID) to data/DataUnits.
-15-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
DataUnit Attributes
A Dataset is a data container, represented by the type DataContainer. All data containers own a
set of DataUnits. DataUnits represent the actual data (although in most cases, they are merely
proxies for the data, and do not physically contain it). DataUnits are first-class entities, with unique
identifiers, so they may be referenced from outside the context of the containing data container.
Name [Type]
Description
data/structuralForm
string
The structural form of the data contained within the data unit.
Currently, the only supported form is “table”.
data/structure
array of embedded struct
Structure for the data, defined by a schema (or possibly,
multiple schemas), of type ‘data/Schema’.
persist/storage
reference
Reference (pointer via entity ID) to a persist/StorageUnit.
Schema Attributes
Data units contain one or more schemas. A Schema is a structure; it is fully-owned by a DataUnit
and does not (currently) have a unique identifier
Name [Type]
Description
data/structuralForm
string
The structural form that the schema represents.
Currently, the only supported form is “table”.
data/isDefault
boolean
If true (or nil if one schema), the schema is the default
one for the data unit
The type of the schema depends on the structural form of the data unit. For table data units, with
structural form of ‘table’, the schema type is TableSchema.
Storage Attributes
A Dataset is physically persisted to some system. The Storage entity represents the persistence
information. For the most part, storage information for datasets is hidden from Loom users (unlike
for Sources).
See the persistence information under Sources.
Dataset Summary Attributes
The DatasetSummary structure captures a ‘view’ of a Dataset entity, pulling in information from
related entities (such as scan measurements). Instances of these structs are returned from the
‘summary’ API methods.
-16-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Name [Type]
Description
summary/entityID
string
The unique identifier of the entity that the summary is for.
summary/entityName
string
The name of the entity that the summary is for.
summary/entityDescription
string
The description of the entity that the summary is for.
summary/entityCreatedAt
instant
The creation timestamp of the entity that the summary is for.
summary/entityCreatedBy
string
The username of the person who created the entity.
summary/entityModifiedAt
instant
The timestamp when the entity was last modified.
summary/entityModifiedBy
string
The username of the person who last modified the entity.
data/structuralForm
string
The structural form of the data contained within the data
container. E.g., “table”.
data/expandable
boolean
Whether the dataset is an expandable collection or not,
data/autoUpdate
boolean
Whether the dataset will be auto-updated as its associated
source (sourcedFrom) is updated.
dataset/entityState
string
Indicator of the lifecycle state of the source entity. One of
‘active’ or ‘potential’.
persist/storageType
string
Storage form: file/text, rdb/hive, etc.
summary.dataset/nDataUnit
long
Number of ‘data units’ (e.g., tables) in the dataset.
summary.dataset/sourcedFrom
string
Name of Source, if dataset directly derived from source.
summary.dataset/nUses
long
Number of times Dataset has been transformed.
summary.dataset/lastProcessedBy
Who last used the dataset in a transformation (username).
string
summary.dataset/lastProcessedAt
When the dataset was last used in a transformation.
instant
summary.data/dataUnit
array of embedded structs
Relationship to summary items for each data unit
(DataUnitSummary structs).
For a particular dataset instance, a DatasetSummary structure contains one DataUnitSummary
-17-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
for each data unit in the dataset.
Name [Type]
Description
summary/entityID
string
The unique identifier of the entity that the summary is for.
summary/entityName
string
The name of the entity that the summary is for.
summary/entityDescription
string
The description of the entity that the summary is for.
summary/entityCreatedAt
instant
The creation timestamp of the entity that the summary is for.
summary/entityCreatedBy
string
The username of the person who created the entity.
summary/entityModifiedAt
instant
The timestamp when the entity was last modified.
summary/entityModifiedBy
string
The username of the person who last modified the entity.
data/structuralForm
string
The structural form of the data contained within the data
container. E.g., “table”.
summary.data/nRow
long
Number of rows in the (2-dimensional) data item
summary.data/nCol
long
Number of columns or fields in the data item
summary.data/sizeBytes
long
Size of the data item, in bytes; null if unknown
summary.data/transformedFrom Name of transformation that emitted the table (if from a
string
Transformation, not ‘from source’).
Note on setting Data Unit schemas:
Unlike the Source API, where data units can be saved in the registry without a schema and inherit
the source default schema at runtime, the Dataset API explicitly attaches a schema to each data
unit in datasets saved in the registry. The dataset default schema, if present, will be copied onto
each data unit passed to the API that does not have a schema provided by the client. The copying
of the default schema onto the data unit prevents subsequent modifications to the default schema
from causing problems with processing jobs that were written against a previous version of the
schema. If a request is made to store a data unit without a schema, and there is no dataset default
schema, then an error will be raised. This applies to all API methods that result in creation or
modification of a data unit (POST and PUT).
Requests
-18-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Request
Description
GET datasets
Get all datasets matching the provided filters.
POST datasets
Create a new dataset, given a local instance.
GET datasets/default
Get a default dataset tied to an existing source, for use when
creating a new dataset.
POST datasets/default
Create a new dataset tied to an existing source, using all
default settings.
GET datasets/summary
Get summaries of all datasets matching the provided filters.
GET datasets/<id>
Get the dataset with the specified entity ID.
PUT datasets/<id>
Replace the dataset with the specified ID with a local instance.
DELETE datasets/<id>
Delete the dataset with the specified entity ID.
GET datasets/<id>/summary
Get the summary of the specified dataset.
POST datasets/<id>/data_units
Add a data unit to an existing dataset.
Request Details
Paths are all relative to the root of the API, including version number.
Get Registered Datasets
Get all datasets matching the provided filters.
Path:
Method:
Parameters:
Returns:
See Also:
datasets
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_dataset_obtained_from: filter on where the dataset was directly obtained
from; values of ‘all’ (default), ‘source’, or ‘dataset’.
filter_ndataunit: filter on the number of data units (tables) in datasets; values ‘all’
(default) or one of several discretized facets
filter_use_frequency: filter on how often the dataset has been used by processes
(in transforms, etc)
array of Dataset entities
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
Register (Create) a Dataset
-19-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Create a new dataset from a local dataset object. The values returned from ‘GET
/datasets/default’ can be edited and passed into this method to create a new dataset.
Path:
Method:
Parameters:
Returns:
datasets
POST
local Dataset object
id: ID of created dataset
Get a Dataset Instance with Default Settings
Get a default dataset instance, for use when creating a new dataset. The values returned from this
method can be edited, and used with ‘POST /datasets’ to create a new dataset in Loom.
Path:
Method:
Parameters:
Returns:
datasets/default
GET
source_id: the ID of a source registered in Loom; required
dataset object (no entity/id)
Register (Create) a Dataset using Default Settings
Create a new dataset tied to an existing source, using all default settings. This is a convenience to
combine the calls to ‘GET /datasets/default’ and ‘POST /datasets’ when no modification of the
default dataset is required.
Path:
Method:
Parameters:
Returns:
datasets/default
POST
source_id: the ID of a source registered in Loom; required
entity: a bundle of core properties (e.g., entity/name, entity/description, etc) to be
set on the created Dataset entity; optional (will be obtained from source fields if not
present)
id: ID of created dataset
Get Dataset Summaries
Get summaries of all datasets matching the provided filters.
Path:
Method:
Parameters:
datasets/summary
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_dataset_obtained_from: filter on where the dataset was directly obtained
from; values of ‘all’ (default), ‘source’, or ‘dataset’.
filter_ndataunit: filter on the number of data units (tables) in datasets; values ‘all’
(default) or one of several discretized facets
-20-
Loom REST API
Returns:
See Also:
Document Revision 0.10 - 01 Aug 2014
filter_use_frequency: filter on how often the dataset has been used by processes
(in transforms, etc)
array of DatasetSummary structs
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
Get a Dataset Instance
Get the dataset with the specified entity ID.
Path:
Method:
Parameters:
Returns:
datasets/<id>
GET
none, except for entity/id that is built into the URL
entity: dataset, with embedded DataUnits
Replace a Dataset Instance
Replace the dataset with the specified ID with the new dataset object.
Path:
Method:
Parameters:
Returns:
datasets/<id>
PUT
the dataset to replace the existing dataset
id: ID of updated dataset
Delete a Dataset Instance
Delete the dataset with the specified ID.
Path:
Method:
Parameters:
Returns:
datasets/<id>
DELETE
none, except for entity/id that is built into the URL
nothing
Get a Dataset Instance’s Summary
Get the summary of the specified dataset.
Path:
Method:
Parameters:
Returns:
datasets/<id>/summary
GET
none, except for entity/id that is built into the URL
DatasetSummary struct
-21-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Create a DataUnit in an Existing Dataset
Creates a new data unit (table) in an existing dataset. There are two overloaded options available,
depending on which of the first parameters is passed in. The first creates a data unit in the dataset
corresponding to a new data unit in the associated (‘sourcedFrom’) source, if the dataset was
created directly from a source. The second is for when a table was added to the dataset through
an external process (as opposed to through a Loom process). The latter option should be used
with great care as incorrect use can result in datasets which are not processable by Loom.
Path:
Method:
datasets/<id>/data_units
POST
These are the parameters for the first overloaded option:
Parameters:
Returns:
source_data_unit_name is the name of a table in the source (pointed to by the
dataset’s ‘sourcedFrom’ property) to add to the dataset; optional; if unspecified, then
all ‘new’ tables in the source. If this is specified and the dataset does not have a
sourcedFrom field set, then an error will be raised.
schema: is a Schema struct, to define the structure of the new data unit (table)
columns. Required.
entity: is a set of properties to assign to the created table. Optional, will derive table
name from location if not specified.
id: The ID of the data unit
These are the parameters for the second overloaded option:
Parameters:
Returns:
location: is a relative (or absolute) path to the persisted item, either on disk (hdfs) or
in Hive. If a relative path is passed in, it will be used in conjunction with location in
the dataset’s Storage to determine the location of the persisted item on disk or in
Hive. If an absolute path is passed in, it will be checked for consistency with the
location in the dataset’s Storage, and then used as the location of the persisted item.
Required.
format: is a Format object, for interpreting the persisted data pointed to by location.
Optional; will use dataset’s Storage format (the ‘default’ format) if not specified; if
specified, must be of the same type as the dataset’s Storage format.
schema: is a Schema struct, to define the structure of the new data unit (table)
columns. Required.
entity: is a set of properties to assign to the created table. Optional, will derive table
name from location if not specified.
id: The ID of the data unit
___________________________________________
-22-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Processes
Processes represent processing performed on Datasets (and, to a lesser degree, on Sources).
Process Entity Attributes
The Process entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
process/processType
string
Type of process, corresponding to the ProcessDefintiion.
One of 'sql-query', 'hiveql’, ‘dataset-import', or user-defined
type.
process/processClass
string
General class of this process. One of ‘transform’, ‘importexport’, ‘descriptive’, or user-defined class.
process/processScope
string
The scope of the process in terms of its data inputs and
outputs.
One of ‘container’ or ‘dataunit’.
process/isExecutable
boolean
True if processes conforming to this definition are executable
process/entityState
string
Indicator of the state the entity is in. One of ‘active’ or ‘deleted.
process/argument
reference
Configuration arguments defining the process configuration
process/context
reference
Optional default data context for input; can be overridden for
ProcessUse
ProcessUse Attributes
A ProcessUse is a snapshot of a Process at the point in time that process is used. A process that
is executable (isExecutable = TRUE) is ‘used’ by virtue of its being executed, through POST
/execute/transform. A process that is not executable (isExecutable = FALSE) can still be used,
establishing a link between two or more data containers or data units; that is done via the POST
/processes/<process_id>/uses method.
The ProcessUse entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
process/process
reference
Reference (pointer via UUID) to the Process that this is a use
of.
process/processClass
string
General class of this process. One of ‘transform’, ‘importexport’, ‘descriptive’, or user-defined class.
-23-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
process/isExecuted
boolean
True if processes conforming to this definition was executed.
(Corresponds to isExecutable field of corresponding Process).
process/jobIdentifer
string
Stringified identifier of Job that was spawned, for informational
purposes. Not a formal UUID-reference, to avoid bi-directional
dependency.
process/argument
reference
Configuration arguments defining the process configuration .
process/context
reference
The data contexts for input and output.
Argument/ConfigArgument Attributes
Processes and ProcessUses contain one or more arguments. An Argument is a structure; it is
fully-owned by the Process or ProcessUse it is attached to, and does not have a unique identifier.
A ConfigArgument is an Argument with a value.
Name [Type]
Description
entity/name
string
The name of the argument.
process.param/index
long
The index of the argument (1, 2, etc).
process.arg/value
string
The value of the argument, corresponding to the
parameter's valueType
Context Attributes
Processes and ProcessUses contain one or more contexts. A Context is a structure; it is fullyowned by the Process or ProcessUse it is attached to, and does not have a unique identifier.
Name [Type]
Description
entity/name
string
The name of the context.
process.context/inout
string
The structural form that the schema represents.
Currently, the only supported form is “table”.
process.context/container
reference
Reference (pointer via UUID) of the data container that
the context is for.
process.context/dataUnitName
string
Name of the data unit in the data container. May be null
(missing) for container-level contexts used with
Processes with processScope = ‘container’.
-24-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Contexts associated with Processes (or their ProcessUses) which are container-level
(processScope = ‘container’) will not have a dataUnitName set.
Process Summary Attributes
The ProcessSummary structure captures a ‘view’ of a Process entity, pulling in information from
related entities (such as scan measurements). Instances of these structs are returned from the
‘summary’ API methods.
Name [Type]
Description
summary/entityID
string
The unique identifier of the entity that the summary is for.
summary/entityName
string
The name of the entity that the summary is for.
summary/entityCreatedAt
instant
The creation timestamp of the entity that the summary is for.
summary/entityCreatedBy
string
The username of the person who created the entity.
summary/entityModifiedAt
instant
The timestamp when the entity was last modified.
summary/entityModifiedBy
string
The username of the person who last modified the entity.
process/processType
string
Type of process, corresponding to the ProcessDefinition. One of
'sql-query', 'hiveql’, ‘'dataset-import', or user-defined type.
process/processClass
string
General class of this process. One of ‘transform’, ‘importexport’, ‘descriptive’, or user-defined class.
process/processScope
string
The scope of the process in terms of its data inputs and outputs.
One of ‘container’ or ‘dataunit’.
process/isExecutable
boolean
True if processes conforming to this definition are executable.
summary.process/nUses
long
Number of times transform has been used/executed.
summary.process/lastUse
The entity ID of the last process use.
reference (uuid)
summary.process/lastUsedBy
Who last used the process (username).
string
summary.process/lastUsedAt
When the process was last used (e.g., executed).
instant
-25-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Requests
Request
Description
GET processes
Get all processes matching the provided filters.
POST processes
Create a new process, given a local instance.
GET process/default
Get a default process, for use when creating a new process.
GET processes/summary
Get summaries of all processes matching the provided
filters.
GET processes/<id>
Get the process with the specified entity ID.
PUT processes/<id>
Replace the process with the specified ID with local
instance.
DELETE processes/<id>
Delete the process with the specified entity ID.
GET processes/<id>/summary
Get the summary of the specified process.
GET processes/<id>uses
Get all the ‘uses’ of the specified process
POST processes/<id>/uses
Create a new ‘use’ of the specified process.
GET processes/<id>/uses/<pu_id>
Get a specific process use for a specific process.
Request Details
Paths are all relative to the root of the API, including version number.
Get Registered Processes
Get all processes matching the provided filters.
Path:
Method:
Parameters:
Returns:
See Also:
processes
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_process_class: filter on the process class; values ‘all’ (default) or one of the
valid values of process/processClass
filter_executable: filter on whether executable or not; values ‘all’ (default), true,
false
filter_use_frequency: filter on how often the process has been used
array of Process entities
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
-26-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Register (Create) a Process
Create a new process, given entity metadata and storage information. The values returned from
‘GET /processes/default’ can be edited and then passed into this method to create a new process.
Path:
Method:
Parameters:
Returns:
processes
POST
local instance of process
id: ID of created process
Get a Process Instance with Default Settings
Get a default processinstance, for use when creating a new process. The values returned from this
method can be edited, and used with ‘POST /processes’ to create a new process in Loom.
Path:
Method:
Parameters:
Returns:
processes/default
GET
container_id: the ID of a dataset or source registered in Loom
data_unit_name: the name of a data unit in the container
process object (no entity/id)
Get Process Summaries
Get summaries of all processes matching the provided filters.
Path:
Method:
Parameters:
Returns:
See Also:
processes/summary
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_process_class: filter on the process class; values ‘all’ (default) or one of the
valid values of process/processClass
filter_executable: filter on whether executable or not; values ‘all’ (default), true,
false
filter_use_frequency: filter on how often the process has been used
array of ProcessSummary structs
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
Get a Process Instance
Get the process with the specified entity ID.
Path:
processes/<id>
-27-
Loom REST API
Method:
Parameters:
Returns:
Document Revision 0.10 - 01 Aug 2014
GET
none, except for entity/id that is built into the URL
entity: process entity
Replace a Process Instance
Replace the process with the specified ID with the new process object.
Path:
Method:
Parameters:
Returns:
processes/<id>
PUT
the process to replace the existing one
id: ID of process (same as on input)
Delete a Process Instance
Delete the process with the specified ID.
Path:
Method:
Parameters:
Returns:
processes/<id>
DELETE
none, except for entity/id that is built into the URL
nothing
Get a Process Instance’s Summary
Get the summary of the specified process.
Path:
Method:
Parameters:
Returns:
processes/<id>/summary
GET
none, except for entity/id that is built into the URL
ProcessSummary struct
Create a Process Use
Creates a unique ‘use’ of the specified process. A ProcessUse is a snapshot of a Process at the
time it was used; this ensures the viability of lineage calculations, as ProcessUses are immutable
(whereas Processes are mutable).
Path:
Method:
Parameters:
Returns:
processes/<id>/uses
POST
contexts: The input and output contexts for this particular use.
id: The ID of the ProcessUse
Get a Process’s Uses
-28-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Get the ‘uses’ of the specified process. This returns a set of ProcessUse instances. A ProcessUse
is a snapshot of a Process at the time it was used; this ensures the viability of lineage calculations,
as ProcessUses are immutable (whereas Processes are mutable).
Path:
Method:
Parameters:
Returns:
processes/<id>/uses
GET
none, except for entity/id that is built into the URL
array of ProcessUse’s
Get a Process Use
Get a specific process use for a specific process.
Path:
Method:
Parameters:
Returns:
processes/<id>/uses/<id>
GET
none, except for entity/id that is built into the URL
ProcessUse
___________________________________________
Jobs
Jobs represent asynchronous activity performed in system, and tracked by Loom. Jobs are linked
to executable Processes.
Job Entity Attributes
Name [Type]
Description
job/processUse
uuid
Pointer to the ProcessUse the job is tracking execution of.
job/process
uuid
Pointer to the Process that the ProcessUse was created from.
job/status
string
Current status of the job execution. One of created, started, inprogress, completed, failed, cancelled, timed-out
job/executedAt
instant
When the job was initiated. Should be equal to or greater than
the value of entity/createdAt.
job/errorMessage
string
Error message, if an error occurs (status == failed).
-29-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
job/jobTrackerID
string
Hadoop job tracker ID, if available.
job/jobLog
string
Log associated with the job, if available.
job/progress
embedded struct
Job progress information. Contains a single JobProgress if
available.
JobProgress Attributes
A JobProgress is a structure that captures progress information related to a job. It can be
embedded in a Job instance, if the Job is completed, or obtained independently, if a Job is still in
progress.
Name [Type]
Description
job.progress/state
string
Internal state of the processing engine. Corresponds roughly to
the Job’s status field.
job.progress/startedAt
instant
When processing began by the processing engine. Should be
equal to or greater than the Job’s entity/createdAt.
job.progress/finishedAt
instant
When processing was completed by the processing engine.
Should be equal to or greater than the value of startedAt.
job.progress/duration
long
Duration of the job so far, in milliseconds. duration =
(finishedAt - startedAt).
job.progress/cpuTime
long
Total CPU time used so far for the job, in milliseconds.
job.progress/stepMetrics
embedded struct
Job progress metrics related to job steps. JobStepMetrics
struct.
job.progress/dataMetrics
embedded struct
Job progress metrics related to data read and written.
JobDataMetrics struct.
JobStepMetrics Attributes
A JobStepMetrics is a structure that captures low-level details about a Job’s execution and
progress. It is always embedded in a JobProgress.
Name [Type]
Description
job.progress.metrics/stepCount
long
Number of steps.
job.progress.metrics/stepsFailed
Number of steps that failed.
-30-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
long
job.progress.metrics/stepsPending
long
Number of steps that have not run yet.
job.progress.metrics/stepsRunning
long
Number of steps currently running.
job.progress.metrics/stepsSkipped
long
Number of steps skipped.
job.progress.metrics/stepsStarted
long
Number of steps that have been started.
job.progress.metrics/stepsStopped
long
Number of steps that have been stopped.
job.progress.metrics/stepsSubmitted
long
Number of steps submitted.
job.progress.metrics/stepsSuccessful
long
Number of steps completed successfully.
JobDataMetrics Attributes
A JobDataMetrics is a structure that captures low-level details about a Job’s execution and
progress. It is always embedded in a JobProgress.
Name [Type]
Description
job.progress.metrics/tuplesRead
long
The number of tuples read so far.
job.progress.metrics/tuplesWritten
long
The number of tuples written so far.
Job Summary Attributes
The JobSummary structure captures a ‘view’ of a Job entity, pulling in information from related
entities (such as the Process that was executed to spawn the job). Instances of these structs are
returned from the ‘summary’ API methods.
Name [Type]
Description
summary/entityID
string
The unique identifier of the entity that the summary is for.
-31-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
summary/entityName
string
The name of the entity that the summary is for.
summary/entityCreatedAt
instant
The creation timestamp of the entity that the summary is for.
summary/entityCreatedBy
string
The username of the person who created the entity.
summary/entityModifiedAt
instant
The timestamp when the entity was last modified.
summary/entityModifiedBy
string
The username of the person who last modified the entity.
job/status
string
Current status of the job execution.
job.progress/startedAt
instant
When the execution engine started processing the job.
job.progress/finishedAt
instant
When the execution engine finished processing the job.
job.progress/duration
long
The duration of the job so far; milliseconds.
(Equals finishedAt - startedAt when job completed).
process/processType
string
Type of process, corresponding to the ProcessDefinition. One
of 'sql-query', 'hiveql’, ‘'dataset-import', or user-defined type.
process/processClass
string
General class of this process. One of ‘transform’, ‘importexport’, ‘descriptive’, or user-defined class.
job/jobTrackerID
string
Hadoop Job Tracker ID, if applicable
job/errorMessage
string
Error message, if an error occurs (status = FAILED)
job/processUse
reference (uuid)
Pointer to the ProcessUse the job is tracking progress of.
summary.job/processID
reference (uuid)
Pointer to the Process that the ProcessUse is a snapshot of.
summary.job/processName
string
Name of the Process (from Process entity/name).
Requests
Request
Description
GET jobs
Get all jobs matching the provided filters.
-32-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
GET jobs/summary
Get summaries of all jobs matching the provided filters.
GET jobs/<id>
Get the job with the specified entity ID.
PUT jobs/<id>
Replace the job with the specified ID with the new job object.
GET jobs/<id>/summary
Get the summary of the specified job.
Request Details
Paths are all relative to the root of the API, including version number.
Get Registered Jobs
Get all jobs matching the provided filters.
Path:
Method:
Parameters:
Returns:
See Also:
jobs
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_job_status: filter on the job status; values ‘all’ (default) or one of the valid
values of job/status
filter_job_duration: filter on job duration
array of Job entities
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
Get Job Summaries
Get summaries of all jobs matching the provided filters.
Path:
Method:
Parameters:
Returns:
See Also:
jobs/summary
GET
filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’
(default)
filter_job_status: filter on the job status; values ‘all’ (default) or one of the valid
values of job/status
filter_job_duration: filter on job duration
array of JobSummary structs
GET /search/filters/values: to get the allowed values for a specific filter
GET /types/attributes/values: to get the values of enumerated attributes which are
used in filters.
Get a Job Instance
-33-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Get the job with the specified entity ID.
Path:
Method:
Parameters:
Returns:
jobs/<id>
GET
none, except for entity/id that is built into the URL
entity: job entity
Replace a Job Instance
Replace the job with the specified ID with the new job object. This is used to update a job’s core
metadata (name, description, folder, tags); it cannot be used to modify job progress or metrics
information.
Path:
Method:
Parameters:
Returns:
jobs/<id>
PUT
the job to replace the existing one
id: ID of job (same as on input)
Get a Job Instance’s Summary
Get the summary of the specified job.
Path:
Method:
Parameters:
Returns:
jobs/<id>/summary
GET
none, except for entity/id that is built into the URL
JobSummary struct
___________________________________________
Users
Users represent users (and in the future, groups) who use Loom.
See also the Connection methods for logging into Loom.
User Entity Attributes
The User entity has the following attributes, in addition to the core entity attributes.
-34-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Name [Type]
Description
user/username
string
User’s unique system user name.
user/password
string
User’s password, in hashed form.
user/email
string
User’s email address.
Requests
Request
Description
POST users
Create a new user, given a local instance. Returns entity ID.
PATCH users/<id>
Modify user’s password and/or email address.
Request Details
New User
Creates a user account, and logs in to that account.
Path:
Method:
Parameters:
users
POST
body: User object to be saved
Returns:
id: the entity ID for the created user entity.
Update User
Modifies the specified user account (e.g., password, email, etc).
Path:
Method:
Parameters:
users/<id>
PATCH
password: current password, if changing
body: attributes from User entity, excluding username
Returns:
id: the entity ID for the modified user entity.
___________________________________________
-35-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
-36-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Glossaries
Glossary Entity Attributes
A glossary is a container that holds terms. In addition, terms in a glossary can be organized by
‘subject areas’. So, a glossary is also a container of subject areas.
The Glossary entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
glossary/businessUnit
string
The business unit that owns or manages the glossary.
glossary/steward
string
The name of the steward who maintains and manages the
glossary.
glossary/namespace
namespace/Namespace
The set of namespaces the glossary imports. Used for RDF
interoperability.
glossary/import
array of
namespace/Namespace
The set of namespaces the glossary imports. Used for RDF
interoperability.
glossary/subjectArea
array of references
The subject areas in the glossary. Each subject area is owned
by the Glossary. Each term is associated with exactly one of
these.
glossary/term
array of references
The terms in a glossary. Each term is ownded by the Glossary,
but is considered to be 'contained' within a single subject area.
SubjectArea Attributes
A glossary is a container that holds terms. In addition, terms in a glossary can be organized by
‘subject areas’. So, a glossary is also a container of subject areas. Subject areas are owned by
their containing glossary; if the glossary is deleted, the subject areas within it are deleted also.
Note that terms are not ‘contained’ by subject areas; they reference them through a property.
The SubjectArea entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
glossary/steward
string
The name of the steward who maintains and manages the
subject area in the glossary.
-37-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
glossary/namespace
namespace/Namespace
The set of namespaces the subject area imports. Used for RDF
interoperability.
glossary/import
array of
namespace/Namespace
The set of namespaces the subject area imports. Used for RDF
interoperability.
Term Attributes
A glossary is a container that holds terms. Terms are owned by their containing glossary; if the
glossary is deleted, the terms within it are deleted also.
Terms can reference a subject area in the glossary. However, terms are not ‘contained’ by subject
areas; they reference them through a property.
The Term entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
glossary/acronym
string
An acronym by which the term may be known. May be the
same as entity name or label.
glossary/alternateName
array of string
Alternate names by which the term may be known.
glossary/longDescription
string
A verbose description of the term. (entity/description is treated
as a short description).
glossary/subjectAreaRef
string
Reference to a SubjectArea within the glossary, by name.
glossary/termStatusRef
string
Reference to a TermStatus within the glossary, by name.
glossary/term
array of references
The terms in a glossary. Each term is ownded by the Glossary,
but is considered to be 'contained' within a single subject area.
There are sub-types of terms -- SimpleTerms and (in the future) CompositeTerms.
TermStatus Attributes
A glossary term can have a status, indicating some level of ‘governance’ of the term. Typically,
term governance will be managed by a glossary steward.
The TermStatus struct has the following attributes, in addition to the core entity attributes such as
entity/name, entity/label, and entity/description.
-38-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Name [Type]
Description
glossary/statusIndex
long
Index of the status, for ordering.
Namespace Attributes
Glossaries and subject areas can be organized by namespaces. This is for the purposes of RDF
interoperability.
The Namespace struct has the following attributes, in addition to the core entity attributes such as
entity/name, entity/label, and entity/description.
Name [Type]
Description
namespace/namespaceUri
string
The URI of the namespace for RDF interoperability.
namespace/namespacePrefix
boolean
The prefix of the namespace for RDF interoperability.
Requests
Request
Description
GET glossaries
Get all glossaries matching the provided filters.
POST glossaries
Create a new glossary, given glossary metadata.
GET glossaries/<id>
Get the glossary with the specified entity ID.
PATCH glossaries/<id>
Update the glossary with the specified entity ID.
DELETE glossaries/<id>
Delete the glossary with the specified entity ID.
GET glossaries/<gid>/areas
Get all glossary subject areas matching the filters.
POST glossaries/<gid>/areas
Create a new subject area in a glossary.
GET glossaries/<gid>/areas/<id>
Get the subject area with the specified entity ID.
PATCH glossaries/<gid>/areas/<id>
Update the subject area with the specified entity ID.
DELETE glossaries/<gid>/areas/<id>
Delete the subject area with the specified entity ID.
GET glossaries/<gid>/terms
Get all glossary terms matching the filters.
POST glossaries/<gid>/terms
Create a new term in a glossary.
-39-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
GET glossaries/<gid>/terms/<id>
Get the term with the specified entity ID.
PATCH glossaries/<gid>/terms/<id>
Update the term with the specified entity ID.
DELETE glossaries/<gid>/terms/<id>
Delete the term with the specified entity ID.
Request Details
Note this API is ‘flat’ with respect to the containment hierarchy of a glossary. I.e., since glossary
terms have unique entity IDs, they can be referenced directly, without referencing the containing
subject area. Similarly, subject areas also have unique IDs, and can be referenced directly. (The
containing glossary is still included in API methods, to be consistent with other parts of the Loom
API, and to allow for retrieval by name in addition to by unique ID).
Also note that while Terms are ‘contained’ within SubjectAreas, in terms of the formal data model,
they are serialized as siblings, with the ‘whole-part’ relationship defined indirectly, using the
subjectAreaRef property of a Term to point to their associated SubjectArea using the name (or
namespace prefix) of the Subject Area.
Register (Create) a Glossary
Create a new glossary with a set of descriptive properties.
Path:
Method:
Parameters:
Returns:
Errors:
Notes:
glossaries
POST
entity: properties to assign to the new glossary; must include glossary/namespace if
the namespace parameter is not specified, and may contain glossary/import if the
import parameter is not specified.
namespace: the namespace to assign to the glossary entity; optional, if there is a
glossary/namespace property defined within the entity that is passed in.
imports: the namespaces of other glossaries (or subject areas) to import, so that
terms within the glossary can reference terms in other glossaries; optional, if a
glossary/import property is defined within the entity that is passed in.
ID of the new glossary entity.
1. If the glossary name (entity/name) matches that of another glossary in Loom.
2. If the glossary’s display name (entity/label) matches that of another glossary in
Loom. (Would be confusing in user interfaces).
3. If the glossary’s namespace prefix or URI match those of another glossary in
Loom. (In reality, the prefix does not have to be unique; this is a convenience, or
sloppiness, initially.)
1. The entity parameter is required, and must have at least the entity/name
specified. If the namespace parameter is not specified, the entity parameter must
include the glossary/namespace property. The entity/type need not be specified.
2. The glossary name (entity/name), display name (entity/label), namespace prefix,
-40-
Loom REST API
See Also:
Document Revision 0.10 - 01 Aug 2014
and namespace URI must all be unique within the context of the Loom registry.
GET glossaries: to get all defined glossaries
Get Registered Glossaries
Get all glossaries registered with Loom. All contained subject areas are also returned. However,
contained terms are not returned. (The focus of this method is to retrieve the ‘organizational
scheme’ consisting of glossaries and subject areas).
Path:
Method:
Parameters:
Returns:
Notes:
See Also:
glossaries
GET
(none)
array of Glossary entities, with contained SubjectArea entities.
1. When no glossaries are defined, will return empty array.
POST glossaries: to create a new glossary
GET glossaries/<id>: to get a specific glossary
Get a Glossary
Get the glossary with the specified entity ID. The returned instance will include all subject areas in
the glossary. It will include all terms in the glossary (linked to their respective subject areas) only if
requested.
Path:
Method:
Parameters:
Returns:
Notes:
See Also:
glossaries/<id>
GET
include_terms: If true, the glossary terms will be included in the response; default
is false.
a Glossary entity, with contained SubjectArea entities; contained terms are also
included if the include_terms parameter is true.
1. The ‘related’ section should contain the following sections:
● contentSummary, with counts of the total number of terms (‘nTerm’) and the
total number of subject areas (‘nSubjectArea’).
● governanceSummary, an array of structures closely aligned with the actual
TermStatus instances (with their name for identification, label for display,
description for tooltip, and index for ordering), along with a "count" field with
the number of terms in the glossary with that status. See the attached JSON
example. The array elements should be ordered in the same order as the
glossary/statusIndex values.
POST glossaries: to create a new glossary
GET glossaries: to get all glossaries
GET glossaries/statuses: to get all term statuses
-41-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Update a Glossary
Update the properties of the glossary with the specified ID, with the new or updated attributes
specified. This does not change the contents of the glossary: the subject areas and terms.
Path:
Method:
Parameters:
Returns:
Errors:
Notes:
See Also:
glossaries/<id>
PATCH
entity: a set of glossary properties (e.g., entity/name, entity/description,
glossary/import, etc), which will replace the existing glossary properties; may
include glossary/namespace if the namespace parameter is not specified, and may
contain glossary/import if the import parameter is not specified.
namespace: the namespace to assign to the glossary entity; optional, if there is a
glossary/namespace property defined within the entity that is passed in, or if the
namespace is not to be changed.
imports: the namespaces of other glossaries (or subject areas) to import, so that
terms within the glossary can reference terms in other glossaries; optional, if a
glossary/import property is defined within the entity that is passed in, or if the imports
are not to be changed.
ID of the updated glossary entity.
1. If the glossary’s display name (entity/label) matches that of another glossary in
Loom. (Would be confusing in user interfaces).
2. If the glossary’s namespace URI matches that of another glossary in Loom.
1. Changing the entity name or the namespace prefix is not allowed.
POST glossaries to create a new glossary instance
GET glossaries/<id> to get a specific glossary instance
GET glossaries: to get all defined glossaries
Delete a Glossary
Delete the glossary with the specified ID. This will delete all the subject areas and all the terms in
the glossary.
Path:
Method:
Parameters:
Returns:
glossaries/<id>
DELETE
none, except for entity/id that is built into the URL
nothing
___________________________________________
Add a Subject Area to a Glossary
Create a new subject area within a glossary. The subject area will be owned by the glossary; if the
glossary is deleted, the subject area will be deleted also.
-42-
Loom REST API
Path:
Method:
Parameters:
Returns:
Errors:
Notes:
See Also:
Document Revision 0.10 - 01 Aug 2014
glossaries/<glossary_id>/areas
POST
entity: properties to assign to the new subject area; must include
glossary/namespace if the namespace parameter is not specified, and may contain
glossary/import if the import parameter is not specified.
namespace: the namespace to assign to the subject area entity; optional, if there is
a glossary/namespace property defined within the entity that is passed in.
imports: the namespaces of other glossaries (or subject areas) to import, so that
terms associated with the subject area can reference terms in other glossaries;
optional, if a glossary/import property is defined within the entity that is passed in.
ID of the new SubjectArea entity.
1. If the subject area name (entity/name) matches that of another subject area in the
glossary.
2. If the subject area’s display name (entity/label) matches that of another subject
area in the glossary. (Would be confusing in user interfaces).
3. If the subject area’s namespace prefix or URI match those of another subject area
in the the glossary.
1. The entity parameter is required, and must have at least the entity/name
specified. If the namespace parameter is not specified, the entity parameter must
include the glossary/namespace property. The entity/type need not be specified.
2. Subject areas are owned by their containing glossary; if the glossary is deleted,
the subject area will be deleted also.
3. The subject area name (entity/name), display name (entity/label), namespace
prefix, and namespace URI must all be unique within the context of the glossary.
GET glossaries<glossary_id>/areas: to get all subject areas in the glossary
Get all Subject Areas in a Glossary
Get all subject areas within a specified glossary. The terms associated with each subject area are
not returned by this method.
Path:
Method:
Parameters:
Returns:
Notes:
See Also:
glossaries/<glossary_id>/areas
GET
(none)
an array of SubjectArea entities.
1. When no subject areas are defined, will return empty array.
POST glossaries/<glossary_id>/areas: to add a new subject area to a glossary
GET glossaries/<glossary_id>/areas/<id>: to get a specific subject area
Get a Subject Area and its Terms
Get a specific subject area from the glossary. The returned instance will include the subject area
and all its terms.
-43-
Loom REST API
Path:
Method:
Parameters:
Returns:
Errors:
Notes:
See Also:
Document Revision 0.10 - 01 Aug 2014
glossaries/<glossary_id>/areas/<id>
GET
(none)
a SubjectArea entity under “subjectArea”, and an array of Terms under “terms”.
1. If the ID does not match the entity/id of a SubjectArea in the glossary.
1. When a subject area that doesn’t have any terms (such as one newly created) is
retrieved, there should be no terms returned in the results.
POST glossaries/<glossary_id>/areas: to add a new subject area to a glossary
GET glossaries/<glossary_id>/areas: to get a all the subject areas in the glossary
Update a Subject Area in a Glossary
Update a glossary subject area with the specified ID, with the new or updated attributes specified.
This does not change the terms associated with the subject area.
This method cannot be used to rename the subject area, or to change its namespace. (Renaming
will not be supported in Loom 2.2, although changing the display name will be permitted).
Path:
Method:
Parameters:
Returns:
Errors:
Notes:
glossaries/<glossary_id>/areas/<id>
PATCH
entity: a set of subject area properties (e.g., entity/name, entity/description,
glossary/import, etc), which will replace the existing subject area properties; may
include glossary/namespace if the namespace parameter is not specified, and may
contain glossary/import if the import parameter is not specified.
namespace: the namespace to assign to the subject area entity; optional, if there is
a glossary/namespace property defined within the entity that is passed in, or if the
namespace is not to be changed.
imports: the namespaces of other glossaries (or subject areas) to import, so that
terms within the subject area can reference terms in other glossaries; optional, if a
glossary/import property is defined within the entity that is passed in, or if the imports
are not to be changed.
ID of the updated subject area entity.
1. If the subject area’s display name (entity/label) matches that of another subject
area in the glossary. (Would be confusing in user interfaces).
2. If the subject area’s namespace URI matches that of another subject area in the
the glossary.
The name and namespace prefix cannot be modified by this method. That is
because Terms reference their associated subject areas using these properties.
Delete a Subject Area from a Glossary
Delete the subject area with the specified ID from the glossary. This will also delete all the terms
-44-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
associated with the subject area from the glossary.
Path:
Method:
Parameters:
Returns:
Errors:
glossaries/<glossary_id>/areas/<id>
DELETE
term_disposition: Indicates how to deal with terms that are associated with the
subject area being deleted. Options are ‘delete’, which will delete the terms; or
‘leave’, which will leave the terms, but set their subject area to null.
nothing
1. Calling this for an entity that does not exist will yield an error.
___________________________________________
Add a Term to a Glossary
Create a new term in a glossary. Typically, a term will be associated with a subject area in the
glossary. The term will be owned by the glossary; if the glossary is deleted, the term will be
deleted also. The term is tightly tied to, but not owned by, the associated subject area if one is
defined; if the associated subject area is deleted, it is optional to also delete all the associated
terms.
Path:
Method:
Parameters:
Returns:
Errors:
glossaries/<glossary_id>/terms
POST
entity: Properties to assign to the new term. E.g., entity/name (required),
entity/label, entity/description, etc. May contain a glossary/subjectAreaRef property,
if the area_ref parameter is not specified, whose value must match the name of a
subject area in the glossary. May contain a glossary/termStatusRef property, if the
status_name parameter is not specified, whose value must match the name of a
termStatus in Loom. May contain glossary/property array, if the
composite_properties parameter is not specified, which denotes the term is a
composite, and specifies the properties to embed in the composite (may be null if no
properties specified for a composite upon create).
composite_properties: An array of business properties for a composite term.
Optional; if the glossary/property property is specified or if the term is a simple term.
It is acceptable that this parameter (or the glossary/property property) is specified
without a value, indicating the term is composite but that it does not currently have
any properties.
area_ref: The name of the subject area the term will be in. Optional, if the
glossary/subjectAreaRef property is specified as part of the entity parameter.
status_ref: The name of the term status that will be assigned to the term. Optional,
if the glossary/termStatusRef property is specified as part of the entity parameter.
ID of the new Term entity.
1. If the combination of the term name (entity/name) and subject area matches that
of another term in the glossary (i.e., same name in same or ‘default’ subject area).
2. If the term display name (entity/label) matches that of another term in the
-45-
Loom REST API
Notes:
See Also:
Document Revision 0.10 - 01 Aug 2014
glossary. (That would be confusing for UI displays).
3. If the glossary/termStatusRef has a value that does not match the name
(entity/name) of a valid TermStatus in the registry.
4. If the area_ref or the glossary/subjectAreaRef property has a value that does not
match the name of a SubjectArea in the glossary.
5. If an attempt is made to create a new term without a valid technical name
(entity/name).
6. If an attempt is made to add a term with the combination of technical name and
subject area as another term in the glossary.
1. If not explicitly specified using entity/type, the type is inferred from the presence or
absence of the ‘glossary/property’ property or ‘composite_properties’ parameter
(either of which may have value ‘null’ (or missing for parameter), which indicates the
term is a composite with no properties defined (currently)).
2. Terms are owned by their containing glossary; if the glossary is deleted, the terms
will be deleted also.
3. Terms are not owned by their associated subject areas; if the subject area is
deleted, it is optional whether or not to delete the associated terms.
4. The entity parameter is required, and must have at least the entity/name
specified. The entity/type need not be specified.
5. If the display name (entity/label) is not specified upon input, then the entity name
should be copied into the entity/label field for use as the display name.
6. Term names within a given SubjectArea must be unique. Term names in different
SubjectAreas can be the same, as their qualified names (namespace:name) will be
unique.
7. Term display names (entity/label) must be unique within the entire glossary. This
is more conservative than the restriction on technical names (which must be unique
within a subject area), for usability reasons (seeing different terms named the same
thing in a glossary might be confusing).
8. It is permissible to add a term to the glossary that has the same technical name
as another term, as long as its subject area is different.
GET glossaries<glossary_id>/terms: to get all terms in the glossary
Get all Terms in a Glossary Matching some Constraints
Get all terms within a specified glossary that match some input constraints.
Path:
Method:
Parameters:
Returns:
glossaries/<glossary_id>/terms
GET
area_ref: The subject area to restrict the terms to. If not specified, terms matching
all subject areas are returned. If the special name ‘default’ is specified, then only
terms associated with the glossary directly (no subject area references) are
returned.
status_ref: The status to restrict the terms to. If not specified, then terms with any
status are returned.
an array of Term entities matching the specified input constraints.
-46-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Notes:
1. When no terms are defined, or when no defined terms match the input
constraints, this method will return an empty array.
See Also:
POST glossaries/<glossary_id>/terms: to add a new term to a glossary
GET glossaries/<glossary_id>/terms/<id>: to get a specific term
Get a Term from the Glossary
Get a specific term from the glossary.
There are multiple possible variations of this method, depending on how the get a term using its
Path: glossaries/<glossary_id>/terms/<id>
Method:
GET
Parameters: (none)
Returns:
a Term entity
Errors:
1. If the ID does not match the entity/id of a term in the glossary.
See Also:
POST glossaries/<glossary_id>/terms: to add a new term to a glossary
GET glossaries/<glossary_id>/terms: to get a set of terms
Update a Term in a Glossary
Update a specific term in the glossary, with the new or updated attributes specified.
This method can be used to change the subject area that a term is associated with. However, it
cannot be used to rename the term. (Renaming will not be supported in Loom 2.2, although
changing the display name will be permitted).
Path:
Method:
Parameters:
glossaries/<glossary_id>/terms/<id>
PATCH
entity: A set of term properties (e.g., entity/name, entity/label, entity/description,
etc), which will replace the current set of properties
composite_properties: An array of properties for a composite term, which will
replace the current set of composite properties. Optional; if the glossary/property
property ios specified or if the term is a simple term. It is acceptable that this
parameter (or the glossary/property property) is specified without a value, indicating
the term is composite but that it does not currently have any properties.
area_ref: The name of a different subject area the term will be in. Optional, if the
glossary/subjectAreaRef property is specified as part of the entity parameter, or if
the term will remain tied to the original subject area.
status_ref: The name of the term status that will be assigned to the term. Optional,
if the glossary/termStatusRef property is specified as part of the entity parameter, or
if the term status will remain unchanged.
-47-
Loom REST API
Returns:
Errors:
Notes:
Document Revision 0.10 - 01 Aug 2014
ID of the updated term entity.
1. Attempting to change the name of the term will yield an error.
2. Attempting to update a term with a area_ref or glossary/subjectAreaRef that does
not match the name of a subject area in the glossary will yield an error.
1. The term’s name cannot be modified by this method. That is because
CompositeTerms reference their properties via qnames, formed with the
SubjectArea prefix and Term name. So changing term names would break those
property bindings.
2. The subject area that a term is associated with can be changed with this method
by specifying a new (valid) value for the glossary/subjectAreaRef property of the
entity that is passed in.
3. Updating a term without the area_ref or glossary/subjectAreaRef specified will
result in the term having the same subject area association it had before the call.
Delete a Term from a Glossary
Delete the glossary term with the specified ID from the glossary.
Path:
Method:
Parameters:
Returns:
Error:
glossaries/<glossary_id>/terms/<id>
DELETE
none, except for entity/id that is built into the URL
nothing
1. Calling this for an entity that does not exist should yield an error.
___________________________________________
Relationships
Relationship Entity Attributes
A relationship is an entity that can relate any two entities. Relationships have a reference to a
RelationshipType, which defines the constraints for relationships.
The Relationship entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
relate/relationshipType
reference to RelationshipType
The relationship type, defining constraints on the relationship.
-48-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
relate/role1
Role struct
The first role (of two) defining one of the ends of the
relationship.
relate/role2
Role struct
The second role (of two) defining one of the ends of the
relationship.
RelationshipType Attributes
RelationshipTypes define the constraints for relationships.
The RelationshipType entity has the following attributes, in addition to the core entity attributes.
Name [Type]
Description
relate/acronym
string
An acronym by which the type might be known.
relate/alternateName
array of string
A set of alternate names by which the relationship type can be
known.
relate/role1
RoleType struct
The first role type (of two) defining constraints on one of the
ends of relationship instances.
relate/role2
RoleType struct
The second role type (of two) defining constraints on one of the
ends of relationship instances.
Role Attributes
Roles define the ends of a relationship. A role has an implied reference to a RoleType; i.e., the
Role at ‘end 1’ is associated with the RoleType at ‘end 1’ of the associated relationship type;
similarly for the role at ‘end 2’.
The Role struct has the following attributes, in addition to the core entity attributes such as
entity/name, entity/label, and entity/description.
Name [Type]
Description
relate/entity
reference
The entity that the role is bound to.
RoleType Attributes
Role types define the constraints on the ends of relationships that reference the relationship ttype
to which the role type is part of.
The RoleType struct has the following attributes, in addition to the core entity attributes such as
-49-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
entity/name, entity/label, and entity/description.
Name [Type]
Description
relate/isNavigable
boolean
Whether the end of the relationship denoted by the role
is navigable or not.
relate/constrainedType
string
Constraint on type that can fill a role in a relationship
instance; null for 'any'.
Requests
Request
Description
GET relationships
Get all relationships matching the provided filters.
POST relationships
Create a new relationship.
GET relationships/<id>
Get the relationship with the specified entity ID.
PATCH relationships/<id>
Update the relationship with the specified entity ID.
DELETE relationships/<id>
Delete the relationship with the specified entity ID.
GET relationships/types
Get all relationship types matching the provided filters.
Request Details
Create a Relationship between Two Entities
Create a new relationship between two entities, optionally overriding the default relationship name
and role names
Path:
Method:
Parameters:
relationships
POST
rel_type_id: the entity/id of a relationship type (relate/RelationshipType) registered
with the system; optional - if not specified, the generic 'relatedEntity' relationship will
be used
related1_id: the entity/id of the first entity to relate (required)
related2_id: the entity/id of the second entity to relate (required)
rel_entity: a bucket of properties (entity/name, entity/label, entity/description, etc) to
put on the relationship entity itself; optional - if not specified, the relationship
properties will be set from the associated relationship type
role1: a bucket of properties (entity/name, entity/label, entity/description, etc)
-50-
Loom REST API
Returns:
Errors:
Notes:
See Also:
Document Revision 0.10 - 01 Aug 2014
describing the role of the first entity in the relationship; optional - if not specified, the
role properties will be set from the associated role type
role2: a bucket of properties (entity/name, entity/label, entity/description, etc)
describing the role of the second entity in the relationship; optional - if not specified,
the role properties will be set from the associated role type
ID of the new Relationship entity.
1. If exactly two valid entity IDs are not passed in.
2. If the relationship type is specified, but is invalid.
3. If the specified entities at the ends of the relationship do not correspond to the
type constraints defined bty their respective role types (relate/constrainedType).
1. If the relationship type is not specified, the generic 'RelatedEntity' relationship will
be used.
2. If the relationship entity properties are not specified, the corresponding properties
from the relationship type will be used.
3. If the role for an end is not specified, the properties from the associated RoleType
(from the RelationshipType) will be used.
4. The type of an entity at a relationship end (related1_id, related2_id), must be
compatible (accounting for inheritance) with the type specified by the
relate/constrainedType field of the corresponding role type.
5. If a label is specified for any of the Relationship or Roles, but no name, then the
label will be ‘sanitized’ and used to generate the name.
GET relationships: to get all relationships, given some constraints
Get all Relationships Matching some Constraints
Get all relationships, optionally constrained by relationship type or name, or additionally by entity ID
(at either end) or related entity type (at either end if no entity ID specified, else at the opposite end)
Path:
Method:
Parameters:
Returns:
Errors:
Notes:
relationships
GET
rel_type_id: the entity/id of a relationship type (relate/RelationshipType) to constrain
the relationships returned to; optional .
rel_name: a name to restrict the relationships returned to; optional. Can be used
independent of or in conjunction with rel_type_id.
related_id: the entity/id of an entity at one end of the relationship (can be at either
end); optional.
related_type: the type of entities at one end of the relationship (can be either end, if
no related entity ID is specified, otherwise at the opposite end); optional. If not
specified, there is no constraint on the type of the related entities. Can use the
special values ‘all-technical’, ‘all-glossary’, and ‘all’ (which is the same as
‘core/Entity’)
an array of Relationship entities.
1. If the specified relationship type ID is invalid.
2. If the specified related ID is invalid.
1. The relationship name may be more restrictive than the relationship type
-51-
Loom REST API
See Also:
Document Revision 0.10 - 01 Aug 2014
(specified via the rel_type_id). That is because a relationship can be named
independently of the relationship type’s name. Or, in the case where the
relationships were given the same name as the relationship types, then specifying
this is just another way to identify the relationship type (effectively).
2. There is no restriction on what a relationship can be named. The same name can
be used for multiple instances of the same type, or for instances of different
relationship types.
3. The related type constraint should take inheritance into account. E.g., specifying
data/DataContainer should return all instances that have that type, or
source/Source, or dataset/Dataset.
POST relationships: to create a new relationship
GET relationships/<id>: to get a relationship
Get a Relationship
Get a relationship given its unique identifier.
Path:
Method:
Parameters:
Returns:
Errors:
See Also:
relationships/<id>
GET
none (except for the ID in the URL)
a Relationship entity
1. If the specified relationship ID is invalid.
POST relationships: to create a new relationship
Update a Relationship
Update a relationship. This performs a partial update, not a full replace.
Path:
Method:
Parameters:
relationships/<id>
PATCH
rel_type_id: the entity/id of a relationship type (relate/RelationshipType) registered
with the system; optional - if not specified, the existing relationship type is left
unchanged
related1_id: the entity/id of the first entity to relate; optional - if not specified, the
existing entity on ‘role 1’ is left unchanged
related2_id: the entity/id of the second entity to relate; optional - if not specified, the
existing entity on ‘role 1’ is left unchanged
rel_entity: a bucket of properties (entity/name, entity/label, entity/description, etc) to
put on the relationship entity itself; optional - if specified, the existing relationship
properties will be fully replaced with the new values; if not specified, the relationship
properties will be left unchanged
role1: a bucket of properties (entity/name, entity/label, entity/description, etc)
describing the role of the first entity in the relationship; optional - if specified, the
existing role properties will be fully replaced with the new values; if not specified, the
-52-
Loom REST API
Returns:
Errors:
Notes:
Error:
See Also:
Document Revision 0.10 - 01 Aug 2014
role properties will be left unchanged
role2: a bucket of properties (entity/name, entity/label, entity/description, etc)
describing the role of the second entity in the relationship; optional - if specified, the
existing role properties will be fully replaced with the new values; if not specified, the
role properties will be left unchanged
ID of the updated Relationship entity.
1. If an invalid entity ID is passed in.
2. If the relationship type is specified, but is invalid.
1. For any of the parameters that is specified, their contents will fully replace the
corresponding values of the current relationship.
2. If a new relationship type is specified, but one or more of the rel_entity and role1
and role2 are not specified, then the existing values of those will be replaced with
the ‘default’ values, obtained from the new relationship type. So, the relationship
name, label, description will be replaced with the corresponding values from the new
relationship type. Similarly for the role name, etc being replaced with the
corresponding values from the role types (for each role).
3. If a entity is specified for one or both ends of the relationship (related1_id and/or
related2_id), the type of the entity must be compatible (accounting for inheritance)
with the type specified by the relate/constrainedType field of the corresponding role
type. (I.e., related1 must be compatible with roleType1, and relalted2 must be
compatible with roleType2). This applies whether a new relationship type is
specified or not.
1. If the specified entities at the ends of the relationship do not correspond to the
type constraints defined by their respective role types (relate/constrainedType).
POST relationships: to create a new relationship
Delete a Relationship
Delete a relationship given its unique identifier.
Path:
Method:
Parameters:
Returns:
Errors:
See Also:
relationships/<id>
DELETE
none (except for the ID in the URL)
nothing
1. If the specified relationship ID is invalid.
POST relationships: to create a new relationship
___________________________________________
Get Relationship Types
There are a fixed set of ‘global’ RelationshipType instances in the Loom registry. This method
-53-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
retrieves all the RelationshipType instances that are registered, optionally constrained by those
types that allow for a specific type of entity to be related.
Path:
Method:
Parameters:
Returns:
Notes:
relationships/types
GET
related_type: the type of entities allowed at one end of the relationship (can be
either end); optional - if not specified, all relationship types are returned.
array of RelationshipType entities.
1. This method will return all the RelationshipType instances registered if no
parameters are specified.
-54-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Entities
This is a ‘generic’ API for accessing instance-related information about the Loom registry model.
Requests
There is currently only one ‘polymorphic’ (type-agnostic) method available. In the future, there will
be more methods for interacting with entity instances without declaring the type.
Request
Description
DELETE entities/<id>
Delete the entity with the specified entity ID.
DELETE entities
Delete all entities in the specified folder; optionally recursive.
Delete an entity
Delete the entity with the specified ID. This performs type-specific delete processing where
appropriate (i.e., delegates to DELETE /sources, etc).
Path:
Method:
Parameters:
Returns:
entities/<id>
DELETE
none, except for entity/id that is built into the URL
nothing
Delete all entities in a folder
Deletes all entities ‘in’ the specified folder. Optionally recurses down the virtual folder hierarchy.
Path:
Method:
Parameters:
Returns:
entities
DELETE
folder: The starting folder; use a single slash (‘/’) or an empty string (“”) for the ‘root’
folder.
recurse: Indicates whether to delete recursively or not (default false).
nothing
___________________________________________
-55-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Types
This is a ‘generic’ API for accessing type-related information about the Loom registry model.
Requests
Request
Description
GET types/extensions
Get extension attributes for specified entity type.
GET types/attributes/values
Get allowed values for the specified attribute.
Request Details
Type Extensions
Returns all the extension attributes for a data type. Results are an array of attribute definitions.
Path:
Method:
Parameters:
types/extensions
GET
type: The entity type to retrieve extension attributes for.
Returns:
An array of attribute definitions associated with the entity type; e.g.,
[{"meta.attribute.ref/type":"dataset/Dataset",
"meta.attribute/doc": "Related To",
"meta.attribute/cardinality":"many",
"meta.attribute/valueType":"uuid",
"meta.attribute/name":"dataset.extension/relatedTo"},
{"meta.attribute/doc":"Source",
"meta.attribute/fulltext":"true",
"meta.attribute/cardinality":"one",
"meta.attribute/valueType":"string",
"meta.attribute/name":"dataset.extension/source"}]
Attribute Allowed Values
Returns the allowed values for a specific attribute. This assumes the attribute holds enumerated
values.
Path:
Method:
Parameters:
types/attributes/values
GET
entity: The groups of attributes of interest, based on an entity type. This parameter
is ignored if attribute is set.Must be a single value or array of: source/Source,
process/Process or data.table/Column
attribute: Defines the attributes of interest. Must be a single value or array
of:
-56-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
persist/storageType, data/structuralForm, process/processType,
data.table/dataType
Returns:
Map of attribute-name to [value,label] pairs; E.g.,
{"data/structuralForm":[["table","Table"]],"persist/storageTyp
e":[["file/text","Text Files"],["rdb/generic","Relational
Database"]]}
___________________________________________
-57-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Activity API
The Activity API focuses on general user activities, such as executing transforms, managing
sessions, etc. These are independent of metadata-manipulation operations performed through the
Resource API.
Connection Operations
These operations are related to establishing, checking, and terminating connections to Loom.
Note that in Loom 1.0, connection establishes a cookie so that entity creation and modification
events are tied to a specific user (via the entity/createdBy and entity/updatedBy fields). Loom does
not have formal secure authentication in 1.0.
User Entity Attributes
The connection operations deal with users, and as such use User entity information. See the
Users section above for attribute values.
Requests
Request
Description
POST connect/login
Log user in to Loom.
POST connect/logout
Log user out of Loom.
GET connect/ping
Checks if the connection is still valid. If it is, returns
information for the user associated with the session.
Request Details
Login
Provide user credentials to get a session cookie for that user. Both the username and password
parameters are strings. The password is sent in the clear, and will rely on the transport layer for
security.
Path:
Method:
connect/login
POST
-58-
Loom REST API
Parameters:
Returns:
Document Revision 0.10 - 01 Aug 2014
username, password
entity/id: The entity ID of the user who logged in.
Headers:
Set-Cookie with session-id set.
Logout
Releases the user’s connection. This resets the user’s session cookie.
Path:
Method:
Parameters:
Returns:
connect/logout
POST
none - retrieves current user from the session-id in cookies.
empty
Ping
Checks if the caller is currently connected to Loom, and if so, returns the entity containing all of the
user data for the currently logged in user. The current user is determined from the session
information (see /connect/login).
Path:
Method:
Parameters:
connect/ping
GET
session cookie in headers
Returns:
User entity.
___________________________________________
Search Operations
Related to search - getting filters, defining filters, etc.
Requests
Request
Description
GET search/filters/values
Get the values for specific named filters, used for the multivalue ‘GET’ methods in the Resource API.
POST search/text
Search through text fields for matching entities.
POST search/text/glossaries
Search through all glossaries or a single glossary for term
matching the search value. (New in 2.2)
-59-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
GET search/folders
Get all entities under the specified folder.
GET search/lineage/data
Get lineage for a specific dataset, source, or data unit.
GET search/lineage/process
Get lineage starting from a specific process or process use.
GET search/related/entities
Get entities of some type that are traversable in a specified
direction from a single instance.
GET search/related/processes
Get all the processes and process uses that a specific data
unit has been used in.
GET search/related/dataflow
Get all the ‘dataflow segments’ that a specified entity is part of.
GET search/totals
Get the total number of entities of each of the main types
exposed by Loom:
source/Source, dataset/Dataset, process/Process, job/Job
Request Details
Paths are all relative to the root of the API, including version number.
Get Filter Values for a Filter
Get the values for specific named filters, used for the multi-value ‘GET’ methods in the Resource
API.
Path:
Method:
Parameters:
Returns:
See Also:
search/filters/values
GET
context: The name of a filter grouping (e.g., ‘dataset’, ‘source’, etc).
filter: The name of a filter (e.g., ‘filter_recency’)
Map of filter key to filter value. E.g.,
{<filter_name> : [ ["<value_1>", "<label_1>"], ... ] ... }
The filter key is passed into Loom API methods, whereas the value is used for
display purposes.
GET /types/attributes/values: To get the values for a specific attribute. Filters often
are comprised of the values of an enumerated attribute, plus the value ‘all’ (and
sometimes ‘none’).
Perform Full-text Search over all Entities
Search through text fields for matching entities. The following core entity fields can be included in
these searches: entity/name, entity/description, entity/label, entity/tags. In addition, some domain
model attributes (those with the meta-attribute of meta.attribute/fulltext=true) can be included in these
searches.
Path:
Method:
search/text
POST
-60-
Loom REST API
Parameters:
Notes:
Document Revision 0.10 - 01 Aug 2014
types: entity types to return if their properties, or the properties of a related type, is
matched. One of ‘all’, ‘technical’, ‘glossary’, or a list of specific types.
properties: properties to search over (all of the specified properties may not be on
all the instances being matched against; just ignore the ones that are missing for a
given type). By default, all of the 'core' properties for an entity are included:
entity/name, entity/description, entity/label, entity/tags
match_terms: list of search words or phrases, to be matched individually and
AND’d together
match_type: whether exact or wildcard (default wildcard); one of "exact" or
"wildcard", with the latter being default; wildcard search can also be baked into
search terms with wildcard (‘*’) character
related_types: entity types of related entities to search over; if not specified,
relationships will not be followed. One of ‘all’, ‘technical’, ‘glossary’, or a list of
specific types.
_offset: Provides an offset into the results.
_limit: Specifies how many results should be returned, starting at the offset position.
Returns:
An array of SearchResults, each containing the following information:
● returnEntity: the entity that was returned as a result of the match; must be
compatible with the ‘types’ parameter
○ entity/id
○ entity/name
○ entity/label
○ entity/type
○ entity/modifiedAt
○ entity/modifiedBy
○ containerEntity
■ entity/id
■ entity/name
■ entity/label
■ entity/type
If multiple properties are specified, then a match will occur if the match term is found
in any of those properties, on the target types (types parameter) or on any related
types (if related_types parameter is specified).
Get Entities in Folder
Lists all entities of a given type that appears within a folder. This recursively includes all sub-folders
under the specified one. If a folder is not specified, then the root folder is presumed. Note that the
type parameter is unnamed in the request, and appears at the end of the path.
Path:
search/folders
Method:
GET
Parameters: type: The type of entities to get; ‘all’ for all
folder: The folder to look in. ‘/’ for the root folder.
recurse: If true (default), recurse through the folder hierarchy
-61-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Returns:
entities: An array of entities of the requested type, in the folder or
one of it’s sub-folders.
folders: A sorted array of the folders that the objects appear in.
Get Lineage starting from a Data Container or Unit
Get lineage starting from a specific dataset, source, or data unit within a dataset or source.
Path:
Method:
Parameters:
Returns:
Returns:
Notes:
search/lineage/data (previous method search/lineage is deprecated)
GET
container_id: The ID of the dataset or source; if specified and no dataunit
information is provided, than container-level lineage will be computed, otherwise
data unit-level lineage will be computed
data_unit_name: The name of the data unit within the container (container_id must
be specified); if specified, then data unit-level lineage will be computed
data_unit_id: The ID of the data unit (container_id is optional, but must be
consistent if specified); if specified, then data unit-level lineage will be computed
direction - specifies direction. Optional. If excluded then both directions are given.
format: The format the lineage will be returned in. One of ‘graph’ (default) or
‘nested’. The ‘graph’ format returns a node-link structure with all node and link
details in the ‘related’ section of the response. The ‘nested’ format returns a deeply
nested structure.
direction: The direction to compute lineage for; one of ‘up' (for upstream), 'down'
(for downstream), or 'both' (default).
steps_max: the maximum number of steps taken in the graph traversal; default is
‘unlimited’
‘graph’ option:
nodes: an array of the nodes in the graph, representing entities that are connected
via the links; the ‘node’ key contains the entity ID
■ node node is the UUID of the data entity (dataset, source, or data unit)
■ node label is the name of the node
■ node type is one of ‘data’ or ‘process’
■ node group is the UUID of the container for data nodes, or of the process for
process (use) nodes
■ node state is the entity state
links: an array of links, with the ‘source’ and ‘target’ keys containing the IDs of the
entities at the start and end of each link.
■ link source is the UUID of the node the directed link comes from
■ link target is the UUID of the node the directed link goes to
■ link label is a string of the form ‘source_type->target_type’
■ link type is ‘lineage’
‘nested’ option:
up: the lineage upstream (‘backward’ direction) of the starting entity
down: the lineage downstream (‘forward’ direction) of the starting entity
The combinations possible on input are:
-62-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
■
■
■
Notes:
container_id only - container-level lineage
data_unit_id only - data unit-level lineage
container_id and data_unit_name - server looks up data_unit_id, computes
data unit-level lineage
■ container_id and data_unit_id - same as previous, but data unit must be in
the container or it is an error
Example return structure for ‘graph’ option (with details keyed off UUIDs in the
‘related’ part of the response):
{"nodes": [
{"node": <uuid>, "label": <string>, "type": <string>, "group": <uuid>} ,
...
],
"links": [
{"source": <uuid>, "target": <uuid>, "label": <string>, "type": <string>,
...
]}
Get Lineage starting from a Process
Get lineage starting from a specific process or process use.
This method always returns lineage in the ‘graph’ format. (The ‘data’ lineage has an option to do
‘nested’ also, but that does not make as much sense when starting from a process.)
Path:
Method:
Parameters:
Returns:
search/lineage/process
GET
process_id: The ID of the process or process use (required)
granularity: The granularity at which lineage will be computed. One of ‘container’ or
‘data_unit’ (optional; default is container if not specified).
steps_max: the maximum number of steps taken in the graph traversal; default is
‘unlimited’
nodes: an array of the nodes in the graph, representing entities that are connected
via the links; the ‘node’ key contains the entity ID
■ node node is the UUID of the data entity (dataset, source, or data unit)
■ node label is the name of the node
■ node type is one of ‘data’ or ‘process’
■ node group is the UUID of the container for data nodes, or of the process for
process (use) nodes
■ node state is the entity state
links: an array of links, with the ‘source’ and ‘target’ keys containing the IDs of the
entities at the start and end of each link.
■ link source is the UUID of the node the directed link comes from
■ link target is the UUID of the node the directed link goes to
■ link label is a string of the form ‘source_type->target_type’
-63-
Loom REST API
Notes:
Document Revision 0.10 - 01 Aug 2014
■ link type is ‘lineage’
Example return structure for ‘graph’ option (with details keyed off UUIDs in the
‘related’ part of the response):
{"nodes": [
{"node": <uuid>, "label": <string>, "type": <string>, "group": <uuid>} ,
...
],
"links": [
{"source": <uuid>, "target": <uuid>, "label": <string>, "type": <string>,
...
]}
Get Related Entities
Get entities of some type that are traversable in a specified direction from a single instance.
Path:
Method:
Parameters:
Returns:
search/related/entities
GET
entity_id: the ID of the starting entity; currently, this must be of type source/Source
related_type - the type of the target entities; currently, this must be of type
dataset/Dataset
steps_max: is the maximum number of steps taken in the graph traversal; default 1
related_entities: array of all entities of the specified type that are related to the
starting entity
steps_taken: the number of steps taken in the graph traversal; will be less than or
equal to steps_max
Get Processes
Get Processes and Process Uses that have a Dataset or DataUnit in their input or output contexts.
NOTE: This method will be merged into /search/related/entities in an upcoming release.
Path:
Method:
Parameters:
search/related/processes
GET
dataset_id: The ID of the dataset; if specified and no dataunit information is
provided, then all processes involving the dataset will be returned.
data_unit_name: The name of the data unit within the dataset (dataset_id must be
specified); if specified, then only processes using the specific data unit will be
returned.
data_unit_id: The ID of the data unit (dataset_id is optional, but must be consistent
if specified); if specified, then only processes using the specific data unit will be
returned.
direction: The direction; either ‘in’ (for input to process) or ‘out’ (for output from
process) or 'both' (default)
-64-
Loom REST API
Returns:
Notes:
Document Revision 0.10 - 01 Aug 2014
class: The class of processes to return; one of either ‘process’, ‘process_use’, or
‘all’ (with ‘all’ being the default if not specified).
in: array of Processes and ProcessUses that the dataset / data unit are input to
out: array of Processes and ProcessUses that the dataset / data unit are output
from
The combinations possible are:
■ dataset_id and data_unit_name - server looks up dataunit_id, returns stats
■ data_unit_id only - ok, will return statistics for that
■ dataset_id and data_unit_id - same as previous, but data unit must be in the
dataset or it is an error
■ dataset_id only - error
■ data_unit_name only - error
Get Related Dataflows
Get all ‘dataflow segments’ that involve a specified entity. This is a useful way to find not only
which entities are related (within 2 graph links) to a given entity, but also how they are related.
A dataflow segment is of the form:
/-> process
/
data-input* -> process-use -> data-output*
\
\-> job
where:
● data-input is a set of ‘Source [DataUnit]’ or ‘Dataset [DataUnit]’ pairs, input to and output
from a ProcessUse
○ DataUnit does not have to be present for all process uses (e.g., source-> dataset)
○ There can in general be multiple input pairs and multiple output pairs
● process-use is a ProcessUse instance
● data-input is a Source/DataUnit or Dataset/DataUnit pair, input to a ProcessUse
● process is the Process entity that the ProcessUse is a snapshot of
● job contains Job information for the executing or executed ProcessUse (empty if not
executable)
The main segment contains the main data flow information: data input to process use and data
output. The process and job information are secondary info, and can be optionally omitted.
Path:
Method:
search/related/dataflow
GET
Parameters:
● entity_id - the ID of the reference entity; must be of type Source, Dataset, Process,
-65-
Loom REST API
●
Document Revision 0.10 - 01 Aug 2014
ProcessUse, or Job.
include_secondary - whether to include ‘secondary’ entities (process and job entities), or
not; default is true, to include the secondary entities.
Returns:
● an array containing structures with the the following fields:
○ use-role - the role that the starting entity plays in the segment (one of ‘data-input’,
‘data-output’, ‘process-use’, ‘process’, or ‘job’)
○ use-date - the modifiedAt date from the process-use (copied up for convenience)
○ data-input - the data input entity (and contained entity if there is one)
○ process-use - the process use
○ data-output - the data output entity (and contained entity if there is one)
○ process - the process that the process use is a snapshot of
○ job - the job tracking execution info (empty if not an executable process)
● Each path field (data-input, etc) contains the following core properties:
○ entity/id
○ entity/name
○ entity/type
○ entity/modifiedAt
○ entity/modifiedBy
● In addition, the ‘data’ fields (data-input and data-output) will have an additional sub-field,
called ‘contained-entity’, if the ProcessUse context had a dataUnitName property. In that
case, this which will have the same 5 core properties for the data unit.
● All other ‘auxiliary’ information should be in the ‘related’ section:
○ for all types: description, modifiedByUser, entityState
○ for sources: storageType (technical and human readable), location, # tables (calc)
○ for datasets: # tables (calc)
○ for processes: processType, processClass, processScope, # uses (calc)
○ for process uses: job ID, job name, job status, job duration (raw and display)
○ for jobs: process use ID, job name, job status, job duration (raw and display)
Notes:
● If include_secondary is false, the ‘process’ and ‘job’ fields will not be included in the
response.
● The ‘use-role’ and ‘use-date’ are used to extract out particular segments. For example, a
data entity can be an input to or an output from a process use, or both. The role allows the
client to ‘filter’ based on these use contexts. (Similar to the ‘in’ and ‘out’ grouping output
from search/related/processes). The use-date is a copy of the process-use modifiedAt, and
is placed at the top-level to assign that same timestamp to the overall segment, and provide
an easy way to sort segments by timestamp.
Get Entity Totals
Get the total number of entities of each of the main types exposed by Loom: source/Source,
dataset/Dataset, process/Process, job/Job.
-66-
Loom REST API
Path:
Method:
Parameters:
Returns:
Document Revision 0.10 - 01 Aug 2014
search/totals
GET
none
map of type to count
___________________________________________
Data Access Operations
These operations focus on basic data access. These do not deal with any kind of filtering,
processing, or transformations; those kinds of operations are handled through the Processing API.
Requests
Request
Description
GET data/file/read_lines
Get the first rows from a text file.
GET data/file/read_parsed
Get the first parsed lines from a text file.
GET data/dataset/head
Get the first rows from an individual data unit within a dataset.
GET data/dataset/stats
Get the statistics for an individual data unit within dataset.
Request Details
Read Lines from Text File
Get the first rows from text file in HDFS.
Path:
Method:
Parameters:
Returns:
data/file/read_lines
GET
location: The absolute path to the file in the file system.
nrow: The number of rows to return; default 10.
An array of strings, each of which is a line of text from the file.
Read Lines from Text File
Get the first rows from text file in HDFS.
-67-
Loom REST API
Path:
Method:
Parameters:
Returns:
Document Revision 0.10 - 01 Aug 2014
data/file/read_parsed
POST
location: The absolute path to the file in the file system.
file_format: The Format struct, to interpret the bits on disk
nrow: The number of rows to return; default 10.
rows: An array of rows from the file. Each row is an array of strings that are the
columns parsed from the lines in the file.
columns: An array of strings that gives the column names used in the file. The
column names are parsed from the file's header if possible, else the columns are
assigned auto-generated names.
Get the First Rows from a Data Unit
Get the first rows from an individual data unit within a dataset.
Path:
Method:
Parameters:
Returns:
Notes:
data/dataset/head
GET
dataset_id: The ID of the dataset; optional if data_unit_id is specified
data_unit_name: the name of the data unit within the dataset (dataset_id must be
specified); optional if data_unit_id is specified
data_unit_id: The ID of the data unit (dataset_id is optional, but must be consistent
if specified); optional, if data_unit_name is specified
nrow: The number of rows to return; default 10.
records: the records
column_names: the names of the columns
The combinations possible are:
■ dataset_id and data_unit_name - server looks up data_unit_id, returns stats
■ data_unit_id only - ok, will return statistics for that
■ dataset_id and data_unit_id - same as previous, but data unit must be in the
dataset or it is an error
■ dataset_id only - error
■ data_unit_name only - error
Get the Statistics for a Data Unit
Get the statistics for an individual data unit within dataset.
Path:
Method:
Parameters:
data/dataset/stats
GET
container_id: The ID of the dataset; optional if data_unit_id is specified
data_unit_name: the name of the data unit within the dataset (container_id must be
specified); optional if data_unit_id is specified
data_unit_id: The ID of the data unit (container_id is optional, but must be
-68-
Loom REST API
Returns:
Notes:
Document Revision 0.10 - 01 Aug 2014
consistent if specified); optional, if data_unit_name is specified
A ‘scan metadata’ structure, consisting of (only primary fields shown):
scan.table/numRecords: the number of records
scan.table/columnMetadata: the columns:
■ scan.table/column - the column that the metadata is for, with sub-fields of
● entity/type
● entity/name
● data.table/dataType
■ scan.table/columnType - one of 'string', 'number', or 'object'
■ scan.table/nullValues - number of null values
■ scan.table/emptValues - number of empty string values, for string columns
■ scan.table/minValue - the minimum value, for numeric columns
■ scan.table/maxValue - the maximum value, for numeric columns
■ scan.table/meanValue - the mean, for numeric columns
■ scan.table/stdDev - the standard deviation, for numeric columns
The possible parameter combinations are:
■ container_id and data_unit_name - server looks up data_unit_id, returns
stats
■ data_unit_id only - ok, will return statistics for that
■ container_id and data_unit_id - same as previous, but data unit must be in
the dataset or it is an error
■ container_id only - error
■ data_unit_name only - error
Calculate the Statistics for a Data Unit
Calculate the statistics for an individual data unit within dataset.
Path:
Method:
Parameters:
Returns:
Notes:
data/dataset/stats
POST
container_id: The ID of the dataset; optional if data_unit_id is specified
data_unit_name: the name of the data unit within the dataset (container_id must be
specified); optional if data_unit_id is specified
data_unit_id: The ID of the data unit (container_id is optional, but must be
consistent if specified); optional, if data_unit_name is specified
ID of the data unit the statistics were computed for.
The possible parameter combinations are:
■ container_id and data_unit_name - server looks up data_unit_id
■ data_unit_id only - ok
■ container_id and data_unit_id - same as previous, but data unit must be in
the dataset or it is an error
■ container_id only - error
■ data_unit_name only - error
-69-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
___________________________________________
Execution Operations
These operations deal with processing data -- executing transformations and tracking job progress.
Requests
Request
Description
POST execute/transform
Execute the specified transformation.
GET execute/status
Get the status of an executed job.
Request Details
Execute a Transform
Execute the specified transformation.
Path:
Method:
Parameters:
Returns:
Notes:
execute/transform
POST
process_id: The ID of the process to execute.
contexts: The input and output data contexts; not required if process has a default
context with both dataset and data unit name defined.
id: The ID of a Job to track the progress of the execution
If a context is not provided, the process being executed must have a default context
defined. If that is the case, the input context will be the default input context, and the
output context will be automatically generated as follows: the dataset will be the
same as the input dataset, and the output data unit (table) name will be autogenerated.
Get Execution Status
Get the status of an executed job. The job may be in-progress, or may have completed (or failed).
Path:
Method:
Parameters:
Returns:
execute/status
GET
job_id: The ID of the job (appended to URL)
The job status and progress, as a list with 2 fields:
■ job/status: the job's execution status
-70-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
■
job/progress: the details about the job's execution, a JobProgress
___________________________________________
Environment Operations
These operations deal with the external environment which Loom interacts with.
Environment Struct Attributes
FileInfo Attributes
A FileInfo is a structure that describes a file or directory in HDFS.
Name [Type]
Description
path
string
The full path of the file or directory.
isDir
boolean
If true, is a directory; otherwise is a file.
length
integer
The length of the file in bytes.
blockSize
integer
The number of blocks in a file block.
modificationTime
instant
When the file or directory was last modified.
owner
string
Username in HDFS of user who owns the file or directory
group
string
Name of group in HDFS that owns the file or directory.
permission
string
The file permissions, in string format (e.g., “644”).
replication
integer
How many times the file is replicated across a cluster.
Requests
-71-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Request
Description
GET environ/fs/home_dir
Get the home directory of the file system.
GET environ/fs/list_info
Get a listing of files in a directory. This is not recursive.
GET environ/fs/file_info
Get information about a specific file.
POST environ/fs/files
Upload one or more files to a specified directory in HDFS.
GET environ/hive/dblist
Get a list of databases in a Hive instance.
Request Details
File System Home Directory
Gets the name of the file system home directory. The file system can either be the native file
system on the Loom server, or a Hadoop File System (HDFS) managed by the server. See also
Apache WebHDFS documentation.
Path:
Method:
Parameters:
Returns:
environ/fs/home_dir
GET
none
The path of the home directory
File System List Files
Gets all file system details for everything in the provided path. If location is a directory, then lists
the details for everything in that directory. If location is a file, then lists the details for just that file.
The file system can either be the native file system on the Loom server, or a Hadoop File System
(HDFS) managed by the server. See also Apache WebHDFS documentation.
Path:
Method:
environ/fs/list_info
GET
Parameters: location: The path to get details for. May be a file or a directory.
Returns:
An array containing FileInfo objects (see above). E.g.,
[{"path":"file:///tmp/20130710194447","group":"wheel","blockSize":33
554432,"modificationTime":"2013-0710T23:44:47Z","length":102,"owner":"root","isDir":true,"replication"
:1,"permission":"rwxr-xrx"},{"path":"file:///tmp/geColladaModelCacheLock","group":"wheel","b
lockSize":33554432,"modificationTime":"2013-0723T02:30:47Z","length":0,"owner":"pag","isDir":false,"replication":1
,"permission":"rw-r--r--"}]
File System File Information
-72-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Get the file system details for a specified path. If path is a directory, then returns just the
description of that directory. See also Apache WebHDFS documentation.
Path:
Method:
environ/fs/file_info
GET
Parameters: location: The path to get details for. May be a file or a directory.
Returns:
A FileInfo object (see above). E.g.
{"path":"file:///tmp","group":"wheel","blockSize":33554432,"modifica
tionTime":"2013-0723T19:44:29Z","length":476,"owner":"root","isDir":true,"replication"
:1,"permission":"rwxrwxrwt"}
File Upload
Upload one or more files to a specified directory in HDFS.
Path:
Method:
environ/fs/files
POST
Parameters: file: The fully-qualified path to the file to upload.
target_directory: The destination directory in HDFS, where the file will be
uploaded to.
Returns:
No return value.
Hive Database Listing
Get the databases in a Hive instance.
Path:
Method:
environ/hive/dblist
GET
Parameters: location: If present, list the tables in the database named by the path
all: If true, include databases that are Loom datasets; optional, defaults to
false
Returns:
A map containing tables, and if location specified, names of
databases
{tables: [ {
dbname:
name:
owner:
tableType:
/* "MANAGED" | "EXTERNAL" */
parameters:
viewOriginalText:
viewExpandedText:
}, ... ]
paths: [ "dbname_1", "dbname_2", ... ] /* only if location="" or
empty */
}
-73-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
___________________________________________
System Operations
System operations deal with the Loom system itself.
System Struct Attributes
SystemVersion Attributes
A SystemVersion is a structure that describes the Loom version.
Name [Type]
Description
versionDate
instant
The date and time when the Loom version was released
versionNumber
string
The Loom release identifier (e.g., 1.0.5)
buildNumber
string
The Loom build identifier; mostly for internal Revelytix
use
versionAndBuild
string
The Loom release identifier with the build identifier
appended
productName
string
The name of the product (always ‘Loom’)
productEdition
string
The name of the product edition (e.g., ‘Standard’)
SystemConfig Attributes
A SystemConfig is a structure that describes the Loom configuration. These are the properties
defined in the ‘loom.properties’ file under the ‘config’ directory in your Loom installation.
Name [Type]
Description
persist.mode
string
The mechanism that Loom uses for persisting datasets.
If ‘loom’, then Loom manages persistence in directories
of HDFS files. If ‘hive’, then Loom uses Hive to persist
datasets as databases in Hive.
-74-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Default: loom
dataset.persist.dir
string
The location in HDFS where Loom-managed datasets
are created, when running Loom persist mode.
Defaults to a directory named 'loom-datasets' in the
HDFS working directory.
activeScan.hdfs.enabled
boolean
If true, active scanning of potential sources in HDFS is
enabled.
Default: false
activeScan.hdfs.baseDir
string
Comma-separated list of directories under which to scan
for potential sources in HDFS. Directories may be
specified as an absolute hdfs:// URL or a relative path
that will be resolved against the Loom working directory.
The scan is recursive, so all sub-directories of each
configured directory will be scanned.
Defaults to Loom working directory.
activeScan.hdfs.scanIntervalMinutes
integer
The interval, in minutes, at which Loom will scan HDFS
for potential sources.
Default: 60
activeScan.hdfs.parseLines
integer
The number of records to parse from a file in HDFS to
determine whether it's a potential source.
Default: 50
activeScan.hdfs.scoreThreshold
float
The threshold above which the confidence level must be
for a file in HDFS to be considered a potential source.
The confidence level is a computed value between 0 and
1.
Default: 0.25
activeScan.hdfs.maxBufferSize
long
The maximum amount of data to read into memory from
an HDFS file to determine whether it's a potential source.
Default: 8388608
security.enabled
boolean
Enables or disables Loom security.
If security is enabled, user impersonation is performed.
Default: false
security.authentication
boolean
Configures how authentication is done: does the user
exist and have permission?
Username and password will always be requested in
order to get a session.
-- jaas - (default) Use JAAS. JAAS is configured in
security-unix.conf. Must have a valid session to access
API.
-- loom - Use Loom username/password for a valid
session. API rejects requests without a valid session.
-75-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
-- disabled - Use Loom username/password to get a
session. Valid session not required to access API.
ssl.enabled
boolean
Enables SSL (https) support. Disabled by default.
ssl.port
long
Configures the SSL port. Defaults to 8443.
Requests
Request
Description
GET system/version
Get system information, such as Loom version.
GET system/config
Get system configuration information.
POST system/log_message
Writes a message to the system log.
GET system/backup
Get the contents of the registry; for release migration.
POST system/restore
Fill the registry with the backed up contents from another
registry.
Request Details
Get Version
Returns version information for the instance of Loom.
Path:
Method:
Parameters:
Returns:
system/version
GET
none
SystemVersion struct (see above)
Get Configuration
Writes a message to the Loom system log. This is useful for ‘tagging’ activities before or after they
are performed. Will emit the username from the current session.
Path:
Method:
system/config
GET
Parameters: none
Returns:
SystemConfig struct (see above); e.g.
-76-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
{"activeScan.hdfs.enabled":true,"activeScan.hdfs.baseDir":["/d
ata/dataset"],"activeScan.hdfs.scanIntervalMinutes":60,"activeScan.h
dfs.parseLines":50,"activeScan.hdfs.scoreThreshold":0.25,"activeScan
.hdfs.maxBufferSize":8388608,"persist.mode":”hdfs”,"dataset.persist.
dir":"data/loom-datasets","security.enabled":
false,"security.authentication": "disabled","ssl.enabled":
false,"ssl.port": 8443,"jobService.threadPool.size": 10}
Log a Message
Writes a message to the Loom system log. This is useful for ‘tagging’ activities before or after they
are performed. Will emit the username from the current session.
Path:
Method:
system/log_message
POST
Parameters: level: The log level. One of ‘info’, ‘debug’, ‘trace’, ‘fatal’, ‘warn’.
message: The message to write
Returns:
none; writes a message to the txnlog, e.g. 2013-08-08
10:41:03,701- INFO - [fabric.txnlog] - [nREPL-worker-36] - Logged by
<gary>: THIS IS A MESSAGE
Backup Registry
Get the contents of the registry; for release migration. This can be restored to a new registry using
/system/restore.
Path:
Method:
Parameters:
Returns:
system/backup
GET
none
array of entity objects
Restore a Backup
Fill the registry with the backed up contents from another registry. The backup contents are
obtained from /system/backup.
Path:
Method:
Parameters:
Returns:
system/restore
POST
array of entity objects wrapped in ‘results’ (output from backup)
array of entity/id’s of restored objects
___________________________________________
-77-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Registry Model Notes
Registry Model Overview
The following figure shows a high-level view of the Loom registry model. The model is comprised
of a set of ‘domain’ models, which define a set of connected Entity Types. The models themselves
have dependencies based on cross-model relationships between entity types in the domains.
Some domain models are extensible, and have specific sub-models to expose specific
functionality.
The primary 3 entity types are Source, Dataset, and Process. Source and Dataset models
represent unmanaged and managed sets of data, respectively; they both are derived from the base
DataContainer, which contains DataUnits (e.g., tables) with schema information. Data containers
and data units have underlying persistent storage, which is proxied by Storage (container) and
storage units. The Process entity represents processing performed on data entities, in a generic
form.
Lineage is derived from the inter-connected instances of data entities (containers or data units) and
processes. In order to compute valid lineage, a ‘snapshot’ of a process must be taken when it is
used (executed, or just used in a relationship). This provides the immutability of processes from a
lineage perspective. From a data perspective, immutability is provided by making dataset units
(i.e., tables) non-modifiable once they have been used.
-78-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Core Entity Attributes
All entities have the following core attributes. In addition to these, entities of each type have their
own type-specific attributes.
Name [Type]
Description
entity/id
string
Unique identifier for the entity. In the form of a UUID. Entity
IDs are used for references between entities.
entity/name
string
Name of the entity.
entity/description
string
Description of the entity.
entity/folder
string
Registry folder in which the entity is organized.
entity/tags
array of string?
Tags applied to the entity by users.
entity/createdAt
instant (long)
Timestamp when the entity was created.
entity/createdBy
string
User ID of the person who created the entity.
entity/modifiedAt
instant (long)
Timestamp when the entity was last modified.
entity/modifiedBy
string
User ID of the person who last modified the entity.
-79-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
Usage Scenarios
Basic Entity/Metadata Operations
Create a Source
A data source is a set of data whose lifecycle is not managed and controlled by Loom. These may
be registered with Loom as a Source. This yields the following benefits:
1.
2.
3.
4.
visibility of the source to a wider set of users
definition and management of metadata describing the source
determination of how to read and parse the data
understanding of the data characteristics, through descriptive statistics and other
measurements
5. first step in creating a dataset for more thorough data preparation and analysis
6. ability to define relationships between sets of data
Here is how you create a Source entity in Loom, to represent a data source in its ‘native’ form.
This version allows for user interaction; there is a more streamlined version if the user knows all
information up-front. The example is for a text file-based source.
Step
Description
Notes/Methods
1
Identify location
Input to API
2
Identify storage type, and format type
Inputs to API
3
Read source metadata, return structures
GET /sources/default
4
Identify which parts of the source to include
Set ‘containsData’
5
Review raw data in files
GET /data/file/read-lines
6
Review parsed data based on current format
GET /data/file/read-parsed
7
Modify format characteristics to best parse data
Change Format properties
8
Register the source with Loom
POST /sources
Create a Dataset
A dataset is a set of data that is created by Loom, and whose lifecycle is controlled and managed
by Loom. These are represented by Datasets in Loom. Datasets may be initially created in Loom
-80-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
from Sources; thereafter, any processing that is performed on a dataset in Loom will result in the
automatic creation and registration of another dataset. Datasets are used for data preparation -and subsequently data analysis -- using Loom.
The benefits of defining a dataset are:
1.
2.
3.
4.
visibility of the dataset to a wider set of users
definition and management of metadata describing the source
Loom controls how the data are persisted, and can optimize this for efficient processing.
understanding of the data characteristics, through descriptive statistics and other
measurements
5. used for data preparation and cleansing in Loom
6. relationships between datasets automatically captured when processing occurs
Here is how you create a Dataset entity in Loom from an existing Source, to represent a managed
set of data. This version involves user interaction; there is a more streamlined sequence of steps if
no user interaction is required.
Step
Description
Notes//Methods
1
Get default dataset from registered source
GET /datasets/default <src>
2
Change which data units to include
3
Change schema information
Edit data unit Schemas
4
Register the dataset with Loom
POST /datasets
Create a Process
A process represents some processing of data. Data resides in Sources and Datasets, so
processing acts upon those (specifically, upon the data in tables in them). A process is
represented by a Process entity in Loom. A process is similar to a function or method in a
programming language -- it has a name, some input parameters, and some output parameters. In
the case of Loom, there is also the notion of data contexts, which arguments may bind to during
execution.
There are a variety of types of processes (including any type a user wants to define). The most
common when using Loom for data processing is a SQL Query.
Here is how you create a Process entity in Loom from an existing Source, to represent a managed
set of data. This version involves user interaction; there is a more streamlined sequence of steps if
no user interaction is required.
-81-
Loom REST API
Step
Document Revision 0.10 - 01 Aug 2014
Description
Objects//Methods
1
Define the input arguments (name-value pairs)
Arguments
2
Define the input and output data contexts
Contexts
3
Define the local Process
Process
4
Register the Process with Loom
POST /processes
Overlay Metadata on Corporate Resources
Another set of use cases involves using Loom primarily as a metadata registry, to attach metadata,
manage resource, define relationships, and determine lineage between data resources that are
generated and used outside of Loom.
Here is the basic sequence of steps. The full sequencing of each activity is consolidated down to
one primary method for each; the additional details are covered above.
Step
Description
Objects//Methods
1
Register Sources for each set of data
POST /sources
2
Register Processes
POST /processes
3
‘Use’ Processes to link Sources together
POST /processes/<id>/uses
4
Compute lineage relative to a Source
GET /search/lineage
-82-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
API Examples
Source
The following is an example of the composite structure that is returned from and passed into the
various /sources methods.
JSON Composite Structure
{
“entity”:
{
"entity/type": [
"data/DataContainer",
"source/Source"
],
"entity/name": "ModifiedName",
"entity/folder": "test/test2/test3",
"entity/description": "Created by RLoom",
"entity/tags": "RLoom",
"data/structuralForm": "table",
"source/dataAccessible": true,
"source/metadataAccessible": true,
"source/entityState": "active",
"source/expandable": true
"data/dataUnit": [],
},
“storage”:
{
"persist/format": {
"persist.file.text/headerRow": true,
"entity/type": [
"persist/StorageFormat",
"persist.file/DelimitedFormat"
],
"persist/formatType": "text/delim",
"persist.file.delim/delimiter": ",",
"persist.file.delim/quoteChar": "\"",
"persist.file.text/skipRows": 0
},
"entity/type": [
"persist/Storage",
-83-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
"persist.file/FileSet"
],
"persist/storageType": "file/text",
"persist/location": "/data/datasets/earthquakes",
"persist/application": "",
"persist.file/isSingleFile": true,
"persist/storageUnit": []
},
“storage_units”:
[
{
"entity/type": [
"persist/StorageUnit",
"persist.file/FileSetFile"
],
"persist/location": "file:///data/datasets/earthquakes/earthquakes.ddl",
"persist/relativeLocation": "earthquakes.ddl",
"persist/containsData": false,
"persist.file/fileExtension": "ddl",
"persist.file/isLogicalFile": false,
"persist.file/isBinary": false
},
{
"entity/name": "eqs7day",
"entity/type": [
"persist/StorageUnit",
"persist.file/FileSetFile"
],
"persist/location": "file:///data/datasets/earthquakes/eqs7day.csv",
"persist/relativeLocation": "eqs7day.csv",
"persist/containsData": true,
"persist.file/fileExtension": "csv",
"persist.file/isLogicalFile": false,
"persist.file/isBinary": false
},
{
"entity/type": [
"persist/StorageUnit",
"persist.file/FileSetFile"
],
"persist/location": "file:///data/datasets/earthquakes/README.txt",
"persist/relativeLocation": "README.txt",
"persist/containsData": false,
"persist.file/fileExtension": "txt",
-84-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
"persist.file/isLogicalFile": false,
"persist.file/isBinary": false
}
]
}
Process
The following is an example of the structure that is returned from and passed into the various
/processes methods.
JSON for SQL Transform with no Default Input Context
The following is an example of the JSON that can be POSTed to /processes to define an
executable ad-hoc SQL process that can be executed through Loom. This particular example does
not have a default input context defined.
{
"entity/description": "Created by RLoom",
"process/argument": [
{
"entity/type": [ "process/Argument", "process/ConfigArgument" ],
"entity/name": "transformText"
}
],
"entity/type": [ "process/Process" ],
"process/processClass": "transform",
"process/processType": "sql-query",
"process/processScope": "dataunit",
"entity/name": "SQLProcess_NoContexts",
"entity/folder": "test",
"process/isExecutable": true,
"entity/tags": "RLoom"
}
JSON for Linking 3 Sources
The following is an example of the JSON that can be POSTed to /processes to define a
‘descriptive’ (non-executable) process to link 2 sources as inputs with 1 as output.
-85-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
{
"entity/type": [ "process/Process" ],
"entity/name": "ProcessLinkingSources",
"entity/description": "2 input sources, 1 output",
"entity/folder": "test1/test2",
"entity/tags": "tag1, tag2",
"process/processClass": "descriptive",
"process/processType": "lineage-process",
"process/processScope": "container",
"process/isExecutable": false,
"process/argument": [
{
"entity/type": [ "process/Argument", "process/ConfigArgument" ],
"entity/name": "relationship",
"process.arg/value": "link"
},
{
"entity/type": [ "process/Argument", "process/ConfigArgument" ],
"entity/name": "source1",
"process.arg/value": "Source1"
},
{
"entity/type": [ "process/Argument", "process/ConfigArgument" ],
"entity/name": "source2",
"process.arg/value": "Source2"
},
{
"entity/type": [ "process/Argument", "process/ConfigArgument" ],
"entity/name": "source3",
"process.arg/value": "Source3"
}
],
"process/context": [
{
"entity/type": [ "process/Context" ],
"entity/name": "Source1",
"process.context/inout": "in",
"process.context/container": "51ee9a72-af33-4798-97dd-15b71ea86d49"
},
{
"entity/type": [ "process/Context" ],
"entity/name": "Source2",
"process.context/inout": "in",
"process.context/container": "51ee9a74-fdc2-47c9-a9c3-203f165f353d"
-86-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
},
{
"entity/type": [ "process/Context" ],
"entity/name": "Source3",
"process.context/inout": "out",
"process.context/container": "51ee9a77-e4d6-4135-9160-2ea07d565927"
}
]
}
Executing/Using Processes
Processes are used, or realized, either through execution (for executable processes) or by
explicitly defining a ‘use’ (for non-executable processes). In both cases, a ProcessUse is created.
A ProcessUse is a snapshot of the Process at the instant the process was used.
When using a process, the data contexts must be specified. (Default input contexts from the
Process may be used, or new input contexts may be specified; output contexts must always be
specified.) A context represents a single set of data involved in the processing. If the process is
scoped at the container level, then the contexts will contain only data container references (i.e.,
references to Sources and Datasets, one per context). If the process is data unit scoped, then the
contexts will have not only a container reference, but also the name of a data unit (i.e., table) within
the container. In addition to a name, a container, and an optional data unit name, each context
must specified whether it is used for input or output.
When executing or using a process through the API, an array of Contexts is provided.
JSON for Array of Contexts, Container-Level Process
This is an example of a JSON for the ‘contexts’ parameter for POST /execute/transform and POST
/processes/<process_id>/uses.
{
"contexts": [
{
"entity/type": [ "process/Context" ],
"entity/name": "Source1",
"process.context/inout": "in",
"process.context/container": "51ee9a72-af33-4798-97dd-15b71ea86d49"
},
{
-87-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
"entity/type": [ "process/Context" ],
"entity/name": "Source2",
"process.context/inout": "in",
"process.context/container": "51ee9a74-fdc2-47c9-a9c3-203f165f353d"
},
{
"entity/type": [ "process/Context" ],
"entity/name": "Source0",
"process.context/inout": "out",
"process.context/container": "51ee9a77-e4d6-4135-9160-2ea07d565927"
}
]
}
JSON for Array of Contexts, DataUnit-Level Process
This is an example of a JSON for the ‘contexts’ parameter for POST /execute/transform and POST
/processes/<process_id>/uses. This would be similar to contexts used in single-table SQL query
executions (1 table in, 1 table out).
{
"contexts": [
{
"entity/type": [ "process/Context" ],
"entity/name": "input",
"process.context/inout": "in",
"process.context/container": "520d1f28-35e2-47e8-aa80-6509d674bda1",
"process.context/dataUnitName": "eqs7day"
},
{
"entity/type": [ "process/Context" ],
"entity/name": "output",
"process.context/inout": "out",
"process.context/container": "520d1f28-35e2-47e8-aa80-6509d674bda1",
"process.context/dataUnitName": "Result01"
}
]
}
-88-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
-89-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
API Notes
Notes on HTTP calls
Each API is accessed by an HTTP URL whose path starts with the root path of the API. The
individual API operations are addressed both by extensions to this root path, and by the HTTP
methods used to access to the URL.
Methods
Some HTTP clients (e.g. browsers) may not support all of the methods used by an API. In this
case, a POST operation may be used in conjunction with a URL parameter named _method. The
value of the _method parameter will be used to override the POST method.
As an example, if it is not possible to send a PATCH request to the URL:
http://somehost.com/path/operation
then a POST may be sent instead to the URL:
http://somehost.com/path/operation?_method=PATCH
Host and Port
When referencing the Loom Server via its URL, the host name must be resolvable through DNS to
the server that Loom is running on, according to the machines listed in the local hosts file of that
machine (or may be an explicit IP address).
The default port is 8080, although that can be changed when the Loom Server is started. Firewalls
should be configured to allow the host machine to accept connections to this port number.
Input Parameters
Most API operations require several parameters. For GET requests, these parameters are
provided as part of the URL. For other HTTP requests (POST, PUT, PATCH), the parameters are
provided in the body, as a JSON map of parameter names to values. Occasionally an unnamed
parameter may appear in the path, as is often seen in REST operations. This is the case, for
example, when performing operations on a specific entity instance, in which case, the entity ID is
part of the URL.
-90-
Loom REST API
Document Revision 0.10 - 01 Aug 2014
_method Parameter
Due to URL length limitations, it is sometimes not possible to put the required input parameters in
the URL for a GET request. In those cases, the special _method parameter should be placed in
the operation URL, and a POST operation should be used. The back-end service will interpret the
request as an HTTP HGET, but will look for the parameters in the request body, rather than in the
URL.
Responses
All responses are JSON map structures in the HTTP body. This is consistent even when a
response holds a single value or an array.
Reference properties
Many entities contain references by ID to other entities. For instance, a job currently contains a
reference to a query, an input dataset, and an output dataset. Invariably, a client (e.g. the UI)
requires the name and folder path of the referred-to entity in order to display the reference to the
user. In addition, the ID is used to construct a hyperlink to the referred-to entity. The API provides
a way to return this information, in the ‘related’ part of the standard response structure.
References and composites
Certain kinds of references are not to named entities but are instead references to internal entities.
For instance a DataUnit contains a reference to a Schema entity, which contains references to
Column entities, and so on. In most cases, those kinds of references to internal objects are
expanded into nested JSON objects when passed through the API.
Serialization
Entities are serialized as JSON.
In general, a sub-graph is serialized as a nested JSON structure, where references (UUID ‘pointer’
attributes) and composite structs are treated similarly, as nested values under a parent.
-91-
Download