Process Hierarchy Catalogue
One of Architector’s tenets concerning data lineage is that data only changes or moves because of a process. Therefore, Architector maintains a catalogue of the processes that are part of the data lineage. As in other catalogues, a process is defined by certain required attributes (name, description, etc), and you can add your own custom attributes if required.
Why Define a Process Hierarchy?
The process hierarchy is one thing that sets Architector apart from almost all other data lineage tools.
Getting the process hierarchy right makes the data lineage much more meaningful to both business people and technicians. Questions concerning data ownership, and the accountability for data issues, become much clearer. And risks (e.g. to data quality) much easier to define and assess.
Note that a lineage team does not need to define the full process hierarchy at the outset. You can define a skeleton hierarchy and develop it as the team’s understanding grows.
Here is a screenshot of a process catalogue, where you can see a typical hierarchy:
Architector’s process catalogue defines three main types of process:
- Atomic: the lowest level of process, often a technical process. Atomic processes have physical data inputs and outputs. To support effective impact and root cause analysis, it is important that all the outputs of an atomic process should be derived from all the inputs. If this is not the case, then we are probably not yet at the true atomic level.
- Straight-through: this is an aggregation of atomic processes. What characterizes a straight-through process is that there is no derivation of new data. All the inputs simply flow through to corresponding outputs in the output datastore.
- Complex: this is an aggregation of atomic processes, straight-through processes, and other complex processes. The highest-level of process in an organization is usually the most complex, as it is made up of lots of smaller processes.
The Data Usage button queries the lineage metadata and pops-up a view of the data inputs and outputs for the selected process. Data usages can be edited directly from this screen too.
Process Hierarchy in Lineage Diagram
The lineage diagram displays the highest level of process by default, which creates the simplest lineage diagram. Then the user can drill down to lower-level processes as required, to focus on particular areas of the lineage. See lineage diagram for more information about how this works.