Devops Meta - private metadata governing data as context

🎭 index references    elucidation    metier 🎭
👐 top    mid    bottom   👐

⚙  bpm   sdlc   bianl   data   meta   math   ⚙
⚒   Intro   Foundation   define private meta  private meta pull  private meta push  Pattern chain   ⚒

Contents & topics private metadata Data governance

Patterns as operational constructs.

more_dmeta pull request inforamtion security applications private metadata, why is it critical?

There are a lot of questions to answer:
📚 Information data is describing?
⚙ Relationships data elements?
🎭 Who is using data for what proces?
⚖ Inventory information being used ?

🔰 Somewhere in a loop of patterns ..
Most logical back reference: previous.


Reference Topic Squad
Intro Patterns as operational constructs. 01.01
Foundation Why using a private metadata. 02.01
define private meta Defining a full domain private metadata. 03.01
private meta pull Use full private metadata & derivates (pull). 04.01
private meta push Use full private metadata & derivates (push). 05.01
Pattern chain Defining the chain - Transformations. 06.00
Dependencies other patterns 06.02


Duality service requests

duality service requests sdlc: the naming an description gets either hidden behind user front-ends or are becoming a vocubalary by those that are working more directly with data.

bianl: A complete pull psuh delivery for information is possible having all those elements defined. A descriptions for human readability is going along.


Why using a private metadata.

With a "private metadata" approach, my goal is to have a well defined table having the descriptions for all data elements being used. This table should be available in the data transforming an data generating process in a way coding efforts are minimised by using generic patterns.

The full process circle transforming and generating information.
Information is input for the process represented by data. The output delivery is also data. The horizontal flow from left to right. Intermediate data is used during the pull request or push delivery. They are to be found on the vertical axis.

Information / data is processed by code guided by metadata. When code and metadata are cooperating the the structure becomes a standard pattern. code with the metadata is found on the diagonals. Having four stages it is a circular process.
In a figure:
full data and code proces circle Stage four is the verify and planning of elements that are needed or nice to have.
The model pattern in this figure is acknowledging all four stages. Standard ERP and DWH solutions are avoiding that stage IV, minimizing balance and checks in III.

Defining a full domain private metadata.

A full domain is having all descriptions with different versions of an element by unit size corrections and time displacements.
This is obvious adding some duplicates that after adjustments in element value and/or key identifiers will disappear.
Using the full private metadata IV III I
Without private metadata all logic must be build using data model structures.
Focussing on the code with cooperative private metadata is directing the thinking on what transformations are needed.
A structure for a metadata model hierarchy is like:
  1. 🎭 Specification variable definition
    1. The owner information domain.
    2. High level subarea within the owner.
    3. Within the high level subarea an collection identifier for a similar topic and similar interval deliveries.
    4. Whether it an delivered value or calculated one on what has delivered. (cross check content)
    5. Rubric indicator(s) valid for the collection identifier
  2. 🎭 Specification variable definition
    1. An global element identifier for what the element values is describing.
      eg: male - female - human in numbers possible by age groups
      eg: booked amounts, estimates, expected total amount.
    2. Some code preferred reusing those that already are in use. Describing the intention and meaning of the elements.
  3. 🎭 Specification variable definition
    1. The goal is making it an unique detailed code for this element.
      When there is already some conventions in use that one to copy.
  4. ⚙ Attributes
    1. Version type of the element value for unit corrections
    2. Time (date) displacement used for this element adjusted to time (date) key identifier value of the delivery.
With private metadata supporting logic data models are build in an assembly line to become useful. There are undefined intermediates that wil be a result of proces transformations. In a figure:
focus on code and metadata in the proces circle
When the input is already having multiple columns the transpose possible is still a needed construct. This will happen when the taget object analysis in the output is different on what is in the input.

Use full private metadata & derivates (pull).

Having set what the key values for the private metadata table should cover and what the basic four identifiers are, realisation starts. There are limitations by used tools. In my real life experience SAS Excel and Oracle was used. The examples here are based on that. There is a limit of 32 characters for a variable name. No limit on the number of columns. Other limits were not relevant.

Making variable naming as simple as is possible is by avoiding spaces and special characters in variable names, only the underscore is allowed. The underscore is used to make a logical separation in the variable name.
In the program code variable names must be used that are not conflicting with the names in the private metadata. The private metadata table and other tables also needing a naming convention. Usage for the table columns variables is a three level structure. The connection between levels made by the underscore.
  1. ($*) variable domain, eg: vcd (private metadata) state (data source)
  2. ($*) unique identifier, eg: spec spec1 spec2 spec3 attrib type
  3. ($*) usage type, eg: idvar (variable) idjmd (date content yyyymmdd format)

The full private metadata table
The private metatadata table looks like:
  1. 🎭 vcd_spec1_idvar $8 , logical key variable
  2. 🎭 vcd_spec2_idvar $8 , logical key variable
  3. 🎭 vcd_spec3_idvar $8 , logical key variable
  4. ⚙ vcd_attrib_idvar $8 , logical key variable
With an compound index on the key variables and having it sorted in the order of the index, performance will not raise questions.

Using private metadata IV (template)
Required are four columns in an excel table on an reserved sheet with the goal of datatransport. After the excel table has been retrieved the other columns of the private metadata are joined (key lookup). The four columns are:
  1. ⚙ vcd_spec_idvar $8 , logical key variable

Using private metadata III (staging)
There are only two column variabeles having values stored in a row. All other columns are identifiers or markers on when the elements have been processed. This is a minimized set of what is needed to store and archive any element. The sizing of the dataset can kept small when partitioned on eg years from state_type_idjmd.

Use full private metadata & derivates (push).

In the push stage the attribute key must be removed by deduplication. The result is an deduplicated version of the private metadata table (vdc)
Adding calculations (calc) is a new part of private metadata. Adjusting values and adding adjusted keys.

Using private metadata I (semantic)
Before wanted tables are possible to create, adjustments in values and keys are done in an intermediate step. Selecting ad period reduces the size of data in working process. The table name this element is going to is defined.

Creating and adding calculation definitions
Additional private metadata is added for adding calculations on the data. A list of possible elements is checked with elements that actually are present, removing the ones that do not exist. The calculation metatadata table looks like:
  1. 🎭 vcd_spec1_idvar $8 , logical key variable
  2. 🎭 vcd_spec2_idvar $8 , logical key variable
  3. 🎭 vcd_spec3_idvar $8 , logical key variable
  4. ⚙ vcd_attrib_idvar $8 , logical key variable
The result of the metadata table (def) that is created has the same structure. It gets a name of table that will be created with suffix "e.def".

Using private metadata II (delivery)
From staging the goal is delivering tables with columns having elements to analyse and useful key variables for identification and grouping. This transpose formation in the figure is the logic:
transposing information into useful information 💡 repeating this process for all wanted tables, identification and grouping. The focus is possible on an entity or on elements.


Defining the chain - Transformations.

Pattern: Information private metadata.
Make the logical processing guided by private metadata.
  1. Define a full private metadata table
  2. Have all data elements being described by the full private metadata table
  3. Add another private metadata table for calculations in the push stage
  4. Let every stage (IV III I II) run under control of the private metadata
  5. Run the process transforming data with small set of code (sources)

Required other Patterns.
💡 The Information pull request push delivery is complementary to this pattern.

Conflicting other Patterns.
⚠ The transposing of elements is not an standard transformation. It will work well in an analytics environment. For transactional operational systems it will be not responsive enough.
⚠ This approach breaks with the modelling of the intermediate data (semantic). No star schema or data vault, only well described elements and an minimized set of transformations.
Imagine all data in the following figure is forced by a fixed structure (data model). The transformations can get very complicated. The worst scenario would be using this pattern four times (diagonals) solve that at the four stages.
focus on data in the proces circle
Dependencies other patterns
more_dmeta requesting data pull securing information This page is a pattern on pulling request informations.

Within the scope of metadata there are more patterns like exchaning information (data, building private metadata and securing a complete environment.

🔰 Somewhere in a loop of patterns ..
Most logical back reference: previous.

⚒   Intro   Foundation   define private meta  private meta pull  private meta push  Pattern chain   ⚒
⚙   bpm   sdlc   bianl   data   meta   math   ⚙

© 2012,2020 J.A.Karman
👐 top    mid    bottom   👐
🎭 index references    elucidation    metier 🎭