dmta pull push

Devops Meta - governing data as context

🎭 index references    elucidation    metier 🎭
👐 top    mid    bottom   👐

⚙  bpm   sdlc   bianl   data   meta   math   ⚙

Contents & topics Data governance

Patterns as operational constructs.

more_dmeta inforamtion exchange private metadata dwh Information pull request, why is it critical?

There are a lot of questions to answer:
📚 Information data is describing?
⚙ Relationships data elements?
🎭 Who is using data for what proces?
⚖ Inventory information being used ?

🔰 Somewhere in a loop of patterns ..
Most logical back reference: previous.


Reference Topic Squad
Intro Patterns as operational constructs. 01.01
pull-push Why Requesting in a pull before doing the delivery in a push. 02.01
IV III pull Defining & realising the pull request. 03.01
I II push Defining & realising the push delivery. 04.01
transpose transform Transposing information and adding derivated elements. 05.01
Pattern chain Defining the chain - Transformations. 06.00
Dependencies other patterns 06.02


Duality service requests

duality service requests sdlc: The variability in information types is that low that a transactional approach with relational physical modelling from start to end is workable for a business process.
When the process is retrieving analytical information from other parties, see bianl.

bianl: The variability in information types can be that high that relational physical modelling will a problem itself.

The delivery of information for further processing is the best in a relational format.
In the project with this information structure the source data was not relational. The reason is the variation in requests and the way of how it is collected. Only at the conceptual layer with naming conventions it was possible to define wat can become relational.
Defining all the unique elements resulted in a table with apx 6.000 rows. Postponing the technical and physical relational model to the moment of delivery made the process workable with very little lines of code.


Why Requesting in a pull before doing the delivery in a push.

When the goal for some process has been defined the needed and available information must be inventoried. In a lean - agile mind setting only that information that really matters will be retrieved.
There is a cycle starting from the question what is needed up to the delivery.

It is not the least possible set of elements that are included. Some additional artefacts might be included when the effort to include is minimal and the expectation is they are needed sometime in the near future.
The information flow supported by a process cycle.
The push delivery is following the process flow from input to output.
The pull request is analysing wanted output and from that define what is need for the input. Both are having functions to support them. Those support functions have strong relationships.

There are functions for monitoring, logging and performance and there are functions that are describing working instructions. Working instructions can be generic on what is to be expected and detailed when the elements artefacts get processed for delivery.
Using only existing well defined relation tables, the question and what elements are really needed gets easily lost. When the request is done with a totally different technical approach request question is the logical starting point.

For example use a single sheet in a spreadsheet having all elements with an unique logical identification and the numerical and/or character values associated with that. This datasheet is a single technical table storing all elements in a vertical way. The output delivery is expected to be several relational tables having many columns with the focus on each element type.

In an elementary figure:
receiving safe data

Defining & realising the pull request.

The information gathering, doing the inventory and getting access for delivery is known as the most time consuming activity.

There two stages within this.
IV 🎭 The communication an organisation on using needed information that should become available.
III The technical coding and running retrieving the information.

Data is processed by logic (code) the create new data. When there is a template defined it can be later get filled by recipients.
Those filled templates is the new information to be processed. In the example of a single sheet there will be many sheets to process. In a figure:
push delivery
When there as a database connection with relational tables as input the usual hard work is decomposing all the information and composing it in some new model. The believe is the new composed model will fit for all requests in the future.

defining pull request
Determining, defining, documenting, template for the needed information. (IV)

Defining a template for what information is needed:
  1. Verify all elements are defined in a private metadata repository
  2. Add new elements to the private metadata repository
  3. Monitor, log events, maintenance of templates
Using all elements stored vertical is not a standard approach. This will result in a private metadata approach because not tools for data lineage int his way are existing.

executing pull request
Processing the incoming containers having information. (III)

Processing all filled templates into a working staging store implements:
  1. Verify all elements are conforming the model template.
  2. Extract identifiers from the filled template. Verify the recipiënts identifiers.
  3. Replace element values with all identifiers into the working staging store.
  4. Monitor, log events, adding information to the working staging store.
The working storage can have a segregation for segregations in periods. A yearly cumulation will often do. The yearly cumulation can be moved in a permanent operational store. The goal with that is only assured information from a defined moment will be used in the push delivery.

Defining & realising the push delivery.

The information processing activity is the part of a process line. The goal is with one or more input containers delivering a one or more output result containers. The process may change in time and by progressive insights.

There two stages within this.
I 🎭 For a defined delivery select the elements and information area.
II Delivering information in an easy to use format.

Having the data in the operation permanent store and newer ones in the working store, transforming of information can proceed into another defined area for the defined periods. In a figure:
push delivery
The splitting in segments by periods adds come complexity. It decreases complexity on:
prepare push delivery
Preparing stored information into more useful representation (I).

Transformation at an intermediate store for selected elements:
  1. Adjust temporal indicators (dates) when applicable.
  2. Adjust element values when the version of an element has changed the unit size.
  3. Connect key values to transformed values, add group and translated historical values.
  4. Define the destination for tables when elements will transpose into columns.
  5. keep track on what elements are used marked by origins and date time stamps.

executing push delivery
Transforming stored information into useful representation (II).

  1. Create a derivate private metadata table for each table to deliver, only having valid elements.
  2. Add to the derivate private metadata table all calculations that should me made on elements when they are columns.
  3. Create the tables doing a transpose transform in the prepared defined structure.
  4. Add calculations to the new record order before storing it.
  5. When some summary values on the new tables are useful create those after having the new tables.
  6. keep track on what elements are used marked by origins and date time stamps.
There can be multiple types of deliveries sharing the same selection. The derivate private metadata tables are in that cases shared information.

Transposing information and adding derivate elements.

Using elements stored with private metadata definitions simplifies the processing in stages IV III and I tremendously. The variability of what elements are valid is not something to bother.

The complexity is in transposing the vertical stored elements into well defined horizontal columns. This process in stage II is the unusual one. This unusual transform needs additional elaboration. This transform doesn't exist in the SQL coding approach. There are no tools supporting this with data lineage.

In a figure:
transposing information into usefull information
All information in these delivered tables could be not the end situation. For presentation goals some additional steps are needed for other tables and connecting to other tools.
This whole process was successfully implemented using several features of SAS. The code size kept remarkable small in lines of code. Some code is presented for free.
Source Description
ykwdrnt4 define template,
check elements private metadata, correct.
ykwdrnt3 check delivery with template
build up partitioned staging, monitoring activity.
ykwdrnt1 create intermediate selected partitions
defining delivery derivates.
ykwdrnt2 Create many tables for the push delivery
verify integrity.


Defining the chain - Transformations.

Pattern: Information pull request push delivery
Four stages in this pattern:
IV 🎭 Define what should become available.
III ⚙ Collect deliveries, defined elements, into staging area.
I 🎭 For a defined delivery select the elements with identifiers.
II ⚙ Delivering information in an easy to use format.

Required other Patterns.
💡 Private metadata pattern is used to do most of the logical constructs.
💡 Performance consideratations using sorted data partitions (pattern) is used.

Conflicting other Patterns.
⚠ This approach with hidden logical star schema´s is a break with how a star schemas should be implemented using SQL DWH and a relational DBMS as common guidelines.
⚠ The transposing of elements is able the create several star schema´s. Elements are the goal for analysesis reporting. Put those in columns, A key identifier or several of them can be the center to review. Usually this freedom is lost by the classic DWH approach thas is focussed on working with a predefined centerpoint.

Dependencies other patterns
more_dmeta inforamtion exchange private metadata dwh This page is a pattern on pulling request information.

Within the scope of metadata there are more patterns like exchanging information (data, building private metadata and securing a complete environment.

🔰 Somewhere in a loop of patterns ..
Most logical back reference: previous.

⚙   bpm   sdlc   bianl   data   meta   math   ⚙

© 2012,2020 J.A.Karman
👐 top    mid    bottom   👐
🎭 index references    elucidation    metier 🎭