Devops Meta - governing data as context
Solutions based on relationship wiht context
Patterns as operational constructs.
Doing implementations of solutions is normally using building blocks that have been used before.
The first thing understand the reasoning of those building block patters.
This requirement in understanding is always needed. No matter whether get bought or build in house.
Too fast .. previous
- 2020 week:01
- reordering the content as the design level has become more clear.
- Defining sub pages used for the operational patterns.
- 2019 week:19
- starting to get the page filled.
Done not very well evaluated as of missing design concepts.
Duality service requests
From the organisation (bpm) there two goals for their solution questions in improvements, those are:
- the core business process (sdlc)
- for reviewing governing the business process (bianl)
Solutions for those goals are different altought some tools could be the same.
data governance 101.
You could go to some place and asking an oracle on what to do. It won't dismiss you from your own actions to do.
Managing processes is requiring:
📚 Information the available data is describing?
⚙ Relationships between data elements (meta context)?
🎭 Who is using that data - information- for what process?
⚖ Inventory of data and information being used ?
Naming conventions, when done correctly, are narrowing down a complex environment into many smaller environments.
Smaller environments with less challenges are more easy to solve.
A book library has a fine tuned labelling (naming) convention to be able to find and store a huge number of books.
For a information system is needed:
- Life Cycle indications on any component
- Unique business process lines
- Classification business artefacts (technical)
- Classification technical artefacts (tools)
- Hierarchy of technical support for administration and monitoring
- The decoupling and connection to other business processes
👓 A proposal for a naming convention. My intention once to bring this fully into a production environment.
The important segregation of the tool from the business process was after that clear.
Invented in house or a bought solutions.
When the problem to solve is a standard well known one there would be a good chance there is commercial software available for that.
Building and maintaining once for many is usually cheaper and giving more functionality than building it yourself.
When buying a solution for a process:
- it will be just partial at the intended target environment.
- Just a part of what is bought will be used, it could be below 10%.
What parts being used is important to have managed by the customers.
- How to ignore all that overhead not being used, is a new problem.
- A switch to a new Vendor is something to evaluate as a new problem.
- The challenge still to solve is the integration with all other processes.
There is no way out in avoiding responsibility for the own organsiation.
Data once was on a single system, everything being shared on a single machine with no real security or isolation.
As everybody liked to have their own machine isolated from the others the need of transport of data has started.
Duplicating of data -information- has grown to a level that knowing what it is about has become a huge problem.
It is not the technical limitation of storage but not knowing for sure wat information is use by who for what purposes and with what kind of reliability and availability.
Transport of data - information
Having systems in place the usual question is how to propagate the data -information- from one system to another in a reliable way.
A service better "micros service" is a direct way with no need for storage and building up an inventory with time delay.
Not every process is an interactive one needing immediate response. May be most are starting by a request at some point in time and a delivery that is at some moment later.
In those cases well defined storage location is needed with associated security alignment.
👓 A business application has little value when there are no interactions.
My intention once to bring this from a running production environment to getting more used.
The important segregation of the tool from the business environment and the making interactions with other business processes obvious.
Requesting data -information as pull
The full stack request interactions of information starts with questions:
- What information is needed
- Who should deliver the information
- When is the information needed
This is similar to defining the pull question.
👓 A full stack information flow including the request and the basic delivery. Executed as small project at a regulator.
Starting at IV goes to III connecting to better known I and II. The order got mixed up by historical BI approaches I and II.
Realizing with understanding data.
Data lineage is often mentioned as being important. The goal is following the flow of information in the logical process.
During technical realisations it gets lost by the limited focus on the technical tools.
A private metadata approach
The challenge is how to manage the metadata when the data -information- doesn&acutt follow the well known solutions.
Describing the problem that is solved.
- Wanting a single object for exchanging data
- Having many types with elements to exchange (complexity)
- The variety and variability both can be high and different which each element
- Wanting to have useful information quickly when the deliveries are complete
- Doing trend analyses on selected information elements for a longer period (multiple years)
👓 A private metadata naming solution.
Executed as small project at a regulator.
Technical details what has done.
- Using a spreadsheet (Excel) with a single sheet having one table collecting all information
- Using the spreadsheet (Excel) for validating the integrity and consistency for selected elements
- Retrieving the spreadsheet table organised vertical and building a basic history on what has delivered
- Converting the temporal indicators, version indicators to current valid values and organised with current valid data-time indicators.
- Transposing the vertical adjusted elements to more practical column oriented tables.
- Adding additional computations on transposed elements.
All this is based on private metadata tables. Some have the workable "valid-from" "valid-thru" indicators.
The basic metadata table is based on a naming convention hierarchy for elements that can be a number or a string.
The table being exchanges is having just three columns.
Within that table being exchange there are string elements that are defining the delivery type and origin.
There are many other often used approaches. This one has several advantages over those.
Data lineage, commercial standard tools
There is something weird with commercials solutions for metadata and data lineage.
There is a duality in the type of artefacts being stored and maintained.
- Metadata business elements (objects artefacts) are stored in some hierarchy.
- Technical elements like servers and connections are combined in that same metadata storage.
- Users and groups for access right aside that of those that are already present are combined in that same metadata storage
- disconnection from normal available technical approaches for backup-restore and availability.
Those dedicated topics should be handled and discussed separately.
Data - Software, Security Access (SAM).
Using standard commercial solutions get along with the external guidelines of the supplier.
These guidelines are possible conflicting with the organisations goals standards and compliancy by regulations.
Some challenges are:
- preventing data breaches (leaking information).
- Having the availability as needed by the organisation.
- Ease of the requested functionality by the organisations.
- Underpinning of behaviour by monitoring and logs for the organisation.
Example securing SAS
SAS is delivering a complex environment with a lot of components. Their solutions are build of comments and than adding some dedicated logic and/or data to that.
The process of installing and configuring has many steps and is never fully complete.
👓 In my ING era (banking / insurance) a complete design and setup was done.
The requirements of the organisations being in the lead over the instructions of the external vendor.
Other issues aroused with architects not wanting to be compliant for the set regulations. Reviewing the issues and my approach I am convinced it is still correct.
Pitfalls with middleware application systems.
Some generic ones are:
- Defining outbound connections usually is giving a store for a user and password.
Because it is an external system, the password must be able to get decrypted.
- A self service approach allowing business users to code is different than a limited web service functionality.
Some configurations settings must be made different according intended usage.
- Running processes automated in a batch approach is adding another differentiation for business usage. An additional list of functionals.
- The alignment to the organisations Identity and authorisation model is challenging. That with the required administration and logging of security events.
- Processes once up&running are not automatically synchronised with external security (host).
The only way assuring system consistency is by regular full machine restarts.
- Backup Restore, availability and DR (disaster recovery) is not standard included in the external vendors setup.
What is required and what is acceptable is to be solved during the business implementation.
- common words like "backup" are used for what is a database offload.
- ACL Granting access rights: responsible accountable consulted informed.
- "Application" business or supplied tool.
Examples using SAS code.
| Source || Description |
| xkeypsw || having a password that could be reused in a wrong way |
Using a manageable Password vault without needing obscurity.
| xgetsetpsw || Synchronise account stored obfuscated |
reading re-using and storing a obfuscated user password combination.
| xmetadirlst || Definitions of data connection obfuscated |
reading to visible usable syntax.
| .. || failing home dir definition, missing saswork, wrong pwd, java /tmp |
correcting run time settings.
| .. || Dictionary database processing synchronise (users/rights). |
DevOps ICT - Transformations
Identity Questions who is responsible, who has executed it, who permitted it, who is the owner are needing are clear identity.
See: "laws of identity in brief".
Kim Cameron is a Canadian computer scientist who was Microsoft's Chief Architect of Access. He is the originator of the 7 Laws of Identity, and developed the InfoCard architecture.
PIM, Privileged Identity Management
Delegating activities still is requiring all those questions on identity. Non Personal Accounts being used with automated systems or temporary rights.
This requirement gets too often missed in security policy fundaments. The growing number of NPA´s not seeing as manageable solution.
Instead a well defined security seeing as threat by not understanding the simplicity of dedicated containerisation.
Access Control Lists, attribute based.
A long road to go. Multi tenancy, shared data, shared code as business functions and business approaches. Focussing on the machine not seeing that pathway.
These are practical data experiences.
technicals - math
generic - previous data
, generic toolings 👓 next.
Others are: concepts requirements: 👓
Data Meta - including security concepts -
© 2012,2020 J.A.Karman