📚    BPM    SDLC    BIAanl    Data    Meta    Math    📚 👐 🎭 index - references    elucidation    metier 🎭
⚒    Intro    data everywhere    collect    deliver    intermezzo 1    transfer    BI report    ALC type3    intermezzo 2    BIA prepare    EDWH 3.0    What next    ⚒ 👐    top bottom   👐

Design Data - Information flow


information, data: enterprise core objects.

Data, gathering information on processes.

Building data The data explosion. The change is the ammount we are collecting measuring processes as new information (edge).

📚 Information questions.
⚙ measurements data figures.
🎭 What to do with new data?
⚖ legally & ethical acceptable?

🔰 Too fast .. previous.

Contents

Reference Topic Squad
Intro Data, gathering information on processes. 01.01
data everywhere Data everywhere, systems everywhere. 02.01
collect Data: collect - store - deliver (I). 03.01
deliver Data: collect - store - deliver (II). 04.01
intermezzo 1 Intermezzo (1) 05.01
transfer Transfer information - data. 06.01
BI report BI Analytics - reports. 07.01
ALC type3 Analytics - ALC type3. 08.01
intermezzo 2 Intermezzo (2) 09.01
BIA prepare Preparing data for BI Analtyics. 10.01
EDWH 3.0 Logistics of the EDWH - Data Lake. EDWH 3.0 11.01
What next Change data - Transformations 12.00
Following steps 12.02

Progress


Data everywhere, systems everywhere.

Administrative systems are processing data - information for al long period using computers. The options for measuring those processes were limited as computerresources were limited and expensive. That is all changing.

etl-reality.jpg
Proces flow.
A bigger organisations has several departments. Expectations are that their work has interactions and there are some central parts.
Sales, Marketing, Production lines, bookkeeping, payments, accountancy.

Interactions with actions between all those departments are leading to complexity.

df_machines.jpg
complexity number of computers.
The number of machines and the differnces in stacks are growing fast. No matter where these logical machines are.

For every business service an own dedicated number of machines will increase complexity.

olap_star01.jpg
Optimization Operational Data.
The relational SQL DBMS replaced codasyl network databases (see math).

The goal was simplification of operational dataprocessing bij deduplication and normalization (techtarget) using DBMS systems supporting ACID ACID properties of transactions (IBM).

These approaches are necessary doing database updates with transactional systems. Using this type of dbms for analytics (read-only) was not the intention.

Data: collect - store - deliver (I).

Processing objects, information goes along with responsabilities.

Focus on the collect receive side.
There are many different options how to receive something. Multiple sources of data - information. Not alle data is of the same quality.

Diagram_of_Lambda_Architecture_generic_.jpg All kind of data (technical) should get support for all types of information (logical) at all kinds of speed.
Speed, streaming, is bypassing (duplications allowed) the store - batch for involved objects. Fast delivery (JIT Just In Time). lambda architecture. (wikipedia).

In a picture:
df_collect01.jpg

Data: collect - store - deliver (II).

Processing objects, information goes along with responsabilities.

Focus on the collect receive side.
There are possible many data consumers. It is all about "operational production data". A classification by consumption type:
  1. ⚒ Operations For goals where standard systems are not appropiate or acting as an interface for not coupled systems. 💰
    Results are input for other data consumers. Sensitive data allowed (PIA).
  2. ⚒ Archive of data - information not anymore available in operations, only for limited goals and associated with a retention period. ⚖
  3. ⚒ Business Intelligence (reporting). Developping and generating reports for decsion makers. Possible ias usage of analytical tools with DNF. ✅
    Sensitive data is eliminated as much is possible.
  4. ⚒ Analytics Developing Machine Learning. ❗ This is: ALC type3.
    Sensitive data is eliminated as much is possible.
  5. ⚒ Analytics, Operations Machine Learning. ❗ This is: ALC type3.
    Sensitive data may be used controlled (PIA).
    Results are input for other data consumers.

In a picture:
df_delivery01.jpg

Intermezzo (1)


Logistics using containers.
shp_cntr_clct.jpg Containers have become rapidly the standard in physical logistics. It are not the objects being transported but containers.

shp_cntr_store.jpg Time to delivery moment planned by trnsport duration.

shp_cntr_dlv.jpg

Relationships.
More links associated - entry/exit
Is used at:
👓 threats for data & tools Proces Life Cycle.
👓 Release management SDLC - release management.
👓 Business Intelligence,Analytics .
Details to be found at:
👓 Meta
👓 Math Software engineering.


Transfer information - data.

SD_enterpriseservicebus.jpg
The service bus (SOA).
bron: ESB enterprise service bus The technical connection for business applications is preferable done by a an enterprise service bus. The goal is normalized systems.
Changing replacing one system should not have any impact on others.

Microservice_Architecture.png
Microservices with api´s
microservices: (Chris Richardson)
Microservices - also known as the microservice architecture - is an architectural style that structures an application as a collection of services that are: The microservice architecture enables the continuous delivery/deployment of large, complex applications. It also enables an organization to evolve its technology stack.

informatie_mdl_imkad11.jpg
Data in containers.
Data modelling, operational and analytics, using the relational or network concepts is based on basic elements (artifacts).
An information model like IMKAD is using more complex objects. In the figure every object type has got different colors.
The information block is a single message describing complete states before and after a mutation of an object. The Life Cycle of a dataobject as new metainformation. Any artifact in the message following that metadatainformation.

df_dlv_bi-anl.jpg

BI Analytics - reports.

Business Intellgence has for long claiming the patent of being the owner of the E-DWh.
The Dimensional OLap modelling and the Data Vault for building up reproting on production data.

df_dlv_alctp3.jpg

Analytics - ALC type3.

Analytics machine Learning, data science hs. f being the owner of the E-DWh/ the "Data Lake" for analtyics,
Using yhe Dimensional OLap modelling and/or the Data Vault for building up DNF datasets aside those from streaming and direct input for modelling. Modelling the synonym for code development.

 

Intermezzo (2)

wrh_selfsrvc-01.jpg
Selfservice - Managed

Self service sounds very friendly, it is a euphemism for no service. Collecting your data, processing your data, yourself.

Have it prepared transported for you so it can processed for you.
wrh_cntr_stor.jpg

shp_cntr_load-2.jpg
Containerization.

We are used to the containerboxes as used these days for all kind of transport.

The biggest of them the ships going over the world reliable predictable affordable.

shp_cntr_liberty.jpg
The first containerships where these liberty ships. Fast and cheap to build. The high loss rate not an problem but solved by building many of those.

For normal economical usage, reload returning many predictable relaible journeys they were not that succes.

Preparing data for BI Analtyics.

etl-simple.jpg
Data from operational systems (I).
Preparing data for analytics has started in the 80´s by copying that to another dedicated location.

Steps in the proces:

etl_bi_dwh3.jpg
Data from operational systems (II).

Still as a dedciated solution for analytics the DWH being build with a new data model "Data Vault".

The total execution time and lack of test approaches like regression tests result into a costly difficult to manage environments.

Lans_datavirtualise.jpg
BI datavirtualization.

Almost all data in BI is about periods. Adjusting data matching the differences in periods is possible in a standard way.

The data virtualization is on the data vault DWH dedicated for BI reporting usage. It is not virtualization on the ODS, or original data sources.



idea lightbulb Combining the datatransfer, microservices, archive requirement, securtiy requiements and doing it like the maturity of physical logistics goes into the direction of a centralized managed approach.
Reuse of standard solutions in a standard way has always been promoted as better and cheaper. Why would we not do that for data ware housing, data transport? EDWH-3.0
 legal

Logistics of the EDWH - Data Lake. EDWH 3.0

Processing objects, information goes along with responsabilities.

⚠ A datawarehouse is allowed to receive semi-finished product for the business proces. There is no good reason to do this also for the datawarehouse when positioned as a generic business service.

CIA Confidentiality Integrity Availablity. Activities.
CSD Collect, Store, Deliver. Actions on objects.

The two vertical lines are managing whos has acces to what kind of data, autorized by dataowner, registered data consumers, monitored and controlled.
The confidentiality and integrity steps are not bypassed with JIT (lambda).

In a picture:
df_csd01.jpg

 horse sense

Change data - Transformations


DBMS changing types
A mix of several DBMS are allowed in a EDWH 3.0. The speed of transport and retentionperiods are important considerations. Technical engineering for details and limitations to state of art and cost factors.
dbmsstems_types01.png
etl-elt_01.png
ETL ELT - No Transformation.
Transforming data should be avoided, it is the data-consumer process that should do logic processing.
The era offlaoding data, doing the logic in Cobol before loading, historical.

Following steps

Missing link

These are high level considerations.

Describing the data is data of data. 👓 MetaData.

What is not here: 👓 data & patterns practice.



⚒    Intro    data everywhere    collect    deliver    intermezzo 1    transfer    BI report    ALC type3    intermezzo 2    BIA prepare    EDWH 3.0    What next    ⚒ 👐    top bottom   👐
📚    BPM    SDLC    BIAanl    Data    Meta    Math    📚 👐 🎭 index - references    elucidation    metier 🎭

© 2012,2019 J.A.Karman