jakarman - ICT - My profession
My early working years
jakarman - ICT - My way of thinking
In my working lifetime there are many periods changing the technical details ant attention to issues wanting to get solved.
way of thinking
👓 click for details. The image here is used at more place to chage topic and page.
IVO (ISS, Individual Sales Support)
📚
The business goal was delivering pc´s to the sales people (600) at their home having all the information they needed for their customers (1984).
🎭
The avaibable technology was:
- pc´s were just coming into the market. Specificiations: processor 8086, memory 640Kb ram, harddisk 10Mb, OS Pc/Dos 2.1.
- Connected to an IBM mainframe, screenscaping Irma 3270 cards.
- A modified Viditel modem 1200/75 bps for calling out to the pc´s at home. Planned schedule time evening hours. limit duration 4 hours.
- Controlling environment operators task within the IDMS DB/DC (database backend and fronted).
- Using languages: batch script, Assembler, C, Cobol, ADS (IDMS).
⚙
What was done:
- Building Implementing and supporting the protocol for the data exchange between pc´s. Optimized within technical limitations switching directions being 0,2s.
- A b-tree (no_sql) like dbms was build in C for doing the DBMS fucntionality. Extended maintenance was done in indexes and data blocks, improving performance & reliablity.
- Connecting PC-world to Mainframe with dedicted calling out pc´s.
- Supporting developers testers, a dedicated helpdesk and edcuational staff for what popped up.
Performance & Tuning, Mainframe
📚
The goal was setting up, delivering Management Infomation (MIS EIS) by applications and tools in system resource usage.
Multiple goals:
- Pinpointing on possible issues that could be improved, delaying exteral investments
- Able to plan, just in time, costly hardware upgrades
- Cost distribution to the several Business lines based on measured usage. (life -, property insurance)
🎭
The avaibable technology was:
- Needing to aqcuire tools. Chosen was SAS (5.17) with MXG to process the SMF records.
- Configuring adjusting settings for the SMF record processing.
- In house available scheduling and system resources.
⚙
Having done:
- Building up a weekly PDB (Performance Database) in several incrementals runs with a weekly archive and cleanup. All detailed weekly information going into this dataset.
- Appending all summaries into a SDDB (System Data Database) the cost account being added and minimized and summarized.
JST, a Generic approach for automated testing
📚 It is automating manual work of several IT-staff lines.
This is a very unusual part of IT optimization as it is internal IT.
This is a solution I implemented around 1996 and is was still used in 2011 (unchanged).
Still after all this years is considered to be a very modern approach.
⚖
Experiences:
- With this approach the generating of the statements and the re-usage in development & testing is possible to a highly optimized process.
- It survived all years several reorganizations, the year 2000 testing, Euro testing, outsourcing to India.
- Not everybody was accepting the automatization of ICT work. Not reaaly different to any high impact change to personal work.
🎭
Used technology and limitations:
- The available program language ISPF dialog with REXX running in a TSO environment was used.
- Because of the business optimization that is done, the whole is not available as a commercial solution rolling out at other environments.
⚙
The solution design and realization:
- see: 👓 JST for details
- Having a well-designed JCL (Job Control Language) in a DTAP (Develop Test Acceptance Production) approach.
- The components being setup fully in DTAP approach getting connected to release management, versioning for those processes and tools (in house build, Endevor).
Database IDD - IDMS DB/DC
This was using an advanced approach using Integrated Data Dictionaries (IDD´s). The more modern word would be Metadata Databases doing some kind of Masterdata management.
integrated backend (DB Database) and frontend (DC Datacommunication) tool middleware system.
Being executed
These dictionaries had to be setup in a master (sys:11) and based from that master a logical DTAP segregated configuration (sys:66 sys:88 sys:77 sys:01).
Each of them having definitions:
- technical definitions like terminals.
- security eg: users groups and access control.
- business applications menu tailoring, business logic and data.
The supplier Cullinet was positioned as being prefered. In previous years the front end was not available from the supplier. An in house build front-end middelware system was still running (VVTS).
📚 Goal operational support (system programmer) in a small team setting.
🎭
Used technology: suppliers Cullinet later
CA. and the IBM toolset.
⚙
What was done:
- Automating generating definitions, versioning releases.
- removing home build exits. testing combined with JST.
- Supporting devlopers and testers in achieving their goals.
Security (ACF2 RACF - MS AD )
Bringing Access to resources into roles, organizing who is allowed to what role and getting to is that person realy who is saying he is, is known as security.
There is a master security administrator task (designing) and a restricted security amdinistrator task (execuring requests) for the required segregation in duties.
The old common approach is doing input validation before handing over to the tool by lack of integrated security.
Roscoe
was an multi user mainframe program editor. The roscoe adminstrator tried to implement security by parsing and than reject or allow the command. It never succeeded to became reliable (1985).
In modern times we are parsing code (preventing code injection) trying to secure browser-usage (2020).
📚 Goals implementing security with tools:
- Manageable explainable centralized security.
- Monitoring with Incident Response. CSIRT
🎭
Used technology: suppliers "ADR" later "CA", IBM, Microsoft (AD).
⚙
What was done:
- Important in helping for the design of a logical acces setup aligned with the needed segregation by DTAP policies. (ACF2).
- Helped in the transition in the move from Roscoe to TSO, implementing ACF2. Major impact on te way of working by developers.
- Build an in house tool to support in a quick roll out on all involved platforms changing access rights (group role based).
- Did help in the transition from ACF2 (Supplier CA) to RACF (IBM).
Scheduling (UCC7 TWS - homebuild)
Scheduling Jobs in the old times was all manually hard work. Operators had to carry all the physical paper cards in time to the readers to execute them.
Planning was done as prepartions on paper using previous experiences ont load and durations. The operator was the person you sould be friends with.
This all changed when that work became automated and shifted for some parts to persons inside the business (functional support, prodcution support).
There is big shift in time impacting jobs. Many kind of jobs have gone and replaced by others in te era of applying computers AI ML.
📚 Goals:
- Replacing manual activity by a robot program (scheduler).
- Reducing failed deadlines for job readiness by learning and predicting what is going on. Being aware in daily weekly monthly runs.
🎭
Used technology and limitations:
- UCC7 later CA7 was the CA scheduler, it was the first one being implemented.
- Tivoli (TWS ppc) is the replacement delivered by IBM replaced the CA7 software.
- There are many more options to trigger jobs. JST was home build just using standard tools.
⚙
The realization being adjusted in level of acceptance at departments.
Being insider at a big company
Operational Risk (OpRisk),
This departments and the work on their support is interesting.
OpRisk is doing things like
Advanced Measurement Approach.
Monte Carlo simulation modelling with public known situations is the way to go.
It requires releasemangement, full testing (including DR) and security policy alignment. At the moment of the delivery deadline (quarterly) it is a critical proces.
📚 Goals
- Installation & implementation accoridng to in house Risk management policies.
- Helping tot run those regular reliable and archiving old versions.
- Support developers in guidelines building & maintaining that special analyses prediction.
🎭
Used technology used and limitations:
- SAS 9.1.3 AIX, SAS 9.3 Linux. Licnesed under a Solution.
- Oracle connection started in first approach (removed later).
⚙
Having done:
- Being able to do most succesful for the business line.
- Several issues:
- Issue 1: The internal cost assesment became unexplainable high.
- Issue 2: Doing a DR test succesfull was almost impossible due to too many involved machines. The Oracle database loction.
- A strange quirk in dynamic prompting causing difficulties to run for different lines (tenants) having their own internal data.
Hosting, multi tenancy - *AAS - stacks
Software As A Service (SAAS) is a great idea. Implement it by yourself if your requirements are more strictly then can be fulfilled with ´SAS on demand eg cloud services.
Dot it by yourself when you are you are big enough to do a SAAS implementation yourself. Another reason can be having multiple buisness lines (tenants) needign tehe same solution teh same application.
Sharing computer resources can bring huge benefits. There is a whole industry based on that.
When doing it in the wrong way the possible risks are also high.
Solving the SAS environment challenges with all my knowledge and experience it is brought to a much higher level as is common practice.
📚 Goals
- Definining borders of segreagtions in logical security.
- Definining release management for used tools & middelware.
- Definining release management for business applications.
- Segregation of duties in support and adminitration for ning relases management for used tools & middelware.
🎭
Used technology doesn´t really matter, it is about dependicies in the stack.
⚙
Having implemented:
- Achieved hosting (*AAS) with SAS supporting a full services.
- Got alignment with security policies.
Implemented Sudo for dedciated functions. The BoKs alternatieve failed in a technical async terminal line quirk.
- Not getting alignment with external suppliers and changing internal politics.
Release management (versioning)
Within the Information technology guidelines and technics are evolving fast.
Recent hyping tools like GIT are getting the most attention.
A generic DTAP Approach DTAP - seeing the leveled three layers is far more important.
Being in a silo you are having just one layer, your own layer.
Doing middleware support you are seeing three layers.
📚 Layered DTAP Release management
- Business Applications Logic
- Middleware tolling like SAS DBMS or a managed filetransfer (MFT)
- Business Applications Logic
🎭
Used technology don´t really matter as long goal release management be met.
💣
That goal alignment being experienced in practice as a problem because the goal often got lost and replaced by a tool implementation.
⚙
Having done and being involved with:
- A well set up DTAP environment, always required scripting (some language).
- The approvement steps are organisational mostly being managed with other tools (ITIL).
TCAB, DCAB Technical Decision Change Advisory Boards.
- Impact of changes by other layers giving unexplained surprised.
DWH, BI, Data connections, Unix
In a growing enviroment this topics became the only working area.
The first problem to be solved was a generic desktop roll out for SAS cli?nts as the desktop got another standard.
The next one was adding and consolidation of midrange servers using SAS (see hosting *AAS stacks).
This was a Unix environment (AIX) using a SAN (not NAS). Is has not very much differences in the approach compared to Linux.
It makes to set of having experiences with operating systems complete.
📚 Supporting DWH BI Goals:
- Supporting Several business lines not completely abanding the relatively small origin. At the small origin also supporting actuary, marketng (CI) and more.
- Doing the complete release lifecycle support at several layers.
- Getting alignment into security policies (risk managment).
🎭
Used technology:
- AIX, Windows, Mainframes and all dedicated scriptings
- SAS 8.2, SAS 9.1.3 with several solution lines. SAS/intrnet SAS/AF
- Oracle, DB/2, Sql server SSAS Oros, Samba
- ITIL support tools, schedulers (SAS-WA LSF).
⚙
Implemented:
- With the many siloed approaches, went to a hosting approach, consolidating many old implementations. The ODS and Dwh´s being included.
- Made all kinds of DBMS connections workable and available at business users.
- Changed the SAS Version Version SAS on all systems.
- Conversion of the manual (SAS macr´s) build schedule flow SAS/WA.
- Stuck in aking SAS datasets behave like anay other DBMS (SAS/Share).
- Solved unexplaianble errors like the one being caused by the old 16Mb line in memory setting and the single to multithreaded order result effects (2010).
Policies - Sox-404 ITSM Cobit IEC/ISO 27001, 27002 - GDPR
Policies standards are becoming mandatory (legal requirement), but are having a lot of documental work to do.
You need to know their goals and than how to get to implement those. That is the top down way.
The technology is having a history of doing hings some way. External suppliers have their own "best practices", sometimes they are very bad practices.
That is a bottom up way.
There is need for agreement and archiving on what and how to do it in the own organisation.
The mentioned guidelines are starting points thre are many more of them.
📚 Goals
- Get technical realisations aligned with high level guidelines.
- Following or initiating adjustments on the internal policies.
Waivers are exceptions declaring not following internal policies.
- Within technical layers having those configured and used conforming those internl policies.
🎭
It is about compliant processes, not the details in technology.
⚙
Having worked on:
- Getting the hands on those high level guidelines.
- Documenting (OSG) and giving feed back on those internal policies.
experienced feedback audit reports.
- Not always succeful in getting cooperation at other lower technical layers being in control elsewhere.
Data mining, Customer Intelligence (CI)
Data mining, data science is hyping in 2016. The department CI was in my early years (1990´s) one of the busienss lines to support with tools.
Cross selling, customer segmentation, churn rate and more are words they are using. The developmnent being indicated with the word modeling, operational usage of a model using the word scoring.
That is using another language to communicate for known processes.
Marketing. Customer Intelligence is commonly using more data sources than are available internally.
Geo locations, external open and closed data for inputprocessing bringing into correlations with internal business processes.
Bringing these marketing operations into departments executing the normal classic mass operations is a challenge.
The "Analytics Life Cycle" (ALC) is not settled yet.
Customer Intelligence (CI), data mining:
📚 Goals
- The coding tool SAS Eguide being more used instead of classic SAS desktop.
- The low coding tool SAS/EM (Enterprise Miner) getting supported.
- Availabilty of dedicated external data delevivery flows.
🎭
Used technology:
- From early years still runing the SAS 8.2 migrating to SAS 9.3. Adding SAS/EM connecting to DBMS systems (Oracle)
⚙
Having done:
- Solved configuration limits on memory threading and total maximum number of workspace sessions to run at the same moment.
- Seaching connection to PMML standards. Doing an EM course myself.
- Archiving and restoring of EM projects made possible.
Hired role, limited periods
ML, Machine Learning, Scoring, Explainable AI (I)
The environment is still sensitive information (2016). A generic description.
📚 Goals:
- Operationalize an existing scoring model that has run manually for a long period manually.
- The Production environment being direct connected for streaming data (message oriented) as input and delivering the results back within that same daily run.
- Delivery to several other production processes with time limits.
- Replace the several old local build solutions within business department with this new approach.
🎭
Used technology used and limitations:
- Process the scoring with a limited SAS tooling set: No availablity of a scheduler, no release management tools, no versioning tools, no segregated environments.
- The executing window is maximum two hours, the whole proces should run in about an hour to serve solving possible incidents.
- The source data is build up in using multiple flows, several windows:
- weekly proces that is running severals days to complete. Followed by four hours additonal preparations
- Daily updates to complete the too long running weekly proces
- Daily changes as feedback of deliverd scores and feedback of changed adjustment within the production proceedings
- The number of cases to be processed as message delivery can be 300.000 (quarterly peak). The total number of cases apx 2.000.000
⚙
The solution design and realization:
- Build and used a manual coded scheduler in SAS that is having all necessary functionality like triggering on events and monitoring on unexpected behavior.
- The release management issue bypassed in an approach with segregation in duties.
- Achieved a very stable hands off process that could be easily transferred to others.
Grid computing - performance load balance
This is a hot topic for performance reasons of business solutions.
The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software Grid computingframe (Herb Sutter)
Applications will increasingly need to be concurrent if they want to fully exploit continuing exponential CPU throughput gains Efficiency and
performance optimization will get more, not less, important
Scaling up:
📚 Goals:
- Improved performance optimizing usage bare iron.
🎭
Used technologies:
- SAS. linux shell scripting.
⚙
Solutions and realization:
- Gettting annoyed by bad performance as of performance Linux inodes. Bypassed with planned cleanups and diiferent way of coding, record update SAS changed to bulk updates dataset.
- Seeing basic tuning issues once solved at mainframes occuring again. Bypassed with coding and scheduling avoiding IO overload.
Scaling out:
📚 Goals:
- Performance gains by parallel processing usage.
🎭
Used technologies:
- SAS, Teradata (parallel dbms), linux shell scripting.
⚙
Solutions design and realization:
- Using parallele databases like Teradata. Design thinking as little as possible datatransfer between machines instead as ease of coding
- Defining scheduling strategies using multiple machines in cooperation of one execution flow .
ML, Machine Learning, Scoring, Explainable AI (II)
The environment is sensitive information. I can share some approach details.
📚 Goals:
- Operationalize scoring of models in a full DTAP approach
- Improve the data processing so changes can be applied more easily
- One country is aoperational, two others should run similar.
- New Score models should be able adding in a easy way (being agile).
🎭
Used technology used and limitations:
- Scoring is done in a SAS environment (base 9.4). LSF scheduler is available, DI as data integration tool available (ETL ELT data lineage), manual coding (Eguide) is possible.
- Scores should reflect reliable the current situation as near as real-time with the known information. Incremental updating for data changes is every 15 minutes
- There are just two physical machines but a logical four DTAP logical segregation has been defined (develop machine). The production processed aimes to be on one fully isolated machine.
- Information source is gathered in another system (IBM iseries). Data should be synchronized for up to 7.000.000 objects as fact (2.000.0000 active) in an ER-relationship with several dimensions.
⚙
The solution design and realization:
- Made the data synchronisation to a full daily extract from te data source. The question on what could be going missed by "Change Data Capture" avoided.
The executing time being within 15 minutes needed to be agreed on network transfer load.
- Got aware of parallel development / test situation. Modelling scores is dependent on data delivery. The score delivery is dependent on scoring.
Changed to a set up achieved something workable.
- The releasemangement set up with a acceptance (integration, system test), a shadow production (acceptance and DR) before production.
- The modelling mind switching to ER instead 3nf dimensional.
- Prepared the versioning and releasing using git and bitbucket.
- Added monitoring on the data change process and score process to stop and get notifications when there are an unexpected number of changes. It got the nickname "watch dog"
XML messages, object life cycle data model
Data modelling is a confusing challenge. The requirement is undertanding how information is delivered and what information is needed. These can be complete differnt worlds of context.
The classic dwh approach is based on modelling very detailed elements optimizing the transactional dtabase proces and saving as much as posible storage. The disadvantage is the complexity in relationships.
Blockchain is a hyping approach (2018) for archiving and processing al detailed information and all history in chained blocks (ledger).
In practice contracts, legal agreements, are describing fully the most recent situation. Their history is only in special cases relevant.
Those special cases are the most interesting ones when a goal is able to detect illegal activity or fraud.
Use case "real estate":
📚 Goals
- Changed delivery: from database export to XML message driven one.
- The size is 10.000.000 objects as snapshot on dedicated moment. Updates originating from legal action are up to 10.000 daily, but with technical changens that can grow above 100.000 daily.
- Every message is having full details up to 70 record types. Some record types allow repeatitions of 10.000.
- Every message is having the "current" and "previcous" situation with a unique key-reference creating a chain.
🎭
Used technology used and limitations:
- SAS, Linux scripting, Teradata, XML, modernized SQL (XMLTable)
⚙
The solution design and realization:
- Preprocessing splitting XML and transforming it, is necessary.
- A split up in a tremendous lot of small pieces processed did run.
- Due to complexity and some oter issues this could not be transferred to orher persons.
Dwh & datalakes, monitoring, Privacy by design
Storing objects for use at some later moment is warehousing. Just collecting a lot of things not knowing whether you will used it, is another approach.
⚖ The
datawarehouse (Bill Inmon 1970, Ralph Kimball 1996) is not having the same goal and same functionality as a physical datawarehouse.
In essence, the data warehousing concept was intended to provide an architectural model for the flow of data from operational systems to decision support environments.
What it is describing is the fucntional equivalent of a quality assurance laboratorium and / or a research laboratorimum.
💣
The words "datawarehouse" and "data lake" are confusing in their associations with their physical counterparts
Using the datawarehouse, data lake in line with the physical counterpart.
📚 Goal - change in thinking
- Mulitple customers sharing the same service.
- Solving alignment timely availablity. Allowing JIT processing (lean)
- Allowing delivery of information being used operational when data is originating from a datawarehouse.
🎭   ⚙
Solution and realization :
- The datawarehouse is not a single point dedicated solution:
- Several technical implementations possible are coexisting.
- bypassing DWH-storage, streaming data Lambda architecture
- The datawarehouse serving several usage types:
- Operational processing using tools clasified as BI-tools. The common sense reasosn is the best tool-fit for business requirement.
- The classic BI reporting usage (what has happended)
- AI, Machine Learning (ML) being devided in two separated flows.
The development of models having "privacy by design" restrictions.
The operational part adding "monitoring " and "evaluation " requirements.
The analytics usage BI (Business Intelligence) descriptive, ML (Machine Learing) and others are just customers of a warehouse like any other operational usage.
Being another level of "How to do things" these are the not visible parts in running those other projects.
Patterns
Patterns are basic building blocks and shoud match some issue to solve.
While working on several projects there is a statement "out of the box" predifined transforms are mandatory used.
In reality those "out of the box" steps are lacking some really essential logic (2017).
Several examples of not well suited "out of the box" transforms:
📚 Logical Issues to solve:
- Data synchronisation quality monitoring.
- Scoring processing monitoring on impacted cases (changes).
- 3nf tabel conversion to a DNF dataset.
🎭
Used technology: SAS DI (ETL ELT tool) but generic designthinking.
⚙
Solutions, user generated transforms:
- The issues for data synchronisation are caused by
- Numbers processed records to be monitored by a pre - and post process are not generic.
- Defining an isolated destination tabel for the monitoring data is not generic.
- Defining to logic for generating dynamic SQL for a full or incremental process and the used key selection is not generic.
- The issues for data synchronisation is caused by the resulting scoring result have no generic monitoring definition.
- The 3nf to a dnf transform can be solved with al lot of steps. This is a complicated approach.
Another disadvantage is all the performance overhead of all those steps and choosing a not well suited transform even more performance penalties.
Made a proposal to do this simplified depending on the used database technology.
Verified these with a known proces did behave well but also showed a system that seems to behave unpredictable (overloaded thrasing).
jakarman - ICT - My way of thinking
In my working lifetime there are many periods changing the technical details ant attention to issues wanting to get solved.
way of thinking
👓 click for details. The image here is used at more place to chage topic and page.
© 2012,2020 J.A.Karman