Sometime you read bright opinion that helps you improve your own view and level up your perception. On rarer situation those opinion are so powerful that they change forever your own referential. When this happens and you realized that the impressive opinion was expressed four years ago and it didn’t aged: you fall in love! Here is my love story with pat Helland former senior software architect.
Resources
The article that ignite my brain is still on the MSDN : Data on the Outside vs. Data on the Inside
I also read the resources left on www.pathelland.com
What really capture my attention is the data typology introduced by this article, I will sum up the concept I co ncentrate on, then I will show you how the RioterDeckers employ a small trick to mark reference data and finally I will share my view on a tool aiming designing business entities according to the data typology and compliant with the SOA principles.
Data typology
Reference Data
Within service’s architecture it is about information that is published across services boundaries, for each set of reference data there is one service that creates it, publishes it, and periodically sends it to others. It is identified as the Data owner.
This kind of data shares specific abilities:
· Immutable: Once written they cannot be changed.
· Identified: with one ID you retrieve the same piece of data whenever and wherever.
· Stable: the piece of data cannot be ambiguous.
Ex: Product Identifier
Resource Data
This kind of data is used by services to complete their goal. They used to survive a service lifetime and therefore can be processed by various services (as far as you master the field life cycle). Each time a field can shift due to process execution consider it as a resource.
Ex: Stock value
Activity Data
This kind of data lives only in the scope of a service. Some can be persisted in order to reproduce the context of a service, but even those data got means only for its owner purpose. No other process will be able to work with.
Ex: Basket Orderline
Other considerations
Validity
A data scope can sometimes be established among time or space. Expiration defines the validity within a period whereas localization describes it for an area.
A data that got validity must be considered as a reference data and it must have the three intrinsic characteristics. (Immutable, identified and stable)
Versioning
Most of the reference data are not ever ending. They required some revision over the time. Updates must be periodic and they must be easily available to each service that subscribes to them. Versioning is a strategy that can be used to manage reference data over time. It helps to maintain an up to date referential. It’s crucial to be aware which piece of data is currently exploited by others. This information can be mastered by a clear versioning schema.
Retirement
Business entities got lifecycle so piece of data have also their own lifecycle. Within real life it’s very rare that lifetime ended with a concrete destruction most of the time data retires in history’s tables for reporting and metrics purpose.
One common behavior among retired data is that they must be treated as immutable and so, must be presented as read only.
Each type of data can retired but under different conditions, let’s sum up them:
· if retirement* is not required for BI or activity monitoring.
As retired data are getting older they become less and less frequently accessed and you can consider archiving them elsewhere in the information system.
A static diagram look like this :
And with the help of the typology you can identify within the entity life cycle some piece of data that we call data fragments:
Object Identifier: OID
While you design business entities you realize that most of them got an identifier and a name that quickly become immutable. I remember many discussions about database technical identifier and the opportunity or not to model them upon the data contract. More fights later, and thanks to the patience of AvalonBoy, I finally understand that each entity must have its own identifier whether it is used by the business or not, this id must be part of the contract, it’s not just a matter of techies. Nothing that resides within a database can be anonym and therefore Id is the cool way to offer them a “name” for accessibility purpose.
Two years ago, AvalonBoy introduced to us a geeky mean to mark identifier and to take advantage of the trick while using distributed data structure. This trick is called OID for object identifier.
We always use GUID as primary key within our databases as a table design best practice. OID is a GUID whose third octet has been typed. Each entity that needs to be marked got its own marker. The marker is a unique value defined among a domain range we call boundaries. This trick let us define 4^16 ObjectTypes possible definitions and we can ask to the identifier what ObjectType it is just like that:
OID.GetTypeId(__oidRelief) // Where __oidRelief is a Guid [OID of course!]
OID code source: download here the RD_OID_Demo(17ko)
Some of you are currently thinking, dude they broke the Guid algorithm, isn’t? Yes it is, but tests have shown that within a ten millions of OID generation everyone get unique. But keep in mind that this trick is for technical purpose.
Data fragment editor
Let’s speak about software engineering and development process. You have seen how ORL is treating the Object/Relational mapping and how it his embedded within the code production process. Now I know the data typology, I want a tool that will help software’s architects and business analysts to model their business domain and improve their ability to master the business entities and their data fragment.
Mini glossary
Business Entities: this is the large concept used by workers to achieve a piece of their job. Each business got his major and minor entities. (Product, Account, Bundle…)
Data Structure: this is the full representation of a business entity without consideration of the entity’s lifecycle. Technically a data structure will be different from the objects modeled by developers but also different from the database design.
Data fragment: must be considered as a view on the entity data structure. Each data fragment must have an identified service owner charged to create, publish, and periodically send the data fragment to other services. The owner is the only one able to mutate the content of the data structure and can publish to subscriber a read only version of it. 5subscribers consider it as a referential data)
Service Owner: the service in charge of a dedicated data fragment.
Goals
Design domain model using UML static diagram concept to identify entities’ relations and package segmentation.
Describes how a business entity can be divided among the different data fragment that compose its life cycle.
Employ the data typology to mark each data fragment field in order to specialize the behavior of the contract produced to correspond to the data fragment definition.
Animate the state machine to visualize the changes in between each data fragment. The sequence of data fragment, illustrates the business entity lifecycle.
Generate the XML map definition for each data fragment in order to be able to map them to the relational model by using ORL features.
Consequence of the typology
Each Data fragment is a contract with one unique service owner able to modify fields marked as Resource or Activity.
Reference data must be implemented as read only property
Reference data don’t need to fit business rules anymore
A resource data can become a reference data in a newest data fragment. (The next step of the business entity lifecycle)
Only the service owner of a data fragment can update the resource data described in the fragment. Other services get notified once the data have been updated, and they will work on a read only version of the data fragment that they will cache.
An activity data can never become of another two types(Reference or Resource) . They are internal member who never appear on subscribers version, thus it can only be persisted in History table for workflow re hydratation purpose.
*:Pat Helland say “Rarely it’s deleted”