The Information Layer is responsible for manifesting a unified representation of the information aspect of an organization as provided by its IT services, applications, and systems enabling business needs and processes and aligned with the business vocabulary – glossary and terms. Associated with the primary objective of this layer are a number of capabilities. This layer includes information architecture, business analytics and intelligence, metadata considerations, and ensures the inclusion of key considerations pertaining to information architectures that can also be used as the basis for the creation of business analytics and business intelligence through data marts and data warehouses. This includes metadata content that is stored in this layer. It also supports the ability for an information services capability, enabling a virtualized information data layer capability. This enables the SOA to support data consistency, and consistency in data quality.
In particular, this layer can be thought of as supporting multiple categories of capabilities of the SOA RA:
In particular, an information virtualization and information service capability typically involves the ability to retrieve data from different sources, transform it into a common format, and expose it to consumers using different protocols and formats.
There are multiple set of categories of capabilities that the Information Layer needs to support in the SOA RA. These categories are:
This layer features the following capabilities:
The ABBs responsible for providing these sets of capabilities in the Information Layer are:
Capability Category |
ABB Name |
Supported Capabilities |
Information Service |
Information Services Gateway |
1, 2 |
|
Data Aggregator |
3 |
|
Data Validator |
6 |
|
Information Lifecycle Manager |
4 |
|
Hierarchy and Relationship Manager |
5 |
|
Data Quality Manager |
7 |
|
Quality of Service Layer: Event Manager |
8 |
Information Integration |
Information Integrity Manager |
|
|
Data Cleanser |
14 |
|
Data Rationalization Manager |
13 |
|
Data Matcher |
14 |
|
Data Virtualization Manager |
|
|
Data Representation Manager |
11 |
|
Data Sourcing Manager |
11 |
|
Data Cache |
15 |
|
Integration Layer: Data Transformer |
12 |
|
Data Consolidator |
9 |
|
Data Federator |
10 |
Basic Information Management |
Information Metadata Manager |
16 |
|
Content Manager |
17 |
|
Master Data Authoring Environment |
18 |
Information Security and Protection |
Quality of Service Layer: Access Controller |
19 |
|
Quality of Service Layer: Data-Driven Access Controller |
20 |
|
Traceability Enabler/Auditor |
21 |
Business Analytics |
Data Miner |
22 |
|
Query, Search, Reporting Engine |
23 |
|
Analytics Visualization Engine |
24 |
|
Quality of Service Layer: Business Activity Monitor |
25 |
|
Quality of Service Layer: Business Activity Manager |
25 |
|
Quality of Service Layer: Activity Correlation Manager |
26 |
Information Definition and Modeling |
Business Information: Business Glossary and Terms, Business Entities |
27 |
|
Common Information: Entities and Data, Messages |
28 |
|
Business Events |
29 |
Information Repository |
Data Repository |
30-32 |
This section describes each of the ABBs in the Information Layer in terms of their responsibilities.
This ABB can be thought as a service container enforcing and supporting exposure of services, with all the associated supporting capabilities. In particular, it has three main responsibilities:
This ABB acts as the gateway to the Information Layer. This ABB enables the hosting and exposure of information services by the SOA RA, forming a virtual data layer. It thus supports interfacing between the Information Layer and consumers of information services and is critical to expose Information as a Service (IaaS). It provides a consistent entry point to the Information Layer through multiple mechanisms such as messaging, service calls, and batch processing. This ABB leverages capabilities and ABBs from the Integration Layer.
This ABB is responsible for efficiently joining information – for example, structured and unstructured data – from multiple sources without creating data redundancy to help form a unified data view/model supported by the Data Virtualization Manager ABB.
Its responsibilities include:
This ABB is responsible for validating records against defined business rules.
This ABB is responsible to providing lifecycle management support for data; e.g., CRUD and to apply business logic based upon the context of that data.
This ABB is responsible for managing the data hierarchies, groupings, relationships such as parent-child relationships, and relationships between enterprise data. This ABB is leveraged by the Data Virtualization Manager to build the relationships.
This ABB is responsible for validating and enforcing data quality rules, standardizing the data for both value and structure, and performing data reconciliation including semantic reconciliation. It leverages the Information Integrity ABB to fulfill its responsibilities.
See Event Manager ABB in the Quality of Service Layer.
This ABB is responsible for data profiling, analysis, cleansing, data standardization, and matching. Data profiling and analysis services are critical for understanding the quality of data across enterprise systems, and for defining data validation, data cleansing, matching, and standardization logic required to improve data quality and consistency.
This ABB is responsible for cleansing and applying data quality rules. It enables detection and correction of corrupted or incorrect data.
This ABB is responsible for performing data rationalizing and reconciliation.
This ABB is responsible for matching inbound records to existing data. It supports deterministic matching and probabilistic matching of records.
This ABB is responsible for providing virtual access and unified representation of enterprise data sources.
This ABB is responsible for handling representation of data from various data sources in a unified data format and for creation of unified views of data. In other words, this ABB intends to hide various data sources and present data in uniform formats to other ABBs for data handling. This ABB may link to various data sources and handle relationships between the data sources. This “virtualization” of the data makes consumers of information services (exposed through the Information Services Gateway) and other ABBs independent of the source and supports consistency in data.
This ABB is responsible for enabling access to different data sources using different protocols. It provides unified access to data in files, databases, etc. It uses an Adapter ABB from the Integration Layer to provide the ability to integrate with data sources in different solution platforms (external data sources).
Examples may be relational sources (e.g., DB2, Oracle, or SQL Server databases), other structured data (e.g., Excel .CSV, web service request responses in XML format, and hierarchical stores on mainframes such as IMS), as well as unstructured data stores (such as images and documents). It manages interactions with the data sources in the Solution Platform and other SOA RA layers, but it is not responsible for addressing data and protocol transformation. This ABB represents the actual data repositories in various types, such as a DB2 database in the Operational Systems Layer, or an Excel file. It should be noted that this ABB in the Information Layer refers to high-level links associated with metadata to real data sources in the Operational Systems Layer. This ABB enables optimization of the data access by lazy loading or on-demand access of information. For example, instead of containing (e.g., attaching) a huge document, this ABB typically contains an on-demand link to the original document, together with some metadata describing the document (e.g., goals, purposes, and short descriptions) that help users decide whether they need to access the original document (e.g., a CEO may decide not to download a detailed design document while a project architect may decide to download and review). In addition, it should be noted this ABB typically represents industry-specific data structure; therefore, transformation may be needed for further processing.
This ABB is responsible for the caching of data in support of the data virtualization/information services capability. It enables addressing variations in temporal availability of data as well as improvement of performance. The variance in temporal availability of data is an issue associated with different data sources having different schedules for data being available; for example, one data source could be a time-based file feed, the other a mainframe batch program, and the third a real-time relational database. In such a scenario, for the consistent update and availability of data, it is useful to be able to cache it in some form. The data cache may use persistent data or non-persistent caching of data, which are implementation aspects.
See Data Transformer ABB in the Integration Layer.
This ABB is responsible for extracting relevant information from sources, transforming the information into the appropriate integrated form, and loading the information into the target repository. This ABB supports Extract-Transform-Load (ETL) from one or more source systems into a target system. It is also responsible for supporting real-time ETL capabilities with the initial or incremental ETL of volume data into a target repository (e.g., data warehouse or master data repository).
This ABB is responsible for providing Enterprise Information Integration (EII) capabilities for federated query access to structured and unstructured data.
See Access Controller ABB in the Quality of Service Layer.
See Data-Driven Access Controller ABB in the Quality of Service Layer.
This ABB is responsible for monitoring and managing data usage using a log-like facility. It interprets log information and stores it in databases to analyze the data and initiate threat alerts. This ABB supports the ability to know who has accessed data, when it has been accessed, and what has been accessed and also supports data privacy through the obfuscation of sensitive data.
This ABB is responsible for managing and maintaining metadata in a common metadata repository for the enterprise, including structured and unstructured data; for example, metadata that describes the master data taxonomies and XML schemas and rules for business logic and data validation. It stores information regarding transformation of data types and content and the ability to aggregate data from multiple sources. It is used to share canonical forms (common data models) between SOA Integration Layer elements and other layers of the SOA RA. It supports, in particular, the ability to store, retrieve, and translate metadata into forms that can be effectively consumed by repositories local to other layers in the SOA RA. It facilitates re-use for metadata assets, semantics, models, templates, rules, etc. across the enterprise. Information integration capabilities are used to support the replication of changes to metadata that is contained in systems across the enterprise.
This ABB is responsible for capturing, aggregating, and managing unstructured content in a variety of formats such as images, text documents, web pages, spreadsheets, presentations, graphics, email, video, and other multimedia. It provides the ability to search, catalog, secure, manage, and store unstructured content to support the creation, revision, approval, and publication of content. It provides the ability to identify new categories of content and create taxonomies for classifying enterprise content. This ABB is also responsible for managing the retention, access control and security, auditing and reporting, and ultimate disposition of business records. It provides for the policy-driven movement of content throughout the storage lifecycle and the ability to map content to the storage media type based on the overall value of the content and context of the business content.
This ABB is responsible for authoring, configuring, approving, managing, customizing and extending master data as well as the ability to add or modify instance master data, such as product, vendor, and supplier. These services support the MDM collaborative style of use and may be invoked as part of a collaborative workflow to complete the creation, updating, and approval of the information for definition or instance master data.
This ABB is responsible for analyzing data access history as well as providing optimization algorithms and business intelligence for data optimization. It enables building of descriptive and predictive models by uncovering previously unknown trends and patterns in vast amounts of data from across the enterprise, in order to support decision-making.
This ABB is responsible for supporting ad hoc queries, search, reporting, slicing/dicing/drill-downs, and Online Analytical Processing (OLAP) capabilities for enterprise information.
This ABB is responsible for providing interactive visualization of analytics results and data analysis leading to better analyses, faster decisions, and more effective presentation of analytic results. It provides charting and graphing functionality, spatial dashboard reporting such as for scorecard reporting, spatial analysis, and rendering for interaction with components that provide user presentation.
See Business Activity Monitor ABB in the Quality of Service Layer.
See Business Activity Manager ABB in the Quality of Service Layer.
See Activity Correlation Manager ABB in the Quality of Service Layer.
This ABB represents business vocabularies – business glossary and terms and key business entities of the organization and their definition.
This ABB represents definition of entities and their relationship, logical data definition for database design, and message model for service definition and specification. Information is one of the fundamental constructs of an SOA solution and analysis and design based on the service-oriented paradigm.
This ABB represents definition of business events. Event is one of the fundamental constructs of an SOA solution and analysis and design based on the service-oriented paradigm.
This ABB provides the essential foundation for the storage of operational and reshaped information that adds business value. The core data repositories are Analytical Data, Operational Data. Master Data, Unstructure Data, Metadata.
The ABBs in the Information Layer can be thought of as being logically partitioned into the following categories which support:
ABBs in the Information Layer
The relationship among these ABBs is shown for different scenarios.
The first scenario is for Information as a Service (IaaS), where information is retrieved from multiple sources.
Key Interactions among ABBs in the Integration Layer in an IaaS Query Scenario
The second scenario relates to adding and updating information in the context of master data management.
Key Interactions among ABBs in the Integration Layer for an Add/Update in an MDM Scenario
The third scenario is updating MDM by extracting deltas from source systems.
Key Interactions among ABBs in the Integration Layer for a Delta Extract and Update in an MDM Scenario
Certain relationships exist between the ABBs in the Information Layer with those in other cross-cutting and horizontal layers:
Key Interactions of the Information Layer with Cross-Cutting Layers
The four horizontal layers that are logically more functional in nature in the SOA RA – namely, Consumer Layer, Business Process Layer, Services Layer, and Service Component Layer – require information (structure and unstructured data, metadata, and messages) to fulfill their respective responsibilities and, therefore, rely on the Information Layer to access information. These horizontal layers are dependent on the ABBs of the Information Layer to fulfill their information needs.
Key Interactions of the Information Layer with Horizontal Layers
Especially, for industry-specific SOA solutions, this layer captures all the common cross-industry and industry-specific data structures, XML-based metadata architectures (e.g., XML schema), and business protocols for exchanging business data. Some discovery, data mining, and analytic modeling of data are also covered in this layer. These common structures may be standardized for the industry or organization.