The Open Group SOA Source Book
Show/Hide Plato Messages   You are here:  > SOA Source Book > Information Architecture for SOA
Register Here

Submit a Presentation

Information Architecture for SOA

Information architecture has been defined in Wikipedia as: “The art of expressing a model or concept of information used in activities that require explicit details of complex systems.” Today, the term is most commonly used in the context of web design. We are using it here in a different sense: not to refer to the way information is presented in a service-oriented architecture (that would be “SOA for Information Architecture”), but to describe how models and concepts of information are expressed in SOA development - “Information Architecture for SOA”.

The immediate customers of this form of information architecture are not the end users of the architected systems (although the end users do benefit indirectly). They are the people engaged in SOA development and implementation: the clients, stakeholders, architects, and implementers.

Information architecture is important to these people because inconsistent use of information leads to poor design, difficult implementation, and inconsistent operation. It is important for all styles of enterprise architecture, but it has a particular prominance for SOA.

This section explains why information architecture is so important for SOA, and describes how to address information architecture for SOA using The Open Group Architecture Framework (TOGAF).

The Importance of Information Architecture for SOA

Information architecture is particularly important for SOA, for a number of reasons.

  • Services must be able to exchange information with each other.

Services exchange information much more than traditional applications do. This helps SOA to deliver business agility, but it is difficult to organize services to support business processes if the services assume different data formats or semantics. Translation of data formats and semantics must be done in real time. Expensive bespoke programming is needed to enable the services to communicate with each other.

It is not generally possible to standardize data formats across an enterprise. Data formats used by applications bought “off the shelf“ cannot be changed, and legal and regulatory frameworks sometimes imply particular information models.

  • Each service must process information from different sources.

A traditional “information silo” application uses its own internal data formats. In SOA, a service must process data in a variety of externally-defined formats.

This data is often stored in different locations, and may be inconsistent. For example, it is common for an enterprise to hold customer data in different databases (with different schemas) in different geographical locations.

  • Services should be easy to discover.

In a traditional applications architecture, there are relatively few applications, with relatively few interactions between them. It is easy to identify the applications involved in each interaction, and interactions are generally fixed at design time. In SOA, there can be many services, it can be hard to identify the appropriate service for a particular interaction, and interactions can be determined at run time through dynamic discovery.

Discoverable services make service composition easier, and cut costs through service re-use. Services sold as products must be discoverable. Dynamic discovery improves agility and can simplify operation and maintenance. But it is hard to determine whether a service meets a requirement if the service description and the requirement are expressed in different terms. This means that dynamic service discovery is very difficult, and manual service discovery is slow and expensive.

A good information architecture should encompass proprietary information models (some of which are unavoidable), and help implementers develop discoverable services that access and exchange information. This will facilitate composition and event processing, enable consistent use of information by services, and lead to consistent system behavior. It will also facilitate use of enterprise services by external systems.

Information Architecture for SOA using TOGAF

An information architecture is developed as part of an enterprise architecture or solution architecture. The Open Group Architecture Framework (TOGAF) can be used for developing enterprise and solution SOAs. Other sections of this Source Book describe Using TOGAF for Enterprise SOA and Using TOGAF for SOA Solutions. This section describes how information architectures are created as part of those enterprise and solution architecture development processes.

TOGAF includes a Data Architecture sub-phase, which defines the major types and sources of data necessary to support the business, as part of its Information Systems Architecture phase (Phase C). However, information architecture development starts before this point: it requires attention in the Preliminary Phase and in the Business Architecture phase (Phase B). Also, although this is not part of the information architecture itself, the Technology Architecture phase (Phase D) includes the definition of technology components and standards to support the information architecture.

The information architecture considerations for these phases of TOGAF are described below.

Preliminary Phase

TOGAF's Preliminary phase includes the adoption of a set of architecture principles, the identification of appropriate reference models, and a review of the governance regime.

TOGAF has an example set of architecture principles that enterprises can adopt. This includes the principle of Common Vocabulary and Data Definitions:

  • Data is defined consistently throughout the enterprise, and the definitions are understandable and available to all users.

Adoption of this principle, or a similar one, is the basis of information architecture development.

The reference models identified in the Preliminary Phase should include a meta-model for the information used by the enterprise internally and when interworking with other enterprises. Imposing an enterprise-wide data model (for example, saying that every database must have the same structure) might seem desirable, but generally will not work. An overall information meta-model plus a set of specific information models is the best practical way of achieving consistent representation and processing of information.

The overall information meta-model describes how information is modeled and how business vocabularies are developed and used. Typically, each group of stakeholders has its own stakeholder vocabulary: a particular set of terms for describing the aspects of the business in which it is interested. The meta-model contains rules and conventions for codifying these vocabularies and mapping them to each other. It also contains conventions for data field labels (including XML tags), and for mapping these labels to the vocabularies.

The information meta-model should be determined in the Preliminary Phase. The specific information models should be created as required for subsequent architecture developments in the Business Architecture phases of those developments. The data field labels and their mappings to the vocabularies should be created as required in those developments' Information Systems Architecture phases.

While the TOGAF Architecture Development Method does not include the development of a governance regime, it does assume that one is in place. The Preliminary phase includes a review of the existing governance and support models. This review should ensure that the governance regime covers information stakeholders, information lifecycle management, and information interoperability goals and metrics.

Phase A: The Architecture Vision

This phase includes identifying stakeholders and describing the overall architecture vision.

The stakeholders will include information stakeholders who have concerns relating to the availability and quality of information. They will also include implementer stakeholders who have concerns relating to the representation and location of information. These stakekolders and concerns should be identified in Phase A, so that the concerns can be addressed in the information architecture.

The architecture vision described in Phase A includes initial versions of the baseline and target Business, Information Systems, and Technology Architectures. These will have Information Architecture components, as described under Phase B: Business Architecture, Phase C: Information Systems Architectures, and Phase D: Technology Architecture below.

Phase B: The Business Architecture

Phase B includes an analysis of the business functions and processes. This is where the information that is central to the business operations is described. This is crucial for SOA, as it forms the basis for the identification and definition of the portfolio services.

The analysis should result in specific vocabularies, mappings and information models that conform to the meta-model identified in the Preliminary Phase.

The vocabularies include stakeholder vocabularies and enterprise-wide vocabularies. The mappings show how the terms in these vocabularies correspond.

The models use these vocabularies, and show:

  • The major business information object types and relationships; and
  • Intrinsic properties of information objects, such as quality, access rights, and abundance.

The level of detail in these vocabularies and models depends on the nature of the architecture development. For a solution architecture, the business process analysis should go to a detailed level, the vocabularies should contain all of the business terms relevant to the solution, and the models should show the business objects relevant to the solution. For an enterprise architecture, it is not generally possible to go to this level of detail: the vocabularies should just contain the most important business terms, and the models should show the most important business objects. The vocabularies and models developed for an enterprise architecture will be extended by the solutions that are developed within that architecture.

Phase C: The Information Systems Architectures

While Phase B of TOGAF is concerned with business information in the abstract, Phase C is concerned with the data that represents that information, and the applications that process the data.

The Data Architecture sub-phase of Phase C includes development of:

  • Business data models;
  • Logical data models;
  • Data management process models;
  • Data entity/business function matrices;
  • Data interoperability requirements (e.g., XML schema, security policies); and
  • Data architecture building blocks.

The development of these models, requirements, and building blocks is described in the TOGAF specification. It is not substantially different for SOA than for other architecture styles.

The terms used in the data models, including data-store field labels, should relate to the vocabularies developed in Phase B, and the models should conform to the overall information meta-model identified in the Preliminary Phase.

The applications architecture sub-phase of Phase C is concerned with the applications that process the data. For SOA, this means groups of loosely-coupled services. The definition of these services, and of the interfaces between them, should be based on the vocabularies and information models defined in Phase B and the data models defined in the data architecture sub-phase. In particular, where the services exchange XML messages, the message schemas should be based on the information and data models, and their tags should be related to the terms contained in the vocabularies, as prescribed by the overall information meta-model.

The level of detail of modeling depends on the nature of the architecture development. A solution architecture should describe the key data elements that the solution uses, including all of those that are shared with other solutions. It will in general not be possible to go to this level of detail for an enterprise architecture, but the models should be in sufficient detail to enable solution development.

Phase D: The Technology Architecture

Phase D of TOGAF defines technology components, including the hardware and software infrastructure for SOA. This phase is not concerned with the development of information architecture, but it does include the definition of information infrastructure building blocks and information management building blocks that support the information architecture.

In some cases, such as metadata registries, these building blocks form part of the development environment as well as the run-time environment. And, particularly where a model-driven approach is used, the specification of information-based systems to support implementation can often be considered as a part of the architecture.

This blurring of the destinction between design-time and run-time environments is characteristic of SOA in general. It is particularly apparent with information architecture for SOA. As the technology that supports information architecture devedlops, that technology will be increasingly important for SOA.

If you experience any problems with broken links, or incorrect or unexpected functionality, click here to request help.
   |   Legal Notices & Terms of Use   |   Privacy Statement   |   Top of Page   Return to Top of Page
  PHPlato: 2.0 (550) [p]