Presentation Integration
Integration Patterns
Start |
Previous
| Next
Contents
Aliases
Context
Problem
Forces
Solution
Example
Resulting Context
Testing Considerations
Security Considerations
Acknowledgments
Aliases
Screen scraping
Context
You have multiple independent applications that are organized
as functional silos. Each application has a user interface.
Problem
How do you integrate information systems that were not
designed to work together?
Forces
To correctly solve this problem, you need to consider the
following forces:
- When integrating multiple applications, you should
minimize changes to existing systems because any change to
an existing production system represents a risk and might
require extensive regression testing.
- Because many computer systems are designed according to
the
Layered Application pattern, some of the layers are
typically deployed in physical tiers that are not externally
accessible. For example, security policy may require the
database tier of an application to reside on a separate
physical server that is not directly accessible to other
applications. Most Web applications allow external access
only to the user interface and protect the application and
database tiers behind firewalls.
- Many applications feature little or no separation
between business and presentation logic. The only remote
components of these applications are simple display and
input devices. These display terminals access the central
mainframe computer, and the central mainframe computer hosts
all business logic and data storage.
- The business logic layer of many applications is
programming-language specific and is not available remotely,
unless it was developed on top of a specific remoting
technology such as DCOM or Common Object Request Broker
Architecture (CORBA).
- Directly accessing an application's database layers can
cause corruption of application data or functionality. In
most applications, important business rules and validations
reside in the business logic, and they often reside in the
presentation layer also. This logic is intended to prevent
erroneous user entry from affecting the integrity of the
application. Making data updates directly through the data
store bypasses these protection mechanisms and increases the
risk of corrupting the application's internal state.
Solution
Access the application's functionality through the user
interface by simulating a user's input and by reading data from
the screen display. Figure 1 shows the elements of a solution
that is based on the Presentation Integration pattern.
Figure 1. Presentation Integration connects
to an existing application through the presentation layer
The Presentation Integration pattern is sometimes
disparagingly called screen scraping because the
middleware collects (or scrapes) the information from the
information that is displayed on the screen during a user
session. Collecting information from the screen of the user
session tends to be the simpler part of the integration. The
more difficult part tends to occur when the middleware has to
locate the correct screen in the application in the same way a
human user has to.
To simulate user interaction, the integration solution has to
use a terminal emulator that appears to the application as a
regular terminal, but that can be controlled programmatically to
simulate user input. Such a terminal emulator is usually
specific to the exact type of user interface device that is
supported by the application. Fortunately, in the mainframe
world, IBM's 3270 terminal standard is so widespread that many
commercial 3270 terminal emulators are available. Instead of
displaying information to the user, these emulators make the
screen data available through an API. In the case of 3270
emulators, a standard API exists that is called the High Level
Language Application Program Interface (HLLAPI). The emulator
can send data to the application to simulate keystrokes that a
user would make on a real 3270 terminal. Because the terminal
emulator mimics a user's actions, it usually does not depend on
the specific application that it is interacting with. Additional
middleware logic must encode the correct keystrokes and extract
the correct fields from the virtual screen.
The widespread trend of equipping applications with Web-based
interfaces has revived interest in using Presentation
Integration as a vital integration approach. Web
applications are easily accessible over the Internet. However,
the only accessible portion is the user interface that is
accessed through the relatively simple HTTP protocol. Web
applications transmit presentation data in HTML. Because HTML is
relatively easy to parse programmatically, Presentation
Integration is a popular approach.
Unfortunately, the ease of collecting information from a
provider's Web page over the Internet has caused some
application providers to intentionally exploit the
biggest weakness of Presentation Integration:
brittleness. Because Presentation Integration usually
depends on the exact geometric layout of the information,
rearranging data fields can easily break a Presentation
Integration solution. The graphical nature of HTML allows a
provider to easily modify the HTML code that describes the
layout of the information on the screen. The layout changes then
block any attempt to collect information from the Web page.
Presentation Integration is based on the interaction
between the components that are described in Table 1.
Table 1: Presentation Integration Components
Component |
Responsibilities |
Collaborators |
Presentation layers |
- Render a visual
presentation to be displayed on a user terminal
- Accept user input and translate it into commands to be
executed by the business logic |
Terminal emulator |
Terminal emulator |
- Impersonates a user
session to the presentation layer
- Makes screen information available through an API
- Allows other applications to issue commands to the
presentation tier |
Presentation layer and
other applications |
Other applications |
- Consume application data
- Issue commands |
Terminal emulator |
Example
A big challenge faced by government agencies is the lack of
integrated data across multiple state agencies. For example,
integrated data gives an income tax agency a more holistic view
of a business because the integrated data might show the number
of employees that the business has and the amount of sales tax
that the business reports, if any. This type of information can
be used to identify businesses where there is a difference
between the tax owed and the tax actually collected; this common
issue is referred to as a tax gap. However, integrating
information from multiple state agencies is often constrained by
political and security concerns. In most cases, it is easier for
an agency to obtain end-user access to another agency's data as
opposed to obtaining direct database access. In these
situations, you can use Presentation Integration to gain
end-user access to a remote data source in the context of an
automated integration solution.
Resulting Context
Presentation Integration is almost always an option
and has certain advantages, but also suffers from a number of
limitations:
Benefits
- Low-risk. In Presentation Integration, a
source application is the application that the other
applications extract data from. It is unlikely that the
other applications that access the source application can
corrupt it because the other applications access the data
the same way that a user accesses the data. This means that
all business logic and validations incorporated into the
application logic protect the internal integrity of the
source application's data store. This is particularly
important with older applications that are poorly documented
or understood.
- Non-intrusive. Because other applications appear
to be a regular user to the source application, no changes
to the source application are required. The only change that
might be required is a new user account.
- Works with monolithic applications. Many
applications do not cleanly separate the business and
presentation logic. Presentation Integration works
well in these situations because it executes the complete
application logic regardless of where the logic is located.
Liabilities
- Brittleness. User interfaces tend to change more
frequently than published programming interfaces or database
schemas. Additionally, Presentation Integration may
depend on the specific position of fields on the screen so
that even minor changes such as moving a field can cause the
integration solution to break. This effect is exacerbated by
the relative wordiness of HTML.
- Limited access to data. Presentation
Integration only provides data that is displayed in the
user interface. In many cases, other applications are
interested in internal codes and data elements such as
primary keys that are not displayed in the user interface.
Presentation Integration cannot provide access to
these elements unless the source application is modified.
- Unstructured information. In most cases, the
presentation layer displays all data values as a collection
of string elements. Much of the internal metadata is lost in
this conversion. The internal metadata that is lost includes
data types, constraints, and the relationship between data
fields and logical entities. To make the available data
meaningful to other applications, a semantic enrichment
layer has to be added. This layer is typically dependent on
the specifics of the source application and may add to the
brittleness of the solution.
- Inefficient. Presentation Integration
typically goes through a number of unnecessary steps. For
example, the source application's presentation logic has to
render a visual representation of the data even though it is
never displayed. The terminal emulation software in turn has
to parse the visual representation to turn it back into a
data stream.
- Slow. In many cases, the information that you
want to obtain is contained in multiple user screens because
of limited screen space. For example, information may be
displayed on summary and detail screens because of limited
screen space. This requires the emulator to go to multiple
screens to obtain a coherent set of information. Going to
multiple screens to obtain information requires multiple
requests to the source application and slows down the data
access.
- Complex. Extracting information from a screen is
relatively simple compared to locating the correct screen or
screens. Because the integration solution simulates a live
user, the solution has to authenticate to the system, change
passwords regularly according to the system policy, use
cursor keys and function keys to move between screens, and
so on. This type of input typically has to be hard-coded or
manually configured so that external systems can access the
presentation integration as a meaningful business function,
such as "Get Customer Address." This translation between
business function and keystrokes can add a significant
amount of overhead. The same issues of complexity also
affect error handling and the control of atomic business
transactions.
Testing Considerations
One advantage of using Presentation Integration is
that most user interfaces execute a well-defined and generally
well-understood business function. This can be an enormous
advantage when dealing with monolithic systems that might be
poorly documented or understood.
Unfortunately, this advantage is often offset by the fact
that testing usually depends on the ability to isolate
individual components so that they can be tested individually
with a minimum of external dependencies. Such a testing approach
is generally not possible when using Presentation Integration.
Security Considerations
Presentation Integration uses the same security model
as an end user who logs into the application. This can be an
asset or a liability depending on the needs of the applications
that are participating in the integration solution. An end-user
security model typically enforces a fine-grained security scheme
that includes the specific data records or fields that a user is
permitted to see. This makes exposing the functions through
presentation integration relatively secure.
The disadvantage of the fine-grained security scheme is that
it can be difficult to create a generic service that can
retrieve information from a variety of data sources. In those
cases, a special user account has to be created that has access
rights to all the data resources that are needed by the external
applications.
Acknowledgments
[Ruh01] Ruh, William. Enterprise Application Integration.
A Wiley Tech Brief. Wiley, 2001.
Start |
Previous
| Next