Traditionally, institutions employ a range of information distribution systems
that can be loosely described as publishing market data content to users.
Institutions choose particular technology solutions on the basis of the
characteristics of the content and the requirements of the user, in conjunction
with the cost/benefit analysis of the solution and a good understanding of the
features and limitations of the technologies in hand. Common distribution system architectures include traditional database access
(via centralised database servers or through separate application servers),
video switches, terminal solutions, market data distribution systems (MDDS or
platforms) and a range of other techniques. In recent years, Web technology has begun to deliver systems fully capable of
distributing market data content to large numbers of users. Web Servers are
increasingly becoming a central part of our distribution system strategy.
Dataworks Enterprise is a tool that can help to integrate these technologies using a
common set of paradigms. In addition, distributed object technology is rapidly developing. Initially,
this is being used in traditional RPC environments and as a basis for middle
tier business functionality. However, this is relatively new technology and,
consequently, its deployment will be limited to very specialised application
domains. In the near future, MDDSs will continue to form the core real-time distribution
architecture with Web technology increasingly being employed for static or
near-static content. To see why this is the case, we need to review the major
features and limitations of the underlying technologies. Web systems are identified by the use of specific web-related protocols (i.e.
HTTP 1.0/1.1) used to communicate between clients and servers. Many people use
the term �web protocols� when they mean Internet protocols. To describe
standard Internet Protocols as being Web protocols is a bit like describing all
fish as trout. This confusion arises partly because Web Services employ a range
of protocols taken from existing Internet standards, such as MIME (the
multimedia email extension). It is also partly because web clients (or
browsers) are able to communicate with a range of non-Web Internet services
such as news, file transfer, mail and terminal sessions in addition to Web
Servers themselves. Using HTTP, Web publishing systems deliver data on a point-to-point basis as a
result of receiving a specific request for that content from an individual
client. Each request from client to server causes a new connection to be made
to the server, the request passed across that connection, and the connection
closed. In HTTP 1.1, the connection may be cached to reduce TCP virtual circuit
set-up/tear-down times, but this is an optimisation and does not affect the
basic interaction of the client and the server. As far as the user is
concerned, each element of content is retrieved in a separate conversation with
the server. Typically, a browser will compose a single page of information from
the server using content retrieved via a number of individual requests. The content of an HTTP request is, in principle, a �blob� of, typically
text-based, data. Binary data is often encoded into a textual representation
for transmission although there is no strict requirement that this is the case.
Content is, typically, HTML or XML, although HTTP can be used to distribute
images, documents and other types of content. Unlike web-based solutions, market data distribution systems are optimised to
deliver small quantities of dynamic content to a large number of users. The
protocols used tend to be proprietary multicast protocols, usually based on
low-level Internet protocols such as IP or UDP. The system is characterised by
the server receiving requests from the system to open a stream on an item of
data for one or more clients. Typically, the server is unaware of the number of
clients and their identity. There is only an indirect relationship between an
individual client request and that made by the system to the server. Since the
connection between the client and the server is maintained, the server can push
new content to the client as required. The distinction between opening a stream as opposed to retrieving an item is an
important one. Most web servers claim to support dynamic content through
re-request of static data, rather euphemistically, called �server push�. An
example of such an approach is �Web Channel� technology. Web �server push� is
no more than a scheduled �client pull�: there are no actual �channels� in �Web
Channel� technology. However, real-time updates are only possible where a
stream of content flowing from client to server can be maintained, since it is
only through this stream that servers can �push� content (in the form of
updates) to clients. In other words, web servers implement client-pull publishing (the client �pulls�
the data on demand from the server). Market Data Distribution Systems implement
�server push� publishing (the client expresses an interest in an open-ended
stream of data, and the server replies with state and changes as they happen). To ease this limitation, some web servers attempt to multiplex streams of data
over public point to point protocols such as FTP. However, these will never
achieve the performance levels of a multicast distribution system. Others use
proprietary and public multicast protocols to distribute data. In both cases,
they cease to be web servers at all. The relationship of the server and the client is also important. Web servers are
aware of the client making an individual request and are responsible for
applying security mechanisms such as access permissions to individual items of
content. A platform server is not aware of the particular client making the
request and has to delegate the authorisation of users to access particular
data to the platform. Performance is also a key issue. As a result of the request/response paradigm,
performance of the Web systems tends toward the linear. Each new request places
additional and increasing load on the server. As more individuals make requests
of the system, more CPU, memory and disk resources are required on the server
to support the request rate and more network bandwidth is consumed. To increase
the difficulty of deploying these systems, user requests can cause
unpredictable traffic congestion, resulting in delays and a lowering of the
overall quality of service of the network. The fact that there is a large
number of caching solutions being devised, including local browser caching, Web
Proxies and Web Distribution managers, illustrates the problem. Unlike Web
Systems, Market data systems scale in a less than linear fashion (i.e. scale
well). Web solutions are, however, capable of supporting simultaneous access to a
content set only limited by the power of the server. Since each request is
satisfied on a round trip, only the storage space and CPU resources available
to the server limit content. Web solutions are also capable of creating new
content sets �on the fly�. For instance, a Web server may represent the results
of a database query as content unique to the user who issued the query. This
makes the web server well suited to three tier application architectures. MDDS rely on the notion that at any given time the content set being viewed is a
small subset of what is available, and, more particularly, that within any
given group of users, a large proportion will be viewing the same limited set
of items. This makes them well suited to distribution of commonly used data
such as market prices, but less well suited to satisfying a random query from a
database. The direction of distribution is also a key factor. Platforms have largely been
in the control of large data vendors who depend on the redistribution of
contributed data. As a result, platforms have, traditionally, provided a means
for transferring contributions from the client machine to central servers.
Central contribution servers then provide fan-out and normalisation of the
data, presenting the contributions in more than one vendor format (as required)
and sending the data to that (those) vendor(s). However, both Web services and distribution systems are optimised to deliver
data from server to client. Neither technology is well suited to scenarios
where data is being transferred �upstream�. Better technical solutions may be
found in traditional database technology. However, the cost of deployment
suggests that where a distribution mechanism is in place that can be used for
limited upstream communication, this may be cheaper than installing additional
technologies (the KISS principle). Also increasingly important is the notion of total cost of ownership (TCO) in
general, and cost to deploy in particular. Traditionally, the MDDS was the core
of the distribution architecture. However, they are notoriously expensive to
install and provide a quality of service far beyond what is required to support
many end users. Web solutions appear much cheaper to deploy and can support
large numbers of users with limited requirements. As a result, many
institutions find themselves making the assessment of the cost to deploy a new
system based on the requirement of the end user. For instance, the question of
whether the users need real-time updates to the data needs to be asked. Many
institutions run both systems concurrently to suit the wide range of tasks
being performed and find that they then need the means to integrate the two
approaches. Many of the arguments associated with Web technology have concentrated on client
technology issues diverting attention from distribution system infrastructure.
Traditional Web clients (or browsers) are general-purpose pieces of software
that allow the end-user to view displays of (and in some ways, to interact
with) web content. Browsers have a lower rollout cost for large numbers of end
users because they are standard pieces of software and are sometimes even
delivered as part of the OS. Unlike browsers, clients to market data
distribution systems are, typically, custom-built applications: These
applications usually provide specific sets of functionality known to be useful
to end-users. However, this is not led by the web technology itself. There is
no reason at all why specialised applications could not employ web technology,
or that browsers cannot be given access to market data directly (you can do
this with Dataworks Enterprise). Browsers are normally associated with thin client technology. Thin clients are
intended to reduce TCO by reducing management and administration costs.
However, the thin/thick client argument is also a bit misleading in relation to
browsers. A thin client that is populated with large quantities of downloaded
application code ceases to be a thin client at all. Indeed, from a TCO point of
view, this is probably an administrative nightmare, defeating the original
purpose of the technology. True thin client technology is concerned with
restricting functionality to appropriate levels for cost reasons, such as the
provision of view-only terminal positions. In the case of the browser, this
would probably mean extremely tight control over all scripts, Java code and any
other �downloadable� executable unit on the client machine apart from the
displayed page. The easiest way to do this is to disallow it completely. It
would also imply that upstream connections, particularly those that employ
additional virtual circuits, would also have to be tightly controlled to
minimise their impact on the network. As shown above, the nature of web protocols indicates that Web servers are best
suited to delivering static or near-static content. Equally, the market data
distribution system is best suited to the delivery of real-time content. These
are implicit by the very nature of the definition of �web service� and
�real-time distribution system�. This is not to say that in some cases, it is
not appropriate to use one for the other. Actual implementation technology will
often depend as much on the availability of a technology as on its
appropriateness. For instance, when deploying a new service on the web, we have
to consider whether the users have the browser installed, what are the
implications and potential bottlenecks in the network and so on. It may be the
case that this new service, although appearing to suit the web server
environment, will actually be more cost effectively deployed using a central
database or the market data system.Dataworks Enterprise, The Distribution System and The Web
Web Technology vs. MDDS � A Background