Dataworks Enterprise Configuration

At the heart of Dataworks Enterprise is an in-memory high-speed configuration database. Each component is accompanied by configuration information in the form of text files with the ‘.dat’ extension. As each component is loaded, it requests that the central system load its configuration file(s) into the central database. Keeping the database shared reduces the load time of each component of the system.

In a typical system, a series of basic database files are delivered with each component during installation. These files contain system defaults. In many cases, specific configuration information is entered in these files, but commented out. This provides information to the user as to the hard-coded default for the component. In addition, the entries in these configuration files are documented. As stated above, the customer should not edit these files. Instead, an entry is made at the end of the system file indicating that a specific user file (given the extension ‘.usr’ should be loaded). If the system cannot find this file, it places a message in its trace log and continues. Entries in the user file will override those in the system files. This allows the system to be upgraded without affecting user configurations.

On launch, the launch application locates each component that is installed and starts that component. See the launcher documentation for details on how this happens. Each component started then registers with the central core system, requesting that its configurations should be loaded. If a component is not loaded, its configurations are never placed in the central repository. If the component is reloaded, its configuration data will also be reloaded overwriting any existing entries.

Locating and Installing Configuration Files

As stated above, the location of configuration files is specified in the registry under the Config key.

Note, that all files are located using this Config path setting in the registry. When specifying that one file should include another, these, typically, do not contain path information. This allows multiple files to be kept containing per user, per site and per system level configurations.

Each configuration file represents both part of the global database and a database in its own right and is, consequently, sometimes referred to as a Resource or Configuration Database.

One key feature of the configuration databases is that they can be merged ‘on the fly’. The system makes use of this facility. As each module is initialised, it instructs the system to load its configuration file(s) prior to full execution. The system finds the specific files and loads them into the database. Each entry overrides any existing value where there is an absolute match in configuration names.

Applications can also enumerate entries under a given key. For instance, the configuration file RTCache.dat is loaded on entry to the Cache component. On loading, it searches the configuration database for all matches with ‘Tosca.DB’ and takes the values as being the names of additional databases to load. Therefore, entries such as:

	Tosca.DB.SiteData: CorpData.dat
	Tosca.DB.DeskData: DeskData.dat

would cause the additional configuration files ‘CorpData.dat’ and ‘DeskData.dat’ to be located and loaded.

Changes to the configuration of a system can be made ‘on the fly’. However, most modules read the configurations once at start-up. Therefore, it is typical to shut down the system or to restart a particular module after making changes to its configuration files. If a configuration is loaded as a result of a system action such as starting a module, the data is marked as read-only. None of these configurations are ever written by the system, therefore, files saved locally by a user will not overwrite changes to the central configuration files.

Configuration File Syntax

The configuration files consist of a series of entries each containing a ‘resource’. The entry has the general format:

	<configuration name> : <value>

The configuration name consists of a series of words separated by the ‘binding’ characters asterisk (*) and dot (.). The dot character indicates a tight binding which will only match a query where the two words are immediately juxtaposed. The asterisk character represents a wildcard where any number of words may intervene. Applications query the database by combining a series of words separated by the dot character i.e. tightly bound. Applications ‘query’ the database for a specific configuration and the system provides a result on the best match for the query.

Note that the last element of a configuration name is never a binding character. This last element is the name of the configuration. For instance, Remote Server has a configuration containing the TCP port used to connect to its service. This configuration is given as:

	Tosca.RemoteSvr.Port:6511

The Port is the configuration, in this case being specifically the port for the Thomson Financial component called Remote Server. Thomson Financial reserves all configuration names with the leading word ‘Tosca’ which is used to indicate that the resource is a configuration associated with a standard Thomson Financial component.

Single and Multiple Instances of Components

In some cases, components may be loaded more than once. This may be to provide additional resilience or, in a more complex scenario, to provide additional connectivity. Where this is the case, the module is coded to use its instance name (the name of the executable being loaded) as a configuration name. For instance, the Publishing Client may be loaded more than once to connect to more than one Publishing Server. In this case, the client encodes its name as the configuration name so that each instance of the client can be separately configured to attach to different servers. By default, the name of the configuration is the standard installed name of the executable.

Other components, typically servers, restrict their use to a single instance on any given host. Typically, this is the case where there is no actual benefit in running more than one instance.

Configuration Queries

A configuration is queried either as a simple resource value or by enumerating a value. Typically, most components prefix the configuration name with something that uniquely defines their component. For instance, all system configurations are prefixed with the word ‘Tosca’ to indicate that this is a configuration defined by Thomson Financial itself.

Simple queries typically take the full name of a known configuration and return its value. For instance, the query for ‘Tosca.Fields.LinkNames’ returns a comma-separated list of field names that provide linkage information.

The system also allows the user to enumerate all entries to one or more levels from the database. Enumerating at one level just provides all the keys from a given key, whereas enumerating all levels provides all sub-keys of a starting key.

Using Wildcards

In some cases, the database may contain wildcards which can also be ‘queried’. For instance, suppose that a component supported a timeout value called Timeout based on an exchange identifier. The component would query the database for the configuration MyComps.<exchange>.Timeout. The database might contain the following entries:

	MyComps.L.Timeout: 30
	MyComps.N.Timeout: 45
	MyComps.F.Timeout: 20
	MyComps*Timeout: 15

A query for London (L) would produce 30, New York (N) 45, Frankfurt (F) 20 and any other exchange 15. Keys are traversed left to right accepting a tighter binding match (.) with greater precedence than a looser binding match (*). In this way, configuration files support user defined defaults. The component would also, probably, provide its own default if no configuration was found.

Configuration API

The TOSC API to the Real-Time Cache exposes methods that allow programmers direct access to the configuration system. These appear in the Visual Basic environment as global functions such as RTGetConfiguration(). In C++ they are members of the standard CRTModule class.

The API allows applications to load configuration files into the database, to query the database and to enumerate part or all of the database. The API also allows applications to write information to the database. However, wildcards are not supported. Optionally, values that are written into the database at run-time may be stored by the configuration system at close down in a user database file and automatically restored when the system is restarted. The user database file name is constructed by the system from the first entry in the CONFIG_PATH combined with the name <username>.dat. Storing written configurations on a per user basis ensures that the actions of one user cannot affect those of another.

Applications should not attempt to overwrite configuration information provided for another component in the system since this may result in unpredictable behaviour due to the inconsistencies that can arise.

The VB sample program ‘config’ can be used to illustrate the API and to test the results of changing configurations.