Home Download Support Development Projects My pages
 

SOAP and UNO

OpenOffice

  1. What is SOAP?
  2. SOAP and object models
  3. Web services
  4. Server-side infrastructure
  5. Programming considerations
  6. UNO and SOAP

1 What is SOAP?

SOAP, Simple Object Access Protocol, is a communication protocol that specifies how structured and typed data are exchanged in a decentralized, distributed environment. SOAP consists of three parts: how messages have to look, how data types are encoded, and how remote procedure calls are represented. The specification provides bindings for the protocols HTTP and SMTP. This is not to say that SOAP cannot be used with other underlying protocols but the most useful are these two, because by using them, SOAP data can be routed through standard internet ports. This is a decisive point for network administrators who deploy firewalls which only let HTTP requests or mails pass through. SOAP utilizes XML to describe data which affords some inherent benefits:

  • SOAP calls are human readable. That abolishes the need for translation tools in the development sector and thus makes development easier.
    Filtering calls between peers can be achieved by deploying commonplace languages and tools, such as PERL
  • SOAP calls do not depend on programming languages and object models; e.g. DCOM, CORBA. This enables clients to make use of web services, using SOAP as network protocol, which can reside on different platforms, that is, a heterogeneous environment. Clients need only support standard protocols such as HTTP in order to exploit the huge amount of web services the internet offers.

The fact that SOAP uses standards, such as: XML, HTTP, SMTP, TCP/IP, and it is about to become a standard itself; it is highly expected that it will find broad acceptance within the IT industry. Remote procedure calls are now possible in a language and platform independent way. This greatly facilitates the deployment of distributed applications. For example, database requests can even be performed by clients which do not speak any proprietary database protocols and do not have a database client installed. This opens new alternatives for booking and e-business applications which deal, to a large extent, with database requests. As an example one could think of a simple booking system. The travel company employs a web service that is accessed through SOAP based on HTTP. The web service offers two functions, one for querying for a vacant hotel room at a given date and the other for booking a room. To make booking easy for customers, one opted for setting up two different sales channels, which are the internet and travel agencies. The travel agencies usually use special booking software which connects to external databases. Because of the permanent usage of the booking software one would create a small client which would call the functions of web service through the internet and thus completely avoid the hassle of dealing with an external database. The customer at home would be offered a booking solution through the web site of the travel company. There would be a small applet which enables the customer to query for vacant rooms and finally to book them. In both scenarios, it is completely irrelevant to the client what the infrastructure of the server hosting the service looks like. The client application and the applet take the textual input and form SOAP messages out of it, which in fact is a pure textual operation. The message is being sent over the internet, so there are no further means of transportation necessary. The response message from the web service contains the return values of the called function embedded in XML. The client, either application or applet, in this case, cuts the return value out of the message and presents it to the user. There is no further conversion needed because the value is already plain text.

The main points are that SOAP

  • is a simple communication protocol that enables RPC
  • it uses HTTP and SMTP
  • it uses XML
  • it does not depend on programming languages or object models
  • it enables communication in heterogenous environments


2 SOAP and object models

The main purpose of object models is to easily access components in a transparent way, without bothering for lifetime control, inter–object communication, over the network, and where the components are located. The latter is a very important aspect, for example a DCOM programmer does not know whether a component is local or actually resides on a different machine. The creation process is exactly the same. The object model ensures that remote objects appear as local objects to the programmer. There are also some synchronization issues which are automatically handled by the object model itself.

SOAP does not support object models such as DCOM or CORBA. This means that one cannot make use of such convenient techniques by simply speaking SOAP. Those features could still be implemented by building an extra implementation layer on top of SOAP. This would be similar to DCOM that is based on RPC, Remote Procedure Calls, but, whereas RPC supports bidirectional calls, SOAP does not. This does not mean that inter-object communication is impossible, but since it is not defined, clients just cannot receive calls from a web service in a standardized way.

SOAP does not support any progressive object model techniques. Instead it follows a rather simple approach of remote procedure calls. Building a higher functionality on top of SOAP would enable an object models but then clients become dependent on model specific implementations, such as: additional libraries, runtime, etc. This would nullify the attempt of SOAP to act as communication bridge between different platforms.

Summing it up one can say that SOAP...

  • does not support object, models (lifetime, object identity, inter-object communication, object creation, thread identity)
  • can be used by object models.

3 Web services

SOAP, as a simple communication protocol, could be deployed in a variety of scenarios. One of those is the access of web services through SOAP. One can think of a web service as a unit of application logic that provides services and data in that it exposes a set of functions. What makes them stand out is the fact that they are accessible through the internet and do not impose too much logic on the clients. As a matter of fact, a client just needs to be able to connect to the internet and handle HTTP requests and responses.

The SOAP specification stipulates that a Uniform Resource Identifier (URI) identifies a web service. That is, to access a web service one has to send an HTTP post request that carries the service's URI as "address" .

4 Server - side infrastructure

The SOAP specification does not tell how the infrastructure on the server hosting the web services has to look. In other words it leaves open how an HTTP requests which addresses a web service is processed by the receiving host.

Because of the domination of the internet, there is an already existing infrastructure that handles HTTP messages. That infrastructure contains web server applications on the server's side. Current implementations of SOAP and web services make use of that and do not deal with HTTP directly rather they exploit available web server solutions.

How the web servers process HTTP requests which address web services, depends on the extension mechanism the web server provides. For example, the SOAP implementation for the tomcat server uses servlets and the Internet Information Server (Microsoft) uses ISAPI (extensions are DLLs). That obviously suggests that those infrastructures are incompatible. However the clients view of the web service is absolutely transparent; it only sees the URI.

The main aspect here is that no matter what the server-side infrastructure looks like, the client always accesses the web services in the same way.

5 Programming considerations

Although SOAP promises to be simple, it does not imply that accessing web services is simple in general. One point that is constantly emphasized is that SOAP messages are human readable. But this does not suggest that a user simply writes down a request and sends it off. As as matter of fact, there is an appropriate infrastructure required on the client's side. This could be as simple as a program that accepts text files and dispatches them as HTTP requests. That approach would still be inappropriate where programs want to use web services. A programmer preferably expects an interface description he can program with, such as header files in C++. For that to work one needs a proxy mechanism. The proxy would provide the programmer the possibility of making calls specific to the language being used rather then to generate SOAP messages themselves.

SOAP does not specify how services have to be described. However, service descriptions are valuable to convey clients a picture of a web service' functionality and they can be used for enabling proxy mechanisms. The latter is important for reducing client application development time.

WSDL ( Web Service Description Language, see http://www.w3.org/TR/wsdl ) is a means of describing services and it appears to evolve to the prevailing description language, since it is widely supported.
A WSDL service description can be used to generate proxies or to resolve calls to proxy at runtime.

Applications using SOAP are inherently lesser performing than those using protocols based on binary data. Since requests are represented by XML, the data packet for each call is rather big. Moreover the server has to parse all received messages. When parameters consist of binary data, then each byte has to be converted into a readable format (hex or base64). To reduce negative performance impact, one should design interfaces that do not require frequent calls, and one should avoid transferring big chunks of binary data. That means a client should be able to get a maximum of information with one call (but no binary data). Iteration or enumeration interfaces are thus not useful at all.

Interfaces for web services should be kept simple in terms of not imposing additional logic on the client.
The lifetime of a web service cannot be controlled by the client. Instead the web service must specify how its lifetime is managed. For example, the soap implementation for the Tomcat web server expects the services to provide a deployment descriptor which is a XML file that contains information about the service. In that descriptor the service determines how its lifetime is to be handled.

The following notes summarize the major points of this paragraph:

  • Clients need a minimum infrastructure
  • Language integration and proxies for easier development
  • WSDL service descriptions to support proxies
  • Performance ( avoid frequent calls and binary data)
  • Client has no control over the lifetime of the web service and cannot hold references to it

6 UNO and SOAP

SOAP could certainly be used as a communication protocol for object models. Although it was not designed for that, the extensibility of SOAP, along with a runtime implementation, could be the means for an UNO - SOAP bridge. The question is if this is desired. SOAP is definitely not the protocol of choice when it comes to network performance and as one can infer from the minimal approach of SOAP (no object model support) it would be a major task to build UNO on top of it.
A simple mapping from UNO interfaces to a web service interface would not always work because performance issues could call for a complete redesign of interfaces.
UNO components control its lifetime by a reference count. This is inappropriate for web services because the objects lifetime is not specified by SOAP. By enabling reference counting, regardless, one would coerce this non-standardized mechanism on the client. Instead UNO components must define their lifetimes themselves and they have to rely on the server side infrastructure to control their lifetimes. How long a web service remains active could depend on factors such as the duration of a HTTP connection, the lifetime of a web application, a specific servlet session, etcetera.

Web services guarantee neither object identity nor thread identity, and SOAP does not specify how this information would be transferred. To satisfy the demands of UNO one could create a proxy object for a web service which then could ensure object and thread identity.

As it seems, current UNO components tend to be unfit for use as web services. This is mainly because UNO interfaces have not been designed for web services. It would be possible to implement UNO services anew, which support special web service interfaces. Then, there is still the question of making those services available to clients. For that one has to decide what infrastructure is to be used (tomcat, servlets etc.).

The main conclusions are

  • SOAP is not useful as communication protocol between UNO components
  • UNO interfaces need to be redesigned

Using Apache SOAP and the tomcat web server one could easily access UNO web services (UNO service with one special interface). The writer of the web service would have to provide the web service (e.g. C++), a dummy Java web service and a service description in an descriptor.xml file. The descriptor file is used for registering the service with the SOAP implementation and carries information about who is responsible for the type conversion.

Whenever a client calls to an UNO web service then the web server receives a HTTP post request. The target of the request is a special servlet from the SOAP implementation. The servlet recognizes that the HTTP request contains a SOAP message and what function of what service is called. It instantiates the Java web service if necessary and calls the function. The servlet converts the SOAP data types to Java types, so that the actual Java web service does not have to deal with SOAP at all. In fact the service is completely unaware that the function call originated from a SOAP message.
The Java web service can use a C++ language binding to forward the call to the C++ web service. Alternatively, one could put all the logic into the Java web service. That would be an eligible way because in many cases UNO services must be redesigned anyway to met the requirements of being a web service.

The situation looks different when web services are used from UNO. Because no UNO client wants to deal with network APIs and SOAP directly, we need a mechanism to make the web service appear to the UNO client as an object whose functions can be ordinarily called. That is, when the client uses C++ then it expects an object with a C++ interface. At the programming level, the programmer would include a header file in order to use the web service functions.
UNO uses interfaces, so it seems reasonable to represent a web service as an interface. The type description could either be written by hand or if a WSDL description is available then one could generate it from the WSDL description.

The object that the client uses rather then the web service should be named "proxy" in this document.
To create a proxy one would employ an UNO factory service. The factory service must be capable of finding the web service based on the web service name which is provided by the user of the factory service. Currently, there are different ways for discovering web services, for example, .NET uses DISCO (Discovery) and IBM uses UDDI (Universal Description, Discovery and Integration). The latter is an auspicious standard because it is supported by lots of companies.

The procedure of calling a function of a web service by an UNO client would be like this:

  1. get the factory
  2. create the proxy object
  3. call to the proxy object

The proxy object is actually more than an ordinary proxy in that it guarantees object identity itself rather than the web service. The proxy also realizes thread identity, although this is not exactly necessary because a web services does not call back to the client.

Currently there are different implementations of SOAP and frameworks for web services which could cause some trouble. For example, when Tomcat is used then all web services are addressed by the same URI. That URI points to a servlet that eventually delegates the call to the appropriate service. This actually contradicts the SOAP specification, which states that the URI should identify the service.

The picture below shows how a client calls a web service provided by the web top server (left side) and how an UNO client uses a web service (right side).


Author: Joachim Lingner ($Date: 2003/01/23 10:59:15 $)
Copyright 2001 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, CA 94303 USA.