Data Aggregation Support

Querying and Aggregation Across Disparate Data Sources

The Data Integration Suite provides data aggregation across enterprise data sources via standards-based interfaces.  The Data Integration Suite is optimized for performance and scalability and provides cross-platform flexibility and the ability to plug into virtually any infrastructure.  Data sources include a wide variety of relational databases: Oracle, DB2, SQL Server, Sybase, and MySQL, XML documents, EDI, flat files, and other legacy formats.  

 

Report Generation is made much easier by the Data Integration Suite. XQuery is extremely useful for XML reporting, and the resulting XML is easily published to many different formats.

This makes Data Integration Suite ideal for database publishing applications, since XQuery can create an infinite number of logical views of the data, which can be used in many different environments by publishing them as HTML, PDF, PostScript, or other formats. The following diagram illustrates this process.

 

XML Reporting

There are many useful ways to look at your data — for instance, if you have a collection of invoices, you may want to show the orders for a given customer, the orders for a given product, or the totals for a given region or sales agent.

Just as a relational report writer can be used to create different logical views of relational data, Data Integration Suite can be used as an XML report writer, creating different logical views of XML and relational data in an XML format. XML reporting makes your data more useful by letting you see it from different perspectives. And if you are exchanging data as XML, XML reporting lets you more easily create XML documents with the required structure.

XML Publishing

XML is easily formatted and converted for human beings. Many applications can use XQuery as a report writing language, then publish the report using standard XML publishing tools. Data integration Suite further extends XML Publishing with XML Report Writing technologies because of its ability to access any non-XML file as XML which can then simply leverage the powerful report generation features inherent in XQuery.

"The XQuery language has become an integral technology for connecting heterogeneous data sources within service-oriented solutions. It's a natural fit for the typical XML data interchange format used in most SOA implementations and it provides powerful, cross-repository data access features without compromising scalability or performance." (Excerpt from "SOA and the Importance of XQuery" by Dr. Carlo Innocenti)

SOA Data Management

The following "Top Tips for Simplifying Data Management in SOA," by Dr. Carlo Innocenti explain how the XQuery feature-set can be leveraged in support of SOA:

  1. XQuery and XML can be used effectively to provide an abstraction layer between the multitude of different data sources IT organizations need to deal with and the variety of client applications that needs to be built.
  2. While Data Integration Suite allows application developers to ignore the physical details of the data sources being accessed, scalability and performance are preserved.
  3. Data Integration Suite is able to provide fast and scalable access to relational data stores, very large XML documents, Web services and EDI or flat files thanks to its ability to deal with each data source in a dedicated and highly optimized manner.
  4. As most SOA implementation rely heavily on XML (just think about Web services based on WSDL/SOAP or REST interfaces), XQuery is a natural language to access and provide SOA endpoints.
  5. Data Integration Suite provides the ability out of the box to consume Web services and to expose XQueries developed by users as WSDL/SOAP or REST based Web services without any additional coding and in the context of a variety of application servers.

Additional Resources

Web Publishing is made much easier by Data Integration Suite. Many web sites are created from database data, often using PHP, ASP or JSP. XQuery can be used to create dynamic HTML pages directly, to create XML to be transformed by XSLT to create HTML, or to create HTML to be consumed by AJAX applications.

XQuery on the Middle Tier

Most web sites need to protect critical resources like databases from their outside users, so they ensure that web clients cannot invoke code that runs in the web server or application server. For instance, Java servlets can be installed by placing them in a directory that is not accessible to the outside world, and can then be invoked using a URL. The same strategy can be used in XQuery. Suppose we want the user to be able to call a query named "portfolio", specifying the name of a user whose portfolio should be returned.

http://tagsalad.org/TomcatXQJExecute?query=portfolio&user=jonathan

On the middle tier, we need some Java code to call the appropriate query and bind the parameters to external variables. This code uses the XQJ API, which is the JDBC for XQuery. This code also sets up the connections for the query, which must be installed on the middle tier, where it is protected from the outside world.

// Set up the data source, connection, and expression objects
dataSource = new DDXQDataSource(new FileInputStream(configFile));
connection = dataSource.getConnection();
xqExpression = connection.createExpression();

// bind URL parameters to XQuery external variables           
Iterator i = parameters.entrySet().iterator();
while ( i.hasNext() ){
 Map.Entry entry = (Map.Entry)i.next();
 xqExpression.bindString(new QName((String)entry.getKey()),(String)entry.getValue());
}
        
// Execute the query
xquerySourceFile = request.getParameter("query") + ".xquery";
xqSequence = xqExpression.executeQuery(new FileReader(xquerySourceFile));

XQueries that return HTML

In some applications, XQuery is used to return HTML. These applications use XQuery much like PHP, ASP or JSP, to generate HTML with dynamic content. Here is an XQuery that returns a portfolio in HTML.

declare variable $user external;

<html>
   <head>
       <title>Portfolio for {$user}</title>
   </head>

   <body>
       <h1>Portfolio for {$user}</h1>
       <table>
         <thead>
            <tr>

              <td>Ticker</td>
              <td>Shares</td>
            </tr>
         </thead>
                   
         <tbody>

           {
             for $st in collection('stock.dbo.HOLDINGS')/HOLDINGS
             where xs:string($st/USERID) = xs:string($user)
             return
               <tr>
                 <td>{ data($st/STOCKTICKER) }</td>
                 <td>{ data($st/SHARES) }</td>
               </tr>

           }
         </tbody>                    
   </table>
 </body>
</html>

This query can be invoked using a URL as discussed above.

XQueries that return XML for AJAX Applications or XSLT Transformation

Many applications create dynamic content for web pages by generating XML, then translating the XML to HTML using XSLT. Other applications use XML as input to JavaScript or Java on the client side, especially in AJAX applications. For instance, an application might use the following XML to represent a portfolio:

<portfolio>
   <user>Jonathan</user>
   <stock>
       <ticker>AMZN</ticker>

       <shares>3000.00</shares>
   </stock>
   <stock>
       <ticker>PRGS</ticker>

       <shares>23.00</shares>
   </stock>
</portfolio>            
       This XML can be generated by the following XQuery query, which uses relational data to create the XML.

declare variable $user as xs:string external;

<portfolio>

 <user>{ $user }</user>
 {
   for $h in collection('stock.ts.HOLDINGS')/HOLDINGS
   where $h/USERID eq $user
   return
      <stock>
       <ticker>{ xs:string($h/STOCKTICKER) }</ticker>
       <shares>{ xs:string($h/SHARES) }</shares>       
      </stock>

 }
</portfolio>

This query can be invoked using a URL as discussed above. In summary, XQuery can be used like PHP, JSP or ASP to publish data to the web.

Data Integration Suite XQueryWebService Framework

XQueryWebService is a framework that allows you to expose an XQuery as a Web service. XQueryWebService is a Java servlet tested on numerous Java servlet containers like Apache Tomcat, JBoss, IBM WebSphere, and BEA WebLogic, the XQueryWebService framework simplifies the design and implementation of Web service applications.

Each XQuery exposed as a Web service provides a single operation; this operation is expressed in the query body through a function that takes the name of the XQuery file without the extension. For example, the file emp.xquery provides the emp operation. Parameters (external variables) expressed in the XQuery, if any, are reflected in the operation’s prototype.

Click here to learn more about the Data Integration Suite XQueryWebServiceFramework.

XQueryWebService Framework Architecture Overview

Let's take a look at the XQueryWebService framework architecture before getting into more of the details.

A high-level illustration of the XQueryWebService framework architecture shows all the pieces at work to expose an XQuery as a Web service:

 

 

To start, an HTTP request is submitted to a Web server — a Tomcat Web Server in this case. The URL used to invoke the Web service takes the following form:

http://examples.xquery.com/employee-lookup/emp.xquery?id=A-C71970F

Where:

  • http://examples.xquery.com/employee-lookup/emp.xquery is the location of the XQuery Web service. The Web service was created by saving an XQuery to the /employee-lookup folder where the Tomcat Web Server is running.
  • id=A-C71970F is a parameter passed to the XQuery. The variable that takes this parameter is defined in emp.xquery:

declare variable $id as xs:string external;
<root>{
   for $employee in collection("employee")/employee
   where $employee/emp_id = $id
   return $employee
}</root>

The query body is just a single FLWOR (For each, Let, Where, Order by, Return) expression.

When the XQuery processing is finished, it returns a value using HTTP response, as shown in the following illustration:

 

 

Most of the work performed by the Web service takes place in the DataDirect XQueryWebService servlet, a close-up of which is shown here:

 

 

The browser (or an application) submits the Web service request using SOAP or HTTP GET for the XQuery stored on the Web server. Next, DataDirect XQuery unpacks the Web service request and binds its parameters, if any, to the XQuery. In our example, the parameter passed with the Web service request is an ID. The XQuery is then executed and its result (an XML document) is returned to the client.

The Web Service Description Language (WSDL) is a language for describing Web services. If we copy the emp.xquery to a folder where our Java servlet container is running (/employee-lookup, for example), we can use the following URL to access a WSDL document that describes the Web service that results from our XQuery:

http://examples.xquery.com/employee-lookup/WSDL

Using this URL, we can take a closer look at how our XQuery is described by the WSDL document. In particular, let's look at these:

Service Element

The service element — only one per WSDL document — is named after the query file name without its extension. The service contains two port definitions that always have the same name: SOAPPort and HTTPGETPort, respectively; one for SOAP over HTTP, one for HTTP GET.

<wsdl:service name="Service">
   <wsdl:port binding="dd:SOAPBinding" name="SOAPPort">
       <wsdlsoap:address
          location="http://examples.xquery.com/employee-lookup/WSDL"/>
   </wsdl:port>
   <wsdl:port binding="dd:HTTPGETBinding" name="HTTPGETPort">
       <http:address
          location="http://examples.xquery.com/employee-lookup/WSDL"/>
   </wsdl:port>
</wsdl:service>

Notice that the service address or end point is the same for both ports.

For each element wsdl:port under the element wsdl:service there is an attribute called binding=; the attribute value matches the value of attribute name= of one of the binding elements.

HTTPGETBinding

The HTTPGETBinding describes the HTTP verb (in this case it is GET), which operations are exposed, and how the input/output are encoded. The attribute location= in the element wsdl:operation is particularly important — it represents the query function to invoke in our query; in this case emp means the query body.

<wsdl:binding name="HTTPGETBinding" type="dd:HTTPGETPort">
   <http:binding verb:"GET"/>
   <wsdl:operation name="emp">
       <http:operation location="/emp"/>
       <wsdl:input>
           <http:urlEncoded/>
       </wsdl:input>
       <wsdl:output>
            <mime:mimeXML part="Body"/>
       </wsdl:output>
   </wsdl:operation>
</wsdl:binding>

SOAPBinding

The SOAPBinding (in the following code sample) describes which encoding style will be used by the service; the value can be either rpc or document (in our case it is always document). The style document is completely driven by the schema definition associated to the message, so the resulting XML fragment is more elegant. The style rpc assumes the creation of a wrapper element that matches the underlying function name to encapsulate the function arguments. The XML on the wire may look the same, but it is conceptually different.

Each wsdlsoap:operation defines the attribute soapAction= that, similar to the attribute location= in http:operation, represents the function name; soapAction= must be encoded as an HTTP header in the Web service request.

The attribute use= in the element wsdlsoap:body can be either literal or encoded. (In the generated WSDL it will be always literal, as suggested by the OASIS WS Basic Profile 1.0, to improve interoperability between different client implementations.) The message representation on the wire has the child element of the element wsdlsoap:body, which matches the global element defined in the XML Schema and is declared in the related message part.

The attribute type= in the element binding matches the attribute name= of one of the element portType. The element portType associates to each operation one message for the input and one for the output.

<wsdl:binding name="SOAPBinding" type="dd:SOAPPort">
   <wsdlsoap:binding transport="http://schemas.xmlsoap.org/soap/http"
                    style="document"/>
   <wsdl:operation name="emp">
       <wsdlsoap:operation soapAction="emp.xquery" style="document"/>
       <wsdl:input>
           <wsdlsoap:body use="literal"/>
       </wsdl:input>
       <wsdl:output>
            <wsdlsoap:body use="literal"/>
       </wsdl:output>
   </wsdl:operation>
</wsdl:binding>

For each query function there is a pair of messages (input and output) for each binding (SOAPPort and HTTPGETPort). Having different messages for each binding allows, for instance, simple types like xs:string or xs:integer to be used for HTTP GET, which can be easily expressed inline as a URL.

<wsdl:portType name="SOAPPort">
   <wsdl:operation name="emp">
       <wsdl:input message="dd:empInputMsg"/>
       <wsdl:output message="dd:OutputMsg"/>
       <wsdl:fault name="nmtoken" message="dd:FaultMsg"/>
   </wsdl:operation>
</wsdl:portType>

<wsdl:portType name="HTTPGETPort">
   <wsdl:operation name="emp">
       <wsdl:input message="dd:empInputMsg"/>
       <wsdl:output message="dd:OutputMsg"/>
       <wsdl:fault name="nmtoken" message="dd:FaultMsg"/>
   </wsdl:operation>
</wsdl:portType>

The element wsdl:message may have multiple sub-elements called wsdl:part; each part references either an XML Schema global type or global element. OASIS WS Basic Profile 1.0 suggests using only one part and a global element. To mimic the validation process against an XML Schema, the validation always starts from a global element — the document root.

<wsdl:message name="empInputMsg">
   <wsdl:part name="parameters" element="dd:emp"/>
</wsdl:message>

<wsdl:message name="OutputMsg">
   <wsdl:part name="Output" element="dd:Output"/>
</wsdl:message>

<wsdl:message name="FaultMsg"/>

Finally, the WSDL describes the element types where the XML Schema types are defined. For each message, the XML Schema defines two global elements – one for the input and one for the output.

<wsdl:types>
   <wsdl:types>
       <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
                   targetNamespace="http://www.datadirect.com"
                   attributeFormDefault="unqualified"
                   elementFormDefault="qualified">
           <xs:import schemaLocation="employee.xsd"
                     namespace="http://www.employee.com"/>
           <xs:element name="emp">
               <xs:complexType>
                   <xs:all>
                       <xs:element type="xs:string" name="id"/>
                   </xs:all>
               </xs:complexType>
           </xs:element>
           <xs:element type="xs:anyType" name="Output"/>
       </xs:schema>
   </wsdl:types>

 

Email Print Share

Download Now

Data Integration Webinar

Web Services and Data Integration Across Heterogeneous Data Sources

Data Conversion Webinar

Bridging the Gap Between EDI, Proprietary Formats and XML

Product Review - Data Integration Suite

Video Systems Journal takes a look at Data Integration Suite.

VSJ Review

Data Integration White Paper

Taming Data Chaos: Simplifying Data Integration

Taming Data Chaos