Friday, October 26, 2012

Flexibility in generating PDF's from BPEL by using IText and XHTML

I needed a quick but flexible way to generate PDF files from BPEL. The project I was on, was using the iText PDF library; http://itextpdf.com/. I encountered this library before in the Oracle SOA Suite 11g Handbook from Lucas Jellema. Here the library was used amongst other things to demonstrate the Spring component. I decided to use this library but took a different approach as to what Lucas describes in his book since I did not want to hardcode the layout of my PDF in Java code.

The IText library has code to convert XHTML to a PDF. XHTML can be manipulated like any other XML in BPEL by using a transformation. This way I could put the logic for the layout in an XSL file and make layouting easy because there are few programmers who don't know HTML.

Implementation

You can download the code/samples by using the links in the Example part of the post.

GeneratePdf webservice

First I created a JAX-WS webservice to write my PDF on the file system and a servlet to download the PDF. The servlet is not secure as it allows downloading files from the filesystem if you know the filename. Do not use this in production environments!

I included the following JAR files for this;
itextpdf-5.3.1.jar
itextpdfa-5.3.1.jar
itext-xtra-5.3.1.jar
xmlworker-1.1.5.jar

The code I used was the following;

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.tool.xml.XMLWorkerHelper;

import java.io.File;
import java.io.FileOutputStream;
import java.io.PrintWriter;
import java.io.StringReader;

import java.io.StringWriter;

import javax.jws.WebMethod;
import javax.jws.WebParam;
import javax.jws.WebResult;
import javax.jws.WebService;

@WebService
public class GeneratePdf {
  public GeneratePdf() {
    super();
  }
 
  @WebMethod
  @WebResult(name = "status")
  public String HtmlToPdf(@WebParam(name = "pdfpath")String pdfpath, @WebParam(name = "xhtml")String html) {
    Document document = new Document();
    PdfWriter writer;
    long size=0;
    try {
      writer = PdfWriter.getInstance(document, new FileOutputStream(pdfpath));
      document.open();
      XMLWorkerHelper.getInstance().parseXHtml(writer, document, new StringReader(html));
      document.close();
      size=new File(pdfpath).length();
    } catch (Exception e) {
      return "NOK: "+stackTraceToString(e);
    }
    return "OK: "+"File created. Size: "+size;
  }
 
  @WebMethod(exclude=true)
  public static String stackTraceToString(Throwable e) {
    StringWriter sw = new StringWriter();
    e.printStackTrace(new PrintWriter(sw));
    return sw.toString();

  }
  /*
  public static void main(String[] args) {
    GeneratePdf myPdf = new GeneratePdf();
    System.out.println(myPdf.HtmlToPdf("c:\\temp\\output3.pdf", "<html><head/><body><p>Hello world</p></body></html>"));
  }
  */
}


DownloadPdf servlet

The code I used for the servlet to download the PDF is the following;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;

import java.io.InputStream;

import java.io.OutputStream;

import javax.servlet.ServletContext;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class DownloadPdf extends HttpServlet {
  private static final long serialVersionUID = 4537516508557609572L;

  public DownloadPdf() {
    super();
  }
 
  public void doGet(HttpServletRequest request, HttpServletResponse response) 
  throws ServletException, IOException, FileNotFoundException  { 

   System.out.println("parameter filename: "+request.getParameter("filename"));
   long length = new File(request.getParameter("filename")).length();
   System.out.println("size: "+length);
   InputStream is = new FileInputStream(new File(request.getParameter("filename"))); 
   
   response.addHeader("content-disposition", "attachment; filename=mypdf.pdf;");

      response.setContentType("application/pdf");
   response.setContentLength(Long.valueOf(length).intValue());
    int read =0; 
   byte[] bytes = new byte[1024];
     OutputStream os = response.getOutputStream(); 
    while((read = is.read(bytes)) != -1) 
   {    os.write(bytes, 0, read); 
   } 
   os.flush(); 
   os.close(); 
  } 
}


Converting to XHTML

The above webservice and servlet are preparation to be able to be flexible in BPEL with generating PDF's. To be able to transform messages in BPEL to XHTML, it's quite useful XHTML has XSD's. See for example http://www.w3.org/2002/08/xhtml/xhtml1-strict.xsd.

Of course this XSD is quite complex and it did not immediately work in JDeveloper so I had to make some alterations. I removed the following import;

<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/2001/xml.xsd"/>

And I removed all references to the xml namespace. These were only 2 lines with a small number of occurences;
<xs:attribute ref="xml:lang"/>
<xs:attribute ref="xml:space" fixed="preserve"/>

Now I was able to define a variable of the type specified in the XHTML XSD. I used the variable for a transformation. In this example I created a Hello [name] XHTML. This XHTML I used to call the webservice to create the PDF. The below example contains the XSL transformation I created.

<?xml version="1.0" encoding="UTF-8" ?>
<?oracle-xsl-mapper
  <!-- SPECIFICATION OF MAP SOURCES AND TARGETS, DO NOT MODIFY. -->
  <mapSources>
    <source type="WSDL">
      <schema location="../HelloWorldPDF_BPEL.wsdl"/>
      <rootElement name="process" namespace="http://xmlns.oracle.com/PdfUtils/HelloWorldPDF/HelloWorldPDF_BPEL"/>
    </source>
  </mapSources>
  <mapTargets>
    <target type="WSDL">
      <schema location="../HelloWorldPDF_BPEL.wsdl"/>
      <rootElement name="html" namespace="http://www.w3.org/1999/xhtml"/>
    </target>
  </mapTargets>
  <!-- GENERATED BY ORACLE XSL MAPPER 11.1.1.6.0(build 111214.0600.1553) AT [FRI OCT 26 09:44:31 CEST 2012]. -->
?>
<xsl:stylesheet version="1.0"
                xmlns:bpws="http://schemas.xmlsoap.org/ws/2003/03/business-process/"
                xmlns:xp20="http://www.oracle.com/XSL/Transform/java/oracle.tip.pc.services.functions.Xpath20"
                xmlns:mhdr="http://www.oracle.com/XSL/Transform/java/oracle.tip.mediator.service.common.functions.MediatorExtnFunction"
                xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
                xmlns:oraext="http://www.oracle.com/XSL/Transform/java/oracle.tip.pc.services.functions.ExtFunc"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns:dvm="http://www.oracle.com/XSL/Transform/java/oracle.tip.dvm.LookupValue"
                xmlns:hwf="http://xmlns.oracle.com/bpel/workflow/xpath"
                xmlns:plnk="http://docs.oasis-open.org/wsbpel/2.0/plnktype"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:med="http://schemas.oracle.com/mediator/xpath"
                xmlns:ids="http://xmlns.oracle.com/bpel/services/IdentityService/xpath"
                xmlns:bpm="http://xmlns.oracle.com/bpmn20/extensions"
                xmlns:xdk="http://schemas.oracle.com/bpel/extension/xpath/function/xdk"
                xmlns:xref="http://www.oracle.com/XSL/Transform/java/oracle.tip.xref.xpath.XRefXPathFunctions"
                xmlns:client="http://xmlns.oracle.com/PdfUtils/HelloWorldPDF/HelloWorldPDF_BPEL"
                xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                xmlns:ns1="http://www.w3.org/1999/xhtml"
                xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
                xmlns:bpmn="http://schemas.oracle.com/bpm/xpath"
                xmlns:ora="http://schemas.oracle.com/xpath/extension"
                xmlns:socket="http://www.oracle.com/XSL/Transform/java/oracle.tip.adapter.socket.ProtocolTranslator"
                xmlns:ldap="http://schemas.oracle.com/xpath/extension/ldap"
                exclude-result-prefixes="xsi xsl plnk client xsd ns1 wsdl bpws xp20 mhdr bpel oraext dvm hwf med ids bpm xdk xref bpmn ora socket ldap">
  <xsl:template match="/">
    <ns1:html>
      <ns1:body>
        <ns1:p>
          <xsl:value-of select="concat('Hello ',/client:process/client:input)"/>
        </ns1:p>
      </ns1:body>
    </ns1:html>
  </xsl:template>
</xsl:stylesheet>



You can also observe in the XSL that the format used for the actual layout of the PDF is XHTML.

Example

Now putting it all together;
Download and deploy the PDF webservice/servlet: https://dl.dropbox.com/u/6693935/blog/PdfUtils.zip

Download and deploy the sample BPEL project; https://dl.dropbox.com/u/6693935/blog/PdfUtilsBPEL.zip. Keep in mind that you will most likely need to change the URL of the PDF webservice.

Test the webservice

Check the PDF is created by looking at the /tmp folder of your server running the SOA Suite. I used /tmp since I used an Oracle Enterprise Linux SOA Suite installation. You might want to change the path if you're running the server on Windows.

After you've determined the filename, you can look at the result. Download the PDF by going to (in my case); http://soabpm-vm:7001/PdfUtils//DownloadPdf.do?filename=/tmp/PDF2012-10-26T00:51:19.pdf

 
Final thoughts

The conversion from XHTML to PDF is not perfect. Complex layouts cause problems. I did some try-outs with tables and they were only partially succesful. For more complex layouts, it's worthwhile to look at Apache FOP; http://xmlgraphics.apache.org/fop/ and base your PDF generation on that. Apache FOP can also be used with Oracle APEX. A drawback is that it introduces it's own layouting language.

If you need to convert newlines to <br/> statements (for example when converting texts), you can look at; http://www.danrigsby.com/blog/index.php/2008/01/03/preserving-line-breaks-in-xml-while-transforming-to-html-with-xslt/ for an XSL template to use.

Friday, October 5, 2012

Liquibase; an interesting option for efficient database change management

Often relevant when working with Oracle SOA Suite; database change management. There are several issues with database scripts which are rolled out over different environments in releases;

Take the following example;
Development environment has version 3 of the database software installed. The test environment has version  2 installed and the acceptance test environment has version one installed. The software hasn't reached production yet. In version 1, a table is created. In version 2 data is inserted in the table and in version 3, the data in the table is updated.

When releases are created and deployment is not automated (or automated using custom code), a DBA would have to install 1 followed by 2, followed by 3 in the production environment if the environment is to be updated to version 3. Also when installation is manual, the DBA has to have an SQL client installed (or run the scripts from the DB server) and most likely the scripts in release 1,2 and 3 have different names and installation instructions such as prerequisite checks. Some scripts such as the release 2 script which inserts data may only be executed once on an environment. A custom registration has to be kept in order to keep track of changes in an environment. Not only is this expensive on DBA time, but it's also prone to errors. And what if the installation went wrong? Then often the rollback is restoring a backup or clone from before the installation started. Creating and restoring these can be cumbersome, especially for relatively small changes.

The open source product Liquibase (http://www.liquibase.org) provides a solution for automating database change management. I've tried this product only for a couple of days and this post describes some first experiences with it. I must admit I haven't used it yet at a customer but I might in the near future and will certainly suggest it as an option when the question arises.

Because my current customer shuns fancy things like automated builds, I've used batch files and command-lines in the examples below.

Liquibase

What is Liquibase?

Liquibase is a standalone JAR file which can executed using a commandline or from for example Maven using a plugin. A property file can be used to specify database connection settings. Below an example of such a property file;

driver: oracle.jdbc.OracleDriver
classpath: ojdbc6.jar;liquibase-oracle-1.2.0.jar
url: jdbc:oracle:thin:@localhost:1521:XE
username: testuser
password: testuser
contexts: dev


It is also possible to specify these properties on the command-line;

liquibase --driver=oracle.jdbc.OracleDriver \
     --classpath=ojdbc6.jar;liquibase-oracle-1.2.0.jar \
     --changeLogFile=changelogs\testchangelog.xml \
     --url="jdbc:oracle:thin:@localhost:1521:XE" \
     --username=testuser \
     --password=testuser \
     --contexts="dev"
     update


Database changes are described using XML files called ChangeLogs. An example is shown below;

 <?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog
    xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ora="http://www.liquibase.org/xml/ns/dbchangelog-ext"
    xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-2.0.xsd
    http://www.liquibase.org/xml/ns/dbchangelog-ext http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd">
     <changeSet id="1" author="ms">
        <createTable tableName="params">
            <column name="paramname" type="varchar(50)">
                <constraints primaryKey="true" nullable="false"/>
            </column>
            <column name="paramvalue" type="varchar(50)">
                <constraints nullable="false"/>
            </column>
        </createTable>
    </changeSet>
    
    <changeSet author="ms" id="2" context="dev">
        <insert tableName="params">
            <column name="paramname" value="environment"/>
            <column name="paramvalue" value="dev"/>
        </insert>
        <rollback>
            <delete tableName="params">
                <where>paramname='environment'</where>
            </delete>
        </rollback>
    </changeSet>

    <changeSet author="ms" id="3" context="tst">
        <insert tableName="params">
            <column name="paramname" value="environment"/>
            <column name="paramvalue" value="tst"/>
        </insert>
        <rollback>
            <delete tableName="params">
                <where>paramname='environment'</where>
            </delete>
        </rollback>
    </changeSet>
 
 </databaseChangeLog>

When a changelog is executed against an environment, a table entry is kept which says the changelog has been executed. When the same changelog is executed again, it refrains from re-execution thus avoiding problems with for example insert statements which are executed repeatedly. Most actions described in the changelogs have an automated rollback action. Some such as insert or update statements don't however. They need to be specified explicitly.
To apply a ChangeLogfile the following can be used (if connection properties are specified in liquibase.properties);

liquibase --changeLogFile=changelogs\testchangelog.xml update

Oracle specific tips

There is a specific tutorial for using Oracle in combination with Liquibase; http://www.liquibase.org/tutorial-using-oracle. In my opinion you should consider the parts from this tutorial you want to implement. Liquibase can make database change management easier but you can also make the process complicated. In my opinion you should avoid that a small group or even only one developer knows how to deal with the tool and the process. To achieve this, I'd suggest to keep it simple and the manual scripting / steps for releasing to a minimum.

Oracle extensions

The property file in the example above includes a classpath referring to the Oracle extension which can be used to provide support for some Oracle specific database features; https://liquibase.jira.com/wiki/display/CONTRIB/Oracle+Extensions

These features include amongst others XML tags to enable and disable triggers, to truncate tables, add check constraints, etc.

JDBC drivers

Also you should download the Oracle JDBC driver from for example; http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html

How to deal with packages, grants and synonyms?

Packages, grants and synonyms are more or less Oracle specific and not supported directly using XML tags in ChangeLog files. You can however use the sql tag such as in the example below;

    <changeSet author="ms" id="6" context="tst">
        <sql>
            grant select on params to public
        </sql>
        <rollback>
            <sql>
                revoke select on params from public
            </sql>
        </rollback>
    </changeSet>


Developer tips

Test your installation using apply, rollback, apply scenario in your own schema to avoid hindering other developers. This can be done with the updateTestingRollback option (use instead of update). You can change changelogfiles but a checksum is kept in the database and rerunning the changelog will cause errors if the checksum mismatches. If you want the change to be reapplied and avoid checksum failures, you can use the runOnChange attribute. Also avoid making changes not using Liquibase because that can cause conflicts.

How do you deal with OTAP?

Liquibase has the option to specify contexts as part of the property file. You can see this in the example property file above. In the definition of the ChangeSet a context can be specified. If no context is specified, the changelog is always executed. In the above example, if the environment is dev, the dev environment entry is added to the params table. This will not happen if the context is tst. Then the entry tst will be inserted. If you want to change an environment from dev to tst, you should first rollback 'dev' and then apply 'tst'. Converting dev to tst requires more code; rollback is easier. You can use a precondition to check if an entry is already present and update instead, but if you're loading large datasets, this is not workable.

How to deal with multiple schema's/users?

When using the Oracle database, a database user is also called a schema. Often when the datamodel is implemented, different users/schema's are used to implement different 'modules' of the application or different 'applications'. These schema's are strung together with grants, views, synonyms or even databaselinks when the users are in physically different databases. For the person installing database objects into different schema's, this can be cumbersome. Liquibase can be used with a little bit of scripting to help automate this process.

You can specify commandline which property file is used and which ChangeSets are applied. This can be done with the option defaultsFile like;

liquibase --changeLogFile changelogs\testchangelog.xml --defaultsFile="liq.props" update

You can use this to use a different property file per environment (for example specifying different databases).

builddir
- liquibase_dev.properties
- liquibase_tst.properties
- liquibase_acc.properties
- liquibase_prd.properties
--schema1
---master.xml
---testchangelog.xml

Usually passwords of database users are guarded by DBA-ers and need to be asked during the installation. When using the above directory structure, you can loop over the schema's, ask for the password for the specific schema and use the connection settings specific for the purpose of the environment.

How to deal with errors during the installation

A nice suggestion was done by Jelle Hoekstra. You can create a tag during the start of the release. Next you can execute your changes. You can check the ERRORLEVEL environment variable to determine if a change went successful or has failed. On failure, you can call the rollback action to the previously created tag. This is implemented in the below script.

Example script to run Liquibase

The below example script shows some of the suggestions earlier in this post.

@echo off
set current_dir=%CD%
set liquibase_directory=%current_dir%
set build_directory=%current_dir%\build

set DATESTAMP=%DATE:~3,2%_%DATE:~6,2%_%DATE:~9,4%
set TIMESTAMP=%TIME:~0,2%_%TIME:~3,2%_%TIME:~6,2%
set DATEANDTIME=%DATESTAMP%_%TIMESTAMP%
set SCHEMA=%2%
set PASSWORD=%3%
:BEGIN
IF "%1%" == "" GOTO USAGE
IF "%2%" == "" GOTO USAGE
IF "%3%" == "" GOTO USAGE
SET ENV=%1%
IF "%ENV%" == "" GOTO USAGE
SET filename=%build_directory%\liquibase_%1.properties
echo Property file : %filename% 1> install.log
java -jar %liquibase_directory%/liquibase.jar --defaultsFile=%filename% --changeLogFile=%build_directory%/%SCHEMA%/master.xml --contexts=%ENV% --username=%SCHEMA% --password=%PASSWORD% tag %DATEANDTIME% -Dchangelogpath=%build_directory%/%SCHEMA% 1>>install.log 2>>&1
IF NOT %ERRORLEVEL% == 0 GOTO ERROR
java -jar %liquibase_directory%/liquibase.jar --defaultsFile=%filename% --changeLogFile=%build_directory%/%SCHEMA%/master.xml --contexts=%ENV% --username=%SCHEMA% --password=%PASSWORD% --logLevel=info update -Dchangelogpath=%build_directory%/%SCHEMA% 1>>install.log 2>>&1
IF NOT %ERRORLEVEL% == 0 GOTO ROLLBACK
java -jar %liquibase_directory%/liquibase.jar --defaultsFile=%filename% --changeLogFile=%build_directory%/%SCHEMA%/master.xml --contexts=%ENV% --username=%SCHEMA% --password=%PASSWORD% dbDoc %build_directory%/docs/%SCHEMA% -Dchangelogpath=%build_directory%/%SCHEMA% 1>>install.log 2>>&1
IF NOT %ERRORLEVEL% == 0 GOTO ROLLBACK
ECHO Script executed on %ENV%:SCHEMA %SCHEMA%
GOTO END
:ROLLBACK
java -jar %liquibase_directory%/liquibase.jar --defaultsFile=%filename% --changeLogFile=%build_directory%/%SCHEMA%/master.xml --username=%SCHEMA% --password=%2 rollback %DATEANDTIME% -Dchangelogpath=%build_directory%/%SCHEMA% 1>>install.log 2>>&1
GOTO ERROR
:ERROR
ECHO Errors found during execution of script on %ENV%:SCHEMA %SCHEMA%
ECHO See install.log
GOTO END
:USAGE
ECHO Error during execution of script on %ENV%:SCHEMA %SCHEMA%
ECHO Incorrect parameters
ECHO USAGE : install environment schema password
ECHO environment   = dev,tst,acc,prd
ECHO schema     = Schema/user for which an update needs to be performed
ECHO password = Schema/user password
GOTO END
:END

You can start the script like;

runlb.bat dev testuser testuser

This executes the builddir\testuser\master.xml changelog file and uses the builddir\liquibase_dev.properties file for it's database connection properties. It executes against user testuser with password testuser. In the master.xml file there is a reference to the changelogdir so it can include other changelogfiles.

In this case, the master.xml script looks as followed;

<?xml version="1.0" encoding="UTF-8"?> 
<databaseChangeLog
    xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ora="http://www.liquibase.org/xml/ns/dbchangelog-ext"
    xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-2.0.xsd
    http://www.liquibase.org/xml/ns/dbchangelog-ext http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd">
  <include file="${changelogpath}/testchangelog.xml"/> 
</databaseChangeLog> 

The liquibase-dev.properties looks as follows;

driver: oracle.jdbc.OracleDriver
classpath: ojdbc6.jar;liquibase-oracle-1.2.0.jar
url: jdbc:oracle:thin:@localhost:1521:XE

Notice in the above script the following;
- differentiation between Liquibase location and Build location (they can be taken apart so Liquibase doesn't have to be part of the release itself)
- usage of ERRORLEVEL to perform rollback if needed
- usage of a property to determine the correct changelogpath in the master.xml file
- usage of the contexts parameter
- usage of a log file for the output

Conclusion

Liquibase is very flexible and provides enough options to be able to take care of database change management (with the help of some scripts and version control of course). It provides various integration options but can also be run stand-alone. Also if implemented well, it can reduce the amount of time DBA needs to spend on installations and reduces the amount of human errors during the installation process.

You need to think well about how you are going to implement it because of this flexibility. There are some best practices online which are a good idea to take into account;

Liquibase can do a lot more then what I've mentioned in this post such as generating an initial changelogset from a database or generating diffs. Also how for example version control can be used efficiently together with Liquibase is an important topic I haven't touched yet. These are however customer process specific and should be determined by the person implementing Liquibase at a customer.