Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Monday, March 24, 2014

Collaborating on Talend Open Studio Routines: Part 4 - Assembly

Pulling Routine code development out of Talend Open Studio means that you can enlist supporting technologies like Git and Maven to foster collaboration.  This post describes how to assemble Java source into a zip file that can be imported into Talend Open Studio.
Team-based development requires a source code control management capability that isn't provided within Talend Open Studio (TOS).  However, to factor out the source code that makes up a TOS Routine means that some other technology is required to build the code; this building is normally done within TOS starting with the Create Routine command.  In my Routine development, I use the Maven build tool to compile the TOS Routine into .class files.

I also use Maven to package the compiled product along with library dependencies (JARs) and TOS metadata (talend.project, .properties files) into a zip file for distributing to users.  Maven has great out-of-the-box functionality for creating well-known artifacts like library JARs or web application WARs, but requires customization for artifacts like a TOS Routine zip file.

Assembly Plugin

Out-of-the-box, Maven will produce a JAR file based on a set of .java source files with very little pom.xml file configuration besides the project identification.  Code in src/main/java is compiled and zipped, producing a JAR based on the name of the project.  If you want to produce something that deviates from the standard JAR format, say you want to include third-party dependencies or modify the file hierarchy in the JAR file, you can use a Maven extension called a Plugin, the Assembly Plugin in particular.

In the TOS Routine that I maintain, BRules, I produce a JAR which makes BRules able to be used in any Java project.  However, to import this into TOS, the format has to change.  For example, the file BRules.java is located under the TALENDPROJECT/code/routines folder and is named "BRules_1.6.item" where 1.6 is the version of the Routine.

The following section of XML is called bin.xml and is stored in the BRules project under src/main/resources/assembly.  The BRules project still produces a JAR using the standard layout, but a zip file -- brules-bin.zip -- is also produced using a layout that allows the Routine to be imported into TOS.

<assembly
xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2 http://maven.apache.org/xsd/assembly-1.1.2.xsd">
  <id>bin</id>
  <formats>
    <format>zip</format>
  </formats>
  <includeBaseDirectory>false</includeBaseDirectory>
  <files>
    <file>
     <source>${project.build.directory}/TALENDROUTINE/talend.project</source>
     <outputDirectory>TALENDPROJECT</outputDirectory>
     <destName>talend.project</destName>
    </file>
    <file>
     <source>src/main/java/routines/BRules.java</source>
     <outputDirectory>TALENDPROJECT/code/routines/bekwam</outputDirectory>
     <destName>BRules_${parsedVersion.majorVersion}.${parsedVersion.minorVersion}.item</destName>
    </file>
    <file>
     <source>${project.build.directory}/TALENDROUTINE/BRules_${parsedVersion.majorVersion}.${parsedVersion.minorVersion}.properties</source>
     <outputDirectory>TALENDPROJECT/code/routines/bekwam</outputDirectory>
     <destName>BRules_${parsedVersion.majorVersion}.${parsedVersion.minorVersion}.properties</destName>
    </file>
   </files>
   <dependencySets>
    <dependencySet>
     <outputDirectory>TALENDPROJECT/lib</outputDirectory>
     <useProjectArtifact>false</useProjectArtifact>
     <useTransitiveDependencies>false</useTransitiveDependencies>
    </dependencySet>
   </dependencySets>
</assembly>

The bin.xml does 4 things that are not typically done in a standard JAR command.
  1. Copies a generated file called talend.project into a subfolder in the zip file called TALENDPROJECT.
  2. Renames the source file BRules.java to BRules_1.6.item where 1.6 is the version of the project.  The renamed file is copied to the TALENDPROJECT/code/routines subfolder in the zip file.
  3. Copes a generated file called BRules_1.6.properties to TALENDPROJECT/code/routines.
  4. Copies all of the dependent libraries to the zip file under the TALENDPROJECT/lib folder.
The bin.xml file is a collection of directives that lets you pull resources from the build such as input source files, generated class files, library dependencies, or other generated files and transform the resources into the desired output layout.  For example, the <file> directive under the <files> section allows you to specify a source file and map that to a destination name and location.  The <depedencySet> directive allows you to manipulate the JAR file dependencies in a similar fashion.

Using the Assembly Plugin

To use the Assembly Plugin, which is a standard feature of Maven, you need to create an assembly descriptor as in the previous section.  There are standard descriptors that you can use -- there's one to capture all of the source and binaries for a "project" distribution -- but for TOS Routines, you'll need your own similar to the one I wrote for BRules.  Additionally, you'll need to configure your project pom.xml to invoke the Assembly Plugin at the packaging phase of Maven.


From the build/plugins section of the project pom.xml:

<plugin>
  <artifactId>maven-assembly-plugin</artifactId>
  <executions>
    <execution>
    <id>talend-assembly</id>
    <phase>package</phase>
    <goals>
      <goal>single</goal>
    </goals>
    <configuration>
    <descriptors>
      <descriptor>src/main/assembly/bin.xml</descriptor>
    </descriptors>
    </configuration>
    </execution>
  </executions>
</plugin>


This will look the goal "single" into the packaging so that each time you "mvn install" the build, you'll produce the bin.zip along with the JAR.

To produce this packaging to some reverse engineering of the TOS exported Routine.  As such, it may need to be updated for future versions of TOS; Talend has no agreement with me to adhere to this file structure in the future.  However, this seems to have held for the past few years, so proactive testing of future versions should work.  The next post will expand on the reverse engineering of the TOS Routine, how to produce the metadata associated with a TOS Routine, and how you can take advantage of the existing TalendRoutine Maven Plugin in your Routine development.

No comments:

Post a Comment