Featured Post

Applying Email Validation to a JavaFX TextField Using Binding

This example uses the same controller as in a previous post but adds a use case to support email validation.  A Commons Validator object is ...

Sunday, February 13, 2011

Getting the Source Filename in CloverETL

CloverETL uses a special property of Readers like XLSDataReader to retrieve the data source filename.

When processing a file or a collection of files, the file name often contains valuable information about the contents.  Also, it's important to capture the file name for quality control purposes.  CloverETL allows you to access the filename of a file being processed using the special auto_filling property on Readers like XLSDataReader.

For example, in this graph, I record the filename used in XLSDataReader in the DBOutputTable.  The XLSDataReader will process more than one file.  It's file URL contains the wildcarded value "*_enrollments_*.xlsx" which will expand to filenames like "us_md_enrollments_fy2006.xlsx".

CloverETL Graph Using auto_filling

XLSDataReader's output port has the following metadata.

CloverETL XLSDataReader Output Port Metadata
Fields like County, State, and System were taken from the column headers in an Excel file.  Filename was added using the Edit metadata screen by pressing the green plus arrow.  After setting the field name and type, scroll to the Advanced section of the Field Properties pane.  Set Autofilling to 'source_name'.

Since it's added to the schema, the source_name field is able to be used in other component.  For example, the Graph in this post uses source_name (carried in the "Filename" field), in a Reformat column.

There are other file-based items that may be of interest.  Check the CloverETL Wiki for a list of what else is available.

1 comment:

  1. You can use autofilling property to get even sheet names, see: http://www.cloveretltalks.com/2016/08/how-to-get-all-sheet-names-from-excel.html (shameless plug here :))

    ReplyDelete