A reader recently asked about manipulating schedule data for more efficient presentation. I've worked with schedulesas part of the Open Travel Alliance which is an XML standard for sharing travel industry information. In both cases, the data structures involved in the intake of the data weren't suited for the display. Talend Open Studio and Jaspersoft can resolve the discrepancy.
Operations Schedule
The hours in which a business operates is important to customers. In most cases, a business with a storefront will not operate 24x7, with weekend hours differing from weekday hours. There may be different times within the week (closed Mondays).
To input information like this is a structured way -- that is, not echoing back text strings -- means that the presentation can be standardized across an applications, validation is easier, and back-end computations and analytics can be used. A UI that might capture this information would look like this:
An Operations Schedule UI Helps Maintain Structured Data |
Operations Schedule Data |
Target Presentation
Even though the operations schedule data is structured, we'd still like to display the information in a concise format for the user. Instead of
- Monday 09:00 - 17:00
- Tuesday 09:00 - 17:00
- Wednesday 09:00 - 17:00
- Thursday 09:00 - 17:00
- Friday 09:00 -17:00
- Saturday 10:00 - 17:00
- Sunday 10:00-17:00
We'd like
- Monday - Friday 09:00 - 17:00
- Saturday and Sunday 10:00-17:00
Consolidating data in this fashion is easy if the data is in a relational database. The data can be grouped by key (hotel, restaurant), open time, and close time. This reduces the number of lines in the previous paragraph from 7 down to 2.
An additional manipulation is required in order to support the group by operation. This is to convert the boolean values into a numeric which makes the values available to the SQL aggregate operation "MAX". This is a workaround because there is no SQL aggregate operation for logic like "OR".
Schema Replacing Boolean Values with Numerics for MAX() Function |
Talend Open Studio
If the input data is not provided directly by an application, say it's provided through an external interface, you should load your data into an RDBMS. To transform this input data from a spreadsheet into something ready to be grouped is easy with Talend Open Studio. A simple 3-component job will work.
Talend Job to Load Spreadsheet into Access |
tMap Converting Boolean Values to Numerics |
Notice that Updatable and Insertable are unchecked.
Preventing an Insert on an Auto Increment Column |
The result of running the Talend Open Studio job is the following. All of the Talend Open Studio and Access types are String or Text, but Access's display seems to coerce the openTime and closeTime columns into a date.
Data Loaded and Transformed into an RDBMS |
No comments:
Post a Comment