tHashOutput and tHashInput worked with input stored in internal memory and do so in a way consistent with other Talend components. The Hash components allow you to define flows to retrieve data throughout a map that has been stored by some other part of the job. In a simple scenario, this is done with a single input/output pair.
Multiple Sources
This screenshot shows a job that will merge two data sources -- a tRowGenerator and a tFileInputDelimited -- into a single Hash data structure using two tHashOutputs. The first tHashOutput will be referenced by subsequent tHashOutputs in the "Link with a tHashOutput" control.
Configuration of Linked tHashOutput |
tHashOutput Referring to Prior Component |
tHashInput Configuration |
Clearing When Iterating
This job iterates over a data set, clearing the backing RAM structure defined in the tHashOutput with each iteration. This is done by unchecking Append. If Append were not unchecked, each iteration would produce more and more output as the preceding iteration's tHashOutput gathers more values.
Clearing After Each Iteration |
Results with Append Unchecked |
If Append is checked, the output is repeated as it accrues through the iterations.
Iterating in Append Mode |
Why you have provided the tfixedflowinput during the iteration in foreach component??
ReplyDeleteThis is for demonstration purposes to consolidate the display. The tFixedFlowInput is meant to simulate an input component like a tOracleInput. The tForEach shows repeated invocations of an input component subjob. The same demonstration could be rewritten without the tForEach using multiple input (tFixedFlowInput) components in multiple subjobs.
DeleteDoes tHBASE components be only used for Big Data Batch jobs?
ReplyDelete