Course recordings on DaDesktop for Training platform
Visit NobleProg websites for related course
Visit outline: Talend Open Studio for ESB (Course code: talendesb)
Categories: SOA ESB Integration · Talend
Summary
Overview
This course session provides a comprehensive hands-on tutorial on using Talend (referred to as "Talent" in the transcript) for data integration tasks. The instructor demonstrates key components and workflows including row and column filtering using the tMap component, conditional job execution via triggers (on job OK/on job error), data formatting with variable expressions in Java, and database migration between MySQL and PostgreSQL using tDBInput and tDBOutput components. The session concludes with a practical example of migrating a customer table from MySQL to PostgreSQL, including schema synchronization, table creation configuration, and data validation via pgAdmin.
Topic (Timeline)
1. Introduction and Environment Issues [00:00:02 - 00:01:03]
- Instructor acknowledges persistent server issues affecting virtual machines.
- Notes lack of progress in troubleshooting; anticipates resolution by next session.
- Confirms audio and participant engagement; transitions to continuing previous session’s example.
2. tMap Component Overview and Basic Setup [00:01:03 - 00:04:21]
- Introduces
tMapas a unified component replacing separatetFilterRowandtFilterColumnoperations. - Demonstrates adding
tMapto canvas and connecting it between input (employeetable) and output (tRowOutput). - Explains necessity of defining output names to avoid naming conflicts.
- Highlights red exclamation mark on
tMapas indicator of incomplete configuration.
3. Configuring tMap: Column Filtering [00:04:21 - 00:06:09]
- Guides participants through double-clicking
tMapto open configuration. - Explains left-to-right mapping: input columns (left) → output columns (right).
- Recommends expanding output panel height for usability.
- Demonstrates selecting
firstName,city, andcountryfrom input and dragging to output to filter columns.
4. Configuring tMap: Row Filtering with Java Expressions [00:06:12 - 00:11:19]
- Explains row filtering requires Java code in the
tMapexpression area. - Expands the expression panel and demonstrates dragging
countryinto the Java field. - Clarifies string comparison syntax:
row.country.equals("Canada")(case-sensitive). - Shows combining conditions using
&&(Java logical AND) to filter forcountry = "Canada"ANDcity = "Calgary". - Final expression:
row.country.equals("Canada") && row.city.equals("Calgary"). - Confirms configuration removes red exclamation mark upon successful entry.
5. Job Sub-Flow Deactivation for Testing [00:11:40 - 00:13:32]
- Teaches how to disable entire sub-job (e.g.,
tFilterRowblock) via right-click → “Deactivate Sub-job”. - Executes job to verify only the active sub-job runs, outputting 5 rows (from 8 input rows).
- Confirms filtering logic works: 3 rows filtered out (not meeting Canada + Calgary criteria).
6. Data Formatting with tMap Variables [00:15:31 - 00:22:32]
- Demonstrates creating a new output column:
fullNameby concatenatingfirstNameandlastName. - Two methods shown:
- Dragging columns into variable expression and manually writing Java:
row.firstName + " " + row.lastName - Using the variable expression menu to double-click column names and auto-insert them with
+operators.
- Dragging columns into variable expression and manually writing Java:
- Shows renaming variable to
fullName_underscore(naming convention). - Explains how to remove unwanted output columns using the “X” button at bottom of output panel.
- Teaches reordering output columns using up/down arrows in output panel.
- Emphasizes expanding the tMap panel to prevent accidental column merging.
7. Job Flow Control with Triggers [00:23:46 - 00:27:36]
- Explains default sequential execution order of sub-jobs based on creation order.
- Introduces
Triggerconnections:on job OK(proceed on success) andon job error(proceed on failure). - Demonstrates connecting
tFilterFile(sub-job below) totDBInput(sub-job above) usingon job OK. - Executes job to confirm execution order now follows trigger chain, not creation order.
- Notes
on job errorbehaves identically but triggers on failure.
8. Database Migration: PostgreSQL Setup [00:28:16 - 00:31:34]
- Introduces new job: “Job_Migracion_Base_Underscore_Datos” to migrate from MySQL to PostgreSQL.
- Guides participants to open pgAdmin, connect to server using
rootpassword. - Creates new database named
Chinookvia right-click → “Create Database”. - Confirms database creation in pgAdmin under public schema.
9. Configuring Talend Database Connection [00:31:34 - 00:34:51]
- Navigates to Talend Metadata → “Create Connection”.
- Selects PostgreSQL as DB type; uses default user
postgres, passwordroot, host127.0.0.1, databaseChinook. - Leaves schema field blank to default to
publicschema. - Performs “Test Connection” to auto-download JDBC drivers.
- Confirms successful connection and finishes setup.
10. Migrating Table: tDBInput to tDBOutput [00:35:08 - 00:40:53]
- Drops
customertable from MySQL into canvas → auto-createstDBInput. - Drops PostgreSQL connection into canvas → creates
tDBOutput(nottDBInput). - Connects
tDBInputdirectly totDBOutputfor exact migration. - Explains color coding: purple = input, green = output.
- Reviews schema mapping:
DB column(source) →Column(Java) →Type(Java type) →DB Type(source type). - Confirms schema auto-syncs between input and output (with caveat: may require manual check).
11. Configuring tDBOutput: Table Creation and Data Action [00:40:53 - 00:45:19]
- Sets
Tablename intDBOutputtoartist(note: must be in quotes as Java string). - Sets
Action on tableto “Create table if not exists” to avoid errors on re-run. - Explains
Action on dataoptions:Insert: adds rows; fails on duplicate PK.Update: modifies existing rows; fails if row doesn’t exist.Insert or update: tries insert first, then update if PK exists.Update or insert: tries update first, then insert if missing (more efficient for updates).
- Selects
Insertfor initial migration.
12. Execution and Validation of Migration [00:45:30 - 00:48:16]
- Executes job: reports 59 rows migrated in 2.36 seconds.
- Validates in pgAdmin: refreshes database → navigates to
publicschema → findsartisttable. - Uses “View/Edit Data” → “All Rows” to confirm data integrity.
- In Talend, refreshes PostgreSQL connection →
artisttable now visible under public schema.
13. Class Schedule Clarification and Closing [00:48:16 - 00:49:53]
- Instructor clarifies next session is Wednesday (not tomorrow) due to Mexican holiday.
- Notes technical issues were resolved during session; encourages questions.
- Announces next class will focus on Talend Integration and Talend ISB.
- Ends with well-wishes and confirmation that session data auto-saves.
Appendix
Key Principles
- tMap is the central component for filtering rows/columns and transforming data in Talend.
- Java expressions are required for complex row filtering; string comparisons use
.equals()and are case-sensitive. - Output column order and naming must be explicitly managed in tMap to avoid errors.
- Sub-jobs can be controlled via triggers (
on job OK/on job error) to override default execution order. - Database schema mapping in Talend distinguishes between source (
DB column,DB type) and target (Column,Type). Action on tableandAction on dataintDBOutputdetermine table creation and DML behavior (insert/update).
Tools Used
- Talend Open Studio (referred to as “Talent”)
- pgAdmin (PostgreSQL administration tool)
- MySQL (source database)
- PostgreSQL (target database)
Common Pitfalls
- Red exclamation mark on tMap = unconfigured output name or invalid Java expression.
- Forgetting to set
Action on tableto “Create table if not exists” causes job failure on re-run. - Not expanding tMap panel can cause accidental column merging during drag-and-drop.
- Using
insertinstead ofupdate or insertfor large datasets with frequent updates leads to performance issues. - Leaving schema field blank in PostgreSQL connection defaults to
public— critical for correct table access.
Practice Suggestions
- Recreate the
tMapfiltering example with different conditions (e.g.,country = "USA"andcity = "New York"). - Try migrating a second table (e.g.,
order) from MySQL to PostgreSQL using the same pattern. - Experiment with
update or insertvsinsert or updateon a table with existing data. - Use
tRowOutputto preview transformed data before writing to database. - Practice disabling sub-jobs to isolate and debug specific components.