12 videos 📅 2024-09-30 09:00:00 America/Bahia_Banderas
48:57
2024-09-30 11:03:24
30:21
2024-09-30 12:02:29
3:29
2024-09-30 12:34:41
1:04
2024-09-30 12:43:50
1:22
2024-09-30 13:06:34
5:52
2024-09-30 13:08:57
5:02
2024-09-30 13:30:59
12:33
2024-09-30 13:46:10
49:55
2024-09-30 15:00:01
6:21
2024-10-08 11:02:02
8:16
2024-10-08 12:05:49
3:03
2024-10-08 12:26:15

Course recordings on DaDesktop for Training platform

Visit NobleProg websites for related course

Visit outline: Talend Open Studio for ESB (Course code: talendesb)

Categories: SOA ESB Integration · Talend

Summary

Overview

This course session provides a comprehensive hands-on tutorial on using Talend (referred to as "Talent" in the transcript) for data integration tasks. The instructor demonstrates key components and workflows including row and column filtering using the tMap component, conditional job execution via triggers (on job OK/on job error), data formatting with variable expressions in Java, and database migration between MySQL and PostgreSQL using tDBInput and tDBOutput components. The session concludes with a practical example of migrating a customer table from MySQL to PostgreSQL, including schema synchronization, table creation configuration, and data validation via pgAdmin.


Topic (Timeline)

1. Introduction and Environment Issues [00:00:02 - 00:01:03]

  • Instructor acknowledges persistent server issues affecting virtual machines.
  • Notes lack of progress in troubleshooting; anticipates resolution by next session.
  • Confirms audio and participant engagement; transitions to continuing previous session’s example.

2. tMap Component Overview and Basic Setup [00:01:03 - 00:04:21]

  • Introduces tMap as a unified component replacing separate tFilterRow and tFilterColumn operations.
  • Demonstrates adding tMap to canvas and connecting it between input (employee table) and output (tRowOutput).
  • Explains necessity of defining output names to avoid naming conflicts.
  • Highlights red exclamation mark on tMap as indicator of incomplete configuration.

3. Configuring tMap: Column Filtering [00:04:21 - 00:06:09]

  • Guides participants through double-clicking tMap to open configuration.
  • Explains left-to-right mapping: input columns (left) → output columns (right).
  • Recommends expanding output panel height for usability.
  • Demonstrates selecting firstName, city, and country from input and dragging to output to filter columns.

4. Configuring tMap: Row Filtering with Java Expressions [00:06:12 - 00:11:19]

  • Explains row filtering requires Java code in the tMap expression area.
  • Expands the expression panel and demonstrates dragging country into the Java field.
  • Clarifies string comparison syntax: row.country.equals("Canada") (case-sensitive).
  • Shows combining conditions using && (Java logical AND) to filter for country = "Canada" AND city = "Calgary".
  • Final expression: row.country.equals("Canada") && row.city.equals("Calgary").
  • Confirms configuration removes red exclamation mark upon successful entry.

5. Job Sub-Flow Deactivation for Testing [00:11:40 - 00:13:32]

  • Teaches how to disable entire sub-job (e.g., tFilterRow block) via right-click → “Deactivate Sub-job”.
  • Executes job to verify only the active sub-job runs, outputting 5 rows (from 8 input rows).
  • Confirms filtering logic works: 3 rows filtered out (not meeting Canada + Calgary criteria).

6. Data Formatting with tMap Variables [00:15:31 - 00:22:32]

  • Demonstrates creating a new output column: fullName by concatenating firstName and lastName.
  • Two methods shown:
    1. Dragging columns into variable expression and manually writing Java: row.firstName + " " + row.lastName
    2. Using the variable expression menu to double-click column names and auto-insert them with + operators.
  • Shows renaming variable to fullName_underscore (naming convention).
  • Explains how to remove unwanted output columns using the “X” button at bottom of output panel.
  • Teaches reordering output columns using up/down arrows in output panel.
  • Emphasizes expanding the tMap panel to prevent accidental column merging.

7. Job Flow Control with Triggers [00:23:46 - 00:27:36]

  • Explains default sequential execution order of sub-jobs based on creation order.
  • Introduces Trigger connections: on job OK (proceed on success) and on job error (proceed on failure).
  • Demonstrates connecting tFilterFile (sub-job below) to tDBInput (sub-job above) using on job OK.
  • Executes job to confirm execution order now follows trigger chain, not creation order.
  • Notes on job error behaves identically but triggers on failure.

8. Database Migration: PostgreSQL Setup [00:28:16 - 00:31:34]

  • Introduces new job: “Job_Migracion_Base_Underscore_Datos” to migrate from MySQL to PostgreSQL.
  • Guides participants to open pgAdmin, connect to server using root password.
  • Creates new database named Chinook via right-click → “Create Database”.
  • Confirms database creation in pgAdmin under public schema.

9. Configuring Talend Database Connection [00:31:34 - 00:34:51]

  • Navigates to Talend Metadata → “Create Connection”.
  • Selects PostgreSQL as DB type; uses default user postgres, password root, host 127.0.0.1, database Chinook.
  • Leaves schema field blank to default to public schema.
  • Performs “Test Connection” to auto-download JDBC drivers.
  • Confirms successful connection and finishes setup.

10. Migrating Table: tDBInput to tDBOutput [00:35:08 - 00:40:53]

  • Drops customer table from MySQL into canvas → auto-creates tDBInput.
  • Drops PostgreSQL connection into canvas → creates tDBOutput (not tDBInput).
  • Connects tDBInput directly to tDBOutput for exact migration.
  • Explains color coding: purple = input, green = output.
  • Reviews schema mapping: DB column (source) → Column (Java) → Type (Java type) → DB Type (source type).
  • Confirms schema auto-syncs between input and output (with caveat: may require manual check).

11. Configuring tDBOutput: Table Creation and Data Action [00:40:53 - 00:45:19]

  • Sets Table name in tDBOutput to artist (note: must be in quotes as Java string).
  • Sets Action on table to “Create table if not exists” to avoid errors on re-run.
  • Explains Action on data options:
    • Insert: adds rows; fails on duplicate PK.
    • Update: modifies existing rows; fails if row doesn’t exist.
    • Insert or update: tries insert first, then update if PK exists.
    • Update or insert: tries update first, then insert if missing (more efficient for updates).
  • Selects Insert for initial migration.

12. Execution and Validation of Migration [00:45:30 - 00:48:16]

  • Executes job: reports 59 rows migrated in 2.36 seconds.
  • Validates in pgAdmin: refreshes database → navigates to public schema → finds artist table.
  • Uses “View/Edit Data” → “All Rows” to confirm data integrity.
  • In Talend, refreshes PostgreSQL connection → artist table now visible under public schema.

13. Class Schedule Clarification and Closing [00:48:16 - 00:49:53]

  • Instructor clarifies next session is Wednesday (not tomorrow) due to Mexican holiday.
  • Notes technical issues were resolved during session; encourages questions.
  • Announces next class will focus on Talend Integration and Talend ISB.
  • Ends with well-wishes and confirmation that session data auto-saves.

Appendix

Key Principles

  • tMap is the central component for filtering rows/columns and transforming data in Talend.
  • Java expressions are required for complex row filtering; string comparisons use .equals() and are case-sensitive.
  • Output column order and naming must be explicitly managed in tMap to avoid errors.
  • Sub-jobs can be controlled via triggers (on job OK/on job error) to override default execution order.
  • Database schema mapping in Talend distinguishes between source (DB column, DB type) and target (Column, Type).
  • Action on table and Action on data in tDBOutput determine table creation and DML behavior (insert/update).

Tools Used

  • Talend Open Studio (referred to as “Talent”)
  • pgAdmin (PostgreSQL administration tool)
  • MySQL (source database)
  • PostgreSQL (target database)

Common Pitfalls

  • Red exclamation mark on tMap = unconfigured output name or invalid Java expression.
  • Forgetting to set Action on table to “Create table if not exists” causes job failure on re-run.
  • Not expanding tMap panel can cause accidental column merging during drag-and-drop.
  • Using insert instead of update or insert for large datasets with frequent updates leads to performance issues.
  • Leaving schema field blank in PostgreSQL connection defaults to public — critical for correct table access.

Practice Suggestions

  • Recreate the tMap filtering example with different conditions (e.g., country = "USA" and city = "New York").
  • Try migrating a second table (e.g., order) from MySQL to PostgreSQL using the same pattern.
  • Experiment with update or insert vs insert or update on a table with existing data.
  • Use tRowOutput to preview transformed data before writing to database.
  • Practice disabling sub-jobs to isolate and debug specific components.