Skip to content

Learn about our organization's purpose, values, and history that define who we are and how we make a difference.

Who we are

why-we-are

Discover how the Mastech InfoTrellis ecosystem is enabling customers to make well-informed decisions faster than ever and how we stand apart in the industry.

Delve into our wealth of insights, research, and expertise across various resources, and uncover our unique perspectives.

Thrive in a supportive and inclusive work environment, explore diverse career options, grow your skills, and be a part of our mission to excellence.

Table of Content

Seamless integration of Informatica Data Quality with Informatica MDM

Data cleansing and standardization is an important aspect of any Master Data Management (MDM) project. Informatica MDM Multi-Domain Edition (MDE) provides reasonable number of cleanse functions out-of-the-box. However, there are requirements when the OOTB cleanse functions are not enough and there is a need for comprehensive functions to achieve data cleansing and standardization, for e.g. address validation, sequence generation. Informatica Data Quality (IDQ) provides an extensive array of cleansing and standardization options. IDQ can easily be used along with Informatica MDM.

This blog post describes the various options to integrate Informatica MDM and IDQ, explains the advantages and disadvantages of each approach to aid in deciding the optimal approach based on the requirements.

Informatica MDM-IDQ Integration Options

There are three options through which IDQ can be integrated with Informatica MDM.

  1. Informatica Platform staging
  2. IDQ Cleanse Library
  3. Informatica MDM as target

Option 1: Informatica Platform Staging

Starting with Informatica MDM’s Multi-Domain Edition (MDE) version 10.x, Informatica has introduced a new feature called “Informatica Platform Staging” within MDM to integrate with IDQ (Developer Tool). This feature enables to directly stage/cleanse data using IDQ mappings to MDM’s Stage tables bypassing Landing tables.

informatica-platform-staging-process-1000x409

Figure 1: Informatica Platform Staging Process

Advantages

  • Stage tables are immediately available to use in the Developer tool after synchronization eliminating the need to manually create physical data objects.
  • Changes to the synchronized structures are reflected into the Developer tool automatically.
  • Enables loading data into Informatica MDM’s staging tables bypassing the landing tables.

Disadvantages

  • Creating connection for each Base Object folder in Developer tool can be cumbersome to maintain.
  • Hub Stage options like Delta detection, hard delete detection, and audit trails are not available.
  • System generated columns need to be populated manually.
  • Rejected records are not captured in _REJ table of the corresponding stage table but get captured in .bad file.
  • Invalid lookup values are not rejected while data loads to stage unlike in Hub Stage Process. The record with invalid value gets rejected and captured by the Hub Load process.

Option 2: IDQ Cleanse Library

IDQ allows creating functions as operation mappings and deploy them as web service which can then be imported in Informatica MDM Hub implementation as a new type of cleanse library defined as IDQ cleanse library. This functionality allows usage of the imported IDQ cleanse functions, just like any other out-of-the-box cleanse function. Informatica MDM Hub acts as a Web service client application that consumes IDQ’s web services.

Hub Stage Process design using IDQ imported function

Figure 2: Hub Stage Process design using IDQ imported function

Advantages

  • Easily build transformations in IDQ’s Informatica Developer tool rather than creating complex java functions.
  • Unlike Informatica Platform staging, Hub Stage process options such as delta detection, hard delete detection, audit trail are available for use.

Disadvantages

  • Physical data objects need to be manually created for each staging table and manually updated for any changes to the table.
  • IDQ function must contain all transformation logic to leverage batching of records. If any transformation logic is additionally defined in the MDM map, then calls to the IDQ web service provider will be single record leading to performance issue.
  • Web service invocations are synchronous only, which can be a concern for large data volume.

Option 3: Informatica MDM as target

3.1 Loading data landing tables

Informatica MDM can be used as target for loading the data to landing tables in Informatica MDM.

Using IDQ as ETL tool for loading data into landing table

Figure 3: Using IDQ as ETL tool for loading data into landing table

Advantages

  • Single connection created in Developer tool for Informatica MDM is less cumbersome when compared to creating multiple connections with Informatica platform staging.
  • No need to standardize data in the Hub Stage Process.
  • Unlike Informatica Platform staging, Hub Stage process options – delta detection, hard delete detection, audit trail are available to use.

Disadvantages

  • Physical data objects need to be manually created for each landing table and manually updated for any changes to the table.
  • Need to create mappings at two levels (i) source to landing and (ii) landing to staging (direct mapping).

3.2 Loading data staging tables (bypassing landing tables)

Informatica MDM can be used as target for loading the directly to staging tables in Informatica MDM, bypassing landing tables.

Using IDQ as ETL tool to load data directly into staging tables

Figure 4: Using IDQ as ETL tool to load data directly into staging tables

Advantages

  • Single connection created in Developer tool for Informatica MDM is less cumbersome when compared to creating multiple connections with Informatica platform staging.
  • Can be used for lower version of Informatica MDM where Informatica Platform staging option is not available.

Disadvantages

  • Physical data objects need to be manually created for each staging table and manually updated for any changes to the table.
  • Hub Stage Delta detection, hard delete detection, and audit trails options are not available.
  • System generated columns need to be populated manually.
  • Rejected records are not captured in _REJ table of the corresponding stage table but get captured in .bad file.
  • Invalid lookup values are not rejected while data loads to stage unlike in Hub Stage Process. The record with invalid value gets rejected and captured by the Hub Load process.

Conclusion

While there are multiple options to integrate IDQ with Informatica MDM, based on client’s requirements, the ideal approach is analyzed and recommended.

avatar

Meenatchi D

Consulting Enterprise Integration Specialist

Meenatchi D is an accomplished Consulting Enterprise Integration Specialist with a strong ETL and MDM technical consulting background. With a proven track record in Data Warehousing and Data Management projects, Meenatchi brings expertise in seamlessly integrating and optimizing data systems.