Informatica Tutorial: Understanding Informatica ‘Inside Out’

Recommended by 41 users

Nov 21, 2016
Informatica Tutorial: Understanding Informatica ‘Inside Out’
Add to Bookmark Email this Post 8.7K    2

We learnt in the last blog about What is Informatica and its real life application. Let us deep dive now and understand in this Informatica Tutorial blog about Informatica, its architecture and a use case. As discussed in the last blog, Informatica PowerCenter is the flagship product of Informatica and is often used interchangeably.  Just to recap, Informatica Powercenter is a single, unified enterprise data integration platform that allows companies and government organizations of all sizes to access, discover and integrate data from virtually any business system, in any format and deliver that data throughout the enterprise at any speed. It is an ETL tool (Extract, Transform and Load) with its main advantage over other ETL tool are as follows:

  • It is robust, and can be used in both windows and UNIX based systems
  • It is high performing yet very simple for developing, maintaining and administering

Informatica Tutorial: Understanding Informatica PowerCenter

To understand Informatica real time, we should understand in depth about Informatica Architecture and other components of Informatica. So at the end of this Informatica Tutorial blog, you will be able to understand the following:

  1. What is Informatica Architecture?
    1. Client Component of Informatica
      1. Informatica PowerCenter Repository Manager
      2. Informatica PowerCenter Designer
      3. PowerCenter Workflow Manager
      4. PowerCenter Workflow Monitor
      5. Administrator Console
    2. Server Component of Informatica
      1. Repository Service
      2. Integration Service
      3. SAP BW Service
      4. Webservices Hub
  2. Flow of data in Informatica
  3. Informatica Domain & Nodes
  4. Informatica Services & Service Manager
  5. Use Case: How to load product dimension table using SCD

What is Informatica Architecture?

The architecture of Informatica PowerCenter is based on the Service Oriented Architecture (SOA) concept. A service oriented architecture (SOA) can be defined as a group of services, which communicate with each other. The process of communication involves either simple data transfer or it could involve two or more services coordinating same activity.

Development of Informatica is based on Component Based Development Techniques. Component-based development is a technique where predefined components or functional units, or both, with specific functionalities are used to assemble the final product. PowerCenter follows the component-based development methodologies by allowing to build a data flow from a source to the target, using different components (called transformations) and linking them to each other as required.  A good way to go about it would be to first understand what are the components of Informatica and then we will learn how to apply Informatica to solve typical business problem through a use case.

So, the Informatica PowerCenter tool consists of 2 components. They are:

  • Client component
  • Server component
Informatica-tutorial-Informatica-Architecture

                             Fig: Informatica Architecture Overview

Client Components of Informatica PowerCenter:

  • PowerCenter Repository Manager:

Repository Manager is used to administer repositories. It can manage user and groups. We can create, delete, and edit repository users and user groups. We can also assign and revoke repository privileges and folder permissions.

The Repository Manager has the following windows:

  • Navigator: It displays all objects that you create in the Repository Manager, the Designer, and the Workflow Manager. It is organized first by repository and then by folder.
  • Main: It provides properties of the object selected in the Navigator. The columns in this window change depending on the object selected in the Navigator.
  • Output: It provides the output of tasks executed within the Repository Manager.
informatica-tutorial-powercenter-repository-manager-window

                            Fig: Repository Manager

                               

  • Informatica PowerCenter Designer

The PowerCenter Designer is the client where we specify how to move the data between various sources and targets.  This is where we interpret the various business requirements by using different PowerCenter components called transformations, and pass the data through them (transformations).  The Designer is used to create source definitions, target definitions, and transformations, that can be further utilized for developing mappings. 

informatica-tutorial-informatica-powercenter-designer

                         Fig: Informatica PowerCenter Designer

  • Informatica PowerCenter Workflow Manager

    It is an ordered set of one or more sessions and other tasks, designed to accomplish an overall operational purpose. It executes a series of Mappings (as Sessions) and other tasks.

Workflow example - Informatica Tutorial - Edureka

                                                                     Fig: Workflow Manager

The Workflow Manager is the PowerCenter application that enables designers to build and run Workflows. It can be opened as follows:

  • Can be launched from Designer by clicking the “W” icon
  • Can be opened independently from the path Start > All Programs > Informatica PowerCenter 9.6.1 > Client > PowerCenter Client > PowerCenter Workflow Manager
  • Can be opened from the Workflow Designer -The tool you use to create Workflow objects
Workflow Manager- Informatica Tutorial

                           Fig: Workflow Manager Interface

The Workflow Manager displays the following windows to help you create and organize workflows:

  • You can connect to and work in multiple repositories and folders. In the Navigator, the Workflow Manager displays a red icon over invalid objects.
  • You can create, edit, and view tasks, workflows, and worklets.
  • It contains tabs to display different types of output messages. The Output window contains the following tabs:
    • Displays messages when you save a workflow, worklet, or task. The Save tab displays a validation summary when you save a workflow or a worklet.
    • Fetch Log. Displays messages when the Workflow Manager fetches objects from the repository.
    • Displays messages when you validate a workflow, worklet, or task.
    • Displays messages when you copy repository objects.
    • Displays messages from the Integration Service.
    • Displays messages from the Repository Service.

Informatica Workflow Designer

It maps the execution order and dependencies of Sessions, Tasks and Worklets, for the Informatica Server

Informatica-Tutorial-Workflow-Designer

                                                    Fig: Workflow Designer

  • Task Developer

It creates Session, Shell Command and Email tasks. Tasks created in the Task Developer are reusable

  • Worklet Designer

It creates objects that represent a set of tasks. Worklet objects are reusable.

The Workflow Manager also displays a status bar that shows the status of the operation you perform.

The following figure illustrates how a typical workflow looks like including the Start task, Link, and Session task components.

eg-of-workflow-manager

                                                                   Fig: Example of Workflow Manager

  • Informatica PowerCenter Workflow Monitor

The Workflow Monitor, a PowerCenter tool, is used to monitor the execution of workflows and tasks.

Workflow Monitor can be used to:

  • View details about a workflow or task run in Gantt chart view or task view
  • Run, stop, abort, and resume workflows or tasks
  • The Workflow Monitor displays workflows that have run at least once.
  • The Workflow Monitor continuously receives information from the Integration Service and Repository Service. It also fetches information from the repository to display historic information.
workflow monitor-Informatica Tutorial

                                              Fig: Workflow Monitor 

How to Open Informatica Workflow Monitor:

To open the Workflow Monitor, go to:

Start>All Programs>lnformatica PowerCenter 9.6.1>Client>PowerCenter Client > PowerCenter Workflow Monitor

The monitor can also be opened:

  • From the Workflow Manager Navigator
    • The Workflow Manager can be configured to open the Workflow Monitor when a workflow is run from the Workflow Manager
    • From Tools > Workflow Monitor in the Designer, Workflow Manager, or Repository Manager
  • Or, from the Workflow Monitor icon on the Tools toolbar
workflow monitor different section- Informatica Tutorial

                                Fig: Workflow monitor-sections 

  • Informatica Administrator Console

Informatica Administrator console (Administrator tool) is the administration tool to administer the Informatica domain and Informatica security. Informatica Administrator console (the Administrator tool) is available after Informatica installation.

Informatica adminstrator console- Informatica Tutorial

                      Fig: Informatica Administrator Console

The Administration Console performs the following tasks in the domain:

  • Managing application services: It manages all application services in the domain, including the integration service and repository service.
  • Configuring nodes: It configures node properties including backup directory and resources. It allows the nodes to be shut down and then restarted as well when required.
  • Managing domain objects: It creates as well as manages objects such as services, nodes, licenses, and folders.
  • Viewing and editing domain object properties: It allows properties for all objects in the domain to be viewed as well as edited within it.
  • Security administrative tasks: Manage users, groups, roles, and privileges.
  • Viewing log events: It uses the log viewer to view log events of domain, integration service, SAP BW service, web services hub, as well as repository service.
adminstrator console interface- Informatica Tutorial

                                  Fig: Administrator console-Interface

                           

So, in nutshell, client component of Informatica comprises of 5 components viz. Informatica Repository Manager, Informatica PowerCenter Designer, Informatica Workflow Manager, Informatica Workflow Monitor and Informatica Administrator Console. It forms the form-work of the entire tool. Lets now try to understand the Server component of Informatica PowerCenter. 

Server Components of Informatica PowerCenter

The PowerCenter server components comprises of the following services:

  • Repository service: The Repository service manages the repository. It retrieves, inserts, and updates metadata into the repository database tables.
  • Integration service: The Integration service runs sessions and workflows.
  • SAP BW service: The SAP BW service looks out for RFC requests from SAP BW and initiates workflows to extract data from, or load data into the SAP BW.
  • Web services hub: The Web services hub receives requests from web service clients and exposes PowerCenter workflows as services.

Now that we have understood both client and server components of Informatica, the following info-graphic will explain the flow of data in Informatica i.e. how data is processed:

data-flow-in-informatica

                                                         Fig: Data flow in Informatica

 It is very logical at this point to understand what are other fundamental units in Informatica such as Domain & Node, Service & Service Manager. So lets take a moment to understand them before we perform a handson on Informatica. 

Informatica Domain & Nodes: 

The salient features of a Domain are as follows:

  • A Domain is a logical collection or set of nodes and services
  • The PowerCenter Domain is the fundamental administrative unit of PowerCenter
  • A Domain can be a single PowerCenter installation, or it can consist of multiple PowerCenter installations

The salient features of a node are as follows:

  • A node is a logical representation of a physical machine. It has physical attributes such as a hostname and a port number
  • Each node runs a service manager which is responsible for the application and core services
  • A node can be a gateway node or a worker node, but it can belong to only one Domain
informatica-domain-n-node

                                           Fig: Informatica Domain n Node

Informatica Services & Service Manager:

A service is a resource that provides specialized functions. All PowerCenter processes run as services on a node.

Informatica PowerCenter has two types of services:

  • Application Services represent server based functions including Repository and Integration Services.
  • Core Services represent functions that manage and maintain the environment in which PowerCenter operates and include services like Log Service, Licensing Service, and Domain Service amongst many others.

Service Manager

  • The Service Manager is a service that manages all Domain operations and runs on each node within a Domain
  • On the gateway node, the Service Manager is responsible for the following:
    • Controlling the Domain
    • Managing the services running on the Domain
    • Providing service lookup
  • On all nodes, the Service Manager is meant to control the core services and application services

How different components of PowerCenter interact: 

informatica-component-interaction

                            Fig: Informatica Component Interaction

             

Use Case: How to load a Product Dimension Table using SCD   

Problem statement: Our aim is to load a Product Dimension table using Slowly Changing Dimensions (SCDs) Type 2 using effective date.

Given a customer source system which contains the Customer ID, Name, City, State and Country details of the customers, We need to create a new entry in the target dimension table every time a customer comes with a different value.

To understand this better, if a customer returns with a different value for state or city compared to the value already present in the target dimension table, a new entry has to be created with the updated value. This is achieved by the use of SCD solution based target table.

Below is a step-by-step process of loading the product dimension table using SCD.

Step 1: Open PowerCenter Designer.

Informatica-tutorial-domain-creation

Step 2: Connect to the repository

Informatica-tutorial-connecting-to-repositories

                             Fig: Establishing connection to Repository

Step 3: Launch the Designer

Informatica-tutorial-launch-designer- Informatica Tutorial

                             Fig: Launching PowerCenter Designer

Step 4: Load the source from Database

Informatica-tutorial-import-from-database- Informatica Tutorial

                             Fig: Various options to load Source data set

Step 5: Connect to Database

Informatica-tutorial-Connect-to-database- Informatica Tutorial

Step 6: Select SCD_INPUT_DATA table 

Informatica-tutorial-input-table- Informatica Tutorial

Step 7: Similarly load target set from database

Informatica-tutorial-load-target- Informatica Tutorial

                             Fig: Various options to Target sets

Step 8: Design a workflow to perform the required operation as seen below

Informatica-tutorial-workflow- Informatica Tutorial

                             Fig: Workflow Design for Database

Step 9: Launch Oracle SQL Developer and load SCD_CUSTOMER table 

Informatica-tutorial-customer-data- Informatica Tutorial

                             Fig: SCD_CUSTOMER table

Step 10: Modify the values of state for customers Mary and Hannah

Informatica-tutorial-mary- Informatica Tutorial

                             Fig: Modifying values of Mary

Informatica tutorial-hannah

                                 Fig: Modifying values of Hannah

Step 11: Launch Workflow monitor and execute the workflow

Informatica-tutorial-Workflow- Informatica Tutorial

                                           Fig: Executing workflow

 

Informatica-tutorial-Workflow-output- Informatica Tutorial

                                           Fig: Workflow Output

Step 12: Execute the command below to obtain the targeted data base

  • select* from scd_customer_target 
Informatica-tutorial-targeted-customer- Informatica Tutorial

                           Fig: Executing SQL query for targeted output

Step 13: Product Dimension table output

Informatica tutorial-output

                                 Fig: Product Dimension table Output

To conclude, the product table loaded contains a historical values of the data including the variation to the values present and this is obtained by using Informatica PowerCenter. 

I hope this Informatica Tutorial blog was helpful to build your foundation of Informatica and has created enough interest to learn more about Informatica. 

View Upcoming Batches

If you have already decided to take up Informatica as a career, I would recommend you why don’t have a look at our Informatica training course page. The Informatica Certification training at Edureka will make you an expert in Informatica through live instructor led sessions and hands-on training using real life use cases. 

Got a question for us? Please mention it in the comments section and we will get back to you.

              

Share on
Comments
2 Comments
  • Sheikh Uddin

    For a beginner like me after the Step 2, Step 3 does not appear in that way unless you create folder in the repository and then start the source analyzer. This whole thing is missing and a newbie can not figure it out. You need to revise this steps.

    • EdurekaSupport

      +Sheikh Uddin, thanks for checking out our blog and for sharing your pain point. We will update the example. Cheers!

24 X 7 Customer Support X

  • us flag 1-800-275-9730 (Toll Free)
  • india flag +91 88808 62004