The Pentaho Data Integration is intended to Extract, Transform, Load (ETL) mainly. It consists of the following elements:
DI Server (Server Application)
Data integration server executes jobs and transformations using PDI engine. It has default user and role-based security and can also be integrated with existing LDAP/ Active Directory security provider. Here, we can store the transformations and jobs stored at one common place.
Design Tool (standalone) – It is for designing jobs and transformations
Spoon – GUI Tool to develop all jobs & transformations
Kitchen – Tool to run any job & transformations
Pan – Tool to run just the transformations
Carte – Remote ETL Server
In data warehouse, historical data is loaded at one go and historical data is available with the organization. On a daily basis since we won’t be able to run the entire data repeatedly into the data warehouse, we go forward with the incremental load.
The incremental load involves loading any changed data from the source site. It’s important to know that we won’t be able to sit or run the job & transformation manually everyday so we must schedule the job. We schedule it on a weekly basis using windows scheduler and it runs the particular job on a specific time in order to run the incremental data into the data warehouse. This is known as the command prompt feature of PDI (Pentaho Data Integration).
Data Connections – Which is used for making connection from source to target database.
Transformation – It works on extracting and loading data into data warehouse.
What is Spoon?
It’s a GUI tool for developing jobs and transformations. It is easy to learn and is user friendly. There is a transformation already opened under the name ‘DIM_Product’. On the left side there are two tabs called View and Design. Here, we build a Database Connection to get data or load data from datawarehouse. In the design tab we have different nodes such as:
Input – Where we need to extract the data.
Output – In order to load data.
Transform – Which involves connectors and logic.
Got a question for us? Mention them in the comments section and we will get back to you.