Informatica Training & Certification
- 15k Enrolled Learners
- Live Class
The architecture of a software serves as a skeletal system which enables the structured flow of data and processes. To optimize the use of your software, you must have a clear understanding of its architecture. Through this blog on Talend architecture, I am going to give you a complete insight on the internal as well as the functional architecture of Talend.
Following are the topics, I will be discussing in this Talend architecture blog:
Talend is an open source software integration platform/vendor. It is a vendor which provides various software and services for:
According to Gartner Magic Quadrant 2017, Talend is recognized as a global leader in big data and cloud integration solutions. Following are few of the most intriguing features offered by Talend:
In case you want to know more about these features, you may refer to this blog on What Is Talend.
When all the above points are taken into consideration, you can easily conclude that, it’s highly unlikely that it will go out of the market anytime soon. As a result, more and more companies are using Talend which has led to the increase in Talend’s hold in the market. Currently, Talend holds 19.3% of the total market share.
Talend provides the software that helps companies become data driven by making data more accessible, improving its quality and quickly moving it where it’s needed for real-time decision making. Talend is also known as the Swiss knife of a non-programmer for Big Data. It makes the user’s interaction with Big Data technologies like Hadoop, Hive, Spark, Pig etc., really simple as there is no need of writing even a single line of code.
Since its release in 2005 and till date, Talend has released a wide range of products and services. In next section of this blog on Talend Architecture let’s take a look at few of its major products.
The list of products includes licensed versions, open sourced versions, and platforms. Lets now see all these products one by one.
Among all these products Talend Open Studios are most commonly used. The reason being, it is open sourced which makes it free to download and use. It is the best tool to get you started and comes with almost all the functions you need to process your data. But in case you want to increase your productivity, collaboration and the return on investment you can go for the enterprise versions. As the name suggests, the enterprise products are best suited for the commercial purpose. However, the enterprise versions are not on our discussion list for today, so let’s focus on the Open Studios and move ahead with this blog on Talend architecture.
In the next section, I will try to explain the internal architecture of Talend Open Studio, which makes Talend so powerful yet user-friendly.
But, before I explain the internal working of TOS, let me quickly brief you about it.
Talend Open Studio is based on Eclipse RCP which supports ETL oriented implementations. It is generally used for on-premises deployment and is extensively used for integration between operational systems, ETL processes and much more. Through its GUI, you can access metadata repository containing the definition and configurations for each process performed in Talend. As you might know, Talend’s GUI is extremely interactive and user-friendly and all you need to do is just drag, drop and link the components to perform a task. To execute these tasks, just click on the ‘Run’ button present in the Run tab and the rest is handled by TOS itself.
But have you ever wondered, what happens at the back end? Below diagram represents the basic Talend architecture which shows how Jobs are handled by TOS internally.
Well, at the back end, the Jobs and the business models which we create on its GUI are stored in an XML format by the TOS. Now, whenever you execute these Jobs they will be converted into Java codes and the Business models will be converted into Perl codes by the code generator.
Now that you have a basic understanding of how Talend Open Studio works, lets now take a look at the functional architecture of Talend.
Because of its functional architecture, Talend can easily identify various functions and then interact and respond to various needs of the IT market. Below is the functional architecture of Talend Open Studio:
This block is responsible for administrating and monitoring the Jobs. Here, you can find at least one Studio to carry out various data integration processes, irrespective of data volumes and process complexity. One thing, you must note is that you need a proper authorization to work on any project in Talend Studio.
This block contains a web-based Administration Center (i.e an application server) with two shared repositories. One of these is based on an SVN server while the second one on a database server. The Administration Center is responsible for the management and administration of all projects. The database server stores the Administration metadata like user accounts, access rights and project authorization whereas the SVN server stores the project metadata like Jobs, Business Models, Routines, Routes, Services etc. This makes the sharing of data easier between the end users.
This block is responsible for the execution and deployment of the Jobs. You can deploy one or more Job Servers inside your information system. These servers run the Jobs or the technical processes according to the scheduled time, date or event that is set in the Talend Administration Center Web application. Also, an end-user can easily transfer any Job to a remote execution server directly from a Studio, which is called the ‘distant run’ in Talend.
With this, we come to the end of this blog on Talend Architecture. Hope it was informative and you enjoyed reading it. To know more about Talend you can refer to this Talend Tutorial blog.
If you found this Talend architecture blog relevant, check out the Talend for DI and Big Data Certification Training by Edureka, a trusted online learning company with a network of more than 250,000 satisfied learners spread across the globe. The Edureka Talend for DI and Big Data Certification Training course helps you to master Talend and Big Data Integration Platform and easily integrate all your data with your Data Warehouse and Applications, or synchronize data between systems.
Got a question for us? Please mention it in the comments section and we will get back to you.