What is ETL?
Extract Transform and Load (ETL) typically done by using ETL tools to Extract Data from source systems, Transform and Load in to Data-warehouse for BI Reporting, such as,
- Extract - Extracting the data from Multiple Source Systems (OLTP) containing data in different Format.
- Transform - Applying the Transformations as per the business requirements.
- Load -
Loading the extracted and transformed data in to single format in Data ware
House (Single Centralized Data base (OLAP)).
Fig 1. ETL Data Flow Process
List of Oracle Products which can be used for ETL:
Oracle Data Integrator Enterprise Edition (ODI)
Oracle Data Integrator for Big Data (DIBD)
Oracle Enterprise Data Quality. (Integrated with ODI)
Oracle Warehouse Builder (OWB)
Why ETL?
This is the big question exist in BI World always. When reporting tools has the ability to extract from multiple data sources what is the need for ETL?
Below are the main features which makes ETL as a mandatory
Process in BI Reporting:
Data Cleansing and Data Profiling:
Reporting Tools will provide the template and data flow as per the Report requirements. But it won't Process the data at column or row level in a table. ETL Process ensures that the good and valid data as per the business requirements are available for BI Reporting increasing the credibility and quality of the reports to the Users.
Increasing the Performance of Reporting:
When we fetching the data from single Data ware house (single node) for reporting instead of from multiple sources (multiple nodes), it reduces the time taken for latest data availability in BI Reporting. When we plan to create and run thousands of reports every day for large organizations, this becomes an important aspect in BI. De-normalization and caching Process also plays the vital role which is achieved though ETL and ensures good Performance of BI Reports.
Avoiding the Usage of Live Transaction sources for Reporting:
Since we have Data ware house (OLAP) in a different node (database), we are not directly querying from the Transaction Database and hence there will be no impact on source systems performance due to BI Reporting.
Need of Tools for ETL Process in BI
Can't we achieve this through traditional Programming and scripting languages?
Yes, it is possible to achieve this Process through Programming
Languages. But the ETL tools provide many advantages that are hard to ignore,
not to mention the automation and scheduling capabilities.
Below are the some of the key reasons for using ETL tools:
Data Integration Utilities:
To extract the data from different databases we need to import the respective drivers and packages in the programing code. But the ETL tools are enriched with the utilities to connect to different databases which reduces the ETL developer tasks in data Extraction Part .It also Provides the parallel data extraction which reduces the time consumption for data extraction.
Reducing the dependency on Programming languages:
ETL tools will remove the dependency on Programming languages which reduces the need for complex coding requirements. Also ETL Developers can learn the different ETL tools easily in a short time period. It also makes the Debugging process easier one for Developers as these tools comes up with the enriched logs about failures.
Enriched User Interface in ETL Tools:
ETL tools are built with enriched Interface which will enable BI Techies to achieve the complex business requirements easily. Developers can complete the ETL work by drag and drop of the utilities which are provided in the ETL tools. In this Case, it enables the Developers to focus in the business requirement rather than the ETL code. To conclude this, people who are working in a functional business domain also can learn and work in ETL easily.
Conclusion
In future, ETL Process may evolve
in a different ways. For instance, many ETL tools are enriched with Cloud
Utilities (ODI Cloud Service) and have capabilities to handle the Unstructured
Data. But, if we want to ensure the Data Quality and Report Performance for the
BI Users, then ETL Process is always required. Also Data quality plays a vital
role in BI Reporting as the business decisions and predictions are based on the
Reports generated on BI data.