Tuesday, November 23, 2010

What is DAC?

Data Warehouse Administration Console (DAC)
The DAC is a metadata driven “ETL process management tool” and provides a framework for the entire life cycle of a data warehouse implementation. It enables ETL administrators to create, configure, execute, and monitor modular data warehouse applications in a parallel, highperforming environment. The DAC complements the ETL platform. It provides application specific capabilities that are not prebuilt into ETL platforms. For example, ETL platforms are not aware of the semantics of the subject areas being populated in the data warehouse nor the method in which they are populated. The DAC provides the following application capabilities at a layer of abstraction above the ETL execution platform:

• Dynamic generation of subject areas and execution plans
• Dynamic settings for parallelism and load balancing
• Intelligent task queue engine based on user-defined and computed scores
• Automatic full and incremental mode awareness
• Index management for ETL and query performance
• Embedded high performance Siebel OLTP change capture techniques
• Ability to restart at any point of failure
• Phase-based analysis tools for isolating ETL bottlenecks



The benefits of the DAC include
• Pin-point deployment of required Oracle BI Applications modules
• Load balancing / parallel loading
• Reduced load windows
• Fine-grained failure recovery
• Index management
• Database statistics collection

ETL Process Overview

In Oracle BI Apps architecture we have seen ETL is one of the key component. In this post let us see what exactly ETL does...

Following figure illustrates the ETL Overview


Here in the above figure we can see two stages SDE and SIL..
Let us see the each component of  Source Dependent Extract(SDE) in detail.

Business Component: 
The Business Component presents source data elements in commonly understood business terms. It translates operational data from your sources into meaningful business terms, hiding the complexity of the source systems. The Business Component performs joins in the data source to help extract data efficiently and to facilitate incremental extraction. The Business Component is part of Source Dependent Extract (extract mapping) that enables data extracts into universal staging tables in preparation for the source independent loads. Business components and SDE mappings provide significant out-of-the-box value as described below:


• There are thousands of tables in each of the operational systems. It takes time and effort to figure out which of those tables are required for integrating data from the operational source into the Business Analytics Data Warehouse. This provides significant time and effort savings.

• The mechanism to extract data from various operational systems is different. Some have a proprietary API (e.g. ABAP in SAP R/3); others need access to underlying database tables. Oracle BI Applications’ business component configuration has a uniform architecture that leverages different tools (e.g. Informatica Power Connect for SAP R/3, Informatica Power Connect for PeopleSoft etc) from the data integration platform to build the extraction layer, which hides these complexities.

• The components are architected to extract data from the operational sources in the least intrusive form – e.g, rather than join tables in the operational systems again and again for different data warehouse entities, the extract components have been architected to get the data once, which is then leveraged wherever required. This results in reducing the data warehouse load on the operational systems, freeing up the cycles for supporting operations, which is what they are meant to do.

• Wherever possible, SDE mappings are designed to extract data from the operational sources in an incremental fashion and refresh the enterprise data warehouse. This reduces the load on the operational systems considerably and significantly reduces the time required to refresh the data warehouse.

• The SDE mappings are designed to leverage indexes in the operational systems that can speed up the data extraction.

• SDE mappings have a configuration layer where business rules specific to the implementation can be configured. This avoids spaghetti code and results in a consistent, highly maintainable data integration environment.

• The data integration components are provided with a sample extension column (X_CUSTOM) that can be used as a reference for bringing additional custom fields into the Business Analytics Data Warehouse. Hence the task for extending the data model is reduced to identifying the right source columns, adding them to the data model and connecting it in the extraction layer using the sample column as a reference. This significantly reduces the time and effort required to extend the data model with required fields.

• Adherence to strict naming conventions for the data integration components helps increase understandability and maintainability. The data integration components are organized such that integration tasks can be run in parallel wherever possible, depending on the availability of resources. This can help reduce the time required for extracting data from the operational systems and refreshing the enterprise data warehouse.

Source Adapter:
The Source Adapter Mapplet which is part of SDE (Source Dependent Extract) mapping converts source-specific data elements from the source tables into the source independent data elements required for the Source Independent Load mappings for Data Warehouse code standardization.

Universal Staging Area:
Universal staging tables are used as temporary storage for extracted source data prior to transformation in the Source Independent Load process for loading into the Business Analytics Data Warehouse. The two main purposes of the staging tables are:
• Quick and efficient data extraction: Staging tables allow for faster data extraction, leaving complex transformations for later and allowing for a shorter operational window.
• Flexibility with data load: Staging tables integrate data from different source systems.

For Source Independent Load(SIL) refer to the previous post... 



Source Independent Load(SIL) in ETL

The Source Independent Load Mapplet is the key component of the Source Independent layer of the Oracle BI Applications. It represents fundamental business concepts such as sales orders, purchase orders, products, customers, suppliers etc. from various operational systems like Oracle EBS, PeopleSoft, Oracle’s Siebel CRM , and SAP R/3, which often vary widely in the ways each represent or define these concepts - creating challenges for an analytic solution that must integrate data from many operational sources. The Source Independent Mapplet provides a core consistency through a consolidation layer that “normalizes” these differences into a single common definition of each fundamental business concept across the entire value chain. The transformations packaged inside the Source Independent Mapplet consolidate the data from multiple source systems by enforcing commonalities and by preparing the data for source independent loading into the enterprise data warehouse.

The transformations performed by the SIL (Source Independent Load) Mapplet include:
• Lookup descriptions for data elements which are normally present as codes in the operational sources. The SIL Mapplet provides the flexibility to bypass the lookup for descriptions when they are provided in the input. To truly integrate data from multiple sources, the domain values for data elements should be standardized, and source supplied values need to be transformed to the corresponding values from the standardized set. The SIL Mapplet supports this functionality.
• The Business Analytics Data Warehouse maintains relationships between fact tables and dimension tables using surrogate keys for better query performance. The SIL Mapplet generates the surrogate keys used for maintaining dimension tables.
• The SIL Mapplet supports Type I and Type II slowly changing dimensions. The SIL Mapplet of a dimension provides a Type II flag, which can be configured to indicate a INTERNAL USE ONLY Oracle Confidential 36 Type II change depending on the data  elements for which history tracking is required. The SIL Mapplet handles the creation of a new  record in the dimension table for each Type II change and all the records are effectively dated to facilitate dimension key lookups.
• The SIL Mapplet has a dimension key lookup transformation for resolving the relationships between fact tables and dimension tables. Since the Business Analytics Data Warehouse has Type II support for slowly changing dimensions, the SIL Mapplet performs the lookup using the transaction date to preserve history.
• Currency conversions are performed by exchange rate lookups depending on the transaction date. SIL Mapplets provide complete flexibility and bypass the currency conversion logic when the amounts in all three global currency denominations are provided as input.
• The SIL Mapplet has transformations to figure out the update strategy for an input record, whether it needs to be inserted in the data warehouse, or whether it needs to be updated or handled as a Type II slowly changing dimension. For the latter case it is inserted and a corresponding record in the data warehouse is updated.

The SIL Mappings provide several key benefits:
• Data integration from every operational source uses the SIL Mapplet to load data into the Business Analytics Data Warehouse, enforcing consistency.
• Most of the transformations required to load the Business Analytics Data Warehouse are encapsulated in the SIL Mapplet, which is reused for building data integration components from every source. This results in significant savings of time and effort required to build the data integration components, and reduces overall risk.
• The SIL Mapplet provides a lot of flexibility in the transformations that it encapsulates. The behavior can be controlled by configuring the inputs according to need. The SIL’s ability to resolve integration issues and move data across different platforms makes it an invaluable resource for any data warehouse administrator.