Thursday, November 11, 2010

To achieve better performance in OBIEE

Following are the few points which improve the performance of OBIEE.
  1. Alias Tables
  2. Using Aggregate Navigation
  3. Using Cache
  4. Connection Pool
  5. Turning Off Log Level
  6. Using Database Hints
  7. Using Where Clause

1.) Alias Tables
An Alias table (Alias) is a physical table with the type of Alias. It is a reference to a Physical table, and inherits all its column definitions and some properties from the Physical table. Alias Tables can be an important part of designing a physical layer

OBIEE doesn’t support the self-join, so we can use alias table to implement self-join in OBIEE.
The following is a list of the main reasons to create an alias table:
  • To reuse an existing table more than once in your physical layer (without having to import it several times)
  • To set up multiple alias tables, each with different keys, names, or joins
  • To help you design sophisticated star or snowflake structures in the business model layer.
To create Alias table in Physical Layer, right click on Table -> New Object -> Alias, select source table if required.


NOTE: In previous version we need to synchronize the alias table with the source tables but in OBIEE Alias will be automatically synchronized with source table, any change to Source Table will be immediately reflected in Alias Table.


2.) Using Aggregate Tables
All knows that when we create Summary Report then data from the fact table got roll up to the appropriate level & then showed to the customer. Oracle BI server lets you register aggregate (summary) tables that contain the pre-computed sums, averages and so on for a fact table, which it then uses in preference to rolling up the detail-level fact table if this would speed up a query.
We will be using aggregate tables in physical layer if you required to create the aggregate tables then you can use Aggregate Persistent Wizard.
We will import aggregate Fact & Dimension Table into physical layer..
We have the aggregate tables in physical layer; will use physical Diagram to create foreign key relationships between the aggregate fact tables, the aggregate dimension tables.

We’re now at the point where we can map these aggregate tables to the existing logical tables in the business model layer.
We need to identify the existing logical tables in the business model that have columns that correspond to the incoming data, and then drag the columns you want to match on over from the new, physical table.
We will create new Logical Table Source to map Logical Column to Physical Columns (from Aggregate table).
The final step is to tell the BI Server that this new data source for the units measure, is only valid Month, Sales Rep and Type Level.
Now, when a query comes in against the Dollars measure at the product Type or Customer Sales rep Level, the BI Server will use the D1_ORDER_AGG1 table instead of D1_ORDER2.

3.) Using Cache:
Cache in OBIEE:- 
Cache is component hat improves performance by transparently storing data such that future requests for that data can be served faster. Like other application cache is not virtual memory in OBIEE. In OBIEE cache will be stored as file on Hard Disk in the form of files.
To use the cache we need to enable it, to enable cache we need to edit the NQSConfig.ini as well as need to make the tables cacheable.
Following are the parameters from NQSConfig.ini that will be used in enabling the cache:-
ENABLE 
To enable the cache set the ENABLE parameter to YES.
DATA_STORAGE_PATHS 
This parameter specifies one or more directory paths for where the cached query results data is stored and are accessed when a cache hit occurs. The maximum capacity in bytes, kilobytes, megabytes or gigabytes. The maximum capacity for each path is 4 GB. For optimal performance, the directories specified should be on high performance storage systems.Each directory listed needs to be an existing, fully-qualified, writable directory pathname, with double quotes ( " ) surrounding the pathname. Specify mapped directories only.
UNC path names (“\\server.name.edu\somefolder“) and network mapped drives are allowed only if the service runs under a qualified user account. To change the account under which the service is running, see the corresponding topic in the Oracle Business Intelligence Enterprise Edition Deployment Guide.
Specify multiple directories with a comma separated list.
MAX_ROWS_PER_CACHE_ENTRY 
Specifies the maximum number of rows in a query result set to qualify for storage in the query cache. Limiting the number of rows is a useful way to avoid using up the cache space with runaway queries that return large numbers of rows. If the number of rows a query returns is greater than the value specified in the MAX_ROWS_PER_CACHE_ENTRY parameter, the query is not cached. When set to 0, there is no limit to the number of rows per cache entry.
MAX_CACHE_ENTRY_SIZE 
Specifies the maximum size for a cache entry. Potential entries that exceed this size are not cached. The default size is 1 MB.
Specify GB for gigabytes, KB for kilobytes, MB for megabytes, and no units for bytes.
MAX_CACHE_ENTRIES
Specifies the maximum number of cache entries allowed in the query cache. Limiting the total number of cache entries provides another parameter with which to manage your cache storage. The actual limit of cache entries might vary slightly depending on the number of concurrent queries.
POPULATE_AGGREGATE_ROLLUP_HITS 
Specifies whether to aggregate data from an earlier cached query result set and create a new entry in the query cache for rollup cache hits.
USE_ADVANCED_HIT_DETECTION 
When caching is enabled, each query is evaluated to determine whether it qualifies for a cache hit. A cache hit means that the server was able to use cache to answer the query and did not go to the database at all. The Oracle BI Server can use query cache to answer queries at the same or later level of aggregation.


Following are the parameters used to make Table Cacheable:-
To make table cacheable we need to edit the table properties.
Cacheable 
To include the table in the Oracle BI Server query cache, select this check box. When you select this check box, the Cache persistence time settings become active.
Cache never expires
When you select this option, cache entries do not expire. This could be useful when a table will be important to a large number of queries users might run. For example, if most of your queries have a reference to an account object, keeping it cached indefinitely could actually improve performance rather than compromise it
Cache persistence time
How long table entries should persist in the query cache. The default value is Infinite, meaning that cache entries do not automatically expire. However, this does not mean that an entry will always remain in the cache. Other invalidation techniques, such as manual purging, LRU (Least Recently Used) replacement, metadata changes, and use of the cache polling table, result in entries being removed from the cache.If a query references multiple physical tables with different persistence times, the cache entry for the query will exist for the shortest persistence time set for any of the tables referenced in the query. This makes sure that no subsequent query gets a cache hit from an expired cache entry.
4.) Connection Pool:-
Most of the times not much thought are given to defining connection pools while developing RPD. Improperly defined connection pool would affect the OBIEE performance and user experience. Here are some of the things to consider while defining connection pool.

  • Change the default maximum connections. The default is 10. Based on your system usage change the value that is more realistic to your system usage
  • Create a separate connection pool for execution of session variables
  • Create a separate connection pool for the execution of aggregate persistence wizard. Remember that you need to give the schema user owner credentials for this connection pool as the wizard creates and drops tables
  • If need be create a separate connection pool for VVIPs. You can control who gets to use the connection pool based on the connection pool permissions.
5.) Turning Off Log Level:-
Though query logging has immeasurable development value, do not use this for regular production users as the runtime logging cost is extremely high. Every log item is flushed to the disk, which in turn hurts query response. Also, note that the query log files are not created on per user or query basis, there is only one query log per OBIEE server and it would have exclusive lock on the log file, which kills concurrent performance. The log file is NQQuery.log which resides in OracleBI Directory.

Logging Level
Information That Is Logged
Level 0 No logging.
Level 1
Logs the SQL statement issued from the client application.
Logs elapsed times for query compilation, query execution, query cache processing, and back-end database processing.
Logs the query status (success, failure, termination, or timeout). Logs the user ID, session ID, and request ID for each query.
Level 2
Logs everything logged in Level 1.
Additionally, for each query, logs the repository name, business model name, presentation catalog (called Subject Area in Answers) name, SQL for the queries issued against physical databases, queries issued against the cache, number of rows returned from each query against a physical database and from queries issued against the cache, and the number of rows returned to the client application.
Level 3
Logs everything logged in Level 2.
Additionally, adds a log entry for the logical query plan, when a query that was supposed to seed the cache was not inserted into the cache, when existing cache entries are purged to make room for the current query, and when the attempt to update the exact match hit detector fails.
Do not select this level without the assistance of Technical Support.
Level 4
Logs everything logged in Level 3.
Additionally, logs the query execution plan. Do not select this level without the assistance of Technical Support.
Level 5
Logs everything logged in Level 4.
Additionally, logs intermediate row counts at various points in the execution plan. Do not select this level without the assistance of Technical Support.
Level 6 and 7 Reserved for future use.





6.) Using Database Hints
Database hints are instructions placed within a SQL statement that tell the database query optimizer the most efficient way to execute the statement. Hints override the optimizer's execution plan, so you can use hints to improve performance by forcing the optimizer to use a more efficient plan.
Hints are database specific. The Oracle BI Server supports hints only for Oracle 8i, 9i, and 10g servers.

These are the following objects where you can apply the hint
  1. Physical Complex Join
  2. Physical foreign key
  3. Physical Table – Object type Alias
  4. Physical Table – Object type None
Hints that are well researched and planned can result in significantly better query performance. However, hints can also negatively affect performance if they result in a suboptimal execution plan.
You should only add hints to a repository after you have tried to improve performance in the following ways:
  • Added physical indexes (or other physical changes) to the Oracle database.
  • Made modeling changes within the server.
  • Avoid creating hints for physical table and join objects that are queried often.
To create a hint
1. Navigate to one of the following dialog boxes:
  • Physical Table—General tab
  • Physical Foreign Key
  • Physical Join—Complex Join
2. Type the text of the hint in the Hint field and click OK.

7.) Using Where Clause Filter
The WHERE clause filter is used to constrain the physical tables referenced in the logical table source, If there are no constraints on the aggregate source, leave the WHERE clause filter blank.
Filter applied in WHERE Clause will applied on physical table that will be result in restricted or required data, other data which is not necessary, will not be fetched each time when there is query on that table.
Some more ways for better performance in obiee are..
  • Proper Design: There is simply no substitute for a proper data model design. Follow the rules of Dimensional Modeling and optimize your data model for query performance; this is what Dimensional Modeling is all about to begin with.
  • Move Complex Logic to ETL In some cases, the structure of the Data Model is sound, but some of the calculations are requiring very complex statements or possibly even multiple passes.
  • Use Database Performance Features: Every database at least has indexes that can be used to improve certain queries with minimal effort. The major databases go much further with advanced features and parameters - use them.
  • Use fewer reports If possible, use one report with multiple views instead of multiple reports that contains similar data. This will reduce the number of queries executed, but beware - it could increase the complexity of the queries sent to the database which may negate any benefit.
  • Reduce Record Volumes Wherever possible, try to reduce the number of records that any report returns. Although this means eliminating data dumps, it also includes having default values in prompts and reports, so that when a user first enters a page a minimal recordset will return instead of an unrestricted query.
The size of the RPD and the size of the WebCat really have no bearing on performance, aside from t he startup times of the server components. 

Post Load Processing(PLP) in ETL

         The information loaded into the Business Analytics Data Warehouse is further summarized and enriched to facilitate quick retrieval for analysis by pre-aggregating voluminous information and performing complex calculations. This task is performed by the post-load processing layer packaged in Oracle BI Applications. Since the processing involves taking information from the Business Analytics Data Warehouse base/transactional tables and loading it into other aggregate warehouse tables, this layer is source independent. Post Load Processing is designed to reduce extraction time windows by freeing up source systems as quickly as possible to support today’s complex business environments that include 24x7 usage.
          Since the processing is focused on supporting quick and complex analysis needs, the nature of processing required depends on the specific analysis. Hence, a lot of flexibility is built into this layer to configure it for specific analysis needs - eg. instead of monthly aggregates, performing weekly aggregates might be more beneficial to analysts, or rather than building periodic snapshots for products it might be better to have periodic snapshots for product families. The post load processing layer is architected to easily configure for such requirements. Additionally,other summaries can be built with the building blocks used in the post load processing layer.

The post load processing layer provides several benefits:

• Since it is a source independent layer, configuration in post load processing is shielded from operational sources regardless of the number of source systems from which data is extracted and loaded into the Business Analytics Data Warehouse.
• Complex metrics, which require data from many periods and complex statistical or data mining algorithms, are computed in the post load processing layer – for example, forecasts, profitability indexes, sensitivity indexes etc.
• Normally, transaction grain tables capture facts that actually occurred. However, for the purpose of many analyses, what did not happen may be more interesting - e.g. which products were on promotion that did not show sales improvement, or which customers that were offered incentives did not buy, etc. Such information can be derived in the post load processing layer.