Rules of Data Normalization
________________________________________
1NF Eliminate Repeating Groups - Make a separate table for each set of related attributes, and give each table a primary key.
2NF Eliminate Redundant Data - If an attribute depends on only part of a multi-valued key, remove it to a separate table.
3NF Eliminate Columns Not Dependent On Key - If attributes do not contribute to a description of the key, remove them to a separate table.
1NF Eliminate Repeating Groups
Moving the known databases into a seperate table helps a lot. Separating the repeating groups of databases from the member information results in first normal form.
2NF Eliminate Redundant Data
Suppose there is a database table, the primary key is made up of the MemberID and the DatabaseID ,separate the attributes depending on both parts of the key from those depending only on the DatabaseID. Then there would be two tables: "Database" which gives the name for each DatabaseID, and "MemberDatabase" which lists the databases for each member.
3NF Eliminate Columns Not Dependent On Key
Each column must depend on *directly* on the primary key
The star schema (sometimes referenced as star join schema) is the simplest style of data warehouse schema. The star schema consists of a few fact tables (possibly only one, justifying the name) referencing any number of dimension tables. The star schema is considered an important special case of the snowflake schema
snowflake schema dimensions are normalized into multiple related tables whereas the star schema's dimensions are denormalized with each dimension being represented by a single table.
Fragmentation
Essbase database fragmentation occurs when blocks are "re-written" within the database
The methods are-
• Sending data to a dense dimension that previously did not have any data - This happens when we have a write back cube where users are adding new data to a dense dimension, Eg.. when adding data to the current month that previously did not have any data.
• Loading data into a cube – sometimes Load rules can also cause fragmentation if it is not set up properly. When loading data into Essbase, we make the sparse dimensions first in the data loads. Here we must make sure that the data source is sorted by the columns that represent the sparse dimensions first.
• Adding incremental data to a cube - Adding recent or new data to an Essbase cube can cause fragmentation if the time dimension is tagged as dense.
There are two ways to "un-fragment" an Essbase cube. One is to clear and reload the data into the cube. The other is to perform a dense restructure
Essbase Restructure
Essbase restructures database files after an outline is changed using Outline Editor or Dimension Build. There are 2 ways to do it.
• Dense restructure: If a member of a dense dimension is moved, deleted, or added, Essbase restructures the blocks in the data files and creates new data files. When Essbase restructures the data blocks, it regenerates the index automatically so that index entries point to the new data blocks. Empty blocks are not removed. Essbase marks all restructured blocks as dirty, so after a dense restructure we must recalculate the database. Dense restructuring, the most time-consuming of the restructures, can take a long time to complete for large databases.
STEPS
1. Creates temporary files that are copies of the following files:
2. Reads the blocks from the database files copied in step 1, restructures the blocks in memory, and stores them in the new temporary files. This step takes the most time.
3. Removes the database files that were copied in step 1
4. Renames the temporary files created in step 1 to the correct file names
Sparse restructure: If a member of a sparse dimension is moved, deleted, or added, Essbase restructures the index and creates new index files. Restructuring the index is relatively fast; the time required depends on the index size.
STEPS
To perform a sparse restructure, Essbase does the following:
1. Renames the dbame.esm file to dbname.esr.
2. Renames the essxxxxx.ind files to essxxxxx.inm.
3. Creates index files (essxxxxx.ind) to store index information that is changed by the restructuring operation.
4. Removes dbname.esr and essxxxxx.inm created in step 1
Outline-only restructure: If a change affects only the database outline, Essbase does not restructure the index or data files. Member name changes, creation of aliases, and dynamic calculation formula changes are examples of changes that affect only the database outline.
OPTIMIZATION AND DATA LOADING IN ESSBASE
Optimization.
We need to optimize loading data and calculation process and retrieval process.
Rules for the optimization:
• We need to choose sparse and dense dimensions carefully.
• We need to arrange the sparse and dense dimensions in hour glass model. That is most dense dimensions in the top and less dense dimensions, less sparse in the middle and most sparse in the last.
• The block size must be between 8k and 100k
Optimizing Data Loading:
• Arrange the source data in the same order of sparse dimensions as mentioned in the outline, because the index is arranged in the same order as they appear in the sparse dimensions.
• Group all sparse members together, so that the data related to a block will be together. We can use order by clause in the select statement.
Parallel Data Load Processing
Essbase provides these essbase.cfg settings: DLTHREADSPREPARE, DLTHREADSWRITE, and DLSINGLETHREADPERSTAGE.
Optimization of Restructure Operations
If a database outline changes frequently we must analyze the outline and the types of changes that we make. Changes to sparse dimensions or attribute dimensions are fast, because only the index changes. Changes to dense dimensions are relatively slow, because data blocks are rebuilt.
These types of restructure operations are listed from fastest to slowest:
• Outline only (no index or data files)
• Sparse (only index files)
• Dense (index files and data files) as a result of adding, deleting, or moving members and other operations.
• Dense (index and data files) as a result of changing a dense dimension to sparse or changing a sparse dimension to dense
Optimization in Hyperion Essbase
Optimization:
Dimension building won't take much time. We need to optimize loading data and calculation process and retrieval process.
Rules for the optimization:
• We need to choose sparse and dense dimensions carefully.
• We need to arrange the sparse and dense dimensions in hour glass model. That is most dense dimensions in the top and less dense dimensions, less sparse in the middle and most sparse in the last.
• The block size must be between 8k and 100k
Optimizing Data Loading:
Apart from the above, we need to follow the below steps to optimize the performance.
• Arrange the source data in the same order of sparse dimensions as mentioned in the outline, because the index is arranged in the same order as they appear in the sparse dimensions.
• Group all sparse members together, so that the data related to a block will be together. We can use order by clause in the select statement.
Optimizing Calculations:
Apart from the above, we need to follow the below steps to optimize the calculations.
• Use intelligence calculation.
• Use label only whenever the parent member don't need any calculations.
• Use parallel calculations when possible.
• Use Dynamic calculations on some members, if we further want to reduce the calc time. But use dynamic calculations limitedly.