The best technique to make large data loads faster is to have the optimal order of dimensions in source file, and to sort this optimally, order the fields in your source file (or SQL statement) by having hourglass dimension order, you data file should have dimensions listed from the bottom dimension upwards. Your dense dimensions should always be first, and if you have multiple data columns these should be dense dimension members. This will cause blocks to be created and filled with data in sequence, making the data load faster and the cube less fragmented.
Calculation order of the dimensions.
As a part of Optimization we need to re-order the dimensions as follows
- Large members Dense dimension
- Small members Dense dimension
- Small members Sparse dimension
- Large members Sparse dimension
- Attribute dimensions.
- Dimension tagged accounts if it is dense.
- Dense dimensions in outline or CALC DIM statement order.
- Dimensions tagged as Accounts if it is sparse.
- Sparse dimensions in outline order or CALC DIM statement order.
- Two-pass calculations on members in the Accounts tagged dimension.
Here are some more optimization techniques used in Essbase
For data loading:
- Grouping Sparse Member Combinations
- Positioning Data in the Same Order As the Outline
- Loading from the Essbase OLAP Server
- Making the Data Source As Small As Possible
- Making Source Fields As Small As Possible
- Managing Parallel Data Load Processing
For Calculation:
- Using Parallel Calculation
- Using Formulas
- Managing Caches to Improve Performance
- Using Two-Pass Calculation
- Aggregating #MISSING Values
- Removing #MISSSING Blocks
Data Compression
Essbase allows you to choose whether data blocks that are stored on disk are compressed, as well as which compression scheme to use. When data compression is enabled, Essbase compresses data blocks when it writes them out to disk.
Types of data compression:
- Bitmap compression ( default):
- Run-length encoding (RLE):
Essbase compresses repetitive, consecutive values --any value that repeats three or more times consecutively, including zeros and #MISSING values. Each data value that is repeated three or more times uses 8 bytes plus a 16 byte repetition factor.
- zlib compression:
- Index Value Pair compression:
It is selected automatically by the Essbase system. Essbase applies this compression if the block density is less than 3%. Index Value Pair addresses compression on databases with larger block sizes, where the blocks are highly sparse. zlib does not use this.
No compression.
Good presentation
ReplyDelete