Michel's profileMichel Caradec's BI BlogBlogLists Tools Help

Blog


    March 19

    UPDATE CUBE in subcubes

    Introduction

    Excerpt from BOL: "subcube is a subset of a cube on representing a filtered view of the underlying data".

    Subcubes can be used for different purposes:

    • Simplify user navigation by displaying only needed cube items.
    • Improve query performance by limiting query evaluation space.

    Implementing security

    Subcubes can be interesting for implementing application-level security, similar to dimension security (I talked about this in a recent article dedicated to SSAS 2008 personalization extensions).

    For example, one user can have different positions in a company:

    • As a head of the IT department, he will need to see IT related data for all the branches of the enterprise.
    • As a head of a branch, he will need to have only access to his branch's data.

    This requirement could be achieved with SSAS security by defining two roles (where role is specified at connection time with Roles connection property), but would require SSAS administration operations each time a filter needs to be modified, and could lead to a huge amount of roles in SSAS database.

    Note: Indeed, application-level security will never replace server-side security. We are more in a situation where cube filters needs to be changed dynamically, as in a Budget and Planning application for instance.

    Now that we decided to use subcubes for handling application-level security, what about UPDATE CUBE use in such a context?

    Case study

    For simplicity, we will use a very simple write-enabled cube called Organization, containing just one dimension. This dimension (also called Organization) represents company structure:

    In our case, our user has access to all divisions in USA Operations, but only wants to see east divisions (northeast and southeast).

    Creating subcube

    The following MDX command will be used:

    CREATE SUBCUBE Organization AS SELECT {[Organization].[Northeast Division],[Organization].[Southeast Division]} ON 0 FROM Organization

    Now if we query dimension members, we will only see our two divisions (and its ascendants):

    Using UPDATE CUBE

    Let's issue the following command to spread a value on our two divisions:

    UPDATE CUBE Organization SET ([Organization].[USA Operations])=1000

    We expect our value to be evenly distributed on our two divisions (i.e. 500 on each). This is not what we get:

    We are the rest of the values gone?

    If we temporarily drop our subcube, and query again, here is what we see:

    What happened is UPDATE CUBE distributed the amount 1000 on [USA Operations] descendants, including out of subcube members (highlighted in blue).

    What have we learned?

    UPDATE CUBE can't be used in a subcube when it's issued on a cell having filtered descendants.

    Using COMMIT TRAN

    Regarding what we've previously learnt, we will now work on individual divisions.

    Let's issue the following commands to update Northeast division (at this stage, we are back in a subcube context):

    BEGIN TRAN

    UPDATE CUBE Organization SET ([Organization].[Northeast Division])=500

    The command works fine:

    At some time, we will want to validate (i.e. persist) our updates. The following command will do so:

    COMMIT TRAN

    The command seems to work properly, as a success message is returned. For a write-enable cube, COMMIT TRAN statement as the effect of inserting records in a writeback table. Here we should get one record, as our UPDATE CUBE command was applied on a leaf cell.

    If we check the content of the writeback table, we see no records were inserted: COMMIT TRAN has been ignored!

    We notice the same effect with a ROLLBACK TRAN command: updated values are not discarded!

    Does this mean we will never be able to persist our updates?

    The solution

    If we recall the previous scenario, here is what we did:

    1. Create subcube
    2. Start transaction
    3. Update cube
    4. Commit transaction

    Now, let's slightly change our scenario:

    1. Start transaction
    2. Create subcube
    3. Update cube
    4. Drop subcube
    5. Commit transaction

    This time it works, and records are inserted in writeback table.

    What have we learned?

    UPDATE CUBE commands can be sent in the context of the subcube, but transaction must be handled outside of the subcube.

    This means that each time we need to validate or cancel our updates, we will have to temporarily drop our subcube.

    Note the behavior is the same with SSAS 2005 and SSAS 2008.

    Conclusion

    While there are some limitations with UPDATE CUBE in a subcube context (updates only on cells not having filtered descendants, transaction outside subcube), there is still an interest at using it, when business logic such as application-level security needs to be implemented.

    March 11

    Simulating Parent-Child dimension with data on leaf members only

    Introduction

    Parent-child dimensions

    When SQL Server introduced Parent-Child dimensions with Analysis Services 2000, it came with different options regarding members associated data, also called Data Members.

    This option could be set via Members With Data property.

    Excerpt from SSAS 2000 BOL: "The Members With Data property indicates whether nonleaf members of a parent-child dimension are allowed to have associated fact table data. By default, nonleaf members are not allowed to have associated fact table data, so the property is initially set to Leaf members only. The related Data Member Caption Template property controls the names of data members when the Members With Data property is set to Nonleaf data visible. For more information about these properties, see Properties Pane (Cube Editor Data View) and Properties Pane (Dimension Editor Data View)".

    There were three possible values for Members With Data property: Leaf members only, Nonleaf data visible and Nonleaf data hidden.

    Analysis Services 2005 changes

    With the coming of SSAS 2005, Leaf members only value for Parent-Child dimensions isn't implemented anymore.

    This shouldn't be a real issue, as Nonleaf data hidden can be used as a workaround to give the illusion only leaf members have data.

    This workaround is satisfying for most of the situations, except for the following case.

    When Leaf members only is required

    Let's consider Organization Parent-Child dimension in SSAS 2005 Adventure Works DW sample database.

    Let's assume we have data only for leaf members (i.e. non leaf members are aggregating leaf ones; even if this is not the exact business meaning in the sample).

    If we execute the following UPDATE CUBE command at [European Operations] member level:

    UPDATE CUBE [Organization] SET ([Organization].[AdventureWorks Cycle].[European Operations])=60

    We expect to have an equal distribution of 60 on [France] and [Germany], 30 each. Instead, we get 20 on [France], and 20 on [Germany] (but still an aggregated value of 60).

    Where are the missing 20?

    Remember our dimension is set with Non leaf data hidden option set (as a replacement to the disappeared Leaf member only option). This means member [European Operations] has its own data, so aggregated [European Operations] has in reality three children (and not two, as shown in query result). That's where the missing 20 is gone: on [European Operations] Data Member (20+20+20=60).

    This is a serious issue in a Budget and Planning context.

    One could say the problem shouldn't occur if the cube is pre-filled with fact data for all Leaf members. An UPDATE CUBE command combined with the USE_WEIGHTED_ALLOCATION option would then distribute data as expected.

    The point here is we don't always have fact data. If a company creates a new division (a new member in the dimension), it will start a budget from scratch, i.e. with no fact data.

    So what can we do?

    Workaround

    Obviously, we must avoid Parent-Child dimensions, and come back to standard ones.

    The hierarchy level HideMemberIf property can be used to give the illusion of an unbalanced structure (as with Parent-Child).

    Data source

    The difficult thing is to get the dimension data source fit the new dimension type.

    Our Organization Parent-Child dimension data source had a structure based on a key field (OrganizationKey) and a parent field (ParentOrganizationKey).

    With a standard dimension, we need a table structure with as many fields as levels.

    This structure change can be handled quite easily if we have control on the data source.

    This is not always the case:

    • Data source can already exist when this change is required.
    • Parent-Child table structure must be kept for compatibility and usability reasons.

    On the fly transformation

    There are different techniques to transform a Parent-Child structure to a hierarchical one.

    Many stored procedures have been published on the internet by T-SQL gurus. One of these stored procedures could be executed by a SQL Server Integration Services (SSIS) package before dimension processing.

    For some reasons, we may need/want to keep all the logic at SSAS level, without relying on external packages. In this case, our standard dimension data source can be linked to a named query in the Data Source View. Our named query will be defined using a recursive query, which is possible since SSQL 2005 with the use of Common Table Expressions.

    Using such recursive query on Parent-Child table structure will result in the following structure and dataset for our Organization dimension data source:

    If we analyze in detail the transformed table structure, we notice it's organized in groups of three fields, one group corresponding to a level:

    • k_Level<N>: OrganizationKey for level <N>, the member key.
    • n_Level<N>: OrganizationName for level <N>, the member name.
    • PercentageOfOwnership_Level<N>: for level <N>, a member property.

    For leaf members with a smaller depth, values are duplicated on descendant levels.

    Dimension design

    Dimension design is straight forward. Each group of fields will be used to create an attribute. Then, attributes will be used to build a user hierarchy.

    Finally, the HideMemberIf property will be set for each hierarchy level with value OnlyChildWithParentName. This will make those descendant (see before) levels invisible to the client.

    Using it

    If we browse the new dimension, it will look the same as the original Parent-Child dimension.

    If we execute the same UPDATE CUBE command at [European Operations] member level:

    UPDATE CUBE [Organization] SET ([Organization].[AdventureWorks Cycle].[European Operations])=60

    We get the expected result, with value 60 equally distributed on [France] and [Germany]:

    Conclusion

    The lack of a Leaf members only value for Members With Data property in SSAS 2005 can be solved with unbalanced standard dimensions.

    Advantages

    • Query performance: it is recommended to avoid Parent-Child dimension, as values are not aggregated.

    Drawbacks

    • Process performance: depending on the dimension data source size and depth, recursive query execution can be time consuming when dimension needs to be processed.
    • Display: SSAS doesn't properly estimate member children count for unbalanced standard dimensions. This can be seen when browsing dimension. On screen shot below, highlighted leaf members not being at the deepest level have an associated [+] symbol, indicating they can be drilled-down (which is not the case).