Installation
Download and extract to your repository:
.github/skills/powerbi-modeling/ Extract the ZIP to .github/skills/ in your repo. The folder name must match powerbi-modeling for Copilot to auto-discover it.
Skill Files (6)
SKILL.md 5.8 KB
---
name: powerbi-modeling
description: 'Power BI semantic modeling assistant for building optimized data models. Use when working with Power BI semantic models, creating measures, designing star schemas, configuring relationships, implementing RLS, or optimizing model performance. Triggers on queries about DAX calculations, table relationships, dimension/fact table design, naming conventions, model documentation, cardinality, cross-filter direction, calculation groups, and data model best practices. Always connects to the active model first using power-bi-modeling MCP tools to understand the data structure before providing guidance.'
---
# Power BI Semantic Modeling
Guide users in building optimized, well-documented Power BI semantic models following Microsoft best practices.
## When to Use This Skill
Use this skill when users ask about:
- Creating or optimizing Power BI semantic models
- Designing star schemas (dimension/fact tables)
- Writing DAX measures or calculated columns
- Configuring table relationships (cardinality, cross-filter)
- Implementing row-level security (RLS)
- Naming conventions for tables, columns, measures
- Adding descriptions and documentation to models
- Performance tuning and optimization
- Calculation groups and field parameters
- Model validation and best practice checks
**Trigger phrases:** "create a measure", "add relationship", "star schema", "optimize model", "DAX formula", "RLS", "naming convention", "model documentation", "cardinality", "cross-filter"
## Prerequisites
### Required Tools
- **Power BI Modeling MCP Server**: Required for connecting to and modifying semantic models
- Enables: connection_operations, table_operations, measure_operations, relationship_operations, etc.
- Must be configured and running to interact with models
### Optional Dependencies
- **Microsoft Learn MCP Server**: Recommended for researching latest best practices
- Enables: microsoft_docs_search, microsoft_docs_fetch
- Use for complex scenarios, new features, and official documentation
## Workflow
### 1. Connect and Analyze First
Before providing any modeling guidance, always examine the current model state:
```
1. List connections: connection_operations(operation: "ListConnections")
2. If no connection, check for local instances: connection_operations(operation: "ListLocalInstances")
3. Connect to the model (Desktop or Fabric)
4. Get model overview: model_operations(operation: "Get")
5. List tables: table_operations(operation: "List")
6. List relationships: relationship_operations(operation: "List")
7. List measures: measure_operations(operation: "List")
```
### 2. Evaluate Model Health
After connecting, assess the model against best practices:
- **Star Schema**: Are tables properly classified as dimension or fact?
- **Relationships**: Correct cardinality? Minimal bidirectional filters?
- **Naming**: Human-readable, consistent naming conventions?
- **Documentation**: Do tables, columns, measures have descriptions?
- **Measures**: Explicit measures for key calculations?
- **Hidden Fields**: Are technical columns hidden from report view?
### 3. Provide Targeted Guidance
Based on analysis, guide improvements using references:
- Star schema design: See [STAR-SCHEMA.md](references/STAR-SCHEMA.md)
- Relationship configuration: See [RELATIONSHIPS.md](references/RELATIONSHIPS.md)
- DAX measures and naming: See [MEASURES-DAX.md](references/MEASURES-DAX.md)
- Performance optimization: See [PERFORMANCE.md](references/PERFORMANCE.md)
- Row-level security: See [RLS.md](references/RLS.md)
## Quick Reference: Model Quality Checklist
| Area | Best Practice |
|------|--------------|
| Tables | Clear dimension vs fact classification |
| Naming | Human-readable: `Customer Name` not `CUST_NM` |
| Descriptions | All tables, columns, measures documented |
| Measures | Explicit DAX measures for business metrics |
| Relationships | One-to-many from dimension to fact |
| Cross-filter | Single direction unless specifically needed |
| Hidden fields | Hide technical keys, IDs from report view |
| Date table | Dedicated marked date table |
## MCP Tools Reference
Use these Power BI Modeling MCP operations:
| Operation Category | Key Operations |
|-------------------|----------------|
| `connection_operations` | Connect, ListConnections, ListLocalInstances, ConnectFabric |
| `model_operations` | Get, GetStats, ExportTMDL |
| `table_operations` | List, Get, Create, Update, GetSchema |
| `column_operations` | List, Get, Create, Update (descriptions, hidden, format) |
| `measure_operations` | List, Get, Create, Update, Move |
| `relationship_operations` | List, Get, Create, Update, Activate, Deactivate |
| `dax_query_operations` | Execute, Validate |
| `calculation_group_operations` | List, Create, Update |
| `security_role_operations` | List, Create, Update, GetEffectivePermissions |
## Common Tasks
### Add Measure with Description
```
measure_operations(
operation: "Create",
definitions: [{
name: "Total Sales",
tableName: "Sales",
expression: "SUM(Sales[Amount])",
formatString: "$#,##0",
description: "Sum of all sales amounts"
}]
)
```
### Update Column Description
```
column_operations(
operation: "Update",
definitions: [{
tableName: "Customer",
name: "CustomerKey",
description: "Unique identifier for customer dimension",
isHidden: true
}]
)
```
### Create Relationship
```
relationship_operations(
operation: "Create",
definitions: [{
fromTable: "Sales",
fromColumn: "CustomerKey",
toTable: "Customer",
toColumn: "CustomerKey",
crossFilteringBehavior: "OneDirection"
}]
)
```
## When to Use Microsoft Learn MCP
Research current best practices using `microsoft_docs_search` for:
- Latest DAX function documentation
- New Power BI features and capabilities
- Complex modeling scenarios (SCD Type 2, many-to-many)
- Performance optimization techniques
- Security implementation patterns
MEASURES-DAX.md 4.8 KB
# DAX Measures and Naming Conventions
## Naming Conventions
### General Rules
- Use human-readable names (spaces allowed)
- Be descriptive: `Total Sales Amount` not `TSA`
- Avoid abbreviations unless universally understood
- Use consistent capitalization (Title Case recommended)
- Avoid special characters except spaces
### Table Naming
| Type | Convention | Example |
|------|------------|---------|
| Dimension | Singular noun | Customer, Product, Date |
| Fact | Business process | Sales, Orders, Inventory |
| Bridge | Combined names | CustomerAccount, ProductCategory |
| Measure Table | Underscore prefix | _Measures, _KPIs |
### Column Naming
| Type | Convention | Example |
|------|------------|---------|
| Keys | Suffix with "Key" or "ID" | CustomerKey, ProductID |
| Dates | Suffix with "Date" | OrderDate, ShipDate |
| Amounts | Descriptive with unit hint | SalesAmount, QuantitySold |
| Flags | Prefix with "Is" or "Has" | IsActive, HasDiscount |
### Measure Naming
| Type | Convention | Example |
|------|------------|---------|
| Aggregations | Verb + Noun | Total Sales, Count of Orders |
| Ratios | X per Y or X Rate | Sales per Customer, Conversion Rate |
| Time Intelligence | Period + Metric | YTD Sales, PY Total Sales |
| Comparisons | Metric + vs + Baseline | Sales vs Budget, Growth vs PY |
## Explicit vs Implicit Measures
### Always Create Explicit Measures For:
1. Key business metrics users will query
2. Complex calculations with filter manipulation
3. Measures used in MDX (Excel PivotTables)
4. Controlled aggregation (prevent sum of averages)
### Implicit Measures (Column Aggregations)
- Acceptable for simple exploration
- Set correct SummarizeBy property:
- Amounts: Sum
- Keys/IDs: None (Do Not Summarize)
- Rates/Prices: None or Average
## Measure Patterns
### Basic Aggregations
```dax
Total Sales = SUM(Sales[SalesAmount])
Order Count = COUNTROWS(Sales)
Average Order Value = DIVIDE([Total Sales], [Order Count])
Distinct Customers = DISTINCTCOUNT(Sales[CustomerKey])
```
### Time Intelligence (Requires Date Table)
```dax
YTD Sales = TOTALYTD([Total Sales], 'Date'[Date])
MTD Sales = TOTALMTD([Total Sales], 'Date'[Date])
PY Sales = CALCULATE([Total Sales], SAMEPERIODLASTYEAR('Date'[Date]))
YoY Growth = DIVIDE([Total Sales] - [PY Sales], [PY Sales])
```
### Percentage Calculations
```dax
Sales % of Total =
DIVIDE(
[Total Sales],
CALCULATE([Total Sales], REMOVEFILTERS(Product))
)
Margin % = DIVIDE([Gross Profit], [Total Sales])
```
### Running Totals
```dax
Running Total =
CALCULATE(
[Total Sales],
FILTER(
ALL('Date'),
'Date'[Date] <= MAX('Date'[Date])
)
)
```
## Column References
### Best Practice: Always Qualify Column Names
```dax
// GOOD - Fully qualified
Sales Amount = SUM(Sales[SalesAmount])
// BAD - Unqualified (can cause ambiguity)
Sales Amount = SUM([SalesAmount])
```
### Measure References: Never Qualify
```dax
// GOOD - Unqualified measure
YTD Sales = TOTALYTD([Total Sales], 'Date'[Date])
// BAD - Qualified measure (breaks if home table changes)
YTD Sales = TOTALYTD(Sales[Total Sales], 'Date'[Date])
```
## Documentation
### Measure Descriptions
Always add descriptions explaining:
- What the measure calculates
- Business context/usage
- Any important assumptions
```
measure_operations(
operation: "Update",
definitions: [{
name: "Total Sales",
tableName: "Sales",
description: "Sum of all completed sales transactions. Excludes returns and cancelled orders."
}]
)
```
### Format Strings
| Data Type | Format String | Example Output |
|-----------|---------------|----------------|
| Currency | $#,##0.00 | $1,234.56 |
| Percentage | 0.0% | 12.3% |
| Whole Number | #,##0 | 1,234 |
| Decimal | #,##0.00 | 1,234.56 |
## Display Folders
Organize measures into logical groups:
```
measure_operations(
operation: "Update",
definitions: [{
name: "YTD Sales",
tableName: "_Measures",
displayFolder: "Time Intelligence\\Year"
}]
)
```
Common folder structure:
```
_Measures
โโโ Sales
โ โโโ Total Sales
โ โโโ Average Sale
โโโ Time Intelligence
โ โโโ Year
โ โ โโโ YTD Sales
โ โ โโโ PY Sales
โ โโโ Month
โ โโโ MTD Sales
โโโ Ratios
โโโ Margin %
โโโ Conversion Rate
```
## Variables for Performance
Use variables to:
- Avoid recalculating the same expression
- Improve readability
- Enable debugging
```dax
Gross Margin % =
VAR TotalSales = [Total Sales]
VAR TotalCost = [Total Cost]
VAR GrossProfit = TotalSales - TotalCost
RETURN
DIVIDE(GrossProfit, TotalSales)
```
## Validation Checklist
- [ ] All key business metrics have explicit measures
- [ ] Measures have clear, descriptive names
- [ ] Measures have descriptions
- [ ] Appropriate format strings applied
- [ ] Display folders organize related measures
- [ ] Column references are fully qualified
- [ ] Measure references are not qualified
- [ ] Variables used for complex calculations
PERFORMANCE.md 5.3 KB
# Performance Optimization for Power BI Models
## Data Reduction Techniques
### 1. Remove Unnecessary Columns
- Only import columns needed for reporting
- Remove audit columns (CreatedBy, ModifiedDate) unless required
- Remove duplicate/redundant columns
```
column_operations(operation: "List", filter: { tableNames: ["Sales"] })
// Review and remove unneeded columns
```
### 2. Remove Unnecessary Rows
- Filter historical data to relevant period
- Exclude cancelled/void transactions if not needed
- Apply filters in Power Query (not in DAX)
### 3. Reduce Cardinality
High cardinality (many unique values) impacts:
- Model size
- Refresh time
- Query performance
**Solutions:**
| Column Type | Reduction Technique |
|-------------|---------------------|
| DateTime | Split into Date and Time columns |
| Decimal precision | Round to needed precision |
| Text with patterns | Extract common prefix/suffix |
| High-precision IDs | Use surrogate integer keys |
### 4. Optimize Data Types
| From | To | Benefit |
|------|-----|---------|
| DateTime | Date (if time not needed) | 8 bytes to 4 bytes |
| Decimal | Fixed Decimal | Better compression |
| Text with numbers | Whole Number | Much better compression |
| Long text | Shorter text | Reduces storage |
### 5. Group and Summarize
Pre-aggregate data when detail not needed:
- Daily instead of transactional
- Monthly instead of daily
- Consider aggregation tables for DirectQuery
## Column Optimization
### Prefer Power Query Columns Over Calculated Columns
| Approach | When to Use |
|----------|-------------|
| Power Query (M) | Can be computed at source, static values |
| Calculated Column (DAX) | Needs model relationships, dynamic logic |
Power Query columns:
- Load faster
- Compress better
- Use less memory
### Avoid Calculated Columns on Relationship Keys
DAX calculated columns in relationships:
- Cannot use indexes
- Generate complex SQL for DirectQuery
- Hurt performance significantly
**Use COMBINEVALUES for multi-column relationships:**
```dax
// If you must use calculated column for composite key
CompositeKey = COMBINEVALUES(",", [Country], [City])
```
### Set Appropriate Summarization
Prevent accidental aggregation of non-additive columns:
```
column_operations(
operation: "Update",
definitions: [{
tableName: "Product",
name: "UnitPrice",
summarizeBy: "None"
}]
)
```
## Relationship Optimization
### 1. Minimize Bidirectional Relationships
Each bidirectional relationship:
- Increases query complexity
- Can create ambiguous paths
- Reduces performance
### 2. Avoid Many-to-Many When Possible
Many-to-many relationships:
- Generate more complex queries
- Require more memory
- Can produce unexpected results
### 3. Reduce Relationship Cardinality
Keep relationship columns low cardinality:
- Use integer keys over text
- Consider higher-grain relationships
## DAX Optimization
### 1. Use Variables
```dax
// GOOD - Calculate once, use twice
Sales Growth =
VAR CurrentSales = [Total Sales]
VAR PriorSales = [PY Sales]
RETURN DIVIDE(CurrentSales - PriorSales, PriorSales)
// BAD - Recalculates [Total Sales] and [PY Sales]
Sales Growth =
DIVIDE([Total Sales] - [PY Sales], [PY Sales])
```
### 2. Avoid FILTER with Entire Tables
```dax
// BAD - Iterates entire table
Sales High Value =
CALCULATE([Total Sales], FILTER(Sales, Sales[Amount] > 1000))
// GOOD - Uses column reference
Sales High Value =
CALCULATE([Total Sales], Sales[Amount] > 1000)
```
### 3. Use KEEPFILTERS Appropriately
```dax
// Respects existing filters
Sales with Filter =
CALCULATE([Total Sales], KEEPFILTERS(Product[Category] = "Bikes"))
```
### 4. Prefer DIVIDE Over Division Operator
```dax
// GOOD - Handles divide by zero
Margin % = DIVIDE([Profit], [Sales])
// BAD - Errors on zero
Margin % = [Profit] / [Sales]
```
## DirectQuery Optimization
### 1. Minimize Columns and Tables
DirectQuery models:
- Query source for every visual
- Performance depends on source
- Minimize data retrieved
### 2. Avoid Complex Power Query Transformations
- Transforms become subqueries
- Native queries are faster
- Materialize at source when possible
### 3. Keep Measures Simple Initially
Complex DAX generates complex SQL:
- Start with basic aggregations
- Add complexity gradually
- Monitor query performance
### 4. Disable Auto Date/Time
For DirectQuery models, disable auto date/time:
- Creates hidden calculated tables
- Increases model complexity
- Use explicit date table instead
## Aggregations
### User-Defined Aggregations
Pre-aggregate fact tables for:
- Very large models (billions of rows)
- Hybrid DirectQuery/Import
- Common query patterns
```
table_operations(
operation: "Create",
definitions: [{
name: "SalesAgg",
mode: "Import",
mExpression: "..."
}]
)
```
## Performance Testing
### Use Performance Analyzer
1. Enable in Power BI Desktop
2. Start recording
3. Interact with visuals
4. Review DAX query times
### Monitor with DAX Studio
External tool for:
- Query timing
- Server timings
- Query plans
## Validation Checklist
- [ ] Unnecessary columns removed
- [ ] Appropriate data types used
- [ ] High-cardinality columns addressed
- [ ] Bidirectional relationships minimized
- [ ] DAX uses variables for repeated expressions
- [ ] No FILTER on entire tables
- [ ] DIVIDE used instead of division operator
- [ ] Auto date/time disabled for DirectQuery
- [ ] Performance tested with representative data
RELATIONSHIPS.md 3.4 KB
# Relationships in Power BI
## Relationship Properties
### Cardinality
| Type | Use Case | Notes |
|------|----------|-------|
| One-to-Many (*:1) | Dimension to Fact | Most common, preferred |
| Many-to-One (1:*) | Fact to Dimension | Same as above, direction reversed |
| One-to-One (1:1) | Dimension extensions | Use sparingly |
| Many-to-Many (*:*) | Bridge tables, complex scenarios | Requires careful design |
### Cross-Filter Direction
| Setting | Behavior | When to Use |
|---------|----------|-------------|
| Single | Filters flow from "one" to "many" | Default, best performance |
| Both | Filters flow in both directions | Only when necessary |
## Best Practices
### 1. Prefer One-to-Many Relationships
```
Customer (1) --> (*) Sales
Product (1) --> (*) Sales
Date (1) --> (*) Sales
```
### 2. Use Single-Direction Cross-Filtering
Bidirectional filtering:
- Impacts performance negatively
- Can create ambiguous filter paths
- May produce unexpected results
**Only use bidirectional when:**
- Dimension-to-dimension analysis through fact table
- Specific RLS requirements
**Better alternative:** Use CROSSFILTER in DAX measures:
```dax
Countries Sold =
CALCULATE(
DISTINCTCOUNT(Customer[Country]),
CROSSFILTER(Customer[CustomerKey], Sales[CustomerKey], BOTH)
)
```
### 3. One Active Path Between Tables
- Only one active relationship between any two tables
- Use USERELATIONSHIP for role-playing dimensions:
```dax
Sales by Ship Date =
CALCULATE(
[Total Sales],
USERELATIONSHIP(Sales[ShipDate], Date[Date])
)
```
### 4. Avoid Ambiguous Paths
Circular references cause errors. Solutions:
- Deactivate one relationship
- Restructure model
- Use USERELATIONSHIP in measures
## Relationship Patterns
### Standard Star Schema
```
[Date]
|
[Product]--[Sales]--[Customer]
|
[Store]
```
### Role-Playing Dimension
```
[Date] --(active)-- [Sales.OrderDate]
|
+--(inactive)-- [Sales.ShipDate]
```
### Bridge Table (Many-to-Many)
```
[Customer]--(*)--[CustomerAccount]--(*)--[Account]
```
### Factless Fact Table
```
[Product]--[ProductPromotion]--[Promotion]
```
Used to capture relationships without measures.
## Creating Relationships via MCP
### List Current Relationships
```
relationship_operations(operation: "List")
```
### Create New Relationship
```
relationship_operations(
operation: "Create",
definitions: [{
fromTable: "Sales",
fromColumn: "ProductKey",
toTable: "Product",
toColumn: "ProductKey",
crossFilteringBehavior: "OneDirection",
isActive: true
}]
)
```
### Deactivate Relationship
```
relationship_operations(
operation: "Deactivate",
references: [{ name: "relationship-guid-here" }]
)
```
## Troubleshooting
### "Ambiguous Path" Error
Multiple active paths exist between tables.
- Check for: Multiple fact tables sharing dimensions
- Solution: Deactivate redundant relationships
### Bidirectional Not Allowed
Circular reference would be created.
- Solution: Restructure or use DAX CROSSFILTER
### Relationship Not Detected
Columns may have different data types.
- Ensure both columns have identical types
- Check for trailing spaces in text keys
## Validation Checklist
- [ ] All relationships are one-to-many where possible
- [ ] Cross-filter is single direction by default
- [ ] Only one active path between any two tables
- [ ] Role-playing dimensions use inactive relationships
- [ ] No circular reference paths
- [ ] Key columns have matching data types
RLS.md 5.3 KB
# Row-Level Security (RLS) in Power BI
## Overview
Row-Level Security restricts data access at the row level based on user identity. Users see only the data they're authorized to view.
## Design Principles
### 1. Filter on Dimension Tables
Apply RLS to dimensions, not fact tables:
- More efficient (smaller tables)
- Filters propagate through relationships
- Easier to maintain
```dax
// On Customer dimension - filters propagate to Sales
[Region] = "West"
```
### 2. Create Minimal Roles
Avoid many role combinations:
- Each role = separate cache
- Roles are additive (union, not intersection)
- Consolidate where possible
### 3. Use Dynamic RLS When Possible
Data-driven rules scale better:
- User mapping in a table
- USERPRINCIPALNAME() for identity
- No role changes when users change
## Static vs Dynamic RLS
### Static RLS
Fixed rules per role:
```dax
// Role: West Region
[Region] = "West"
// Role: East Region
[Region] = "East"
```
**Pros:** Simple, clear
**Cons:** Doesn't scale, requires role per group
### Dynamic RLS
User identity drives filtering:
```dax
// Single role filters based on logged-in user
[ManagerEmail] = USERPRINCIPALNAME()
```
**Pros:** Scales, self-maintaining
**Cons:** Requires user mapping data
## Implementation Patterns
### Pattern 1: Direct User Mapping
User email in dimension table:
```dax
// On Customer table
[CustomerEmail] = USERPRINCIPALNAME()
```
### Pattern 2: Security Table
Separate table mapping users to data:
```
SecurityMapping table:
| UserEmail | Region |
|-----------|--------|
| joe@co.com | West |
| sue@co.com | East |
```
```dax
// On Region dimension
[Region] IN
SELECTCOLUMNS(
FILTER(SecurityMapping, [UserEmail] = USERPRINCIPALNAME()),
"Region", [Region]
)
```
### Pattern 3: Manager Hierarchy
Users see their data plus subordinates:
```dax
// Using PATH functions for hierarchy
PATHCONTAINS(Employee[ManagerPath],
LOOKUPVALUE(Employee[EmployeeID], Employee[Email], USERPRINCIPALNAME()))
```
### Pattern 4: Multiple Rules
Combine conditions:
```dax
// Users see their region OR if they're a global viewer
[Region] = LOOKUPVALUE(Users[Region], Users[Email], USERPRINCIPALNAME())
|| LOOKUPVALUE(Users[IsGlobal], Users[Email], USERPRINCIPALNAME()) = TRUE()
```
## Creating Roles via MCP
### List Existing Roles
```
security_role_operations(operation: "List")
```
### Create Role with Permission
```
security_role_operations(
operation: "Create",
definitions: [{
name: "Regional Sales",
modelPermission: "Read",
description: "Restricts sales data by region"
}]
)
```
### Add Table Permission (Filter)
```
security_role_operations(
operation: "CreatePermissions",
permissionDefinitions: [{
roleName: "Regional Sales",
tableName: "Customer",
filterExpression: "[Region] = USERPRINCIPALNAME()"
}]
)
```
### Get Effective Permissions
```
security_role_operations(
operation: "GetEffectivePermissions",
references: [{ name: "Regional Sales" }]
)
```
## Testing RLS
### In Power BI Desktop
1. Modeling tab > View As
2. Select role(s) to test
3. Optionally specify user identity
4. Verify data filtering
### Test Unexpected Values
For dynamic RLS, test:
- Valid users
- Unknown users (should see nothing or error gracefully)
- NULL/blank values
```dax
// Defensive pattern - returns no data for unknown users
IF(
USERPRINCIPALNAME() IN VALUES(SecurityMapping[UserEmail]),
[Region] IN SELECTCOLUMNS(...),
FALSE()
)
```
## Common Mistakes
### 1. RLS on Fact Tables Only
**Problem:** Large table scans, poor performance
**Solution:** Apply to dimension tables, let relationships propagate
### 2. Using LOOKUPVALUE Instead of Relationships
**Problem:** Expensive, doesn't scale
**Solution:** Create proper relationships, let filters flow
### 3. Expecting Intersection Behavior
**Problem:** Multiple roles = UNION (additive), not intersection
**Solution:** Design roles with union behavior in mind
### 4. Forgetting About DirectQuery
**Problem:** RLS filters become WHERE clauses
**Solution:** Ensure source database can handle the query patterns
### 5. Not Testing Edge Cases
**Problem:** Users see unexpected data
**Solution:** Test with: valid users, invalid users, multiple roles
## Bidirectional RLS
For bidirectional relationships with RLS:
```
Enable "Apply security filter in both directions"
```
Only use when:
- RLS requires filtering through many-to-many
- Dimension-to-dimension security needed
**Caution:** Only one bidirectional relationship per path allowed.
## Performance Considerations
- RLS adds WHERE clauses to every query
- Complex DAX in filters hurts performance
- Test with realistic user counts
- Consider aggregations for large models
## Object-Level Security (OLS)
Restrict access to entire tables or columns:
```
// Via XMLA/TMSL - not available in Desktop UI
```
Use for:
- Hiding sensitive columns (salary, SSN)
- Restricting entire tables
- Combined with RLS for comprehensive security
## Validation Checklist
- [ ] RLS applied to dimension tables (not fact tables)
- [ ] Filters propagate correctly through relationships
- [ ] Dynamic RLS uses USERPRINCIPALNAME()
- [ ] Tested with valid and invalid users
- [ ] Edge cases handled (NULL, unknown users)
- [ ] Performance tested under load
- [ ] Role mappings documented
- [ ] Workspace roles understood (Admins bypass RLS)
STAR-SCHEMA.md 3.1 KB
# Star Schema Design for Power BI
## Overview
Star schema is the optimal design pattern for Power BI semantic models. It organizes data into:
- **Dimension tables**: Enable filtering and grouping (the "one" side)
- **Fact tables**: Enable summarization (the "many" side)
## Table Classification
### Dimension Tables
- Contain descriptive attributes for filtering/slicing
- Have unique key columns (one row per entity)
- Examples: Customer, Product, Date, Geography, Employee
- Naming convention: Singular noun (`Customer`, `Product`)
### Fact Tables
- Contain measurable, quantitative data
- Have foreign keys to dimensions
- Store data at consistent grain (one row per transaction/event)
- Examples: Sales, Orders, Inventory, WebVisits
- Naming convention: Business process noun (`Sales`, `Orders`)
## Design Principles
### 1. Separate Dimensions from Facts
```
BAD: Single denormalized "Sales" table with customer details
GOOD: "Sales" fact table + "Customer" dimension table
```
### 2. Consistent Grain
Every row in a fact table represents the same thing:
- Order line level (most common)
- Daily aggregation
- Monthly summary
Never mix grains in one table.
### 3. Surrogate Keys
Add surrogate keys when source lacks unique identifiers:
```m
// Power Query: Add index column
= Table.AddIndexColumn(Source, "CustomerKey", 1, 1)
```
### 4. Date Dimension
Always create a dedicated date table:
- Mark as date table in Power BI
- Include fiscal periods if needed
- Add relative date columns (IsCurrentMonth, IsPreviousYear)
```dax
Date =
ADDCOLUMNS(
CALENDAR(DATE(2020,1,1), DATE(2030,12,31)),
"Year", YEAR([Date]),
"Month", FORMAT([Date], "MMMM"),
"MonthNum", MONTH([Date]),
"Quarter", "Q" & FORMAT([Date], "Q"),
"WeekDay", FORMAT([Date], "dddd")
)
```
## Special Dimension Types
### Role-Playing Dimensions
Same dimension used multiple times (e.g., Date for OrderDate, ShipDate):
- Option 1: Duplicate the table (OrderDate, ShipDate tables)
- Option 2: Use inactive relationships with USERELATIONSHIP in DAX
### Slowly Changing Dimensions (Type 2)
Track historical changes with version columns:
- StartDate, EndDate columns
- IsCurrent flag
- Requires pre-processing in data warehouse
### Junk Dimensions
Combine low-cardinality flags into one table:
```
OrderFlags dimension: IsRush, IsGift, IsOnline
```
### Degenerate Dimensions
Keep transaction identifiers (OrderNumber, InvoiceID) in fact table.
## Anti-Patterns to Avoid
| Anti-Pattern | Problem | Solution |
|--------------|---------|----------|
| Wide denormalized tables | Poor performance, hard to maintain | Split into star schema |
| Snowflake (normalized dims) | Extra joins hurt performance | Flatten dimensions |
| Many-to-many without bridge | Ambiguous results | Add bridge/junction table |
| Mixed grain facts | Incorrect aggregations | Separate tables per grain |
## Validation Checklist
- [ ] Each table is clearly dimension or fact
- [ ] Fact tables have foreign keys to all related dimensions
- [ ] Dimensions have unique key columns
- [ ] Date table exists and is marked
- [ ] No circular relationship paths
- [ ] Consistent naming conventions
License (MIT)
View full license text
MIT License Copyright GitHub, Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.