DATA MINNING PROJECT Right click on the column and ‘Rename’ this column “Date”. On the ‘Modeling’ tab now click on ‘New Column’ to create a calculated column in the new table, use this expression: Datekey = Format('Calendar'[Date], "YYYYMMDD") Right click on the new column called “DateKey” and select “Hide In Report view”. In the ‘Modeling’ tab, under ‘Formatting’ and ‘Data Type’ change the “DateKey” Column’s DataType to “Whole Number”. | 1 Note: A warning will inform you that changing the data type will affect how the data is stored thus may have an impact elsewhere. This is an important note however should not affect your current tasks. Now a Calendar table is created you can switch to the ‘Relationship’ perspective and drag the Calendar[DateKey] over to the Orders[DateKey] to create their relationship. You can extend the Calendar by creating the additional calculated columns using the following table of expressions: Note: Please read through the table and try to see the differences in the DAX. To save time select a few columns to add into your model, this is only proof of concept. Also note you can add additional columns at any time. Realistically you would try to do as little modeling as required, so there is no need to create fields that may be redundant and unnecessary. Column Title Formula Comments Year Year = YEAR('Calendar'[Date]) Extracts years from date/time Year2 Year2 = FORMAT('Calendar'[Date], “YYYY”) Alternative - isolates year from date/time Year3 Year3 = LEFT('Calendar'[DateKey], 4) Alternative - Extracts first 4 letters from the left of a text datatype. Month Month = Month('Calendar'[Date]) Extracts month numbers from date/time Month2 Month2 = MID('Calendar'[DateKey], 5, 2) Alternative - Extracts characters from texts based on specific starting point Month Name Month Name = FORMAT('Calendar'[Date], "MMMM") Extracts the month name from date/time Week Number Week Number = "Week " & Weeknum('Calendar'[Date]) Shows the number of the week in the year Week Number2 Week Number2 = "W" & FORMAT(Weeknum('Calendar'[Date]), "00") Alternative - Shows the number of the week in the year with a leading zero for the first nine weeks Day Of Month Day Of Month = DAY('Calendar'[Date]) Displays the number of the day of the month Day Of Month2 Day Of Month2 = FORMAT(Day('Calendar'[Date]),"00") Alternative - Displays the number of the day of the month with a leading zero for the first nine days Day Of Week Day Of Week = WEEKDAY('Calendar'[Date],2) Displays the number of the day of the week – week starting by Monday Week Day Name Week Day Name = FORMAT('Calendar'[Date],"dddd") Displays the name of the weekday Week Day Name2 Week Day Name2 = FORMAT('Calendar'[Date],"ddd") Alternative - Displays the name of the weekday as a three letter abbreviation Weekday/Weekend Weekday/Weekend = IF('Calendar'[Day Calculates if day is weekday or weekend | 2 Of Week]<=5, "Weekday", "Weekend") – this relies on 'Calendar'[Day Of Week] ISO Date ISO Date = [Year] & [Month2] & [Day Of Month2] Displays the date in the ISO (internationally recognised) format of YYYYMMDD Full Date Full Date = [Month2] & " " & [Month Name] & " " & [Year] Displays the full date with spaces Full Date2 Full Date2 = FORMAT('Calendar'[Date], "DD MMMM YYYY") Alternative – using a user-defined date/time formats via format function Quarter Quarter = "Quarter " & ROUNDDOWN(MONTH('Calendar'[Date])/4, 0)+1 Displays the current quarter Quarter Abbr. Quarter Abbr. = "Qtr " &ROUNDDOWN(MONTH('Calendar'[Date])/4 ,0)+1 Displays the current quarter as a three letter abbreviation plus the quarter number Quarter Year Quarter Year = [Year] & " Qtr" & ROUNDDOWN(MONTH('Calendar'[Date])/4, 0)+1 Shows the year and quarter abbreviation Current Year Current Year = IF(YEAR('Calendar'[Date])=YEAR([Toda y]), TRUE(), FALSE()) Tests to see if date is in the current year Current Year Month Current Year Month = IF(YEAR('Calendar'[Date])=YEAR([Toda y]), IF(MONTH('Calendar'[Date])=MONTH([To day]), FALSE())) Tests to see if date is in current year and current month Current LTM Current LTM = IF('Calendar'[Date]>DATE(YEAR(TODAY( ))-1, MONTH(TODAY()), 1)-1 && 'Calendar'[Date]<=DATE(YEAR(TODAY()) , MONTH(TODAY()), 1)-1, TRUE(), FALSE()) Tests to see if date is between the last day of last month, last year and Last day of last month, this year. Note: LTM stands for Last Twelve Months. LTM Group LTM Group = CONCATENATE("LTM Group: ", IF (MONTH('Calendar'[Date])>MONTH(TODAY ()-DAY(TODAY())), (YEAR(TODAY()DAY(TODAY()))YEAR('Calendar'[Date]))*12, (YEAR(TODAY()-DAY(TODAY()))YEAR('Calendar'[Date])+1)*12)) Groups dates together based on twelve months periods. Note: LTM stands for Last Twelve Months. Please only read methods 7A and 7B as alternatives to the above. 7A) Creating a Date Table via External Sources An alternative to manually creating a calendar table is to import an existing table. This is no different to importing any other data sources or files. Simple click Get Data and import then link in the relationships. Note: You do not need to follow this section. | 3 This method will only work if you have a database to connect to and that database includes a Date table. You will additionally need to modify the TSQL statement accordingly. Click Get Data, then nnavigate to ‘SQL Database’ and open the ‘Advanced Options’ Note: this will allow you to write in a SQL query manually. Enter the following syntax: select * from dbo.DimDate where FullDate >= '2015-01-01' and FullDate < '2018-01-01' After inserting the new table navigate to the ‘Relationships’ perspective to create the relevant relationships. Note: Ensure the Cross Filter Direction is on Both. This is the most common, default direction. This means for filtering purposes, both tables are treated as if they're a single table. More information around can found by clicking here. 7B) Extracting Date Table from Columns or Measures A second alternative to the above methods is to extend an existing table with date information extracted from an existing data using DAX’s Text and Format functions. This is most likely used in scenarios that need a ‘quick and dirty’ result, saving us having to import and etc. An example would be to turn Orders[DateKey] Into a three calculated coulmns as Date, Month and Year via: Days = Right(Orders[DateKey],2) ------------------------------------------------------------Month = MID(Orders[DateKey],5,2) ------------------------------------------------------------Year = LEFT(Orders[DateKey],4) Or first breaking the Orders[DateKey] down into sections then concatenating the string back together using a combination of DAX functions: OrderDate=CONCATENATE(CONCATENATE(CONCATENATE(CONCATENATE(Rig ht(Orders[DateKey], 2),"/"), MID(Orders[DateKey],5,2)),"/"), LEFT(Orders[DateKey],4)) Note: This method is not necessary for manipulating Dates because of the vast time intelligence functions in DAX however, these principles apply well with texts. E.g. to extract details from email address such as emails ending in “.co.uk” being a “UK” customer etc. 8) Filtering Generally BI commonly looks at numbers (i.e. facts – elements that can be measured) against descriptions (i.e. dimensions – elements that provide context); for example, looking at the total ‘Orders Value’ by multiple ‘Market Names’. Due to this nature it is essential to enable users to filter facts by descriptions to provide context specific data. | 4 In Power BI there are a number of filtering features. It is important to note each method and know how these are individually used but also how these can be combined together to drive an interactive report. DAX Filters – Filters programmed into calculations Slicers – Filters that are visually represented on the report page Visual level filters – Filter that are off the page and only effect a specific visual Page level filters – Filters that are off the page and effect all visuals on the report page Report level filters – Filters that are off the page and effect all visuals on all report pages 10A) DAX Filters: DAX can be used to create calculations that take into account a filter context, thus DAX can create filtered elements such as measures to evaluate the order value for a specific year. To do this, on the ribbon click on the dropdown labeled ‘New Measure’ and select ‘New Measure’ from the list. In the formula bar that appears insert the following DAX scrip. 2014 Orders = CALCULATE(SUM(Orders[OrderValue]), 'Calendar'[Year]=2014, Orders[LostReasonKey]=-2) Note: the ‘calculate’ function runs an expression i.e. the sum of value where specific filters contexts apply i.e. filtering down to the given ‘Year’ then further filtering down to only show “LostReason=-2” that means the order was “Not lost”. This context illustrates the importance of filtering the data. As seen above, the aggregate of ‘OverValue’ as the standard column from the ‘Orders’ table is redundant without filtering out the lost orders and providing a time context. Repeat these steps to create measures for both ‘2014 Orders’ and ‘2015 Orders’. The two new measures should now be available in the field list. If the measures are not created in the ‘Orders’ table find them, then move them onto the Orders table. This is done by simply clicking on the measure in the fields pane, whilst the element is highlighted clicking on the tab ribbon to ‘Modeling’ and under the section ‘Properties’ section changing the ‘Home Table’ through the dropdown. Note: Measures can be moved between tables as they are a calculated entity therefore not dependent on any tables. Ideally measures should be kept on a table that represents a logical relationship. For example, keeping the filtered 2015 orders and 2014 orders in the ‘Orders’ table. Similar DAX filtering can also be applied when creating calculated columns or tables. Add a new page to the report. Then drag out the ‘2015 orders’ measure into a table. Then from the ‘Product’ table add ‘ProductGroupDesc’ to the table. Now the order value is split into the relevant Product Groups. This shows although measures are aggregates they can still be split via attributes. 9B) Slicers: Slicers are selected from the Visualizations pane as a type of visual. Similar to their counterparts these visuals are added to the canvas and promote an interaction point for users. These on-canvas filters allow anyone segment the data by particular values. | 5 Note: Slicers are one of the few visuals that can’t have a visual level filters applied to them, however a page or repot level filter do effect the slicers. If you must filter a slicer a workaround may be to use a hidden slicer to filter the user facing slicer or use a filtered calculated column rather than the origin comprehensive column. As two separate visuals insert slicers from the ‘Calendar’ table on ‘Date’ and another on ‘Month Name’. Although one type of visual these two visuals are presented and interact differently. This is because ‘Date’ is a ‘Date/time’ datatype and ‘Month Name’ is a ‘Text’ datatype. This highlights why it’s important to check very column and its datatype initially when inserting the table into the model. Also note the ‘Date’ slicer is presenting “01/02/2012” to “31/12/2016” so the table showing the measure ‘2015 Orders’ is presented. If we change the range to exclude 2015 dates such as “01/01/2016” to “31/12/2016” no data will be returned for measure. This is because Measure is already filter through DAX so the new slicers are only adding extra conditions on top of calculation. Note: The form of the slicer is dependent on the datatype of the field selected e.g. text columns will show up as check-box style slicers however a date/time column will display a timeline style slicer and a date picker. Note: A ‘Search’ box can also be added to slicers across text fields. This is very useful for filters such as ‘product code’ as come shop floor and telesales users will know specific codes 9C) Visual, Page and Report Level Filters: These three levels of filtering are off the page filters. They essentially function like the slicers however appear on a collapsible pane. The three versions do as their names suggest, each filtering either to a visual or all the visuals on the page or all visual on all pages of a report. | 6 Simply drop-and-drop ‘Sector’ from the ‘Markets’ table onto the ‘Page Level Filters’ area under the ‘filters’ section on the ‘visualizations’ pane. A good example of these filters being used is when creating an annual report. There is no need to filter each calculation, nor visuals we just apply a Report Level Filter to the pack ensuring all the number reflect the year in question. An example where is not a great fit is where an adhoc report is being created and all visuals are running on the same context. A page or report level filter then can become confusing for users. Note: In most ad-hoc style reports designer apply filters as they developed. This approach is okay for a power user however not ideal for the wider business users. Filters should be grouped together i.e. utilize page or report level filter as much as possible. This is because modifications to the visuals then becomes centralized and there is no need to check each visual individually. Planning is key here; it’s easy to tell what reports are well planned or not just based on this simple concept. As standard the user can also select to add ‘Advance Filtering’ options to the fields. These allow for conditional filtering, much like the ones previously seen in the likes of Excel. On a visual level however this is taken to another level. As the visual is isolated this allows for greater control on its filtering. Here we also see more advancements such as the ‘Top N’ filter. Previously DAX was needed to calculate this however now we see this as a standard feature in the tool. Note: The ‘Top N’ filter was voted in by the power BI Community as a common feature and thus developed into the tool. | 7 10) Tooltips As the users hover over the visuals tooltips open a temporary window to display relevant information. Due to the small display area tooltips are limited to displaying aggregated numbers, first/last values or the count/count(Distinct) of the records. DAX can provide a workaround for this limitation as calculations can convert rows of data into a string separated by delimiters to form a measure. Having collapsed the rows into a single string the measure can display the full information within a tooltip. To do this a variety of DAX calculations could be used, let’s examine some of these to see the difference and learn more about the arrangement of DAX. The simple solution would be collapse all the rows into a single string. Note: DAX measure to display multiple values split by delimiter “, ” Tooltip1 = CALCULATE(CONCATENATEX(VALUES('Stock'[Subcategory]),'Stock'[Subcategor y],", ")) The weakness of the above method is that when the numbers of values to display are very high we lose details here or this may even not display at times. The calculation below limits the numbers returned and displays the first 3 values. If additional values are present it also follows the statement with “ and more…” Note: This measure currently displays 3 values and a note however this could be modified to display any set number or even a variable number Tooltip2 = var itemcount = DISTINCTCOUNT(StockItem[Color]) return IF(itemcount > 3, CONCATENATEX(TOPN(3,VALUES(StockItem[Color])),StockItem[Color],”,”)&” and more…”, CALCULATE(CONCATENATEX(VALUES(StockItem[Color]), StockItem[Color], “, “))) To advance on the last step, the DAX can be further improved to show the top 3 value rather than the first 3 values; this is more likely to be required in the business world. This is a great example to show the power of DAX but also show users can layer logic to advance their calculations. Advance logic can be gained from layering up simple functions. Note: In DAX “VAR” sets a variable, these optimize performance etc. however no need to worry about such advance elements; as long as you get the right numbers, you’re a good position Tooltip3 = VAR SubcategoriesCount = DISTINCTCOUNT('Stock'[Subcategory]) RETURN IF(SubcategoriesCount >= 3, CALCULATE(CONCATENATEX(TOPN(3,VALUES('Stock'[Subcategory])),'Stock'[Su bcategory],", "))&" and more…", CALCULATE(CONCATENATEX(VALUES('Stock'[Subcategory]),'Stock'[Subcategor y],", "))) | 8 11) Row Level Security Row level security in Power BI is configured on the Power BI Desktop at the data model level then published. Through the Power BI Service, the logged in user is then identified through the Active Directory and aligned with the roles and rules to only be presented with the permitted data. Rule - The DAX formula that limits the data visibility on each table e.g. Orders[Region] = “UK” Role - The user groups that the rules effect e.g. “UK Market” Under the ‘Modeling’ tab click on ‘Manage Roles’. This wizard will assist you with creating the roles. In this example, we need three roles based on the table EBSType, these should be defined as: Role: AirCare Table: EBSType Rule: [EBSType] = "AirCare" Role: Non EBS Table: EBSType Rule: [EBSType] = "Non EBS" Role: Total Solutions Table: EBSType Rule: [EBSType] = "Total Solutions" Note: The checkmark on top right validates the DAX expression for syntax errors. Also note, in a later stage the members for these groups are added through the Power BI Service. To test these roles out we can create a new page by clicking the yellow “+” tab on the bottom left side of the canvas. Then creating a table with [EBSTypes]EBSTypes”, Orders[Count], Orders[OrderValue]. Under the ‘Modeling’ tab click on ‘View As Roles’. Then select the role “AireCare” then click ‘Ok’. You should now see that the orders table is limited to only show lines associated with “AireCare”. Then click “Stop Viewing” when you have done you testing. | 9 Note: Multiple roles can be selected to be pre-viewed at one time. Assuming we have finalised the report we press the ‘Publish’ button to send the model and report into the Power BI Service (go to powerbi.com). In the service, on the left in the ‘Navigation’ pane you can hover over the new Dataset we just published then click the ellipse “…” button to open the options then select ‘Security’. Note: As you can see the ellipse “…” option allows for important configurations for each dataset. The ellipses are also available on Reports and Dashboards with various settings. To better understand these options experiment with these. Here we add users to the roles. Note: Refer to the section: Creating a Demo environment to see how to set up an environment with test users. If you’ve already done this or you are using a pre-configured setup, then you will see users pop up as you start to type in names or email addresses. Limitations: Row Level Security in only available on data imported into the model via Power BI Desktop or for Direct Query options that are not SQL Server Analysis Services. Also, the Row Level security for Q&A and Crotona is not yet available but on the road map. CRM and AX7 carry forward the security setting into Power BI. You will however need to thoroughly search on your specific versions and setups - eBECS can help if required. | 10 What is DAX? DAX is a collection of functions, operators, and constants that can be used in a formula, or expression, to calculate and return one or more values. Stated more simply, DAX helps you create new information from data already in your model. Why is DAX so important? You can even create reports that show valuable insights without using any DAX formulas at all. But, what if you need to analyse growth percentage across product categories and for different date ranges? Or, you need to calculate year-over-year growth compared to market trends? DAX formulas provide this capability and many other important capabilities as well. Learning how to create effective DAX formulas will help get the most out of your data. When you get the information needed, you begin to solve business problems that affect your bottom line. DAX Expressions vs. Excel Formulas In Excel, we reference cells or arrays. However, in DAX it is much more like relational data. So, we reference only tables, or columns, or columns that have been filtered. In Excel, we have less data types than we have available to us in DAX. Some of the data types that you have in Excel will have implicit conversion from Excel over to the data type that is supported within DAX. DAX Syntax DAX expressions contain Values, Operators and Functions. Expressions always begin with “=”. Operators have a precedence order however these can be modified by using parenthesis “( )”. | 11 DAX Operators Operators are of four types: Arithmetic - Addition “+”. Subtraction “-”. Multiplication “*”. Division “/”. Exponentiation “^”. Comparison - Equal “=”. Greater/Lesser “>”/ “<”. Greater/Lesser or equal “>=”/ “<=”. Not equal “<>”. Text concatenation - Ampersand “&” e.g. [First Name] & [Last Name] & “from” & [Country] Logical - Combine expressions into a single unit: Logical ‘AND’ “&&” i.e. both expressions evaluate true then true, if not false. e.g. (([Cost] > 125.00) &&([State] = “TX”)) Logical ‘OR’ “||” i.e. if either expression evaluates to true then true, if else false. e.g. (([Cost] > 125.00) ||([State] = “TX”)) DAX Datatypes DAX has a variety of datatypes. Import conversion occurs, allowing users to change/set datatypes. This is then enhanced via data formats enabling additional abilities when using these values. E.g. setting the ‘calendar year’ column to Data Type: [DateTime] and Format: [(yyyy)] for only year to be extracted. Most functions need a parameter to be passed in. These parameters mostly need to be of a certain datatype. In most cases DAX will perform an implicit cast to transform the value in a suitable format. However, if this fails an error is returned. DAX Datatypes: Whole Numbers. Text. N/A. Decimal Numbers. Currency. Data. True/false (i.e. Boolean). Table (i.e. unique datatype for functions. e.g. using the [SUMX] function). DAX Errors There are two types of errors that incur when creating DAX expressions. Syntactical errors (easiest to deal with). E.g. forgotten comma, or un-closed parentheses. Semantic errors (more difficult). E.g. referring to a non-existing column or function or table. Common Syntactic Errors: | 12 DAX Functions Mathematical functions in DAX are very like their siblings over in Excel. Logical functions (by far the most common logical functions utilized), are the IF & OR statements. Information functions are often testing values, as an input to another function, commonly nested inside other functions. DAX References To build on functions please refer to the FUNCTION REFERENCE SITE that provides detailed information including syntax, parameters, return values, and examples for each of the 200+ DAX functions. | 13 DAX Evaluation Context Calculated columns look at every row throughout the entire table using a row context. Measures (a.k.a. metrics) utilize individual cells within the table, so the evaluation context of a measure depends on the function and thus the functions declare which individual cells are used. Note: Measures are preferred over calculated columns, wherever possible, as they compute faster and allow for the evaluation context to be modified through filters or the calculate function. Multiple table evaluation context does not pass the relationship between tables. E.g. If we tried =Table#[Column#]+Table##[Column##] to create a calculated column from two different tables a syntactical error will be generated. This is the case even with established relationships between the tables. The work around for this is to use the [RELATED] function. The correct syntax in this case would be: =Table#[Column#]+RELATED(Table##[Column##]). The related functions will pass the row context to the related table throughout each of these calculated columns. [ALL] function removes filter context from either the table or the table column passed in. It can be thought of as "Remove all filters from a table or table and columns". E.g. syntax layout: ALL({<table> | <column>[, <column>[,…]]} ). The ALL function removes filter context from either the table you pass in, or the table column that you pass in. [FILTER] function is a Boolean evaluation for each row of the table. E.g. syntax layout: FLITER( <Table>, <Filter>) Measures vs. Columns vs. Tables DAX can output three key results: Calculated Measures Calculated Columns Calculated Tables When using DAX, a key foundation you must understand is when to use measures, columns and tables. Calculated Measures return a value, these can be either numbers, dates or text. Note: Generally best practice dictates when results are numeric thus work with aggregation, we should use measures. These are processed on the go thus take up less resources (although cases may differ). e.g. to calculate gross profit in £ via “Gross Profit = SUMX([Revenue]-[Cost of Goods Sold])” | 14 Calculated Columns return an additional column of values for each row on the table. Note: Generally best practice dictates to use columns when the results are unique for each row and cannot be aggregated. This is generally cases where the outcome is dates, text or needs conditional formatting. As these columns are relative to the table thus calculated and stored on the table they taking up more resources than measures. e.g. to extract the month name out of a full date we first extract the month number then translate it into text via “Month Name = SWITCH([Month Number], 1, "Jan", 2, "Feb", 3, "Mar", …….. , BLANK())” Calculated Tables return a table of rows based on a new calculation or an extract from a related table. Note: Generally these are used when we need to restructure data, group or extract other lists forms from exiting datasets. e.g. to create a list of customers from a historical purchase records we could extract each distinct costumer code via “Customers = DISTINCT('Table'[Customer Code])” Measures, Columns and Tables working together In this case example we have a table with transactional values and their corresponding dates; to gain better granularity on our date filtering can create a calendar table to link into the original transactional table. Note: here we are required to generate a table consisting of rows with unique values based on the transaction dates recorded in our original dataset. First we create two calculated measures to find the earliest and latest dates present via “Earliest Date = MIN('Table'[Date])” and “Latest Date = MAX('Table'[Date])” on the transactional table. Second we create a calculated table that also considers the measures to define the start and end date range for the function that generates the table via “Calendar = CALENDAR([Earliest Date], [Latest Date])”. Then, we connect the two tables using the [Date] columns then hiding the transactional[Date] columns. Subsequently, also hiding the two measures from report view in the transactional table. Finally, we can create calculated columns to breakdown the Calendar[Date] column to form columns such as [Year] via “Year = YEAR('Calendar'[Date])”. Aggregators vs. Iterators | 15 I.e. “SUM()” vs “SUMX()” First lets start with what both of these functions are. SUM is an aggregator and SUMX is an iterator. They both can end up giving you the same result, but they do it in a very different way. In short, SUM() operates over a single column of data to give you the result (the aggregation of the single column). SUMX() on the other hand is capable of working across multiple columns in a table. It will iterate through a table, one row at a time, and complete a calculation (like Quantity x Price Per Unit) and then add up the total of all of the row level calculations to get the grand total. e.g. “SUMX()” If your Sales table contains a column for Quantity and another column for “Price Per Unit”, then you will necessarily need to multiply Quantity by the “price per unit” in order to get Total Sales. It is no good adding up the total quantity SUM(Quantity) and multiplying it by the average price AVERAGE(Price Per Unit) as this will give the wrong answer. If your data is structured in this way (like the image above), then you simply must use SUMX() via “Total Sales = SUMX (Sales, Sales[Qty] * Sales[Price Per Unit])”. Note: You can always spot an Iterator function, as it always has a table as the first input parameter. This is the table that is iterated over by the function. e.g. “SUM()” If your data contains a single column with the extended Total Sales for that line item, then you can use SUM() to add up the values. There is no need for an iterator in this example because in this case it is just a simple calculation across a single column. Note however you ‘could’ still use SUMX ()and it will give you the same answer via “Total Sales Alternate = SUMX (Sales, Sales[Total Sales])” Totals Don’t Add Up! There is another use case when you will need to use SUMX that is less obvious. But when you encounter this problem, you will need to use an iterator to solve it. | 16 In the example below, the data is showing 10 customers shopping behavior over a 14-day period. Each customer shops on a different number of days (anywhere between from 5 days to 9 days in this example). The column “Average Visits Per Day 1” is calculating how many times each customer shopped on average (for the days they actually shopped). Customer 10001 shopped on average 1.3 times each day, and customer 10002 only came in once for each day shopped. But do you spot the problem above? The Grand Total of 6.8 doesn’t make any sense. The average across all customers is not 6.8 (it is actually 1.4). The problem is that the Grand Total above is calculated as 95 (total visits) divided by 14 (total days). But by aggregating the Visits and Count of days BEFORE calculating the average means that you end up losing the uniqueness in the customer level detail. This problem would not occur if you use an iterator, because an iterator calculates each line one at a time. The aggregator on the other hand adds up all the numbers BEFORE the calculation – effectively losing the line level detail required to do the correct calculation. | 17 Lookups in DAX When you define a calculated column, you are writing a DAX expression that will be executed in a row context. Since USERELATIONSHIP requires a CALCULATE to be used and the CALCULATE applies a context transition when executed within a row context, obtaining the expected behavior is not easy. Apply USERELATIONSHIP to RELATED If you create a calculated column in FactInternetSales, you might want to use RELATED choosing the relationship to use. Unfortunately, this is not possible. For example, if you want to denormalize the day name of week of the order date, you write: FactInternetSales[DayOrder] = RELATED ( DimDate[EnglishDayNameOfWeek] ) But what if you want to obtain the day name of week of the due date? You cannot use CALCULATE and RELATED together, so you have to use this syntax instead: FactInternetSales[DayDue] = CALCULATE ( CALCULATE ( VALUES ( DimDate[EnglishDayNameOfWeek] ), FactInternetSales ), USERELATIONSHIP ( DimDate[DateKey], FactInternetSales[DueDateKey]), ALL ( DimDate )) There are 2 CALCULATE required in this case: the outermost CALCULATE applies the USERELATIONSHIP to the innermost CALCULATE, and the ALL ( DimDate ) filter remove the existing filter that would be generated by the context transition. The innermost CALCULATE applies the FactInternetSales to the filter condition and thanks to the active USERELATIONSHIP, its filter propagates to the lookup DimDate table using the DueDateKey relationship instead of the OrderDateKey one. The result is visible in the following screenshot. Even if this syntax works, I strongly discourage you using it, because it is hard to understand and it is easy to write wrong DAX code here. A better approach is using LOOKUPVALUE instead, which does not require the relationship at all. FactInternetSales[DayDue] = LOOKUPVALUE ( DimDate[EnglishDayNameOfWeek], DimDate[DateKey], FactInternetSales[DueDateKey] ) | 18 Test your understanding – use File: SourceA This source looks at basic financial data to keep thing simple. Use your new Power BI and DAX skills to create the required calculations. Note: the outcomes are already also in the file so check you have reached the same figures. SourceA Gat data > “SourceA” Then try to create the following: Q1 - Create ‘Gross Sales’ Calculated Column [Units Sold] * [Sales Price] Q2 - Create ‘Net Sales’ Measure ( [Units Sold] * [Sale Price] ) - [Discounts] Q3 – ‘Profit’ Measure =[Sales] - [COGS] Q4 - Create a Calendar/Date Table Q4a - Split the Date into Day, Month, Year using the appropriate functions =DAY([Date]) =MONTH([Date]) <OR> =FORMAT([Date],“MMM”) =YEAR([Date]) <OR> =Format([Date,“YYYY”]) Q4b - Replace the Month Number with Month Names =SWITCH([MonthNum], 1, "Jan", 2, "Feb", 3, "Mar", 4, "Apr", 5, "May", 6, "Jun", 7, "Jul", 8, "Aug", 9, "Sep", 10, "Oct", 11, "Nov", 12, "Dec", BLANK()) Q4c - Insert a Quarters as a Calculated Column =CONCATENATE(“Q”,ROUNDUP(MONTH([Date])/3,0)) Q4d – insert the Week numbers as a calculated column =CONCATENATE(“Week “,RIGHT(CONCATENATE(“0”,WEEKNUM([Date])),2)) Q4e – Grab the system date as a measure =TOYDAY() | 19 Creating a Demo environment A demo environment can be created to test most Power BI features and experiments with the service – apart from the gateways- these will need further setups and configuration. The demo environment can be created by following the following steps: In a search engine look for “Free Trial: Microsoft Dynamics CRM”. Sign up for a free 30-day trial. This will be the main admin account thus you can also test the admin features. In the Office 365 Admin Centre you can create a few test users. Ensure to enable the Product Licenses for Power BI on these user accounts. Finally go into Power BI online and login using these accounts and upgrade them for a free 60-day trial for Power BI Pro. In the Power BI Service, a work group can then be created to include the members added. | 20 Data Mashup Advance This exercise looks combining data from a range of flat files then explore this data through intermediate to advance calculations. Through this exercise we shall explore DAX, M queries and R. 1) Get Data Note: use Folder “FX” In a new power BI file, go to ‘Get Data’, ‘File’, ‘Folder’, “Browse”, then select “FX Samples\Data Sources”. Note: Four options are presented at this point: ‘Edit’ will open the ‘Query Editor’ with the list of details for the Binary files i.e. .xlsx ‘Load’ will import the list of details for the Binary files i.e. .xlsx into the data model ‘Combine & Load’ will open the data model and automatically setup the Binary files to be Appended/combined data sources – as long as the files’ columns structurally match ‘Combine & Edit’ will open the ‘Query Editor’ and automatically setup the Binary files to be Appended/combined data sources – as long as the files’ columns structurally match Select ‘Combine & Edit’, then in the ‘select the object to be extracted from each file’ select “Sheet1”. Note: this step would allow the selection of various sheets/objects, if these are present in the file. In this case there is only one sheet available in the data files. Note: in the ‘Queries’ area, on the left hand pane, notice the various quires generated by selecting this option. Folder have also been generated here to organize new elements. Under the ‘Other Queries’ folder, the query ‘Data Sources’ is the result of the combine effort. Also note, a new column ‘Source.Name’ has also been generated to help identify the source for each row. Note: The next step is to clean up the files. Here there are two options: Edit/clean the result of the query – under the folder ‘Other Queries’ start adding steps onto the ‘Data sources’ query. Edit/clean each file before the query systematically – use the folder ‘Transform File from Data Sources’ and ‘Sample Query’ to programmatically invoke transformations to each file. Note: There are four elements generated here: Sample File Parameters – hosts the location of each file in the M query for the combine function Sample File – illustrates which one of the files from the folder is presented as the sample file | 21 Transform Sample File from Data Sources* – the sample files the steps are added to Transform File from Data Sources – the function that is the result of the steps added to the sample file* Select the ‘Transform Sample File from Data Sources’ then from the ribbon under ‘Query’ section select ‘Advance Editor’. In the editor past the following M syntax. let Source = Excel.Workbook(#"Sample File Parameter1", null, true), Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data], #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]), #"Replace ""null"" with Blank" = Table.ReplaceValue(#"Promoted Headers",null,"",Replacer.ReplaceValue,{"GRG", "Gambling", "Company Number", "Company Name", "Client Type", "Deal Number", "Option Leg Number", "Linked Deal Number", "Original Deal Number", "Department Code", "Department Name", "Trade Date", "Input Date", "Value Date", "Client Group Code", "Client Group Name", "Client Code", "Client Name", "Client Contact", "Client Country ID", "Client Country Code", "Client Country", "Industry Code", "Industry", "Client Size Category Group", "Client Size Category", "Client Size Sub-Category", "First Trade Date", "First Trade Year", "First Trade Month", "First Trade Deal Number", "Client Registration Date", "FCE Contract Type", "FCE Sub Contract Type", "GRP Contract Type Code", "Contract Type", "Contract Type Group", "Broker Code", "Broker Name", "GRP Method Code", "GRP Medium Code", "GRP Lead Source Code", "Main Source", "Sub Source", "Partner Type", "Partner Group Code", "Partner Group Name", "Partner Code", "Partner Name", "Reason for Trade", "Dealer ID", "Dealer Name", "Salesman ID", "Salesman Name", "Profit Currency", "Buy Currency Code", "Buy Currency", "Buy Amount", "Sell Currency Code", "Sell Currency", "Sell Amount", "Option Notional Currency", "TX Fee Nominal", "TX Fee Currency", "28 Day Profit Marker", "Buy / Sell", "Client Buy (Profit Ccy)", "Client Sell (Profit Ccy)", "Broker Buy (Profit Ccy)", "Broker Sell (Profit Ccy)", "Option Notional (Notional Ccy)", "Option Premium - Bank Side (Profit Ccy)", "Option Premium - Client Side (Profit Ccy)", "Category Fee (Profit Ccy)", "Cross Refrence Fee (Profit Ccy)", "TX Fee (TX Fee Ccy)", "Partner Commission (Profit Ccy)", "Average FX Rate (Profit Ccy)", "Average FX Rate (Option Notional Ccy)", "Average FX Rate (TX Ccy)", "Notional", "Turnover", "Cost of Sales", "Total Trading Revenue", "Category Fee", "Cross Refrence Fee", "TX Fee", "Partner Commission", "Net Profit"}), #"Replace ""NULL"" with Blank" = Table.ReplaceValue(#"Replace ""null"" with Blank","NULL","",Replacer.ReplaceValue,{"GRG", "Gambling", "Company Number", "Company Name", "Client Type", "Deal Number", "Option Leg Number", "Linked Deal Number", "Original Deal Number", "Department Code", "Department Name", "Trade Date", "Input Date", "Value Date", "Client Group Code", "Client Group Name", "Client Code", "Client Name", "Client Contact", "Client Country ID", "Client Country Code", "Client Country", "Industry Code", "Industry", "Client Size Category Group", "Client Size Category", "Client Size Sub-Category", "First Trade Date", "First Trade Year", "First Trade Month", "First Trade Deal Number", "Client Registration Date", "FCE Contract Type", "FCE Sub Contract Type", "GRP Contract Type Code", "Contract Type", "Contract Type Group", "Broker Code", "Broker Name", "GRP Method Code", "GRP Medium Code", "GRP Lead Source Code", "Main Source", "Sub Source", "Partner Type", "Partner Group Code", "Partner Group Name", "Partner Code", "Partner Name", "Reason for Trade", "Dealer ID", "Dealer Name", "Salesman ID", "Salesman Name", "Profit Currency", "Buy Currency Code", "Buy Currency", "Buy Amount", "Sell Currency Code", "Sell Currency", "Sell Amount", "Option Notional Currency", "TX Fee Nominal", "TX Fee Currency", "28 Day Profit Marker", "Buy / Sell", "Client Buy (Profit Ccy)", "Client Sell (Profit Ccy)", "Broker Buy (Profit Ccy)", "Broker Sell (Profit Ccy)", "Option Notional (Notional Ccy)", "Option Premium - Bank Side (Profit Ccy)", "Option Premium - Client Side (Profit Ccy)", "Category Fee (Profit Ccy)", "Cross Refrence Fee (Profit Ccy)", "TX Fee (TX Fee Ccy)", "Partner Commission (Profit Ccy)", "Average FX Rate (Profit Ccy)", "Average FX Rate (Option Notional Ccy)", "Average FX Rate (TX Ccy)", "Notional", "Turnover", "Cost of Sales", "Total Trading Revenue", "Category Fee", "Cross Refrence Fee", "TX Fee", "Partner Commission", "Net Profit"}), #"Replace ""n/a"" with Blank" = Table.ReplaceValue(#"Replace ""NULL"" with Blank","n/a","",Replacer.ReplaceValue,{"GRG", "Gambling", "Company Number", "Company Name", "Client Type", "Deal Number", "Option Leg Number", "Linked Deal Number", "Original Deal Number", "Department Code", "Department Name", "Trade Date", "Input Date", "Value Date", "Client Group Code", "Client Group Name", "Client Code", "Client Name", "Client Contact", "Client Country ID", "Client Country Code", "Client Country", "Industry Code", "Industry", "Client Size Category Group", "Client Size Category", "Client Size Sub-Category", "First Trade Date", "First Trade Year", "First Trade Month", "First Trade Deal Number", "Client Registration Date", "FCE Contract Type", "FCE Sub Contract Type", "GRP Contract Type Code", "Contract Type", "Contract Type Group", "Broker Code", "Broker Name", "GRP Method Code", "GRP Medium Code", "GRP Lead Source Code", "Main Source", "Sub Source", "Partner Type", "Partner Group Code", "Partner Group Name", "Partner Code", "Partner Name", "Reason for Trade", "Dealer ID", "Dealer Name", "Salesman ID", "Salesman Name", "Profit Currency", "Buy Currency Code", "Buy Currency", "Buy Amount", "Sell Currency Code", "Sell Currency", "Sell Amount", "Option Notional Currency", "TX Fee Nominal", "TX Fee Currency", "28 Day Profit Marker", "Buy / Sell", "Client Buy (Profit Ccy)", "Client Sell (Profit Ccy)", "Broker Buy (Profit Ccy)", "Broker Sell (Profit Ccy)", "Option Notional (Notional Ccy)", "Option Premium - Bank Side (Profit Ccy)", "Option Premium - Client Side (Profit Ccy)", "Category Fee (Profit Ccy)", "Cross Refrence Fee (Profit Ccy)", "TX Fee (TX Fee Ccy)", "Partner Commission (Profit Ccy)", "Average FX Rate (Profit Ccy)", "Average FX Rate (Option Notional Ccy)", "Average FX Rate (TX Ccy)", "Notional", "Turnover", "Cost of Sales", "Total Trading Revenue", "Category Fee", "Cross Refrence Fee", "TX Fee", "Partner Commission", "Net Profit"}), #"Replace ""0"" with blank on text columns" = Table.ReplaceValue(#"Replace ""n/a"" with Blank",0,"",Replacer.ReplaceValue,{"GRG", "Gambling", "Company Number", "Company Name", "Client Type", "Deal Number", "Option Leg Number", "Linked Deal Number", "Original Deal Number", "Department Code", "Department Name", "Trade Date", "Input Date", "Value Date", "Client Group Code", "Client Group Name", "Client Code", "Client Name", "Client Contact", "Client Country ID", "Client Country Code", "Client Country", "Industry Code", "Industry", "Client Size Category Group", "Client Size Category", "Client Size Sub-Category", "First Trade Date", "First Trade Year", "First Trade Month", "First Trade Deal Number", "Client Registration Date", "FCE Contract Type", "FCE Sub Contract Type", "GRP Contract Type Code", "Contract Type", "Contract Type Group", "Broker Code", "Broker Name", "GRP Method Code", "GRP Medium Code", "GRP Lead Source Code", "Main Source", "Sub Source", "Partner Type", "Partner Group Code", "Partner Group Name", "Partner Code", "Partner Name", "Reason for Trade", "Dealer ID", "Dealer Name", "Salesman ID", "Salesman Name", "Profit Currency", "Buy Currency Code", "Buy Currency", "Sell Currency Code", "Sell Currency"}), #"Replace Blank with ""0"" on num cols" = Table.ReplaceValue(#"Replace ""0"" with blank on text columns","",0,Replacer.ReplaceValue,{"Notional", "Turnover", "Cost of Sales", "Total Trading Revenue", "Category Fee", "Cross Refrence Fee", "TX Fee", "Partner Commission", "Net Profit"}), #"Changed Type" = Table.TransformColumnTypes(#"Replace Blank with ""0"" on num cols",{{"GRG", type text}, {"Gambling", type text}, {"Company Number", type text}, {"Company Name", type text}, {"Trade Date", type date}, {"Input Date", type date}, {"Value Date", type date}, {"First Trade Date", type date}, {"Client Registration Date", type date}, {"Buy Amount", type number}, {"Sell Amount", type number}, {"Option Notional Currency", type number}, {"TX Fee Nominal", type number}, {"TX Fee Currency", type number}, {"Notional", type number}, {"Turnover", type number}, {"Cost of Sales", type number}, {"Total Trading Revenue", type number}, {"Category Fee", type number}, {"Cross Refrence Fee", type number}, {"TX Fee", type number}, {"Partner Commission", type number}, {"Net Profit", type number}, {"Client Type", type text}, {"Deal Number", type text}, {"Option Leg Number", type text}, {"Linked Deal Number", type text}, {"Original Deal Number", type text}, {"Department Code", type text}, {"Department Name", type text}, {"Client Code", type text}, {"Client Name", type text}, {"Client Contact", type text}, {"Client Country ID", type text}, {"Client Country Code", type text}, {"Client Country", type text}, {"Industry Code", type text}, {"Industry", type text}, {"Client Size Category Group", type text}, {"Client Size Category", type text}, {"First Trade Deal Number", type text}, {"FCE Contract Type", type text}, {"FCE Sub Contract Type", type text}, {"GRP Contract Type Code", type text}, {"Contract Type", type text}, {"Contract Type Group", type text}, {"Broker Code", type text}, {"Broker Name", type text}, {"GRP Method Code", type text}, {"GRP Medium Code", type text}, {"GRP Lead Source Code", type text}, {"Main Source", type text}, {"Sub Source", type text}, {"Partner Type", type text}, {"Partner Group Code", type text}, {"Partner Group Name", type text}, {"Partner Code", type text}, {"Partner Name", type text}, {"Reason for Trade", type text}, {"Dealer Name", type text}, {"Salesman Name", type text}}), #"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"GRG", "Gambling", "Company Name", "Client Type", "Deal Number", "Original Deal Number", "Department Name", "Trade Date", "Client Group Name", "Client Code", "Client Name", "Client Contact", "Client Country", "Industry", "Client Size Category Group", "First Trade Date", "First Trade Year", "First Trade Month", "Client Registration Date", "Contract Type", "Contract Type Group", "Broker Name", "Main Source", "Sub Source", "Partner Type", "Partner Group Code", "Partner Group Name", "Partner Name", "Reason for Trade", "Dealer Name", "Salesman Name", "28 Day Profit Marker", "Notional", "Turnover", "Cost of Sales", "Total Trading Revenue", "Category Fee", "Cross Refrence Fee", "TX Fee", "Partner Commission", "Net Profit"}) in #"Removed Other Columns" Note: All the features in the Query Editor result to an ‘M’ syntax. Here we’ve used an existing script to recreate steps previously achieved. Altering the M syntax can also help alter the query to a more advanced level, however it can also break the query if not executed correctly – be careful as this does not have Auto Save nor an Undo feature. Generally, it is advised to use the wizard to generate the M and not manually input it any code, although as you can now see sometimes manual input also has advantages. At this point the steps have now been applied to all the files, then the files have been combined together to form ‘Data Sources’. Select the outcome from ‘Other Queries’. See error message “The column 'Company Number' of the table wasn't found”. To resolve this on the ‘Applied Steps’ select the last step “Changed Type”. Delete this step. Then select all the columns and from the tab ‘Transform’, section ‘Any Column’, select ‘Detect Data type’. Note: The error is caused as the function tried to apply data types on the column from the source however as we have removed some of the columns in our transformations we needed to overwrite this step to ensure its only looking for existing columns. | 22