Let’s get a little bit deeper in Power Query by looking at the M language. If you’ve not seen our first article on creating queries in Power Query, you might want to check it out.

Creating Basic M Language Code and Functions

As we mentioned before, the recommended practice is to let Power Query generate the code. In some situations however, it might be interesting to add your own custom touch.

For instance, let’s start from the following basic Excel table where we have three columns (see Figure 1): ItemID (of food), QtyServed, QtyConsumed.

To keep it simple, we are starting from an Excel table, but what we are going to illustrate applies to virtually any data source supported by Power Query.

We would like Power Query to create an additional column, LeftOver.  LeftOver will be the difference between QtyServed and QtyConsumed, but we need the end result in percentage.

Right-click any cell of the table, go to the Power Query ribbon, and select from Table. Accept the default values. Power Query generates a query with the following code in the editor (see Figure 2).

Next, right-click any cell and select Insert Custom column in the Add Colum menu. This sets it up so we can create the LeftOver column (see Figure 3).

Now we will create the LeftOver custom column. It must be a percentage; the M language library provides a function Number.ToTest() where the second parameter is the formatting code. We will choose “p.” More information on this function here at the Microsoft Office site.

Right-click any cell and select Create Custom Column. Name it LeftOver. Fill in the following code (see Figure 4).

Click OK and you will get the expected result (see Figure 5).

Click the Home ribbon and the Apply & Close menu in order to apply the code to the worksheet (see Figure 6).

We can make our code reusable by creating a custom function called LeftoverFunc that will take two parameters (QtyServed and QtyConsumed) and return the difference in percentage.

Go to the Power Query ribbon, select the From Other Sources button and the Blank Query menu item (see Figure 7).

 

In View Advanced Editor (see Figure 9), type the following code:

Click done. We now have to add a new LeftOver Column to the initial query and to apply the function to each row.

Edit the query, add a new custom column (in the query editor, select Column-Insert Custom Column), (Figure 10) name it LeftOver2 (Figure 11).

 

Save the query modifications (File menu-Apply & Close) (see Figure 12).

It is important to keep in mind, that just like queries, functions can be shared (the data catalog) across the company.

On-Premises Data Transformations

Even if the sharing functionalities require Power BI licenses, Power Query can still play the role of an easy-to-use on-premises ETL.

Let’s take the following use case: We regularly receive log files that need to be parsed and transformed. We stored the log files in a specific folder (see Figure 13).

There are different types of log files (here event logs, and SQL Server log files); each SQLServer log file looks like this (see Figure14).

We want to transform all log files stored in a folder and to combine all error lines in a worksheet like this (see Figure 15).

Let’s create a new query with Power Query. Select the From Folder menu item (in the Power Query ribbon) (see Figure 16).

Select the folder location and the following data (file information), will appear in the Power Query editor (see Figure 17).

Some columns, like Attributes provide the possibility to expand each line and to bring additional information to the query (see Figure 18).

For instance, let’s select Size and we will get an additional field (see Figure 19).

Now, we just want to parse and format SQL Server log files--how can we ask the query to forget the other types of files (here we want to skip files with an .evtx extension) that are stored in the same folder?

Select the Extension column, select the extension you don’t want to see anymore, and filter the column with a Does Not Contain rule (see Figure 20).

Don’t forget: The activities that we’ve performed so far are automatically scripted with the M language and can be replayed later on.

Let’s focus on the file content. Since we want our instructions to be applied on all files, let’s select the Content Column and right-click Drill Down (see Figure 21).

 

Click a binary file, right-click it, and select Text. (See Figure 22).

You will now see the file content (in text format) (see Figure 23).

The first 20 lines can be removed (in all log files!). Click the Remove Top Rows menu and specify 20. (See Figure 24).

Let’s create a dedicated column called Date & Time (first 23 characters); select the menu Split Column-By Number of Characters (see Figure 25).

We get the following (see Figure 26).

It is interesting to notice that Power Query has created a new column but has also changed its type, which is now (see Figure 27) a Date/Time:

Let’s rename the new column as Date (just edit the header).

Split the next column (12 characters), select Once, as far left as possible and name it Source.

Since we want to analyze the errors in the log file, we can now filter the last column (see Figure 28).

Edit the M code to filter on the “Error” string (see Figure 29).

 

The rows look like this (see Figure 30).

We will split this column again by using comma as a delimiter (see Figure 31).

After renaming the columns, we get the following result (see Figure 32).

We still have to do the following:

  • remove the dot in the State column
  • remove the string “Error:” in the Error column,
  • remove “Severity:” and “State:” in the other columns.

Let’s use the Replace menu item and here is the final result (Figure 33).

The query can be shared in the Office 365 Power BI Data Catalog or can remain hosted in a workbook if you work on-premises or don’t have a Power BI license. It can be executed as long as a folder in the specific location is provided (see Figure 34).

 

Foundation of Self-Service BI

Power Query plays a key role in the Power BI/ self-service BI process by enabling any power user to find, create, edit, and share queries that can be used in a data model. This is the foundation of reporting, as we will illustrate in our future articles.

Most features provided by Power Query require a Power BI software license as long as the queries need to be shared. Some features can still be operated locally (without any license) on your (on-premises) data center.

Power Query is a very ambitious add-on, and Microsoft regularly updates it. The generated “M” code can easily be extended thanks to a great library of objects and functions that Microsoft will enhance in the near future.

Keep in mind, however, that the huge majority of operations in Power Query don’t require any coding skills, and the UI is rich enough to generate the code accordingly. All operations are reordered in such a way that they can be replayed and reused by other users if needed, and they can be edited and modified and canceled by the user.

Using this tool has been a real pleasure since the very first iterations. We are looking forward to its next evolutions.

Read More

Microsoft Power Query web site

Power Query For Excel Formula Language specification (“M language”)

Power Query for Power BI and Excel by Chris Webb (Apress, 2014)