In our previous article, “Power BI in Office 365: What You Need to Know” we positioned Power Query in the Power BI general scheme with the following image (see Figure 1).
Any user willing to create a new report, or a new data model (that serves as the foundation of a report), needs a set of data. This set of data is returned by a “query” that the user can create himself or that he can reuse from a shared pool of queries (created by other users).
Basically, Power Query is a Microsoft Excel 2013 add-in that lets almost anyone create queries from data coming from a wide variety of data sources and data types. Power Query also allows users to manage these queries and to share them across the company.
Power Query can be seen as a self-service ETL tool for cleaning up and transforming data from different kinds of data sources, a role that SQL Server Integration Service has been playing very well for years in the traditional Microsoft BI stack.
Keep in mind that what we are going to illustrate in this article are just two sequences of the BI process we illustrated in our previous article.
Installing Power Query
Download the add-in from the following Microsoft download web site and once installed, activate it in the File-Option menu (see Figure 2).
When activated, it will show-up in the ribbon (see Figure 3).
Queries can be created from many types of data sources (see Figure 4), which is very convenient.
Queries can be created from the following sources:
External Data :
- Sites with HTML tables: The HTML tables can be used as data source.
- Files: Excel, csv, xml, text; files can be single files or a group of files in a folder.
- Database: SQL Server, SQL Azure, Access, Oracle, DB2, MySQL, PostGreSQL, Sybase, Teradata; however, if a query runs in the cloud and needs to access data stored in your on-premises data center (hybrid scenarios), only SQL Server and Oracle databases are currently supported; (we will cover this in a future article dedicated to the Data Management Gateway).
- Other sources: SharePoint Lists, oData feed, Windows Azure MarketPlace, Hadoop (for Big Data), Windows Azure Blog Storage, Windows Azure Table storage, Active Directory, Exchange, SAP Business Object, Facebook.
Excel Data: A table within the current Excel workbook.
Let’s now go through the whole process by showing how to create queries, and how to share and retrieve them.
Let’s take a very simple example: We have created a list of food items in a SharePoint List and we want to query this list.
Connect Excel 2013 to this SharePoint Online list by providing the SharePoint site url (see Figure 5).
Power Query fetches and displays the SharePoint lists of the web site (see Figure 6).
Let’s select the list (here “PicNicItems”) and create a query by displaying the Power Query Editor (Figure 7).
Excel will then load the selected data (the SharePoint List data) in the Power Query editor (see Figure 8). We still have to transform the data (or to reorganize them) in order to focus only on what we need.
The Power Query editor is a really amazing tool for creating queries, but it also allows many kinds of data manipulations; all actions in this editors will be (automatically) scripted in a language called “M” and will also be recorded in such way they can be repeated when the query will be triggered by the user or by a data model.
In our specific case for instance, we will rename the column “Name1” as “Name”, rename “Type1” as “Type”, and switch “ItemID” and “Type” as illustrated in Figure 9.
Let’s remove the unnecessary columns (see Figure 10).
As mentioned before, all activities in the editor are recorded (see “Applied Steps” in Figure 11), we can always remove some recorded steps and get back to a previous one.
An advanced editor (see Figure 12) allows code editing if needed, even if the good practice is to let Excel generate the “M” code.
The query can be documented and loaded in an Excel worksheet (“Load to worksheet”) or stored in the Excel 2013 data Model (“Load to Data Model”) (see Figure 13).
The queries associated with the workbook can be visualized in the Workbook Queries panel.
As illustrated in Figure 14, the query at this stage is still locally stored in the workbook. But how can we share a query?
Once the query is created, we have the possibility to share our work with other users by saving it in(see Figure 15). However this option requires a Power BI software License in order to have the business intelligence infrastructure in place in Office 365.
Additional details can be provided, like who can use the query (“Share with”) (see Figure 16), or even a link to a document describing the query.
The option “Certify this query for others”, provides an additional indication that the query has been tested. This option is only available to members of a specific BI group: the “data steward” group. Again, this option is available if a Power BI license has been purchased. In the Power BI admin center (see Figure 17), you can add users to the data steward group (see Figure 18).
Finding Shared Queries
Finding an existing query can easily be done by using Power Query. The user must first log in in the organization (with an Office 365 account) (see Figure 19).
The user can access the “Online Search” button (see Figure 20) in the Power Query ribbon.
The Online search panel will show up (see Figure 21). There, the user can type any keyword that matches the query name or its description.
The search result can be narrowed down to queries shared in the data catalog (within the organization, see “2” in Figure 21) or to queries shared by (US) governmental or public bodies (see “PUBLIC” in Figure 21). The user can easily choose the query needed by previewing the data returned by the query (see “3” in Figure 21).
A certified icon appears alongside the query name to let users easily distinguish a certified query (see “4” in Figure 21).
The query can then be selected, loaded to Excel, or added to the data model (by going to the PowerPivot ribbon as Figure 22 illustrates).
Managing Shared Queries
Managing queries means updating, listing, and sharing the existing shared queries.
The “Shared” button (Figure 23) is the usual way to list shared queries that can seamlessly be grabbed from the Office 365 data catalog. As we mentioned before, any sharing activity requires a sign-in and therefore a Power BI software license.
Query contributors can also manage their own queries by leveraging the Power BI App functionalities. This app can be added to any SharePoint site (and works as long as you have at least one Power BI license).
To access the queries the user shared, the user can just right-click a query and select View Statistics as Figure 24 illustrates, to be shown Query Analytics (Figure 25).
Clicking my queries (on the left rail, see Figure 26) will display the list of queries the current user has shared in the data catalog as well as the associated Data Source Status.
Several queries can potentially be associated to the same data source. In the case of the “Contoso Food” query that we created in the beginning of this article, the data source is a SharePoint list.
Queries in Depth
Let’s get a little bit deeper in Power Query by looking at the M language. See Part 3.