If you want a quick read on the quality of every column in a table, how many rows there are, how many blanks, how many distinct values, Table.Profile gives you all of it in one shot. It returns a small summary table with one row per column, so you can catch data problems before you build anything on top of them.
In this article I’ll show you how to use Table.Profile and how to read each statistic it returns.
Syntax of Table.Profile Function
Table.Profile(table as table, optional additionalAggregates as nullable list) as table
where
table(required, table). The table whose columns you want to profile.additionalAggregates(optional, nullable list). Lets you add your own statistic columns to the profile output. Omit it for the standard set of statistics. This argument depends on the engine version (more on that in Example 5).
Returns: a table with one row per column in the source. The columns are Column, Min, Max, Average, StandardDeviation, Count, NullCount, and DistinctCount.
In plain terms, you hand it a table and it hands back a one-row summary of each column’s spread, its blanks, and how many distinct values it holds.
Example 1: Profile every column in a table
Say you have an EmployeeData table with a Department, Salary, and Bonus column. Some cells are blank.
Here is the starting data:
| Department | Salary | Bonus |
|---|---|---|
| Sales | 50000 | 5000 |
| Sales | 60000 | |
| Engineering | 90000 | 9000 |
| Engineering | 90000 | 8000 |
| Marketing | 55000 | |
| 70000 | 6000 |
Set the column types first, then run Table.Profile on the typed table:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeData"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Department",type text},{"Salary",Int64.Type},{"Bonus",Int64.Type}}),
Profile = Table.Profile(Typed)
in
Profile
You get one row per source column. The engine returns those rows in alphabetical column order:
| Column | Count | NullCount | DistinctCount | Min | Max |
|---|---|---|---|---|---|
| Bonus | 6 | 2 | 5 | 5000 | 9000 |
| Department | 6 | 1 | 4 | ||
| Salary | 6 | 0 | 5 | 50000 | 90000 |
The full output also includes an Average and a StandardDeviation column for the numeric columns. Those are floating-point values, so they are left out of the grid above.
Notice that Count is 6 for every column. That is the full row count, blanks included. The next few examples pull these statistics out one or two at a time so each one is easy to read.
Example 2: Pull just the row count and the blanks
The full profile has eight columns, which is a lot to scan. You can keep only the ones you care about with Table.SelectColumns, then sort by Column for a stable view.
Here we keep Count and NullCount:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeData"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Department",type text},{"Salary",Int64.Type},{"Bonus",Int64.Type}}),
Profile = Table.Profile(Typed),
Keep = Table.SelectColumns(Profile,{"Column","Count","NullCount"}),
Sorted = Table.Sort(Keep,{{"Column",Order.Ascending}})
in
Sorted
The Table.Sort step is only there to give a clean, predictable order. The numbers are the point:
| Column | Count | NullCount |
|---|---|---|
| Bonus | 6 | 2 |
| Department | 6 | 1 |
| Salary | 6 | 0 |
Count is 6 across the board because it counts every row, including the blank ones. NullCount then tells you how many of those rows are blank in each column. Bonus has two blanks, Department has one, Salary has none.
Example 3: Spot data-quality issues with NullCount and DistinctCount
This pairing is the most useful one for cleaning data. Keep NullCount and DistinctCount:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeData"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Department",type text},{"Salary",Int64.Type},{"Bonus",Int64.Type}}),
Profile = Table.Profile(Typed),
Keep = Table.SelectColumns(Profile,{"Column","NullCount","DistinctCount"}),
Sorted = Table.Sort(Keep,{{"Column",Order.Ascending}})
in
Sorted
| Column | NullCount | DistinctCount |
|---|---|---|
| Bonus | 2 | 5 |
| Department | 1 | 4 |
| Salary | 0 | 5 |
Look at Department. It holds three real names (Sales, Engineering, Marketing) plus one blank, and its DistinctCount is 4.
That is because Table.Profile counts null as one of the distinct values. So a DistinctCount that is one higher than you expected is usually telling you the column has blanks in it.
Example 4: Read the smallest and largest numeric values
Min and Max are only meaningful for the numeric columns, so filter the Department row out with Table.SelectRows, then keep Min and Max:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeData"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Department",type text},{"Salary",Int64.Type},{"Bonus",Int64.Type}}),
Profile = Table.Profile(Typed),
Numeric = Table.SelectRows(Profile,each [Column]<>"Department"),
Keep = Table.SelectColumns(Numeric,{"Column","Min","Max"}),
Sorted = Table.Sort(Keep,{{"Column",Order.Ascending}})
in
Sorted
| Column | Min | Max |
|---|---|---|
| Bonus | 5000 | 9000 |
| Salary | 50000 | 90000 |
Salary runs from 50000 to 90000, and Bonus from 5000 to 9000. This is a fast way to catch a stray value, like a salary that should be five digits showing up as six.
Example 5: Add your own statistic with additionalAggregates
The optional second argument lets you bolt on a custom statistic. Each entry is {newColumnName,columnSelector,aggregateFunction}. Here we add a Median for the numeric columns:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeData"]}[Content],
Typed = Table.TransformColumnTypes(Source,{{"Department",type text},{"Salary",Int64.Type},{"Bonus",Int64.Type}}),
Profile = Table.Profile(Typed,{{"Median",each Type.Is(_,type number),List.Median}}),
Keep = Table.SelectColumns(Profile,{"Column","Median"})
in
Keep
The selector each Type.Is(_,type number) limits the aggregate to numeric columns, and List.Median computes the median. This argument only works on newer Power Query engines. Older ones reject the second argument, so test it in your own environment before relying on it.
Things to keep in mind with Table.Profile
- Type your columns first or the numeric stats come back blank.
Min,Max,Average, andStandardDeviationonly fill in for columns Power Query knows are numbers. A freshly imported column is typeany, so set the types withTable.TransformColumnTypes(as Example 1 does) before profiling. DistinctCountincludesnull. A column with blanks counts those blanks as one distinct value, so the number can read one higher than the real categories. If you want a clean list of the actual values,Table.Distinctis the function to reach for.Countis the full row count, blanks and all. It is the same value for every column and equals the table’s row count. To count only filled cells, subtractNullCount.- It scans every row. On a large table that is real work, and it breaks query folding, so the source has to send all the data to Power Query first.
- The UI has a lighter version. The “Column quality” and “Column profile” views in the Power Query Editor give a similar read without writing any M, handy for a quick eyeball.
Performance and query folding
Table.Profile reads every row to compute its statistics, so it does not fold back to the source. Against a database, that means Power Query pulls the whole table down and profiles it locally. On a big table that can be slow, so profile a sample, or lean on the Editor’s column-profile view, while you are still exploring.
Common questions about Table.Profile
What is the difference between Table.Profile and Table.Schema?
Table.Profile reports statistics about the data itself (counts, blanks, distinct values, min and max). Table.Schema reports metadata about the columns (their names, positions, and data types), not the values inside them.
Does Table.Profile look at every row of a large table?
Written in M, yes, it reads all rows. The Power Query Editor’s column-profile view is different. By default it profiles only the first 1000 rows, and you can switch it to the whole dataset with the status-bar option at the bottom of the Editor.
List of All Power Query Functions
Related Power Query Functions / Articles: