• Ben Watt

Calculating Business Hours in Power Query

When analysing business processes which have a start time & end time it's not uncommon for KPIs to require the business duration between those times. Until now, if I was measuring business days between dates, I was doing it in DAX using my DimCalendar table, similar to this article at SQLBI.com. It's very fast and is not a big performance hit on your data model.




Recently I had a requirement to do business hours calculations, accurate to minute level. I chose Power Query to attack this problem, it performed well so I thought I'd share it with you.


Approach


Here's the overall approach:

  • With a Start Datetime and End Datetime, I generate a list of every timestamp between the two.

  • The increments of the timestamps are in 1 minute intervals.

  • Each of the timestamps in this list is evaluated and flagged as being within business hours, or not.

  • The timestamps flagged as being within business hours are counted to give the total number of minutes

  • The business minutes can feed into KPIs analysing business processes


The intervals of minutes in the above can be replaced by hours, depending on how granular your source data is, and how accurate your reporting requirements are. The trade-off being performance to calculate the business duration is 60X better with hours, but less accurate.


The coding part


In Power Query, it only took two steps in a function to achieve it!


It boils down to two key functions. List.DateTimes and List.Accumulate


List.Datetimes is relatively straight forward. It generates a list object containing all of the timestamps between the start & end.


List.Accumulate is a very clever function that iterates through a list object, applying logic for each item and accumulating a total until you reach the end.


Without further ado, here is the simple function I used. The below is named as fn_Get_BusinessMinutes in my list of queries.


You can also take public holidays into account. I generated a list of Irish holidays, which you can edit to your own location



let
  HolidayList = {
    #date(2022,1,3),#date(2022,3,17),#date(2022,4,18),
    #date(2022,5,2),#date(2022,6,6),#date(2022,8,1),
    #date(2022,10,31),#date(2022,12,27),#date(2022,12,26)
    }
in
  HolidayList


Now, here's the function!


(StartDateTime as datetime, EndDateTime as datetime) => 

let
    Source = List.DateTimes(
        StartDateTime,
        Duration.TotalMinutes(EndDateTime-StartDateTime),
        #duration(0,0,1,0)),
    GetDays = List.Accumulate(
        Source,
        0,
        (state,current) => 
            if List.Contains(HolidayList, Date.From(current)) then state+0
            else if Date.DayOfWeek(current, Day.Monday) <5 and Time.Hour(current) >=9 and Time.Hour(current) <17 then state+1
            else state+0)
in
    GetDays



Enhancements


The above query can be modified to be more dynamic/customisable. For example, I hardcoded the business hours being 9-5. They could become parameters.

Performance


It's a relatively meaty calculation so I put a few tests together to see how it performed. The total time to calculate this for a dataset is the combination of both the total row count and the number of increments between your start & end dates.


For example, if your dataset is 1000 rows and the average number of minutes between the start and end of each row is 600, then the List.Accumulate function (which does most of the work) is going to iterate over 1000 lists with 600 items in each, a total of 600,000 increments on average.


I did ten different performance tests, starting with 1000 rows, then 2000, up to 10,000. I wanted to see if the performance was affected linearly, or exponentially. My start & end dates averaged 5.5 days apart, so about 7,900 minutes for each row. With that, my first test of 1000 rows processed 7.9 million increments, the second 15.8 million, and so on up to 79 million increments in a 10,000 row dataset.


Here is a view of the dataset:


I did the test using Power BI dataflows and exported the run times. I also used the Power Query Diagnostics to see where the time was spent. Let's have a look at the results.



As you can see, the processing time increases almost perfectly linearly. That said, there is going to be a point where this is impacted by processor limitations. I do think just under 80 million list items churned by List.Accumulate() in 60 seconds is quite good!


Adding a little more to this, as I've just re-assessed the way Power BI dataflows scale with the increasing workload.


Just after posting this blog, I did the above tests again with 50K and 100K rows, meaning around 395 and 790 million increments across up to 100,000 lists. Power BI actually performs faster, on a thousands of rows per second metric. The 100,000 row dataset with 790 million list items took just over 8 minutes. Colour me impressed!


Just to note, this is done in a Premium Per User workspace.


Performance Improvements


If you have larger datasets or wider distance in time between your start & end dates, there is a couple of things you can do.


  1. Reduce your increments from minutes to hours - assuming the lower fidelity still meets your business requirements. For example, if your start & end dates are weeks/months apart, its probably less likely you'll need accuracy down to the minute so you can move to an hourly granularity.

  2. Incremental refresh will be very helpful, so you're only processing recent rows, as opposed to the whole dataset each time.

When I changed my tests above to hourly increments, it dropped the total number of increments for the 10,000 rows dataset from 79 million to 1.3 million, and it finished in 5 seconds.


Just to add


My recommendation is to also include the Total Duration in your data model, along with the Business Hours duration. Sometimes your Business Hours calculation will result in a zero value, if the activity started & finished outside business hours.


Also, the Total Duration gives the full end-to-end duration of an activity (i.e. user experience), whereas Business Hours is generally used for KPIs or SLAs. Both have their value in different ways.


Try it yourself


I created a simple demo .pbix file you can download and edit to your own desire. If you have any useful tweaks, I'd be happy to hear from you.


Links


Sample PBIX file for you

https://github.com/datalineo/CodeShare/tree/master/Power_BI_CalcBusinessHours


Documentation on the two key functions

List.Accumulate

https://docs.microsoft.com/en-us/powerquery-m/list-accumulate


List.DateTimes

https://docs.microsoft.com/en-us/powerquery-m/list-datetimes