The Lightbulb Went Off… Sort/Break/Summary (Totals)

This is one of those tricks in Query that typically comes around later on in the learning process. It’s an odd gray area of just a little too complex to effectively teach right away, but useful enough that it really can be a key time-saving and simplifying tactic of many projects. I was just fine for about a year of early query projects without really knowing how to do this. But if I had, I would have been able to navigate complex tables much more comfortably, and create better, cleaner, reports lining up more closely to what an end user hopes to see.

Once you master this trick, you’ll find yourself wondering how you ever did queries before!

“Sort – Break – Summary” + Totals & Sub-totals

The interesting thing about Sort/Break/Summary is that it feels complex, and it feels advanced, and it feels difficult to understand, and it really is all those things if you’re trying to get really serious about understanding the database and why it works. But the way 90% of us are going to use it is truly simple enough for an early learner to pick it up quickly and get huge value.

In short, Sort/Break/Summary will help you summarize, re-organize, and calculate information across the rows of data you’re working with.

TACTIC #1

De-duping to unique members – Getting rid of multiple rows per Account Base

Let’s take the example of online banking logins. You’ve been tasked with compiling a list of memberships who have logged into online banking specifically via the mobile app at least once within the last 3 months. Your data source is the AUDICC table which holds an activity log of all successful online banking logins.

Activity log = 1 row per login. Member 1234999 (let’s call him Dave) has logged in via mobile app 26 times in the last 3 months! He qualifies for my list but I don’t want him on it 26 times. Here’s how the data results look with all 26 of his logins in raw form – we have 1 row for each login along with the date each occurred:

Line…                    Account Base                     Call Date 

0001                       1234999                                20180901
0002                       1234999                                20180906
…                             …                                             …
…                             …                                             …
0025                       1234999                                20181015
0026                       1234999                                20181029

So the solution to cleaning this up is 1, 2, 3.

1 – Sort by Account Base.

Access the Sort button and set a priority of 10 on just the Account Base field.

Now, all the rows with information on Dave’s 26 logins are grouped together, regardless what date they occurred.

2 – Break by Account Base.

Access the Break button and set a level of 1 on just the Account Base field. (As you exit this screen, do not do anything with pop-up windows other than move forward through them. They are not relevant.)

Now, there is a line break inserted between his 26 rows and the next sequential account that also qualifies since they logged in via mobile app once in this time period (maybe they have 11 rows for 11 logins).

Each unique account base becomes its own segment of rows within the data, each with a line break between them.

3 – Summary Output.

Access the Choose Output button and toggle the ‘Detail’ selection over to the ‘Summary’ option instead.

Preview the report. Now, every segment that has been broken apart with a line break is consolidated into just 1 row per segment. Since our segments are set up on the Account Base you have 1 row per member. You’ll still see the line breaks here, but if you are generating a database file with this query the line breaks will not be present in the new database file.

Alternatively, if you had segmented (sort & break) by date, or branch, or maybe the type of mobile app (iOS vs Android), then you would have 1 row for each date, branch, or app version. That could be very helpful if you are also trying to count the number of logins on each date, for each branch, or via each type of app. That kind of calculation is very easy to do at this point.

TACTIC #2

Totaling or Counting __Online Logins__ (Or # Loans, Debit Card Transactions, Credit Report Pulls, etc.)

Let’s continue with our example of Dave, member 1234999. So far we’ve maneuvered the database table into only keeping his account number once even though it’s on 26 rows. Now we want to have our results actually show that he logged in 26 times.

Already having done the Sort/Break/Summary by Account Base… all we need to do now is apply the calculation that counts the # of rows that we summarized into the 1 row for his account.

4 – Totals & Sub-Totals

Access the Totals and Sub-Totals button and use any of the 5 available drop-downs on Account Base to choose the Count option. Actually you could apply the Count calculation to any field because it doesn’t matter what’s in the field, it simply counts the # of rows that the field is present in each break segment.

Now, preview the report to see both his account number and a count of all his logins (rows).

NOTE – the count value (26 for Dave) will be displayed on yet another row when you preview the report. If printing, it will print exactly as you see it displayed. But if you’re creating a database file then everything will roll up into a single row for each account. Here’s an example:

What it looks like in your Query via Preview Report after Sort/Break/Summary/Totals:

Line…                    Account Base 

0001                       1234999
0002       COUNT 26
0003
0004                       1235000
0005       COUNT 11
0006
0007                       1235005
0008       COUNT 39

 

What it looks like in your final database file:

Line…                    Account Base                    ACCTBS05  

0001                       1234999                                26
0002                       1235000                                11
0003                       1235005                                39

 

Don’t forget, this is what it looked like back when the raw data was still 1 row per login:

Line…                    Account Base                     Call Date 

0001                       1234999                                20180901
0002                       1234999                                20180906
0003                       1234999                                20180906
0004                       1234999                                20180906
…                             …                                             …
…                             …                                             …
0025                       1234999                                20181015
0026                       1234999                                20181029
0027                       1235000                                20180901
0028                       1235000                                20180903

 

How amazing is that! You’ve successfully performed aggregation and calculations across the rows of relevant data to get meaningful information results which you can easily use for further analysis or filtering in your report.

Too Complicated?

I personally love this strategy and I’m still learning new ways and new data sets to use it with.  But I want to address a potential complaint – I’ve heard from some that it’s frustrating to remember 4 different pieces of the Query tool to achieve the final result of 1 row per member with info on his total logins. If it’s such a good option to use, why can’t it have a single place to apply all these things to the Account Base field at once?

First, believing that this strategy is only best for finding 1 row per member (Account Base) is short-sighted. It can be used to aggregate information for so MANY other data points – 1 row per branch with total member count, 1 row per dealer with count of write-offs and total written off balance, per loan product, 1 row per loan category with count of loans, total disbursement balance and total current balance for each category.

Second, when I was a new learner on the Query tool I did also struggle to remember this strategy. Which one comes first? How come it’s not doing what I want? Oh yes it’s because I forgot to sort first… and so forth. But truly, it’s only a learning curve and for me it helped to memorize the simple Sort/Break/Summary (SBS) term.

Third, you can’t deny that 3 of the 4 are practically effortless to manage. Pop in and type one thing to set up the sort – done! Pop in and pick one thing to set up the break – done! Pop in and choose the summary setting – done!

The only one which really takes a bit of time and brain power is the Totals and Sub-Totals section. At first, you’ll want to play with these BEFORE you choose the summary setting. That way you can still see the details of everything in the report to evaluate if the calculation you chose makes sense.

And that’s a key point – You really want to be cognizant of what your goal is and what each calculation will be doing across the columns and rows you have in your report. But if you’re not sure, don’t let that stop you. Just set up some options and preview the report to see if it looks like it makes sense. If not, go back and take them off then try something else.

You got this!

Leave a Reply

Your email address will not be published. Required fields are marked *