Information or Insight?
For an asset owner, or indeed anyone managing an investment portfolio, regular reporting is a fact of life. While some may rush to their inboxes with excitement for the latest update, others see reporting as a kind of necessary evil. Teams charged with generating reports lament the repetition, drain of time, and mechanical work required, while recipients are often frustrated with timeliness and lack of flexibility.
Perhaps even more importantly, regular reporting, by its nature, tends to provide information rather than insight. A good report merely provides hints by informing hypotheses a recipient may have. Due to the fixed nature of this reporting, it becomes very difficult for the recipient to discover new perspectives – and new perspectives drive insights.
This has been the case forever; however present-day reporting and analytics is evolving to be about the shortest path to insight – not just about information. To form insight, it helps to be able to discover hidden behaviours and patterns in your data, by essentially ‘having a play’ informed by hypotheses.
I’m sure you’ve heard a lot about these topics over the years – but my intent in this article is to make it more real. So, we are going to illustrate these concepts through a build of an interactive dashboard using Microsoft PowerBI. We will wrangle data, build the dashboard, and then use it to play with and test hypotheses. Finally, we will demonstrate some examples of obtained insights. The best part – you can play with the dashboard yourself to form your own insights, or challenge mine.
The data we will be using is the APRA published Australian Superannuation Fund history, however in the context of our clients we more commonly apply these concepts to investment portfolios providing insights on exposure, performance, risk, pacing, and other analytics – building on & enhancing existing regular reporting.
Australian Superannuation – An Interactive Dashboard
APRA (that’s the Australian Prudential Regulator) publish Superannuation Fund data along with a decade or so back history covering member dynamics, fees, performance and other particulars. What I like about it, aside from its general good quality, relevance to Asset Management & free availability, is that I don’t have preconceptions about insights that I may find – I’m exploring it through this process just as a recipient of a report would do – and I intend to share both successful, and unsuccessful hypotheses I explore along the way.
To build this dashboard we will be using PowerBI, Excel, Amazon Web Services (AWS), SQL Server and DAX Studio, with a small cameo appearance from DataRobot Machine Learning. At any stage, please feel free to…
Never Underestimate Data Wrangling
The data comes in an 11MB Excel spreadsheet (a rather rare format in Finance, I’m sure you’ll agree). We find that many models we look at are either done completely in Excel, or at the very least have an Excel front end. Regardless of whether our data comes from a database or a spreadsheet, our journey starts here – remember, to arrive at insight one must first be good at wrangling data.
Excel is a wonderful and extensively prolific platform across Finance, but its exceptional flexibility can also be its downfall. Just to cite one example, even in this exceptionally clean dataset you will find that asterisks (*) are used to indicate ‘no data available’.
This isn’t a problem per se, except that as PowerBI (and other platforms) interpret this data, in trying to be helpful they see asterisks as text and classify the whole column accordingly. Working around these types of challenges is an example of ‘data wrangling’.
In no particular order, other wrangling examples included treatment of special characters, uniqueness in fund identifiers (just name? ABN? or both?), duplicates caused by differing reporting periods, and simply representing the data with unique column names. In the end, I decided to process the data through Excel VBA and load it into a SQL server hosted on Amazon Web Services. The advantage of SQL server is that it runs certain checks on the data as it is loaded to ensure we don’t run into issues later.
Data practitioners will know that this type of work can be tedious and manual, but if done correctly can unlock significant analytical potential in your data. It goes without saying that good data practices support this process and reduce effort required to prepare data for analytics.
For now though, after a few days of data wrangling, we return to our quest for insights.
PowerBI Interactivity & ‘DAX’
PowerBI is a powerful platform that can ingest, process, transform and present data interactively. It’s a popular and well supported platform, but there are also others that are commonly used. Each have their own pros and cons making them more / less appropriate for a given organisation and task at hand.
To demonstrate the building of the dashboard I’ll start at the first page. Here, I want to play with a well known trend in Australian Superannuation – Fund consolidation. To illustrate the consolidation, lets break up the Funds into 5 buckets sorted by Assets Under Management (AUM). The first bucket should hold the Top 10 funds by AUM for a given year, the next bucket should hold the next 10, and so on. The last bucket contains the ‘rest’. So far so good.
This sets up a perfect example of both the power, and the ‘challenge’ of interactivity – the ranking has to be calculated separately and dynamically for every year as a given fund may be in the Top 10 one year but slip out of the Top 10 in the next. Further, the buckets also change as the dashboard user makes filter selections & interacts with the dashboard. For example, if we select ‘Retail’ Funds only, we want to run the bucket ranking over this narrow selection in real time. PowerBI allows the writing of ‘DAX’ (Data Analysis Expressions) to do this. They’re super powerful, but can get tricky!
Beyond Reporting – Posing Hypotheses
Having completed a first draft of the dashboard, I now had something I could ‘play’ with. As a user of the dashboard, I now wanted to see what I could discover. For example, clicking around fees I discovered that Corporate Funds in 2021 had a significant spike in Admin fees per member (more than doubled) – something that wasn’t visible as I clicked around at headline level, or applicable to other Fund types.
Taking a step back, I was interested in testing some hypotheses – for example, compare fees, performance, and member movements – members should be attracted to lower fees and better performance, but perhaps better performance comes at the cost of higher fees. Well, no, the dashboard showed no correlation between fees and performance, or subsequent member movements.
Not wanting to give up on this line of inquiry just yet, I looked to apply Machine Learning (ML) – perhaps the correlations are nonlinear, or come with a time lag. The point here is that playing with data interactively gives rise to other ideas or avenues for investigation, that static reports may not.
Outside PowerBI, after completing some light ‘feature engineering’ on the dataset, I set up the ML algorithms through DataRobot to attempt to predict member growth based on fee and performance data for preceding years.
Details of this I may cover in a future post as Machine Learning is a whole separate topic – but suffice to say there was too little data & too much noise. The results were inconclusive. So, I returned to the dashboard – let’s look at more basic insights first.
Jumping onto the returns page, I found my own Super Fund and proceeded to (very scientifically) compare its returns to other Funds – by clicking to select the row corresponding to my Fund, and then clicking it again to ‘unselect’ it, repeatedly, while watching the mean return circles jump on the Box Plot at the bottom right. The good news was that the circles hardly moved at all – my Fund was in line with average returns over the last 7 years. Now, what else was there to see?
Finance professionals will know that higher returns often mean higher risk was taken – one measure of such risk is ‘volatility’ or Standard Deviation of Returns. Ordinarily, you would want at least monthly return data for a few years (so like 30+ datapoints) to make statistically significant conclusions about risk / return, however I have only 7 annual data points. Still, I thought I’d see if the hypothesis held, and, as long as we’re being ‘rough’, yes it does!
Ideally, a risk / return plot would show compounded returns over time against their Standard Deviation – but as a shortcut I’m showing Average annual returns – close enough for the purpose. We can see in the middle chart that higher risk does appear to translate to higher return, with Industry Funds the riskiest / most volatile, and Retail ERF funds the least. Retail Funds interestingly seem to underperform – but note the lack of data to make substantial conclusions. Clicking ‘Expand All Down’ (hover mouse towards the top of the chart) shows the ‘cloud’ of all Funds, with a correlation visible to the naked eye – excellent, at least one of my more ambitious hypotheses seems to be confirmed by the data!
So after all that, what is the insight that we managed to extract from 11MB of excel data? Here are some of my highlights.
“Fund consolidation is real: 5x less Funds, 6x more AUM since 2004.”
- Top 10 Funds have grown from 30% to 50% AUM
- Outside of the Top 40 Funds: 501 Funds used to make up 35% AUM; Now, 78 Funds make up 8% AUM
- A few top Funds are eating the many bottom Funds; Middle bands – Funds ranked 11-40 have stayed largely unchanged in AUM terms.
“Admin Fees dominate. Retail Funds are generally the most expensive, but more members mean lower fees. Higher fees do not correlate with better performance.”
- In 2021, Corporate Funds charged higher Average Fees per Member ($842) than Retail Funds ($646) – for the first time ever.
- Corporate Fund Fees rose significantly (~$350 to ~$850) caused basically by Telstra Superannuation Scheme. If I was a member there, I’d have some questions.
- Something interesting is happening with Retail ERF Funds – they’ve lost almost all their members, and their fees per member in 2021 have increased substantially.
“More risk means more return, as theory suggests. Industry Funds outperform Retail Funds in both absolute, and risk adjusted terms. 2020 and 2021 were atypical years for returns, likely due to Covid-19.”
- What a year – The average return for 2021 was 16%. This is higher than the maximum return for any of the preceding 6 years (with one small exception).
- “Compare the Pair” (this is a long running catchphrase in Industry Funds marketing) – Industry Funds really do outperform Retail Funds, and their Fees really are lower (these are possibly related as fees may be ‘netted’ against returns).
- Returns distribution, particularly for any single year, approximates a nice Normal bell curve – with some interesting skews (left / right) between years.
We started this piece with a discussion of regular reporting and how it plays a critical role in running investment portfolios. Broadly speaking, this type of reporting suffers from two major challenges – it is manual and time consuming to produce, and it is too rigid and static to allow alternative perspectives that drive insights.
Interactive dashboards address both challenges. Not only are they interactive, and actively encourage insights through data ‘play’, they are also typically set up to refresh automatically. As an example, I originally built this dashboard before 2021 data was made available. Updating the dashboard with the newly published data took all of 15 minutes – but could have been automated even further.
In terms of Insights – while it’s possible that the summary I’ve shared above could have been deduced from a static report alone, interactivity made it significantly easier. The ability to quickly change selections & compare varying ‘slices’ of the dataset naturally leads to the identification and discovery of patterns.
Did you discover anything unexpected or interesting as you played with the dashboard?
Shoreline provides a wide range of Investment Analytics services – like presenting results of existing analytics and models (say, those done in Excel) as fully interactive, automatically refreshed dashboards. We can help you create a capability to create dashboards within your own organisation, or even create and host specific individual dashboards for you. We also love doing “proof of concept” engagements using light cloud hosted environments – less logistics, more insights.
If you’re ready to start your interactive dashboard journey or are looking to add some sophistication to existing analytics practices, we would love to chat. Get in touch!
The dashboard and our analysis in this article are for illustrative purposes only. Shoreline accepts no liability whatsoever arising from, or connected to, the provision or use of the data in the Dashboard. Colours attributed to Funds, and our analysis in no way implies any form of Shoreline endorsement or advice.