CASE STUDY. |
OVERCOMING CONCURRENCY ISSUES, BUSINESS INEFFICIENCIES AND REDUCING TECHNOLOGY COSTS.
CLIENT OVERVIEW – E-COMMERCE
A global E-Commerce company focused on high-end retail. Delivering huge value to customers by analysing and understanding large volumes of Web and App data with billions of interactions. Over $1B global revenue per year (2019) and over 4,500 employees across 30+ regions. The customer has a growing number of business users that are reliant on accurate reporting.
THE CHALLENGE. |
The core aim was to enable revenue growth, align data strategy and reduce swiftly growing technical costs.
Seeking a solution to combat significant query timeouts and rising costs with their current data solution, the client needed an evaluation for an alternative data warehouse. We also recommended an audit to their existing data estate, which we would run as part of the evaluation process.
There was a lack of expertise within the internal teams around the technologies being evaluated which did not have clearly defined success criteria or holistic view as well as a need to define how the technology would be measured as a true fit for the business. This was coupled with out-of-date architecture and lack of ownership and documentation around the current data ecosystem.
Disparate data sources added to the problem and many of these were deemed obsolete. There was a lack of governance around who owns each source, its true purpose and data pipelines. Data was known to be inaccurate but was still being used as there were few alternatives.
During peak usage times all available technical resources were being consumed without clear identifiers for this resource allocation or root causes. This resulted in query timeouts and extremely poor performance of the BI & reporting solution.
Over 20 days, BI:PROCSI undertook a rapid, complete data discovery and POC engagement to provide the client with a set of recommendations to enable them to alleviate the issues by fully understanding them and effectively decrease technical debt.
THE SOLUTION. |
The first step in the solution process was to identify the main offending queries and use these as benchmarks in a BI:PROCSI led POC build.
This was achieved by adding resource monitors, applying correct roles and user management methods, removing duplicated users, setting correct warehouse constraints and requirements, using data cloning and perfecting the ingestion process.
Parallel to this, BI:PROCSI ran a Data Discovery workstream, with the aim of capturing the entire data estate and allow us to identify the following:
Data sources, data volumes, operational tools, storage, reporting tools, every table within the business, transformations, lineage and pipelines.
Starting by analysing every single query that was run, BI:PROCSI was able to work back to determine the above details.
The BI & reporting estate was scrutinised and evaluated to identify offending design flaws and improvement opportunities.
THE RESULT. |
BI:PROCSI demonstrated that the BI & reporting architecture was fundamentally flawed, and many derived tables were being built every time a query was run. In addition, over 1000+ dashboards running nearly 250,000 database queries per month was a significant factor in concurrency issues as these queries used unnecessary technical resources.
Added to this, single-user dashboards accounted for 58% of all dashboards.
BI:PROCSI highlighted that a defined leaver process was missing resulting in at least 50 active users with scheduled queries still running despite the recipients and/or owners no longer with the business.
Due to a lack of governance, derived tables had been used to create many ad hoc Views with extremely large queries run every time the dashboard was loaded, the largest query being 63TB in size putting an enormous strain on resources.
Hundreds of tables and views were queried directly from the BI platform using many hundreds of Views over multiple Projects. Outdated tables were identified, and pain points and owners were captured throughout BI:PROCSI’s evaluation process to allow the action to be taken. A key table that was identified as being used for a large portion of company-wide reporting was found to be years out of date, containing hundreds of columns, many of which were redundant.
Establishing technical debt, several disparate data sources were found which, using tens of operational tools, feeding many different, non-unified reporting tools.
Multiple platforms hosted (an ever-growing) multi PBs of data storage either on-premise or in the cloud. The BI:PROCSI-led Data Platform POC was redesigned, successfully completed and identified several improvements into ways of working, a huge saving in data storage costs alongside a 73% saving in query costs alone. Based on half of the data storage, a saving of over $24,000 a month could be achieved across the technical estate.
A 65% increase in performance was achieved over the current solution due to utilising best in class architectural principles, the right tools for the job and starting with the basics, understanding the problem and what needed to be achieved.
Ready to do more with your data?
Let us tell you how.