Analytics 1.0 — Business Intelligence
Structure data for business intelligence application, mainly providing visualization and dashboard. Structure data are kept in data warehouse, where in order to improve turn-around time, data marts and operational data store (ODS) were used to create snapshot of data at a specific timestamp.
BI tools provide visual and summarized data to business teams, and data pipelines were build to automate data ingestion. Automation were crucial as time sensitive data have to be ingested into data warehouses using ETL tools and because there’s are limited storage capacities, and processing power — effort were focused on capturing only data that can describe how the business was doing — hence, descriptive analytics only.
Analytics 2.0 — Big Data Analytics
The evolution continues with the arrival of storage virtualization and distributed processing — thanks to Hadoop. Now, due to the low cost of Hadoop — companies now have no reason to start collecting, storing and processing in a massive scale. This accompanied by the growing understanding of applying statistics to computer science (laying the rise of Data Science) and new progress of Artificial Intelligence (AI) algorithms which have yield impressive results.
During this period, as Data Scientist wanted to process unstructured data (image, audio, video, geospatial, tweets, etc), existing databases were not suited due to existing database built-in constrain, designed to ensure consistency as required by traditional application. However, due to the sheer size of the data collected and more importantly — the use of the data was mainly for analytics — constrained doesn’t matter much, so a much relaxed constrain database engine was required, hence we have NoSQL. NoSQL was also designed for analytics as it was designed to stored in columnar method rather than by records, which suited traditional application where transactions were recorded in rows. Analytics were would concern of the summary of a certain field, hence columnar storage was better suited.
Distributed processing, started as MapReduce and now Apache Spark have enable large scale parallel processing without supercomputers (instead using clusters of everyday computers). This enables large dataset being processed in much faster speed. All the storage and processing, coupled with application of statistics and AI, companies are now able to do predictive analytics, as well as optimization for prescriptive analytics.
Analytics 3.0 — Data Enriched Products and Services
In the previous generation, organizations have collect, store and process massive amount of data. They are making data informed decision, based on prescription given by the analytics. Much of the previous generation were focused on driving the business, such as improving customer services, improving sales and marketing, all which were internally focused. The next generation analytics will improve the company’s product and service by infusing it with data, and making the product and services “smarter”, as a differentiation strategy.
Builder a smarter product and services comes from various aspect, but it either begins with what customer wants and how companies can improve profits.
Examples of smart products and services
As an example, a smart insurance product can be in the form of lower insurance premium for safer drivers and drivers with lower mileage. Data is collected on the distant driven, in order to calculate a discounted mileage. AI is applied to automatically access the actual miles driven from photos of the ondometers sent by the customers. Fraud monitoring is applied to maintain risk.
Ten Requirements for Capitalizing on Analytics 3.0 by Davenport
- Combined data from various sources and types
- Adopt new technologies to enable (1)
- Adopt agile methodologies based on iterative design
- Embed analytics every within the organization
- Build data discovery platform
- Foster Cross-disciplinary data teams
- Senior management oversight — Chief Analytics Officer
- Focus on prescriptive analytics, which involve large scale testing and optimization
- Industrialize data analytics by focusing on scalable processes
- Insist on data driven decision making and managing
Data driven organization must adhere to the new paradigm of decision making and managing thru fostering the need to conduct experimentation and leverage of statistical tools to confirm findings. Insist that decision cannot be made using “gut” feel and while it makes decision making faster, the data driven leaders should anticipate the decision he/she must make, so they can gather the data and conduct rigorous analysis before they are supposed to make the decision. Not doing so is just plain lazy.
Case Study
This is how I would describe a financial institution that is leveraging analytics 3.0.
Bank A have created a new seat on the board for the newly hired Chief Analytics Officer, Linda. She was tasked to create the organization culture of ensuring decisions are made using data and daily management activities are guided by data. Gone are the days of using gut feels to drive decision. Although gut feels were successful in the past, some were disastrous and management have been willing to accept it due to unable alternative, until now. If there was an idea that came from gut feel, data must be analyze to confirm the gut feel. The bigger the cost and impact, the more data needed. This was the culture. If time doesn’t permit, then more effort was needed to anticipate the need much earlier.
To speed up analytics, a data platform was built to store massive amount of data and data engineers was hired to build new pipelines to new data points. Self-service platform were deployed to enable the most amount of team members to conduct data discovery. Organization wide training was conducted and included in to HR policies. Cross functional team sharing was motivated.
The organization build a COE to focus on deriving new processes to build scalabilities, ensuring analytics can be applied throughout the organization with the least resources.
From these foundation, the deposit team was able to conduct deep dives into the current depositors based on identify key predictors to churn. The model of back tested and confirmed it’s efficacies. Marketing team was rope in to build a sales funnel to engage the newly identified churners. Models are continuously monitor for performance drift and models are continuously trained on new data. A team of analytics will continuously monitor and conduct training. Processes in place mandated quarterly review, to be presented into monthly committees meeting.
Existing products were enhanced to include elements which were identified to be most appealing to the widest segment of existing depositors. Campaigns with elements of the least segment was generated to target the underserve segment.
The team also conduct a deep dive to all existing segment and build personas for each segment to build a marketing strategies focusing on each personas.
Existing data discovery platform, ingested with new data points build by data engineers, project processes govern by COE scalable. The board gets update thru committees capturing all the models built and all business lines provide uplift revenue thru this committee.
KPI and metrics are in place to provide visibility to Linda on her effort to drive data driven decision making and data infused product throughout the organization.