Data Discovery and Classification: Why Now is the Time to Address the Gap
- David Houghton

- Sep 7, 2025
- 6 min read
David Houghton, Senior Consultant, BearingNode APAC + Jana — AI Consultant
Key Takeaways
AI value is throttled today by immature data discovery & classification.
Regulatory pressure (Privacy, sovereignty, financial reporting) is accelerating urgency.
Mature discovery reduces risk surface, improves AI model quality & accelerates compliance evidence.
Invest now: cheaper to build proactively than remediate later (3–5x cost delta).
Start with inventory, standard classification, automation, stewardship, continuous improvement.
In an era where data drives every business decision and artificial intelligence promises to unlock unprecedented productivity gains, many organisations are rushing headlong into transformation without addressing a fundamental weakness: non-existent or immature data
discovery and classification capabilities. This oversight isn't just a technical gap, it's a factor that will increase risks of non- compliance, diminished operational integrity, hinder optimal AI deployment(s) and reduced competitive advantage.
The AI Productivity Promise and the Data Foundation Reality
The Australian Productivity Commission's recent interim report on "Harnessing Data and Digital Technology" paints an optimistic picture of AI's potential impact. The report finds that AI will likely add more than $116 billion to Australian economic activity over the next decade.
However, this transformative potential comes with a critical prerequisite that many organisations overlook: comprehensive understanding and control of their data assets.
The report also recommends that the Government establish simple, flexible regulatory pathways to give individuals and businesses greater access to data that relates to them. This regulatory shift toward enhanced data access rights makes robust data discovery and
classification not just beneficial but legally essential.
We agree with the Commission's report. And in working with clients, we've consistently found that the foundational elements of Data Discovery and Classification are often overlooked, immature and/or not integrated with risk management or efforts to capture the AI
productivity advantage.
In many cases, efforts have been narrowly focused on regulatory compliance, such as GDPR, without establishing a broader, more strategic approach. But with the rise of generative AI, both structured and unstructured data are rapidly becoming critical assets. These assets must be discovered, classified, and governed with precision to unlock
their full value and mitigate risk.

The Choice Is Clear
The Australian Productivity Commission's research confirms what data professionals have long understood: the future belongs to organisations that can effectively harness their data assets. With better access to this data, businesses can get more value out of their products and services and get insights and advice that could help them make better decisions.
However, this future is only available to organisations that first understand what data they have, where it resides, and how it should be protected and leveraged. Those that continue to operate with immature data discovery capabilities will find themselves increasingly vulnerable to regulatory penalties, operational failures, and competitive displacement.
The question isn't whether your organisation can afford to invest in mature data discovery and classification capabilities, it's whether you can afford not to. In a world where data is the new oil, organisations operating blind to their own data assets are essentially drilling in the
dark, hoping for the best whilst risking everything.
The time for half-measures and incremental improvements has passed. The organisations that will thrive in the AI-driven economy are those that start building comprehensive data discovery capabilities today. The rest will be left wondering what data they had, where it went, and why they didn't act sooner.
The Foundation Problem: What Is Immature Data Discovery and Classification?
Immature data discovery and classification capabilities manifest in several dangerous ways:
Incomplete Data Inventories
Organisations operate with partial visibility into their data landscape, missing critical data sources across cloud environments, legacy systems, and third-party applications. This incomplete picture means decisions are made on partial information, and regulatory requirements cannot be fully met.
Inconsistent Classification Standards
Without standardised data classification frameworks, the same type of sensitive information may be treated differently across business units, creating compliance gaps and security vulnerabilities.
Manual and Error-Prone Processes
Reliance on manual data identification and classification processes introduces human error, delays, and scalability limitations that become exponentially problematic as data volumes grow.
Siloed Data Understanding
Different business units maintain separate, incompatible views of data assets, leading to duplicated efforts, conflicting classifications, and missed opportunities for data leverage.
The Cascading Risks and blocker to capturing the AI Productivity Gains
Regulatory Compliance Nightmares
The regulatory landscape is evolving rapidly, with privacy laws becoming more stringent and data governance requirements expanding globally. Organisations with immature data discovery face several compliance catastrophes:
Privacy Regulation Violations: The report finds that parts of the Privacy Act are too focused on prescribing actions or procedures businesses must take, rather than outcomes, and can place the burden of privacy protection on individuals, rather than businesses. Without
comprehensive data discovery, organisations cannot identify where personal data resides, making compliance with GDPR, CCPA, or Australia's Privacy Act virtually impossible.
Financial Services Regulatory Gaps: Banking and financial institutions face challenges with Basel Committee requirements for comprehensive risk data aggregation. Incomplete data discovery makes accurate regulatory reporting impossible and exposes institutions to
significant penalties.
Data Localisation Challenges: As governments implement data sovereignty requirements, organisations must know exactly where their data resides and how it moves across borders, which is impossible without mature discovery capabilities.
Operational Risk Amplification
Poor data discovery doesn't just create compliance issues; it fundamentally undermines business operations:
Flawed Decision-Making: Executive decisions based on incomplete data sets can lead to strategic miscalculations, failed investments, and missed market opportunities. When leadership doesn't know what data they have, they can't leverage it effectively.
AI and Analytics Failures: The report finds that improving people's ability to access data that relates to them could spur competition and innovation and deliver productivity gains worth as much as $10 billion a year. However, AI models trained on incomplete or misclassified data produce unreliable results, undermining the productivity gains that AI promises to deliver.
Inefficient Resource Allocation: Organisations may duplicate data collection efforts, over-invest in unnecessary systems, or under-protect critical data assets due to incomplete understanding of their data landscape.
Cybersecurity Vulnerabilities
Unknown data assets represent unknown security risks:
Shadow Data Exposure
Unidentified data repositories often lack proper security controls, creating easy targets for cybercriminals. The 2023 MOVEit breach affected hundreds of organisations precisely because many didn't know where their sensitive data was stored.
Incomplete Incident Response
When a security breach occurs, organisations with poor data discovery cannot quickly identify what data was compromised, hampering incident response and potentially violating breach notification requirements.
Access Control Failures
Without comprehensive data classification, organisations cannot implement appropriate access controls, leading to over-privileged access and insider threat vulnerabilities.
Strategic and Competitive Disadvantages
In today's data-driven economy, poor data discovery creates lasting competitive disadvantages:
Missed Innovation Opportunities
Organisations cannot leverage data assets they don't know they have. This blindness prevents data-driven innovation and limits the ability to develop new products, services, or business models.
Customer Experience Degradation
Without understanding customer data across all touchpoints, organisations cannot deliver personalised experiences or respond effectively to customer needs.
Partnership and M&A Complications
Due diligence processes become complex and risky when organisations cannot provide comprehensive data inventories, potentially derailing valuable business opportunities.
The Financial Reality: Quantifying the Cost of Inaction
The financial implications of immature data discovery are staggering:
Regulatory Penalties
GDPR fines can reach 4% of global annual revenue. The recent $1.3 billion fine against Meta and the $877 million penalty for Amazon demonstrate the scale of potential financial impact.
Operational Inefficiencies
Organisations with poor data discovery typically spend 40-60% more on data management activities due to redundant efforts, manual processes, and system integration challenges.
Lost Revenue Opportunities
McKinsey research suggests that organisations with mature data capabilities generate 20% more revenue from data-driven initiatives compared to those with immature capabilities.
Remediation Costs
Retroactively implementing comprehensive data discovery and classification can cost 3-5 times more than building these capabilities proactively.
The Australian Context: Regulatory Evolution and Economic Opportunity
The Australian Productivity Commission's findings highlight both the opportunity and the urgency. The PC recommends the Government introduce an alternative compliance pathway for business to meet their privacy obligations, focused on outcomes rather than controls-based rules. This shift toward outcomes-based regulation makes robust data discovery even more critical, as organisations will need to demonstrate effective data
protection outcomes rather than simply following prescribed procedures.
Furthermore, Australia is one of the only countries that still requires companies to submit financial reports in non-digital formats like hardcopy or PDF. This makes extracting data from Australian financial reports expensive, time consuming and more prone to error. As Australia moves toward mandatory digital financial reporting, organisations will need comprehensive data discovery capabilities to meet these new requirements efficiently.
Building Resilient Discovery and Classification: A Path Forward
Organisations can no longer afford to treat data discovery and classification as technical afterthoughts. Building mature capabilities requires:
Board and Executive Commitment
Data discovery must be recognised as a strategic imperative with board-level oversight (who is the data sponsoring Non-Executive Director?), Executive Committee alignment and investment.
Technology Investment
Modern AI-powered discovery tools can automate much of the heavy lifting, but they require significant upfront investment and ongoing maintenance.
Cultural Change
Organisations must shift from data hoarding to data stewardship, with clear roles, responsibilities, and accountability for data management.
Continuous Improvement
Data landscapes evolve constantly, requiring ongoing discovery efforts and adaptive classification frameworks.
What is Data Discovery and Classification?
To learn more about how discovery and classification capabilities fit within an enterprise framework for information asset management and observability, explore our interactive Connected Operating Model Cube. You can also find more on this topic over on the BearingNode blog.


