Article

Data’s next frontier: Why architecture, not just AI, will define the future of analytics

Subhashis Manna
By:
Subhashis Manna
1440x600px_Hero_Banner_AdobeStock_354007466.jpg

The age of big data is no longer novel. What’s novel—and urgent—is how organisations navigate the fast-emerging crossroads of real-time analytics, architectural retooling, and privacy-led regulation. In an era where data is both abundant and surveilled, and where AI capabilities are evolving faster than policy frameworks, the true frontier is not in the algorithms. It is in the infrastructure.

This is a first-principles view of how the data landscape is shifting—not as a tech trend, but as a systemic transition grounded in architecture, governance, and the rebalancing of power between design, platforms, and subject matter expertise.

From data lakes to data products: The rise of open, programmable infrastructure

What began as a problem of data scarcity is now a problem of data accessibility. India’s National Data and Analytics Platform (NDAP), led by NITI Aayog, marks a fundamental shift in how governments think about data—not as closed departmental records, but as public infrastructure. The NDAP reorganises hundreds of ministry-level datasets into a coherent, queryable interface, complete with APIs and semantic layers. This is no mere dashboard; it is the groundwork for democratised data and analytics.

In parallel, we see growing momentum behind data mesh architectures—decentralised systems that treat data as a product, owned and maintained by the teams that generate it. First formalised in academic and technical discourse by Zhamak Dehghani in the book Data Mesh: Delivering data-driven value at scale, this architectural principle is quietly transforming how enterprises govern, share, and scale their analytics workflows.

Real-time governance: When dashboards become decision engines

Data isn’t just historical anymore; it’s operational. Platforms like Apache Kafka and Flink, built for millisecond latency, are being adopted across public and private systems to handle event streams in real time. One under-reported example lies in India’s public sector: the Andhra Pradesh Real-Time Governance System (RTGS) integrates live feeds from health, education, and municipal systems, transforming policymaking from post-mortem analysis to dynamic, feedback-driven governance.

This is what the literature describes as the Lambda Architecture — a system combining both batch (historical) and stream (real-time) layers to give administrators both hindsight and foresight (Marz & Warren, 2015). The challenge now is not collecting the data — it’s building systems that can act on it while it’s still relevant.

In the race to harness AI, it's not just the intelligence of algorithms that will shape the future, but the intelligence of the architecture that supports them. Data without design is noise; with the right architecture, it becomes actionable insight. AI may be the engine, but data architecture is the road. Without it, even the smartest algorithms go nowhere. AI learns. Architecture enables. Together, they define the future of data.

 

Subhashis Manna, Partner, New & Emerging Tech, Grant Thornton Bharat

AI is powerful, but uncertainty is inevitable

The conversation around AI tends to center on power—more data, more computing, more accuracy. But increasingly, the frontier lies in how models express uncertainty.

Once a niche domain in academia, Bayesian approaches are now central to modern probabilistic programming frameworks like PyMC3 and TensorFlow Probability. Research from MIT, Cambridge, and elsewhere shows how these models allow for reasoned, interpretable estimates, especially under ambiguity (An Introduction to Probabilistic Programming – van de Meent et al., NeurIPS 2018). As AI enters regulated sectors—healthcare, finance, justice—this probabilistic reasoning isn’t a feature; it’s a necessity.

Meanwhile, the arrival of Large Language Models (LLMs) is remaking analytics from the ground up. LLMs don’t just generate text; they interpret, translate, and synthesise unstructured information into code, SQL queries, and synthetic datasets. Their integration into enterprise analytics pipelines—documented by researchers like Liu et al. (2023)—means that the entire interface between humans and data systems is now shifting from dashboards to dialogue.

And yet, the most profound shift may be architectural: “Agentic” AI systems, capable of planning, reasoning, and using external tools across tasks, are not speculative—they are being operationalised in open-source research (Yao et al., 2022). These systems no longer simply respond—they act.

The regulatory horizon: Law as architecture

Every leap in data infrastructure eventually confronts a second infrastructure: regulation.

The General Data Protection Regulation (GDPR) reshaped global data practices, not through punitive intent but through architectural enforcement: storage minimisation, explainability, and user agency are now design constraints, not afterthoughts. Similarly, the Digital Markets Act (DMA) extends the debate to platform interoperability, seeking to rebalance power from digital gatekeepers to digital citizens.

India’s own Digital Personal Data Protection Act (DPDPA), 2023, mirrors these principles, though tailored, to the Indian federal and sectoral landscape. It mandates purpose limitation, consent-based access, and the creation of independent data protection boards — a departure from self-regulation and toward adjudicated data governance.

And in the US, California’s CCPA (California Consumer Privacy Act) and its successor, CPRA, codify user rights to access, delete, and opt out of the sale of their data. While narrower than GDPR in some respects, its operational impact has forced firms to rethink consent capture, data inventories, and sharing agreements—especially those involving third-party analytics vendors.

Together, these frameworks mark a pivot: data is no longer just a resource; it is a regulated space. Compliance is no longer a matter of policy—it is a matter of systems design.

Data quality and security are the new scarcity

The abundance of data has not eliminated scarcity — it has shifted it. Today’s constraints are around quality, integration, and security.

  • On quality: Open-source frameworks like ‘Great Expectations’ now serve as automated validation engines, enforcing expectations on data integrity before it enters production systems.
  • On integration: MLOps workflows using tools like MLflow and FastAPI ensure that models don’t just work in notebooks but integrate cleanly into enterprise-scale APIs.
  • On security: Differential privacy, homomorphic encryption, and zero-trust architecture (as codified by NIST SP 800-207) are no longer theoretical—they are prerequisites in health, finance, and government sectors.

In each case, the challenge is not that tools don’t exist — it’s that most organisations have not yet integrated them into a single, coherent system.

Edge analytics and augmented intelligence: The next layer of locality

As analytics moves from centralised warehouses to real-time systems, the next frontier lies in edge analytics — processing data closer to where it's generated. Whether in manufacturing systems, smart cities, or health devices, edge computing reduces latency, increases privacy, and enables more context-specific decision-making. NVIDIA’s Jetson platform and Google's Coral Edge TPU are already being integrated into industrial analytics workflows — turning inference into an on-site function rather than a cloud dependency.

In parallel, augmented analytics is democratising access by embedding ML-driven suggestions, automated insights, and natural language queries directly into BI tools. Instead of technical analysts translating dashboards, business users interact directly with the data — an evolution driven by tools like ThoughtSpot, Tableau Pulse, and Microsoft’s Copilot. This is not just usability — it’s an architectural shift toward assistive interfaces that support human cognition, not just automation.

Challenges and potential solutions

Despite innovation, several structural challenges remain. Each challenge presents not just a technical bottleneck but an architectural opportunity:

  1. Data quality and accuracy
    • Challenge: Ensuring the accuracy and quality of data remains a significant challenge. Poor data quality can lead to incorrect insights, misguided decisions, and operational inefficiencies.
    • Solution: Implementing robust data governance frameworks, data cleansing processes, and quality assurance practices can help address these issues. Regular audits, automated data validation tools, and clear data management policies are essential for maintaining high data quality.
  2. Scalability of analytics solutions
    • Challenge: As data volumes grow, scaling analytics solutions to handle larger datasets and more complex analyses becomes increasingly difficult.
    • Solution: Adopting cloud-based analytics platforms and distributed computing technologies can provide the scalability needed. Scalable data storage (like data lakes) and data partitioning enhance performance and manageability.
  3. Data privacy and compliance
    • Challenge: Navigating complex data privacy regulations and ensuring compliance across jurisdictions is a growing concern.
    • Solution: Developing a comprehensive data privacy strategy—compliant with laws like GDPR, DPDPA, and CCPA—alongside regular audits and employee training can mitigate risks. Privacy-enhancing technologies (PETs) and compliance tools also help streamline adherence.
  4. Integration of AI and ML models
    • Challenge: Integrating AI/ML into existing systems is difficult due to compatibility issues, data integration complexities, and maintenance challenges.
    • Solution: Using standardised APIs, modular AI frameworks, and continuous monitoring ensures smoother integration. Cross-functional collaboration between data scientists, IT, and business units is key to deployment success.

The strategic view: Architecture as competitive advantage

The dominant narrative around data remains algorithmic: better models, bigger transformers, faster inference. But this narrative is incomplete without data, architecture design and related infrastructure considerations. The real advantage lies not in the AI layer, but in the system that surrounds it: the infrastructure that governs, secures, and deploys it at scale.

Just as the railroads enabled commerce, not locomotives alone, the future of analytics will be determined not by individual models but by the architecture that connects them: semantic layers, real-time engines, federated governance, and trust-by-design systems.

Governments have begun treating data platforms as national assets. Enterprises must now follow suit—reimagining their data infrastructure not as back-office plumbing, but as competitive infrastructure in a world that will increasingly reward data trust, data fluency, and data agility.

 

Santanak Datta, Manager, Grant Thornton Bharat, has also contributed to this article.