Astra: AI Safety, Trust, & Risk Assessment

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper argues that existing global AI safety frameworks exhibit contextual blindness towards India’s unique socio-technical landscape. With a population of 1.5 billion and a massive informal economy, India’s AI integration faces specific challenges such as caste-based discrimination, linguistic exclusion of vernacular speakers, and infrastructure failures in low-connectivity rural zones, that are frequently overlooked by Western, market-centric narratives. We introduce ASTRA, an empirically grounded AI Safety Risk Database designed to categorize risks through a bottom-up, inductive process. Unlike general taxonomies, ASTRA defines AI Safety Risks specifically as hazards stemming from design flaws such as skewed training sets or lack of guardrails that can be mitigated through technical iteration or architectural changes. This framework employs a tripartite causal taxonomy to evaluate risks based on their implementation timing (development, deployment, or usage), the responsible entity (the system or the user), and the nature of the intent (unintentional vs. intentional). Central to the research is a domain-agnostic ontology that organizes 37 leaf-level risk classes into two primary meta-categories: Social Risks and Frontier/Socio-Structural Risks. By focusing initial efforts on the Education and Financial Lending sectors, the paper establishes a scalable foundation for a “living” regulatory utility intended to evolve alongside India’s expanding AI ecosystem.

💡 Research Summary

The paper argues that existing global AI safety frameworks suffer from “contextual blindness” when applied to India’s uniquely complex socio‑technical environment. With a population of nearly 1.5 billion, hundreds of languages, a massive informal economy, and a digital public‑goods (DPG) ecosystem exemplified by Aadhaar and UPI, India faces AI‑related hazards that are rarely addressed in Western, market‑centric narratives. These include caste‑based discrimination, exclusion of vernacular speakers, and infrastructure failures in low‑connectivity rural zones.

To fill this gap the authors introduce ASTRA (AI Safety Risk Assessment), an empirically grounded database that classifies AI safety risks (ASRs) as hazards arising from design flaws—such as biased training data or missing guardrails—that can be mitigated through technical iteration. ASTRA adopts a tripartite causal taxonomy: timing (development, deployment, usage), responsible entity (system or user), and intent (unintentional vs. intentional). Building on this, 37 leaf‑level risk classes are organized into two meta‑categories—Social Risks and Frontier/Socio‑Structural Risks—capturing both direct harms (e.g., discrimination, privacy breaches) and structural vulnerabilities (e.g., reliance on foreign APIs, digital sovereignty concerns).

A key methodological contribution is the “remediability heuristic”: a risk is deemed an ASR if a new version of the AI system could plausibly eliminate the harm. This shifts evaluation from purely probabilistic actuarial models—where AI failures are often black‑swans—to a focus on severity and the feasibility of technical remediation.

The paper demonstrates the taxonomy with two pilot domains: Education and Financial Lending. In education, risks include LLMs that cannot recognize regional dialects, leading to unequal access to learning resources, and AI‑assisted tutoring that may inadvertently encode caste bias. In financial lending, biased credit‑scoring models can exclude smallholder farmers, and reliance on digital payment infrastructure can disenfranchise users in poorly connected areas. For each case, ASTRA identifies the causal chain, assigns the appropriate risk class, and proposes concrete mitigations such as diversified data collection, robust guardrails, and infrastructure investment.

The authors critically compare ASTRA with several international frameworks: the EU AI Act (rigorous tiered risk but assumes strong regulatory capacity), Singapore’s Model AI Governance (practical guidance but less suited to India’s scale), UNESCO’s AI Ethics Guidance (high‑level human‑rights focus but lacking operational detail), and the World Bank’s Responsible AI for Development (emphasizes low‑capacity contexts but provides only a broad perspective). They argue that these models either assume uniform digital literacy, homogeneous institutional capacity, or a commercial‑centric deployment model, all of which misalign with India’s DPG‑driven, multilingual, and resource‑constrained reality.

Limitations acknowledged include the early stage of the ASTRA database—empirical incident counts and quantitative severity metrics are still sparse—and the deliberate exclusion of “systemic risks” such as labor‑market disruption, which the authors treat as beyond the technical scope of ASRs. Future work is outlined as continuous data collection, integration of systemic risk categories, and the creation of feedback loops between regulators, developers, and civil‑society stakeholders.

In sum, ASTRA offers a context‑aware, design‑focused risk taxonomy that can be operationalized across Indian public‑sector AI deployments. By grounding risk identification in India’s linguistic diversity, caste dynamics, and infrastructural constraints, and by emphasizing remedial feasibility, the framework aims to balance rapid AI innovation with the protection of fundamental rights and national digital sovereignty.

Astra: AI Safety, Trust, & Risk Assessment

💡 Research Summary

Comments & Academic Discussion

Leave a Comment