LexChronos: An Agentic Framework for Structured Event Timeline Extraction in Indian Jurisprudence
Understanding and predicting judicial outcomes demands nuanced analysis of legal documents. Traditional approaches treat judgments and proceedings as unstructured text, limiting the effectiveness of large language models (LLMs) in tasks such as summarization, argument generation, and judgment prediction. We propose LexChronos, an agentic framework that iteratively extracts structured event timelines from Supreme Court of India judgments. LexChronos employs a dual-agent architecture: a LoRA-instruct-tuned extraction agent identifies candidate events, while a pre-trained feedback agent scores and refines them through a confidence-driven loop. To address the scarcity of Indian legal event datasets, we construct a synthetic corpus of 2000 samples using reverse-engineering techniques with DeepSeek-R1 and GPT-4, generating gold-standard event annotations. Our pipeline achieves a BERT-based F1 score of 0.8751 against this synthetic ground truth. In downstream evaluations on legal text summarization, GPT-4 preferred structured timelines over unstructured baselines in 75% of cases, demonstrating improved comprehension and reasoning in Indian jurisprudence. This work lays a foundation for future legal AI applications in the Indian context, such as precedent mapping, argument synthesis, and predictive judgment modelling, by harnessing structured representations of legal events.
💡 Research Summary
LexChronos introduces an agentic framework designed to automatically extract structured event timelines from Supreme Court of India judgments. The authors identify two major gaps in the current legal AI landscape: (1) large language models (LLMs) are typically applied to legal texts as monolithic blocks, which prevents them from capturing the intricate temporal, causal, and hierarchical relationships that are essential for legal reasoning; and (2) there is a severe shortage of publicly available, event‑level annotated datasets for Indian jurisprudence, which hampers reproducibility and benchmarking.
To address these challenges, LexChronos adopts a dual‑agent architecture that iteratively refines candidate events. The first component, called the extraction agent, is a LoRA‑instruct‑tuned LLM with fewer than 4 billion parameters. LoRA (Low‑Rank Adaptation) allows the model to incorporate domain‑specific prompts without altering the bulk of its pretrained weights, enabling rapid adaptation to the legal domain. Given a raw judgment, the extraction agent proposes a set of candidate events, each described by four attributes: a timestamp, a narrative description, the presiding judge(s), and any cited precedent. This schema, termed the LexChronos Event Schema (LES), is derived from a systematic analysis of Indian Supreme Court judgments and is intended to capture the eight canonical components (facts, issues, arguments, analysis, precedent, reasoning, conclusion, etc.) that legal scholars routinely reference.
The second component, the feedback agent, is another LLM of comparable size that evaluates each candidate event, assigns a confidence score, and either accepts, modifies, or discards it. The feedback loop continues until a confidence‑driven stopping criterion is met, ensuring that the final timeline is both semantically coherent and factually accurate. This meta‑cognitive feedback mechanism distinguishes LexChronos from prior single‑pass extraction pipelines and aligns with recent trends in self‑refining LLM systems.
Because no real‑world Indian legal event‑level corpus exists, the authors construct a synthetic dataset of 2,000 judgment‑timeline pairs. The dataset creation pipeline consists of three stages: (i) random selection of one of 25 Supreme Court case categories (e.g., criminal, civil, cyber law), (ii) generation of a structured event timeline using prompts fed to DeepSeek‑R1 and GPT‑4, and (iii) generation of a full‑text judgment that narratively realizes the timeline. Each event is encoded as a JSON object with the four LES attributes. DeepSeek‑R1 contributed 1,000 samples with an average of 27 events and six precedents per case, while GPT‑4 contributed the remaining 1,000 samples with slightly fewer events, providing diversity in style and complexity.
For evaluation, the authors fine‑tune a BERT‑based classifier on the synthetic data and report an F1 score of 0.8751 for event extraction, a notable improvement over prior legal event extraction baselines that often struggle with document‑level temporal reasoning. To assess downstream utility, they feed the extracted timelines into GPT‑4 for summarization. Human evaluators compare these timeline‑augmented summaries against baseline summaries generated from the raw judgment text. In 75 % of cases, evaluators prefer the timeline‑enhanced summaries, citing better coverage of key facts, clearer chronological flow, and more accurate citation of precedents.
The paper also discusses threats to validity, acknowledging that synthetic data may not fully capture the linguistic nuance, ambiguity, and procedural intricacies of authentic judgments. The reliance on sub‑4 B models may limit performance compared to larger LLMs, and the legal implications of erroneous event extraction (e.g., mis‑attributing a precedent) are highlighted as an ethical concern. Future work is outlined: (a) validation on a small, manually annotated real‑world subset, (b) scaling the agents to larger models such as Llama‑2‑70B, (c) extending the LES to include additional legal dimensions (e.g., statutory provisions, procedural stages), and (d) integrating the framework into legal practice tools for precedent mapping, argument synthesis, and predictive judgment modeling.
In summary, LexChronos offers a novel, reproducible pipeline that bridges the data scarcity gap in Indian legal AI and demonstrates that structured event timelines can substantially improve LLM comprehension and downstream tasks. By combining LoRA‑based domain adaptation with a confidence‑driven feedback loop, the framework sets a new benchmark for document‑level legal event extraction and paves the way for more transparent, accountable, and effective AI assistance in the Indian judicial system.
Comments & Academic Discussion
Loading comments...
Leave a Comment