Fine-Tuned Large Language Models for Logical Translation: Reducing Hallucinations with Lang2Logic

Reading time: 4 minute
...

📝 Original Info

  • Title: Fine-Tuned Large Language Models for Logical Translation: Reducing Hallucinations with Lang2Logic
  • ArXiv ID: 2512.02987
  • Date: 2025-12-02
  • Authors: Muyu Pan, Dheeraj Kodakandla, Mahfuza Farooque

📝 Abstract

Recent advances in natural language processing (NLP), particularly large language models (LLMs), have motivated the automatic translation of natural language statements into formal logic without human intervention. This enables automated reasoning and facilitates debugging, finding loop invariants, and adhering to specifications in software systems. However, hallucinations-incorrect outputs generated by LLMs are challenging, particularly for logical translation tasks requiring precision. This work introduces a novel framework that inputs English sentences, converts them into logical expressions, and then translates them into Conjunctive Normal Form (CNF) for satisfiability solving. It employs classical NLP techniques with self-defined grammar, symbolic computation libraries, and a fine-tuned language model to reduce hallucinations. In the early experiments, we observed that the fine-tuned model, trained on different grammar settings, could intentionally correct the same types of hallucinations made by the original model. Thus, it provides reliable CNF generation.

💡 Deep Analysis

Figure 1

📄 Full Content

Fine-Tuned Large Language Models for Logical Translation: Reducing Hallucinations with Lang2Logic Muyu Pan Computer Science and Engineering Pennsylvania State University Pennsylvania State University mfp5696@psu.edu Dheeraj Kodakandla Computer Science and Engineering Pennsylvania State University Pennsylvania State University djk6439@psu.edu Mahfuza Farooque Computer Science and Engineering Pennsylvania State University University Park, USA mff5187@psu.edu Abstract—Recent advances in natural language processing (NLP), particularly large language models (LLMs), have mo- tivated the automatic translation of natural language statements into formal logic without human intervention. This enables automated reasoning and facilitates debugging, finding loop invariants, and adhering to specifications in software systems. However, hallucinations-incorrect outputs generated by LLMs are challenging, particularly for logical translation tasks requir- ing precision. This work introduces a novel framework that inputs English sentences, converts them into logical expressions, and then translates them into Conjunctive Normal Form (CNF) for satisfiability solving. It employs classical NLP techniques with self-defined grammar, symbolic computation libraries, and a fine-tuned language model to reduce hallucinations. In the early experiments, we observed that the fine-tuned model, trained on different grammar settings, could intentionally correct the same types of hallucinations made by the original model. Thus, it provides reliable CNF generation. Index Terms—Logics, LLM Hallucinations, Natural language Processing, LLM fine-tuning I. INTRODUCTION Natural Language Processing (NLP) [1] was initially con- ceptualized by Swiss linguist Ferdinand de Saussure, who introduced the idea that language meaning is created through internal relationships and contrasts. Shared linguistic structures enable communication. In 1950, Alan Turing proposed the concept of a ”thinking machine,” suggesting that a machine capable of communicating with humans through a teleprinter demonstrates cognitive capability. Contemporary NLP plays a critical role in understanding human language and generating contextually appropriate responses, exemplified by intelligent assistants like Apple’s Siri and Amazon’s Alexa, which pro- vide personalized assistance and process user requests au- tonomously. Large Language Models (LLMs) [2] are sophisticated ar- tificial intelligence models constructed using deep learning methodologies, trained on extensive datasets, and capable of generating human-like textual content. Grounded in trans- former architecture, these models are designed to capture complex linguistic nuances and long-range textual dependen- cies, enabling advanced capabilities such as machine transla- tion, conversational interaction, and content generation. LLMs not only comprehend human languages but also demonstrate applicability across diverse research and industrial domains. OpenAI’s ChatGPT [3] serves as a prominent example of LLM technology utilized extensively in daily applications. Hallucination [4] in language models represents a phe- nomenon where, based on memorized training data patterns, the model generates outputs containing fabricated, plausible- sounding information when confronted with unseen scenarios. The consequences of hallucinations can range from minor inconsistencies that cause user confusion to critically signifi- cant errors in sensitive domains such as language translation, software development, or autonomous systems. Mitigating hal- lucinations [5] in LLMs is paramount for ensuring reliability, safety, and practical applicability, particularly when deploying these models in critical or sensitive contexts. To address the challenge of hallucinations, fine-tuned mod- els [3] have emerged as an effective solution. These models are pre-trained machine learning models optimized for spe- cialized task domains, demonstrating superior performance compared to generalized models through targeted training on smaller, domain-specific datasets. During the fine-tuning process, model parameters are meticulously adjusted to en- hance precision and generalization capabilities. This approach leverages the foundational language understanding acquired during initial large-scale training, subsequently refining the model’s focus on specific target tasks. Recent works such as LogicLLaMA [6] and LOGIC-LM [7] have pioneered advancements in logical reasoning by fine-tuning LLMs for specialized tasks. LogicLLaMA fine- tunes LLaMA on a dataset of verified NL-FOL pairs to translate natural language to first-order logic (FOL) and mit- igate hallucinations using reinforcement learning with human feedback (RLHF). Similarly, LOGIC-LM integrates LLMs with symbolic solvers, converting NL into structured symbolic formulations for deterministic inference while using solver feedback to self-refine and improve accuracy on logical rea- soning benchmarks. These studies h

📸 Image Gallery

example.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut