top of page

Welcome
to NumpyNinja Blogs

NumpyNinja: Blogs. Demystifying Tech,

One Blog at a Time.
Millions of views. 

How to Stop AI Hallucinations: Using Context Programming and Chain-of-Thought

Updated: 1 day ago

What is Model Context Programming?

Model Context Programming is the art and science of "designing the environment" in which an AI thinks. Instead of just asking a question, you are building a structured digital workspace, complete with specific data, rules, and step by step logic to ensure the AI reaches the right conclusion every time.


Think of it this way:

  • Traditional Prompting is like shouting a question at a stranger on the street.

  • Model Context Programming is like sitting a specialist down at a desk filled with the exact files, calculators, and instructions they need to solve a complex problem.



In LLMs "Context" actually means an AI's working memory.

  • Context Window: The physical limit of tokens (words/parts of words) the model can "see" at one time.

  • In-Context Learning (ICL): The ability of a model to learn a task purely from the examples provided in the prompt, without updating its permanent weights.


While LLMs can seem magical during initial testing, they frequently hallucinate when faced with multi step logic. Transitioning to a Chain of Thought design allows us to explicitly architect the model's internal monologue, turning a black-box response into a traceable, programmed reasoning path.


Why Pair it with Chain of Thought (CoT)?

In the realm of model context programming, Chain of Thought (CoT) is a technique used to improve the reasoning capabilities of an LLM by prompting it to generate a sequence of intermediate steps before arriving at a final answer.



While Context Programming sets the stage (the where and what), Chain of Thought provides the script (the how). Together, they transform an AI from a "text predictor" that guesses the next word into a "reasoning engine" that follows a logical path. By "programming" the context to require a chain of thought, you aren't just getting an answer, you’re getting a verified, step by step solution.


Types of CoT:

Standard prompting often fails at complex tasks because the model attempts to predict the most likely "final answer" token without calculating the underlying logic. CoT forces the model to allocate "compute time" (in the form of generated tokens) to the reasoning process.


The "Few-Shot" CoT Method

The most effective way to program CoT is by providing examples in the context.

  • Standard Prompting:

    • Input: "John has 5 apples. He gives 2 to Mary. How many does he have?"

    • Output: "3."

  • Chain of Thought Prompting:

    • Input: "John has 5 apples. He gives 2 to Mary. How many does he have?"

    • Output: "John started with 5 apples. He gave 2 away, so . John has 3 apples remaining."


Zero-Shot CoT (The "Magic" Phrase)

You don't always need to provide examples. Researchers discovered that simply appending the phrase "Let’s think step by step" to a prompt triggers a significant increase in accuracy for logic, math, and symbolic reasoning tasks. This "programs" the model to enter a reasoning state automatically.


Advanced CoT Variations

As context programming has evolved, several sophisticated versions of CoT have emerged:



Why it Works: The "Working Memory"

In technical terms, CoT works because LLMs are autoregressive. This means each token they generate becomes part of the "context" for the next token.

By forcing the model to write out its reasoning, it is essentially writing its own "scratchpad" or "working memory." If the model correctly identifies a sub problem in Step 1, that correct information is now physically present in its context window, making it much easier to solve Step 2.


Implementation Tips
  • Limit the Verbosity: While you want logic, "over thinking" can lead to token waste. Instruct the model to be "concise but logical."

  • Use Delimiters: When programming CoT, ask the model to wrap its thoughts in tags like <reasoning> so your software can easily strip them out before showing the final answer to an end-user.

  • Check for "Drift": Sometimes a model will start a chain of thought correctly but "drift" into a wrong conclusion. Providing a strong persona (e.g., "You are a logical analyst") helps keep the chain on track.


Real time Example:

"The Missing Refund"

A customer messages a retail brand: "I returned my boots two weeks ago and haven't received my $120 refund. What's going on?"


1. Model Context Programming (The "Stage")

Instead of just sending the user's message to the AI, the developer "programs the context" by injecting specific real time data into the prompt before the AI even sees it.

  • System Prompt: "You are a Senior Billing Agent. You have access to Order_History and Shipping_Logs. You must verify the return status before discussing refunds."

  • Dynamic Context Injection: The system automatically pulls the customer's recent data:

    • Order #8821: Status: Returned.

    • Tracking: Delivered to Warehouse Feb 5th.

    • Refund Status: Pending (Manual Review Required).


2. Chain of Thought (The "Reasoning")

Without CoT, the AI might just say, "I see it's pending, please wait." With CoT, the model is instructed to "think step by step" before responding. Its internal "thought" process looks like this:

[CoT Internal reasoning - hidden from user]

  • Step 1: Check if the item was received. Result: Yes, Feb 5th (14 days ago).

  • Step 2: Check the company policy. Result: Refunds are processed within 10 business days.

  • Step 3: Compare dates. Result: It has been 14 days, which exceeds the 10 day window.

  • Step 4: Identify the bottleneck. Result: The 'Manual Review' flag is still active.

  • Step 5: Formulate a solution. I need to apologize for the delay, explain the manual review, and escalate this to the billing team.


3. The Final Output (The "Result")

"I’ve looked into this for you. I can see your boots arrived at our warehouse on February 5th. Since it’s been 14 days, you should have received your refund by now. It appears your return is currently flagged for a manual review. I have escalated this to our billing team to expedite the payment. You should see an update in 24 hours."


In this example, Context Programming gave the AI the eyes to see the warehouse data and Chain of Thought gave the AI the logic to realize the 10 day policy had been violated.


Conclusion: From Prompting to Programming

The biggest hurdle to AI adoption is trust. We can’t trust a "black box" that might hallucinate at any moment. Chain of Thought (CoT) solves this by making the AI’s logic transparent and traceable. We are moving past the era of "throwing prompts at a wall and seeing what sticks." By mastering Model Context Programming and Chain of Thought, we are finally treating LLMs like the sophisticated reasoning engines they are. Whether you’re using the Model Context Protocol (MCP) to pipe in real time data or forcing a model to "think" before it speaks, you are doing more than just chatting, you are architecting intelligence. The magic isn't in the machine anymore; it’s in how you program the context.



+1 (302) 200-8320

NumPy_Ninja_Logo (1).png

Numpy Ninja Inc. 8 The Grn Ste A Dover, DE 19901

© Copyright 2025 by Numpy Ninja Inc.

  • Twitter
  • LinkedIn
bottom of page