In today’s AI-powered software landscape, developers increasingly want to run LLMs (Large Language Models) locally for privacy, performance, and offline capabilities. With .NET 9, Microsoft continues to enhance its ecosystem, making integration with local AI models more streamlined. In this blog post, we’ll walk through how to run an LLM locally (using models like LLaMA, Mistral, or TinyLlama) and build .NET 9 clients that consume them.


Prerequisites

To follow along, you should have:

  • .NET 9 SDK (Preview or Release Candidate)
  • A local LLM runtime (e.g., llama.cpp, Ollama)
  • Linux/macOS/WSL2 on Windows
  • Basic knowledge of C# and REST APIs

Step 1: Running the LLM Locally

We will use Ollama for its simplicity. Install it:

curl -fsSL https://ollama.com/install.sh | sh

Then pull and run a model:

ollama pull mistral
ollama run mistral

Ollama will now run the LLM on http://localhost:11434.


Step 2: Create a .NET 9 Client Project

Create a new console app using the .NET 9 SDK:

dotnet new console -n LlmClient
cd LlmClient

Update your project to target .NET 9 (edit LlmClient.csproj):

<TargetFramework>net9.0</TargetFramework>

Step 3: Define a Data Contract

Ollama uses a simple JSON structure. Here’s an example payload:

{
  "model": "mistral",
  "prompt": "What is the capital of France?"
}

Create the following C# classes:

public class LlmRequest
{
    public string Model { get; set; } = "mistral";
    public string Prompt { get; set; } = string.Empty;
}

public class LlmResponse
{
    public string Response { get; set; } = string.Empty;
}

Step 4: Build the HTTP Client Logic

Using HttpClient, we’ll send the request and parse the response:

using System.Net.Http.Json;

var client = new HttpClient { BaseAddress = new Uri("http://localhost:11434") };

var request = new LlmRequest
{
    Prompt = "Explain quantum computing in simple terms."
};

var response = await client.PostAsJsonAsync("/api/generate", request);
var json = await response.Content.ReadFromJsonAsync<LlmResponse>();

Console.WriteLine($"AI: {json?.Response}");

Step 5: Add Streaming Support (Advanced)

Ollama supports streaming responses using Server-Sent Events. Here’s how to consume them with HttpClient in .NET 9:

using var requestMessage = new HttpRequestMessage(HttpMethod.Post, "/api/generate")
{
    Content = JsonContent.Create(request)
};
requestMessage.Headers.Accept.ParseAdd("text/event-stream");

using var response = await client.SendAsync(requestMessage, HttpCompletionOption.ResponseHeadersRead);
using var stream = await response.Content.ReadAsStreamAsync();

using var reader = new StreamReader(stream);
string? line;
while ((line = await reader.ReadLineAsync()) != null)
{
    if (line.StartsWith("data: "))
    {
        var jsonLine = line.Substring(6);
        Console.Write(jsonLine); // Or parse JSON fragment
    }
}

Pros and Cons

✅ Pros

  • Privacy: All prompts stay local
  • Performance: Low-latency inference without internet
  • Cost: No cloud usage fees
  • Customizability: You control model choice and deployment

❌ Cons

  • Hardware Requirements: Needs RAM and CPU/GPU resources
  • Model Size: Can consume several GBs of disk
  • Limited Model Selection: Smaller open models may not match GPT-4/Claude performance

Summary

Running LLMs locally and integrating them with modern .NET 9 clients is easier than ever. Whether you want to build a privacy-focused chatbot, a local code assistant, or simply prototype AI features, this architecture offers a great foundation:

  • Use Ollama to simplify local LLM serving
  • Use HttpClient and System.Net.Http.Json for async interactions
  • Leverage streaming for better UX

With .NET 9’s improved performance and language features, integrating AI into native apps or microservices is not only possible, but productive.

🧠 Bonus Tip: Wrap your LlmClient into a minimal API or a Blazor app to bring AI to your UI!

Stay tuned for follow-up posts on embedding these clients into MAUI, Blazor, and ASP.NET 9 applications.

Views: 26

Running an LLVM Model Locally with .NET 9 Clients: A Practical Guide

Johannes Rest


.NET Architekt und Entwickler


Beitragsnavigation


Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert