My Local, Private AI

Local AI

From project managers to directors, decision-makers are increasingly blinded by the buzz surrounding AI. They're sitting in the front seat of the AI train, clutching the promise of AI like a pilgrim clutching sacred pearls, blind to whether it truly serves their business. In the rush to adopt "the next big thing", they overlook whether AI is actually the right answer - or if there’s even a real problem to solve.

Software Engineers offer a necessary dose of reality, providing a grounded counterpoint to this pervasive enthusiasm. This cycle strongly echoes the dot-com speculative bubble, and its inevitable correction promises to deliver a severe global economic shock. This is precisely why AI must be treated as an integrating tool - a sophisticated component within the delivery mechanism - not as a monolithic, all-encompassing solution. When the AI house of cards collapses, those who mastered the tool will maintain stability, while those who relied solely on the illusion will be buried in the debris.

Taking a step back, taking it offline and identifying the actual problem gives a more balanced and level-headed approach to using AI. Running Large Language Models (LLMs) locally or in-house offers unparalleled flexibility. It's possible to switch models as needed, create new models, RAG and Context train and manipulate System User Prompts to achieve the best performance for each specific task – without being locked into a single model for everything or being forced to pay for premium features.

In my Home Lab, I can do all of this; from basic questions, discussions, queries, even running uncensored models, to context queries and RAG - I have access to it all by configuring models for our specific needs; like my chess playing AI - KnightShift - and my XP coding buddy, Coder AI.

Coder AI is grounded in qwen2.5-coder:7b and is my go-to coding assistant. I have refined the System User Prompt to ensure the answers given are succinct, code samples are constructive and when suggested solutions are repetitive or failing regularly, a new approach is considered without prompting. I can also provide supporting documents to RAG train the model or query in context. This level of control ensures (or drastically reduces) stupidity and hallucinations mainstream models like ChatGPT and CoPilot regularly present.

Privacy

LLM's are pre-trained with no external internet access. Whatever we ask of the local LLMs, they share no data with OpenAI, AI Datacentres or dirty data brokers!

Dedicated Coding Models

As a Software Engineer, I may have complicated questions, tasks or need another set of eyes on a problem. I have LLM's configured that I can use specifically for this.

Model Hot-Swapping

There are a stack of models available for download for different purposes. Running LLM's locally with Ollama, I can hot-swap these to use a LLM best suited to the task in hand.

RAG Capability

Ability to use RAG (Retrieval-Augmented Generation) on any personal files/datasets knowing the data source never leaves the network.

Better Resilience

Local LLMs reduce dependency on external services and remove connectivity issues, ensuring greater reliability and true offline use.

Multi-model Queries

Run multiple LLM models on a single query, see the responses side-by-side and and select the best response from the results.

Learn More

Running Large Language Models (LLMs) locally with Open WebUI offers unparalleled flexibility, allowing me to switch models as needed to achieve the best performance for each specific task – without being locked into a single model for everything or being forced to pay for premium features.

This approach also prioritises my privacy. It keeps my prompts and other data out of large AI datacenters and away from ruthless and untrustworthy data brokers, keeping personal information secure and under my control.

And, with Open WebUI's slick interface modeled like the popular ChatGPT UI, I can seamlessly interact with LLMs in a way that feels familiar and natural, regardless of the model selected.

And for fun, I've taken a qwen model and with some manipulation, I've developed a Chess Playing AI called KnightShift. I play on a physical board and KnightShift plays virtually. The question is, have I beaten it?

Benefits Of Local AI

Data Security Data Privacy Dedicated Models Per Task Control Over Data Usage No Dependence on Cloud Services Unlimited Model Switching (Free) Better Performance for Specific Tasks Prioritised Updates No Subscription Fees Familiar Interface Offline Access to Models Integrate with Local Tools Simplified Maintenance More Accurate Results Reduced Latency Support for Rare or Custom Models Uncensored Models

Key Tools

Ollama (LLM Engine) Docker (Container Engine) Open WebUI (UI/Interface)

Current Models

qwen2.5-coder:7b qwen2.5:7b llama3.2:3b

As you explore the world of Large Language Models (LLMs), you've likely come across popular cloud-based services like OpenAI and Gemini. While these platforms offer impressive capabilities, they also come with limitations – and a significant price tag.

With local LLMs, you can bypass the cloud entirely and unlock a more personalised, efficient, and cost-effective approach to AI development. By running models locally on your machine, you gain complete control over your data and workflows, ensuring that your projects are secure, scalable, and tailored to your specific needs.

No longer will you be tied to expensive subscription fees or limited by cloud infrastructure. With local LLMs, you can train, test, and deploy models at a fraction of the cost, while also enjoying faster processing speeds and more accurate results due to reduced latency. It's an empowering experience that will revolutionise your approach to AI-driven projects.

Whether you're a seasoned developer or just starting out with LLMs, local deployment offers a unique opportunity to take control of your data and workflows. Say goodbye to cloud-based limitations and hello to a more personalized, flexible, and cost-effective way of working with AI.

- Written by Llama3:8b in Open Web UI.

Screen shots

Summarise Documents And Stories.

Use Open WebUI to upload personal or confidential documents and select your preferred LLM(s) to fine-tune their knowledge on these. Here's a summary of a famous fairy tale.

Ask Your Own Questions.

RAG allows you to ask any question about your provided dataset. They can be a vague or as complex as you wish. Use the power of the LLM to construct the answer you want.

Not Sure What To Ask?

Furthermore, the questions you ask with RAG do not have to be specific, the LLM is able to work out what you're asking and use the provided datasets to answer you.

More Brains Is Sometimes Better!

The beauty of Open WebUI is that you can add multiple models to a single session meaning you can query different LLM's at the same time and choose the response you prefer!