3 Commits 6f7d682a39 ... fb401bc2fa

Author SHA1 Message Date
  windhamdavid fb401bc2fa ralph 3 weeks ago
  windhamdavid 4d6d7d1776 RAG 1 month ago
  windhamdavid d323493246 posts cult 1 month ago

+ 13 - 0
docs/ai/ai.md

@@ -0,0 +1,13 @@
+# AI
+
+26/02/23 - I've gradually been adopting more agent involvement into my workflow and I really need to start making and keeping some docs on the setups as they evolve.  
+
+## Claude 
+
+## CoPilot
+
+
+## MCP
+
+
+## Ralph

+ 1 - 0
docs/index.md

@@ -15,6 +15,7 @@ I use this library of documents as a quick reference to find technical answers,
 
 
 ## Log
 ## Log
 
 
+- 26/02/23 - ๐Ÿ‘พ [AI Agents](/docs/ai/ai.md)
 - 25/11/08 - ๐Ÿ›๏ธ [Kosmo](/docs/computers/kos.md)
 - 25/11/08 - ๐Ÿ›๏ธ [Kosmo](/docs/computers/kos.md)
 - 25/04/07 - ๐Ÿชฉ [Stu](/docs/computers/stu.md)
 - 25/04/07 - ๐Ÿชฉ [Stu](/docs/computers/stu.md)
 - 25/02/13 - ๐Ÿฆ‘ [Squid](/docs/computers/squid.md)
 - 25/02/13 - ๐Ÿฆ‘ [Squid](/docs/computers/squid.md)

+ 1 - 0
notes/index.md

@@ -7,6 +7,7 @@ slug: /
 
 
 ## Log
 ## Log
 
 
+- 26/02/18 - ๐Ÿค– [projects/RAG](/notes/work/projects/ai-rag)
 - 26/01/02 - ๐ŸŽจ [art/paint](/notes/art/paint)
 - 26/01/02 - ๐ŸŽจ [art/paint](/notes/art/paint)
 - 25/08/16 - ๐Ÿšฐ [house/bath](/notes/house/bath)
 - 25/08/16 - ๐Ÿšฐ [house/bath](/notes/house/bath)
 - 25/04/26 - ๐ŸŽน [music/music](/notes/music/)
 - 25/04/26 - ๐ŸŽน [music/music](/notes/music/)

+ 106 - 0
notes/work/projects/ai-rag.md

@@ -0,0 +1,106 @@
+# RAG
+
+(26/02/18) Although I still primarily use the Lunr XML search feature, I occasionally use the AI helper and I want to get rid of the remote API calls. I'm adding this page for documenting the process. 
+
+## Overview
+Build a fully local RAG system that ingests ~313 markdown files from the Docusaurus site, stores them in ChromaDB with Ollama embeddings, and provides a web UI for semantic search and Q&A.
+
+## Stack
+- Python 3 with FastAPI for the web server
+- Ollama with nomic-embed-text for embeddings
+- ChromaDB for vector storage (persisted to disk)
+- Simple HTML/JS frontend served by FastAPI
+
+## Directory Structure
+
+```sh
+daw_til/
+โ””โ”€โ”€ rag/
+    โ”œโ”€โ”€ ingest.py          # Markdown parser + ChromaDB ingestion
+    โ”œโ”€โ”€ server.py          # FastAPI web server + query endpoint
+    โ”œโ”€โ”€ requirements.txt   # Python dependencies
+    โ”œโ”€โ”€ templates/
+    โ”‚   โ””โ”€โ”€ index.html     # Web UI (search box + results)
+    โ””โ”€โ”€ chroma_db/         # ChromaDB persistence (gitignored)
+```
+
+## Implementation Steps
+1. Create rag/requirements.txt
+```sh
+chromadb
+fastapi
+uvicorn[standard]
+ollama
+python-frontmatter
+```
+
+2. Create ```rag/ingest.py``` โ€” Markdown Ingestion Pipeline
+- Walk docs/, notes/, lists/, posts/ directories
+- Parse each .md file:
+  - Extract YAML frontmatter (posts have title, slug, description, tags)
+  - For docs/notes/lists: derive title from first # heading
+  - Extract date from post filenames (YYYY-MM-DD pattern)
+  - Categorize by directory: docs, notes, lists, posts
+- Chunking strategy:
+  - Split on markdown headers (##, ###) to keep semantic sections intact
+  - For sections exceeding ~1000 chars, further split on paragraphs
+  - Each chunk gets metadata: source (file path), category, section (header path), title, tags (if post)
+- Generate embeddings via Ollama (nomic-embed-text)
+- Upsert into ChromaDB collection with metadata
+- Print progress (file count, chunk count, timing)
+3. Create ```rag/server.py``` โ€” FastAPI Query Server
+- ```GET /``` โ€” serve the HTML UI
+- ```POST /query``` โ€” accept a search query, embed it with Ollama, query ChromaDB for top-k results (default 8), return JSON with chunks + metadata + distances
+- ```GET /stats``` โ€” return collection stats (doc count, chunk count)
+- CORS enabled for local dev
+4. Create ```rag/templates/index.html``` โ€” Web UI
+- Clean, minimal search interface
+- Search box with submit button
+- Results displayed as cards showing:
+  - Matched text snippet
+  - Source file path (clickable link to Docusaurus page)
+  - Category badge (docs/notes/lists/posts)
+  - Relevance score
+- Loading spinner during search
+5. Add ```rag/``` to ```.gitignore```
+Add ```rag/chroma_db/``` to gitignore to keep vector data out of git
+
+## Metadata Schema (ChromaDB)
+Each chunk stored with:
+```js
+{
+    "source": "docs/lang/JavaScript.md",  # relative file path
+    "category": "docs",                    # docs|notes|lists|posts
+    "title": "JavaScript",                 # doc/post title
+    "section": "Arrays > Map and Filter",  # header breadcrumb
+    "tags": "tech",                        # comma-separated (posts only)
+    "date": "2025-03-03",                  # posts only
+    "url": "/til/docs/lang/JavaScript"     # reconstructed Docusaurus URL
+}
+```
+
+## Prerequisites
+- Ollama installed and running with nomic-embed-text model pulled
+- Python 3.10+
+
+## Usage
+```sh
+cd rag
+pip install -r requirements.txt
+ollama pull nomic-embed-text
+
+# Ingest all markdown files (run once, re-run to rebuild)
+python ingest.py
+
+# Start the web UI
+python server.py
+# Open http://localhost:8808
+```
+
+## Verification
+1. Run ```python ingest.py``` โ€” should report ~300+ files processed and ~1000+ chunks created
+2. Run ```python server.py``` โ€” should start on port 8808
+3. Open browser to ```http://localhost:8808```
+4. Search for "PostgreSQL" โ€” should return relevant chunks from ```docs/db/PostgreSQL.md```
+5. Search for "Docker" โ€” should return chunks from docs/server/Docker.md
+6. Search for a conceptual query like "how to set up SSL certificates" โ€” should return relevant Let's Encrypt / Apache / Nginx docs

+ 1 - 3
posts/2026/2026-02-10-posts.md

@@ -12,12 +12,10 @@ image: https://davidawindham.com/wp-content/themes/daw/img/opengraph_image.jpg
 hide_table_of_contents: true
 hide_table_of_contents: true
 ---
 ---
 
 
-I've used the expression that **I've learned much more from maintaining than building projects** for quite some time now. It's definitely more true than ever but I wanted to make a note to myself regarding the use of artificial intelligence when building sites or doing pretty much anything else. 
+Over the last couple of days, I've been in a bit of a fog. It could be the abnormal weather patterns sweeping through the region with a couple of inches of ice last weekend and a half foot of snow predicted today [^1]. Needless to say, I've had some extra time on my hands indoors and I've also been trying to make sense of our political theatre with some daily doom-scrolling. 
 
 
 <!-- truncate -->
 <!-- truncate -->
 
 
-Over the last couple of days, I've been in a bit of a fog. It could be the abnormal weather patterns sweeping through the region with a couple of inches of ice last weekend and a half foot of snow predicted today [^1]. Needless to say, I've had some extra time on my hands indoors and I've also been trying to make sense of our political theatre with some daily doom-scrolling. 
-
 ### Never Tell a Lie
 ### Never Tell a Lie
 
 
 Let me start by saying that the seminal moment in this essay happened when I was very young. Although I've retold this antidote more times than I can recall, I can't specify exactly how old I was at the time. I think I was eight or nine years old and I was riding in the car on the way to school listening to my dad on the radio. Dad did a morning radio show for the majority of my life - mostly music and comedy. This particular morning, just as we were pulling into the drop-off line at our school, he said "my family and I had a great dinner last night at wherever" ( I don't remember the name of the restaurant ). The thing is - we didn't have dinner last night at 'wherever'. I turned to looked at my mom with what I'm sure was an obvious reaction on my face. She just said "ask you dad when you get home". 
 Let me start by saying that the seminal moment in this essay happened when I was very young. Although I've retold this antidote more times than I can recall, I can't specify exactly how old I was at the time. I think I was eight or nine years old and I was riding in the car on the way to school listening to my dad on the radio. Dad did a morning radio show for the majority of my life - mostly music and comedy. This particular morning, just as we were pulling into the drop-off line at our school, he said "my family and I had a great dinner last night at wherever" ( I don't remember the name of the restaurant ). The thing is - we didn't have dinner last night at 'wherever'. I turned to looked at my mom with what I'm sure was an obvious reaction on my face. She just said "ask you dad when you get home". 

+ 58 - 0
posts/2026/2026-02-24-posts.md

@@ -0,0 +1,58 @@
+---
+title: Model Context Protocol
+slug: mcp
+description: Today I learned how to configure a local MCP server.
+<!--- authors:
+  - name: David Windham
+    title: Something Else
+    url: https://davidawindham.com
+    image_url: https://davidawindham.com/wp-content/themes/daw/img/opengraph_image.jpg -->
+tags: [code, AI]
+image: https://davidawindham.com/wp-content/themes/daw/img/opengraph_image.jpg
+hide_table_of_contents: true
+---
+
+Today I learned how to effectively build, configure, and run a local Model Context Protocol server. The Model Context Protocol[^1] is an open protocol that enables seamless integration between language model AI applications and external datasources and tools. 
+
+<!-- truncate -->
+
+The better half and I have had this ongoing discussion about how foolish folks can be with artificial intelligence. She regularly gets AI crafted emails from colleagues and students so we like to read them aloud for fun. We are both fascinated and skeptical about the future of AI, but mostly lean toward the skeptical side. I've mostly been harping on the pitfalls of agentic engineering patterns as it relates to my work, but now it's becoming my work. 
+
+I first realize this when I started to get requests to fix codebases or systems that had 'gotten away' from the developer mostly through the use of AI. Then I started to realize that my cheap assistant was doing more and more of the development work. And then more recently learning about long running agents to do all the work. 
+
+AI assistants embedded in the web, apps, editors, or terminals seen smart but are relatively blind to your work. The hullabaloo recently about Anthropic's CoWork cutting into other service as software models is a good example. The plugins for Claude are essentially hosted MCP servers which give it access to other software and data. The Model Context Protocol was released at the end of 2024 as an open standard and as usual, I'm a late adopter. 
+
+### Ralph
+
+A buddy of mine is deep down the AI rabbit hole, but he's also more experienced, a former systems engineer and pretty damn sharp. He sent me a video[^2] about the web forking over to agents and it got me to thinking.  I put together a local filesystem server called `ralph-fs`[^3] as an exercise โ€” partly to understand the protocol from the inside and partly because I wanted Claude Code to have read-write access to a couple of local directories without me copy-pasting paths all day. The server is TypeScript, uses the official `@modelcontextprotocol/sdk`, and exposes seven tools: `read_file`, `write_file`, `list_directory`, `create_directory`, `file_info`, `delete_file`, and `search_files`. Registered via a `.mcp.json` file in the project root and enabled through the terminal and desktop apps.
+
+My skepticism had me thinking carefully about path safety and a deeper understanding. The server needs to enforce that no tool call escapes the allowed directory tree, because you're essentially handing the model a shell that can write files. I ended up with a simple allowlist resolved at call time โ€” every path gets canonicalized and checked against the list before anything happens. It's not complicated, but it's the kind of thing you have to think through before you just wire up the filesystem and hand it to an agent.
+
+![](/img/ralph-loop.jpg)
+
+It's named Ralph[^4] because of a recent reference to an almost completely autonomous style of building software. As I get it up to speed, I'll add in more tools, utilities, and skills. Eventually I'll have a setup that's completely customized to the type of work I do most often that's completely familiar with my servers and projects.  
+
+### Doing Things
+
+What MCP really represents is a step toward AI that can act rather than just advise. The agentic framing has been floating around for a couple of years now but it's mostly felt speculative. A clean, open protocol for tool use starts to make it concrete. You can write an MCP server for anything โ€” your local files, a database, a REST API, a browser, a calendar โ€” and the model can compose those tools to accomplish real tasks. The composition is the thing. A model that can read a file, reason about it, write a modified version back, and then call a build tool has meaningfully different capabilities than one that can only narrate what it would do if it could.
+
+And it's coming for everything. Everything. Most folks will just be granting Gemini access to their email and calendars or installing various plugins. I'm a local first privacy minded fella, so it'll all be custom for me. Because it's published as an open spec rather than a proprietary feature,Microsoft, Google, and a handful of others have already adopted it[^5]. It has the shape of something that could standardize the way models interact with software environments the same way LSP[^6] standardized the way editors interact with language tooling.
+
+For now, I'm limiting Ralph to a very small subset of my work. And while the market for it will likely be virtual assistants for most, I'd like to keep mine focused. I manage a lot of websites and a handful of servers. Eventually I'll be connecting my local agent to those machines and projects. Even the WordPress websites will be able to use the protocol[^7] and working with a lot of software will be powered by autonomous agents. Some of it will be framed within the agent interface[^8] as embedded extension apps[^9].
+
+<div><br/><br/></div>
+---
+
+[^1]: _Model Context Protocol_ - https://modelcontextprotocol.io
+[^2]: _The agent web is being built this month_ - https://natesnewsletter.substack.com/p/coinbase-stripe-and-cloudflare-all
+[^3]: ralph - https://github.com/windhamdavid/ralph / https://code.davidawindham.com/david/ralph
+[^4]: _everything is a ralph loop_ - https://ghuntley.com/loop/ & https://github.com/frankbria/ralph-claude-code
+[^5]: _MCP Partners_ - https://modelcontextprotocol.io/partners
+[^6]: Language Server Protocol - https://microsoft.github.io/language-server-protocol/
+[^7]: _WordPress MCP Adapter_ - https://developer.wordpress.org/news/2026/02/from-abilities-to-ai-agents-introducing-the-wordpress-mcp-adapter/
+[^8]: _Bringing UI Capabilities To MCP Clients_ - https://blog.modelcontextprotocol.io/posts/2026-01-26-mcp-apps/
+[^9]: MCP Extension Apps - https://github.com/modelcontextprotocol/ext-apps
+
+
+
+

+ 12 - 0
sidebars.js

@@ -4,6 +4,18 @@ module.exports = {
       type:'doc',
       type:'doc',
       id: 'index',
       id: 'index',
     },
     },
+    {
+      type: 'category',
+      label: 'AI',
+      collapsible: true,
+      link: {
+        type:'doc',
+        id:'ai/ai',
+      },
+      items: [
+        'ai/ai',
+      ],
+    },
     {
     {
       type: 'category',
       type: 'category',
       label: 'Computers',
       label: 'Computers',

+ 1 - 0
sidebarsnotes.js

@@ -164,6 +164,7 @@ module.exports = {
           },
           },
           items: [
           items: [
             'work/projects/ai',
             'work/projects/ai',
+            'work/projects/ai-rag',
             'work/projects/game',
             'work/projects/game',
             'work/projects/gzet',
             'work/projects/gzet',
             'work/projects/ham',
             'work/projects/ham',

+ 1 - 0
src/pages/index.md

@@ -6,6 +6,7 @@ description: A place to keep notes and documentation
 # Today I Learned
 # Today I Learned
 
 
 - **2026**
 - **2026**
+  - 26/02/24 - [Model Context Protocol](/posts/mcp)
   - 26/02/10 - [Everything is a Cult](/posts/everything-cult)
   - 26/02/10 - [Everything is a Cult](/posts/everything-cult)
   - 26/02/04 - [Maintenance](/posts/maintenance)
   - 26/02/04 - [Maintenance](/posts/maintenance)
   - 26/01/21 - [Storyboard](/posts/storyboard)
   - 26/01/21 - [Storyboard](/posts/storyboard)