A demo public LLM context repository

I've been writing a lot about personal context data for AI — the idea that you can dramatically improve AI assistant quality by proactively feeding them structured information about yourself, your preferences, and your circumstances. But theory is easier to talk about than practice, and one of the questions I kept getting was: what does a context repository actually look like? Not in the abstract, but concretely — what are the files, how are they structured, what level of detail is appropriate? Rather than keep explaining it verbally, I decided to create a working example. My LLM Context Repo (Public) is a curated subset of my actual personal context data, published on GitHub as a demonstration of the format and approach.

What's inside and how it's organised

Each markdown file in the repository contains a discrete, standalone piece of contextual information — things like my professional background, my technical preferences, my communication style, where I'm based and why that matters for my work. The key design decision is atomicity: each file should be independently useful when retrieved by a RAG pipeline, without requiring the reader (human or AI) to have read the other files first. This means some information is mildly redundant across files, which is intentional. When a vector search retrieves one snippet about my work preferences, that snippet should contain enough context to be actionable without needing to also retrieve my bio and my location data. This is a different design philosophy from, say, a personal website where you'd link between pages — context snippets need to be self-contained because you can't control which ones the retrieval system will select.

How the context was gathered

Some of the context was gathered through an experiment I found genuinely enjoyable: having an LLM conduct a context-setting interview with me. The AI asked questions — structured, purposeful questions about my work, my preferences, my background — and then organised my responses into themed context snippets. It's a surprisingly pleasant process, like a very efficient onboarding conversation with a new colleague who happens to have perfect memory. I also built several Hugging Face assistants to support the workflow: one that helps identify what context data would be most valuable to capture next, one that isolates contextual information from surrounding conversational text, one that generates formatted snippets from raw dictated speech, and one that can conduct the full interview-to-snippet pipeline end to end.

The point of making it public is transparency. Context snippets are just small textual files containing pieces of information about a subject, structured for vector storage ingestion. By publishing a real example rather than a hypothetical one, I hope to lower the barrier for others who want to build their own context repositories but aren't sure where to start. The full demo is on GitHub.

Repositories

danielrosehill/My-LLM-Context-Repo-Public ★ 0

A context repo for experimenting with LLM models. A public version, so some info naturally withheld.

PythonUpdated Dec 2024