A Private AI Assistant For Your Own Files

ask your own
library anything.

Alembic Index turns your notes, books, and papers into something you can question in plain English, and every answer points straight back to the source it came from. It runs entirely on your own computer.

Nothing is uploaded, ever.

  • 100% local
  • Cited answers
  • Notes · eBooks · PDFs
Runs on a MacBook Pro Models via Ollama Nothing leaves the machine

The Atlas

Watch your library think.

Every passage in your library becomes a point on a map, placed by meaning, so related ideas settle into topic clusters. Here you're at the top, speaking a question to the model on the terminal below. Watch it reach out into the clusters all around, pull in the passages that fit, and answer you on screen. It's the whole idea of retrieval, made visible.

Ask the Atlas
Pick a question above, or watch it run
↗ Meet the maker · corbet.app
Speak · 1 / 5

  1. By meaningvector search
  2. By keywordBM25
  3. Combinedfused ranking
  4. Rerankedcross-encoder
  5. Given to the AIcited answer

Shown in miniature: a real index holds 24,303 passages from 397 files. The full Atlas lives inside the app, and every dot of it stays on your own machine.

[01] The problem

You've saved far more than you'll ever re-read.

Years of notes and markdown files. A drive full of eBooks and academic PDFs. The passage you half-remember is in there somewhere, but keyword search doesn't understand what you actually meant, and the cloud AI tools that could help want you to upload your private library to someone else's servers first.

[02] What it is

A smart research assistant for your private files.

Think of it as a Your question → the passages that fit → an answer, only from your files.Retrieval-Augmented Generation (RAG). The technical name for this approach: the AI first searches your own documents for relevant passages, then answers using only those, so replies stay grounded in your sources, not the model's memory.. You ask a question in plain English; it finds the passages that actually matter, reads them, and writes you a clear answer, with Every claim links to the exact source passage. Click to check it.Citations. Every claim is linked to the exact passage it came from. Click one to open the original text and check it for yourself. Nothing is taken on faith. straight back to the source. It doesn't guess, and it doesn't make things up.

The AI Your files and the AI stay on your computer. Nothing goes out.Runs locally. The AI models run on your own machine through Ollama (free software, no account, no cloud). Nothing is ever sent to someone else's servers., both the part that Large Language Model (LLM). The kind of AI that reads and writes text. Here it only answers from the passages pulled out of your files. and the part that Embedding model. Turns each passage into a list of numbers that captures its meaning, so search can match by idea, not just matching keywords.. Your reading and your questions never leave your machine.

Tip: hover, tap, or focus any underlined term for a plain-language explanation. Bigger question? See the FAQ.

In plain terms

It answers only from what it just found in your files, not from the open internet, and not from whatever it picked up during training.

[03] Features

What it does, and why that matters.

01

Answers you can check

Every claim links to the exact passage it came from. Click a citation to open the original source. No vague summaries you just have to trust.

02

Private by design

The AI runs on your own machine, and the app is Local-only. The models run on your machine and the app answers only to your own computer (technically, it binds to 127.0.0.1). It's never exposed to your network or the internet.. Your PDFs are PyMuPDF. A tool that extracts the text from your PDFs on your own machine. No file is ever sent to a cloud service., never uploaded anywhere, and your search index is only ever touched locally.

03

It searches by keyword and by meaning

It searches two ways at once: by exact Keyword search (BM25). Classic search, great at exact words, names, and codes. Paired here with meaning-based search. and by Meaning-based (vector) search. Finds passages by idea, so it matches even when the wording differs from your question., then Keyword results + meaning results → one combined ranking.Reciprocal-rank fusion. A simple, robust way to merge the keyword and meaning results into one ranked list. and has a A second pass re-reads the top results and puts the best first.Reranker (cross-encoder). A second, more careful model that re-reads the top passages against your question and reorders them so the best rise to the top. the top results, so the passages that truly fit come first.

04

Works with the files you already have

Notes, markdown, eBooks, and academic PDFs, even A scanned, image-only page → searchable text.OCR (optical character recognition). Reads the text out of scanned, image-only PDFs, done on your own machine with Tesseract., read on your own machine. Long files are A long file → passages, so it can quote the exact part.Chunking. Documents are split into passages along their natural structure, so the system can fetch and quote the exact part that answers you, not a whole file. along their natural structure, so it can quote the exact part that answers you.

05

Keeps your topics separate

Your notes, your books & papers, and your recipes live in separate collections, so a recipe never sneaks into a research answer. Choose which one to search for each question.

06

Answers in real time, and follows the thread

Answers appear as they're written, so you're never left staring at a spinner. Follow-up questions understand what came before, so "and what about the second one?" just works.

07

A built-in scratchpad

Pin the passages and answers you want to keep, edit them in place, and export to a plain-text document. A research session becomes a first draft without ever leaving the page.

08

Nothing is locked in

The AI models and even the Vector store. The database that holds the meaning-based index and finds the closest matches to your question. Chroma runs locally; Qdrant is a drop-in for a server. (Chroma or Qdrant) are set in one simple settings file, and it's built to move to a home server later with no code changes.

[04] How it works

From a pile of files to a cited answer.

Your computer does just two things: run the AI and copy your files down from the cloud. Everything else (reading the files, organizing them, and searching) happens in a sealed workspace that never opens to the internet.

STAGE 01

Your files Corpus. Your existing library: notes and markdown files, eBooks, and academic PDFs. Nothing new to write; the engine works with files you already keep, grouped into separate collections so unrelated material can't bleed into an answer.

Notes, eBooks, and PDFs you already own.

STAGE 02

Bring files local Materialize. Many files live in iCloud as "dataless" placeholders, just stubs on disk. This step downloads their real contents into a local, read-only staging copy, so indexing always reads from your disk, never from the cloud.

Files are downloaded out of iCloud into a local read-only staging copy.

STAGE 03

Read & split up Ingest. Each file is parsed into clean text, split into passages along its natural structure (headings and sections), and every passage is turned into a meaning-vector by the embedding model. Image-only PDFs are OCR'd first so even scans become searchable.

Read each file, split it into passages, and turn each into a Embeddings (bge-m3). The open model that turns each passage into numbers capturing its meaning, so search can match by idea..

STAGE 04

Save the index Store. The meaning-vectors (plus the original text and where each passage came from) are saved in a local vector database on disk. Built once, it's reused for every question; nothing is rebuilt or re-uploaded unless you change the embedding model.

Vectors land in a local Chroma. The local database that stores the meaning-vectors and returns the closest passages. index on disk.

STAGE 05

Search Retrieve. Your question is searched two ways at once: exact keywords (BM25) and meaning (vectors). The two rankings are merged, then a reranker re-reads the top candidates against your question and reorders them so the most relevant passages rise to the top.

Keyword and meaning search, combined, then double-checked by a reranker.

STAGE 06

Write the answer Generate. The local language model is handed only those top passages plus your question, and writes an answer grounded in them, citing each claim as it goes. It can't wander off into its training data or the open web.

qwen3:30b-a3b. The local large language model that reads the retrieved passages and drafts the cited answer. reads the passages and drafts an answer.

STAGE 07

Answer + citations Answer + citations. The reply streams back word by word so you're not staring at a spinner, and every claim is linked to the exact source passage. Click a citation to open the original chunk and check it yourself. No unverifiable summaries.

Streamed back with every claim linked to its source chunk.

BOUNDARY

On your machine On your machine. Every step above (parsing, embedding, storing, retrieving, generating) runs locally. The models sit on localhost, the server binds to 127.0.0.1, and nothing in the chain reaches the public internet. Privacy is the architecture, not a setting.

No step in this chain reaches the public internet.

Always private The AI runs on your computer · nothing is exposed to the internet · your files never leave your disk.

[05] Stack

Built on boring, swappable parts.

Orchestration
LlamaIndexThe conductor that wires every step together: loading your files, splitting them into passages, running the search, and handing the results to the model.No cloud parsers, local parsing only.
A passage → a point among points; your question lands nearby.Embeddings. Each passage becomes a point in "meaning space," and so does your question; search returns the nearest points, matching by idea rather than exact words.
bge-m3 via OllamaTurns each passage into a list of numbers that captures its meaning, so search can match by idea rather than exact wording.1024-dim · swappable in config
Generation
qwen3:30b-a3b via OllamaThe local language model that reads the retrieved passages and writes your answer, citing each source as it goes.Runs natively on Apple Silicon (Metal)
Vector store
Chroma (local), pluggable to QdrantThe database that holds those meaning-numbers and instantly returns the passages closest to your question.
Retrieval
BM25 + vector, reciprocal-rank fusionSearches two ways at once: exact keywords and meaning, then merges both into a single ranked list of passages.
Reranker
BGE-Reranker-v2-m3 cross-encoderA second, more careful pass that re-reads the top passages against your question and pushes the best matches to the top.
Parsing & OCR
PyMuPDF · ocrmypdf + TesseractPulls clean text out of your files, even reading scanned, image-only PDFs by recognizing the words on the page.Image-only PDFs OCR'd locally, hash-cached
Interface
FastAPI localhost bridge + CLIA small local web server so a browser page (or the command line) can talk to the engine on your machine.
Hardware
MacBook Pro M5 Pro, 48 GBThe machine it's built and tested on, with enough memory to run the models comfortably and no cloud GPUs.Host-direct, no Docker
Roadmap
Self-hosted Unraid Docker + Cloudflare TunnelWhere it's headed: the same app moved to a home server and reached privately, by changing settings, not code.No code changes required

[06] Field notes

A few decisions worth defending.

Watercolor still life: an open notebook with faint handwriting, a stack of cloth-bound books, a dried agave leaf, and a small copper cup on a limestone surface.
Privacy

An invariant, not a setting.

Local parsing is enforced, not optional. The cloud document parser was ruled out from the start, the models stay on localhost, and the container exposes no ports. Privacy you can toggle off isn't privacy.

Debugging

A named volume over a bind mount.

SQLite-backed Chroma threw disk-I/O errors across the macOS↔Linux file boundary. The index now lives where the database engine is actually happy. Caught in a smoke test, not in production.

Architecture

Designed for the move before it moved.

Model endpoints and the vector store are config-driven, so the jump from a laptop to a self-hosted server changes a YAML file, not the code.

Retrieval

Precision over recall theatre.

Over-fetch, cap chunks per document, then rerank with a cross-encoder. Six sharp passages beat twenty fuzzy ones when the model has to cite what it used.

[07] FAQ

Questions, answered.

Is my data private? Does anything get uploaded?

Nothing is uploaded, ever. The AI and your files both stay on your own computer, and the app is never exposed to the internet or your network. Your documents and your questions never leave your machine.

Do I need to be technical to use it?

Once it's running, no. You just ask questions in plain English. Getting it set up does take a little comfort with installing developer tools (a one-time install of the free Ollama app and the project itself). It's an open personal project, so setup is hands-on rather than one-click for now.

How is this different from ChatGPT or other cloud AI?

Cloud assistants answer from what they absorbed during training and run on someone else's servers. Alembic Index answers only from your files, links every claim back to the exact source, and runs entirely on your own computer. Nothing is uploaded.

What kinds of files does it work with?

Notes and markdown files, eBooks (EPUB), and PDFs, including scanned, image-only PDFs, which it reads by recognizing the text on the page. You point it at folders you already have; there's nothing new to write.

Does it cost anything?

No. It's free and open source, and it uses free, open AI models that run on your own machine. There's no subscription, no per-question fee, and no account to create.

Can it make things up?

It's built specifically to avoid that. It answers only from the passages it pulled out of your files, and cites each one so you can click to verify. If your files don't contain the answer, it tells you rather than inventing one.

What are the hardware requirements, and does it run on Windows?

It's built and tested on an Apple Silicon Mac (M-series) with enough memory to run a local model comfortably (roughly 16 GB or more). The models run through Ollama, which also supports Windows and Linux, so it can be adapted, but the current build targets macOS.

What AI models does it use, and can I swap them?

A local language model writes the answers, and a separate model powers meaning-based search; both run locally through Ollama. Every model, plus the search database, is set in one simple settings file, so you can swap in different ones without touching the code.

Where exactly is my data stored?

On your own disk. Your files are read locally, the search index is saved in a local database on your machine, and none of it is sent anywhere. Delete the index folder and it's gone completely.

Is it open source, and how do I install it?

Yes. The full source is on GitHub. In short: install Ollama and pull the models, clone the repo, point it at your folders, and run it locally. See the GitHub repo for step-by-step instructions.

Can I run it on a home server instead of my laptop?

Yes, it's designed to move from a laptop to a self-hosted server (for example, Unraid with Docker, reached privately through a Cloudflare Tunnel) by changing a settings file, not the code.

Read the source.
Run it yourself.

Alembic Index is a personal project, built in the open. Clone it, point it at your own files, and ask your library something.

Get it on GitHub MacBook Pro M5 Pro · local-first