What The Link — How I AI-fied My WhatsApp
How I built a WhatsApp-powered bookmark manager with semantic search using Baileys, TanStack Start, Hono, Drizzle, SQLite, and Gemini embeddings.
Have you ever used WhatsApp as a bookmarking tool?
I did. I even have a group with just me where I dump thoughts and links. Blogs. Tweets. Tools. Products. Ideas.
The problem? Finding them again. So I built What The Link.
The Problem
I'm someone who always sends links — products, blogs, brainstorming notes, ideas — everything gets dumped into my WhatsApp personal group for quick access.
But over time it gets flooded. Messy. Hard to search.
You need to remember what you're searching for. There's no semantic search in WhatsApp — and that's the real problem. You have to remember the exact words you used, not what the link was actually about.
So I thought — why not give WhatsApp that ability? AI-ify it. Make it actually smart.
Finding the Right Approach
When the idea first clicked, I thought about piping chats from WhatsApp to Notion. But only paid bot integrations existed. Same story with the WhatsApp Business API — also paid. Not what I needed.
Then I got curious about how OpenClaw handles their WhatsApp integration. That's how I found Baileys — an unofficial WhatsApp socket package where you connect your WhatsApp via QR code and keep the session active. For free.
That was exactly what I wanted.
I didn't go through every file in their big monorepo to understand the codebase. What I did instead was prepend their GitHub link with DeepWiki, which had already indexed the whole repo. I just chatted with it and got the info I needed.
deepwiki.com/[github-repo-url]Around the same time I got the Claude Max plan for 6 months through Anthropic's open source program. What else do you want, right? I started building, planning, and figuring out the scope.
The Stack
I picked the T3 stack builder to bootstrap the project. The full stack:
- TanStack Start — full-stack React framework, file-based routing, server functions built in
- Hono — lightweight backend API layer
- Drizzle ORM — type-safe, SQL-first. Pairs really well with SQLite
- SQLite — simple, file-based, no infra overhead for a v1
- shadcn/ui — for the UI components
The whole thing is type-safe end to end. Drizzle handles the schema and queries, Hono handles the API routes, TanStack Start ties the frontend and backend together. For a solo project it's a really clean setup — not over-engineered, but also not something you'll regret later.
Building It — V1 First
I didn't try to build everything at once.
V1 was just the CRUD and the backend. Baileys listens for incoming WhatsApp messages, extracts any links, and saves them to SQLite via Drizzle. That's it. No AI yet. Just — capture the link, store it, show it in the web UI.
Get the plumbing right first. The smart stuff comes after.
WhatsApp message
↓
Baileys socket picks it up
↓
Extract the link
↓
Save to SQLite via Drizzle
↓
Show in the web UIOnce that was working cleanly, I moved to the part I actually wanted to build.
How the WhatsApp Part Actually Works
Baileys gives you a raw WebSocket connection to WhatsApp Web. You scan a QR code once, and the session gets persisted to disk — so on restart, it just reconnects. No QR again.
But it's not plug-and-play. WhatsApp rejects outdated client versions with a 405. So the server fetches the latest WhatsApp Web version from a public repo on startup. If that fails, it falls back to a hardcoded version. Small detail, but without it — the whole thing breaks silently.
Connection state machine
connection === "open" → good, listening
connection === "close" → check why
└─ 405 → outdated version → refetch, retry in 10s
└─ 440 → another device took over → logout, clear auth, wait
└─ anything else → auto-reconnect after 5sMessage pipeline
When a message comes in, it goes through a pipeline:
- Ignore your own messages
- Filter by allowed group (optional — you can lock it to one group)
- Check if it's a command (
?help,?search query) - If not — extract links and save
Link extraction — three-layer fallback
Layer 1: Regex — fast, free. Grabs any http/https URL from the text.
Layer 2: AI text extraction — if regex finds nothing, ask Gemini to parse the message.
Layer 3: AI vision — if the message has an image (screenshot, photo of a screen,
QR code), download it and run it through Gemini's vision model.Most messages hit Layer 1 and move on. But the vision layer is surprisingly useful — people screenshot tweets and product pages all the time. That would've been lost without it.
URLs also get cleaned up before saving. Tracking parameters like utm_source, fbclid, gclid — 21 of them — get stripped. Trailing slashes removed. URLs normalized. No junk in the database.
Reactions
When a link is saved, the bot reacts to the WhatsApp message:
| Reaction | Meaning |
|---|---|
| 🔖 | Saved |
| ⚠️ | Duplicate |
| 📝 | Saved as a note (no URL found, just text) |
| ❌ | Nothing useful in the message |
Small touch but it makes it feel alive.
The Database — Keeping It Simple
The whole thing runs on one SQLite file. Bookmarks, tags, settings, embeddings — all in one place.
Bookmarks table
CREATE TABLE bookmarks (
url TEXT UNIQUE,
title TEXT,
description TEXT,
image TEXT,
favicon TEXT,
domain TEXT,
tags TEXT, -- JSON array, denormalized
source TEXT, -- 'whatsapp' | 'manual' | 'import'
whatsapp_message_id TEXT,
summary TEXT, -- AI-generated
embedding BLOB, -- serialized Float32Array
metadata_status TEXT,
summary_status TEXT,
embedding_status TEXT,
is_archived INTEGER DEFAULT 0,
created_at TEXT,
updated_at TEXT
);Each bookmark has three independent processing pipelines — metadata, summary, and embedding — each with their own status and retry count. They run independently so a failed summary doesn't block the embedding, and vice versa.
Tags are stored as a JSON array directly on the bookmark. No join table. For this scale it's the right call — simpler queries, simpler code, and SQLite's json_each() handles filtering just fine.
There's also a separate tags table that tracks tag names and usage counts — basically a tag cloud index. And an app_settings key-value table for things like which WhatsApp group to listen to and digest preferences.
No foreign keys. No complex relationships. One file, easy to back up, easy to move.
Then I AI-fied It
After V1 was solid, I added OpenRouter using the OpenAI SDK — so I'm hitting any model I want through one unified API without being locked into one provider.
The AI layer is a singleton client — it only initializes if you set the OPENROUTER_API_KEY. No key? No AI. The app still works, you just get basic keyword search instead of semantic search. Graceful degradation.
When a bookmark is saved, three things happen asynchronously — fire-and-forget style. The user sees the bookmark immediately. The AI stuff runs in the background:
1. Metadata fetch
Cheerio crawls the page — grabs the title, description, OG image, favicon. If the crawl gets blocked (some sites do), it falls back to AI inference from just the URL structure. Like instagram.com/p/XYZ — even without crawling, you know it's an Instagram post.
2. Summary + Tags
The page content gets sent to Gemini via OpenRouter. Two parallel prompts:
- Summary: "Summarize in 4-5 concise lines." Capped at 2000 characters.
- Tags: "Suggest 2-4 short lowercase tags." Returns a JSON array.
Tags only auto-generate if the user didn't include hashtags in the original message. If you send #design #tools https://some-link.com, those hashtags become the tags and the AI doesn't override them.
3. Embedding
Once the summary exists, the app builds an embedding text by concatenating:
Title | Tags: tag1, tag2 | Full summary | Description (truncated to 500 chars)That combined text gets sent to gemini-embedding-001 via OpenRouter. The embedding comes back as a Float32Array, gets serialized to a binary blob, and stored directly in the SQLite embedding column.
No separate vector database. No Pinecone. No Qdrant. Just a BLOB column in the same SQLite file.
The Retry Machine
AI APIs fail. Rate limits hit. Pages block crawlers. So every pipeline has a retry system.
Three cron jobs run every 5 minutes:
| Job | What it does |
|---|---|
| Summary Retry | Picks up pending or failed summaries (batch of 50), generates summary + auto-tags, chains to embedding job on success |
| Embedding Retry | Picks up pending or failed embeddings (batch of 50), generates and caches embedding |
| Metadata Retry | Retries failed page crawls |
Each bookmark gets 3 retries max per pipeline. After that it's marked failed and left alone — you can manually retry from the UI.
Rate limits get special treatment. When the app hits a 429 from OpenRouter, it pauses the entire batch for 60 seconds, then resumes. No aggressive retries hammering the API.
On startup, it also runs an immediate pass — processes all pending items before the cron kicks in. So after a deploy or restart, everything catches up fast.
How Semantic Search Actually Works
This is the part that makes the whole thing worth building.
The approach: in-memory embedding cache. On startup, the server loads every bookmark's embedding into memory as Float32Array objects. When you search, it:
- Generates an embedding for your query (one API call)
- Computes cosine similarity against every cached embedding
- Filters results above a 0.55 threshold
- Returns top 50, sorted by score
Cosine similarity is just dot product divided by the product of magnitudes. Three lines of math on Float32Array. For a few thousand bookmarks, it runs in about 2ms. No need for a vector index.
If AI isn't configured or the cache is empty, search falls back to SQL LIKE queries across title, description, URL, summary, and even inside the JSON tags array using json_each(). Not as smart, but it works.
Search from WhatsApp
You can search directly from WhatsApp too:
| Command | Action |
|---|---|
?indie hackers | Semantic search |
?#design | Tag filter |
?recent 10 | Last 10 bookmarks |
Results come back as WhatsApp messages with clickable links. So you don't even need to open the web UI.
The Web UI
In January I came across Zaid Alam's bookmarking tool. He built it for his own use case. I loved the minimal UI so I took some inspiration — and a screenshot — and prompted it straight into Claude Code.
It's not open source so I couldn't clone it. So I just fired more tokens and built the web UI myself using shadcn components.
The frontend is a TanStack Start app — SSR on first load, then SPA navigation. Two routes:
- Home — the bookmark library. Search bar, tag filters, pagination (25 or 50 per page), data grid with title, domain, date, and actions. Keyboard shortcuts for everything:
Cmd+Kto focus search, arrow keys to navigate,Deleteto remove. - Settings — WhatsApp QR code display, group selection dropdown, daily digest toggle with hour picker. Connect, disconnect, reconnect — all from the browser.
Auth is simple. Single password. Cookie-based session that lasts 30 days. Rate limited to 10 failed attempts per IP per 15 minutes. No OAuth, no user management. It's a personal tool.
Deployment
The whole thing ships as a Docker image. Multi-stage build — builder compiles everything, runner is Alpine with just the compiled output.
docker-compose up -dOne volume mount for /data — that's where the SQLite database and the WhatsApp auth session live. Survives restarts, easy to backup, easy to migrate.
There's also a one-liner install script for VPS setups. It checks for Docker, asks for your password, builds the image, and starts the container. I'm running mine on Oracle Cloud's Always Free ARM instance — 24GB RAM for a SQLite app is hilariously overkill but hey, it's free.
Health check pings /health every 30 seconds. Container auto-restarts unless you explicitly stop it.
Architecture at a Glance
┌──────────────────────────────────────────────────────────────────┐
│ WhatsApp │
│ (Baileys Socket) │
│ QR Auth ←→ Session Persistence ←→ Message Listener │
└──────────────┬───────────────────────────────────────────────────┘
│ messages.upsert
▼
┌──────────────────────────────────────────────────────────────────┐
│ Link Extraction Pipeline │
│ Regex → AI Text → AI Vision (3-layer fallback) │
└──────────────┬───────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ SQLite (via Drizzle) │
│ bookmarks │ tags │ app_settings │
│ Single .db file │
└──────────┬────────────────────────────────┬──────────────────────┘
│ │
▼ ▼
┌─────────────────────────┐ ┌─────────────────────────────────┐
│ AI Processing (Async) │ │ Hono API Server │
│ ┌───────────────────┐ │ │ /api/bookmarks (CRUD+search) │
│ │ Metadata (Cheerio) │ │ │ /api/whatsapp (QR+status) │
│ │ Summary (Gemini) │ │ │ /api/settings (config) │
│ │ Tags (Gemini) │ │ │ /api/login (auth) │
│ │ Embedding (Gemini)│ │ └──────────┬──────────────────────┘
│ └───────────────────┘ │ │
│ Retry cron: */5 * * * │ ▼
└─────────────────────────┘ ┌─────────────────────────────────┐
│ TanStack Start (React) │
┌──────────────────│ SSR + SPA │ shadcn/ui │
│ │ Search │ Settings │ Auth │
▼ └─────────────────────────────────┘
┌─────────────────────────┐
│ In-Memory Embedding │
│ Cache (Float32Array) │
│ Cosine Similarity │
│ ~2ms search time │
└─────────────────────────┘That's the story. Built out of frustration, unlocked by one good insight, and shipped with a lot of tokens.