Agentic AI Systems · Personal AI & Career Tech

Career Digital Twin

An AI-powered career assistant that turns a static CV into a live, queryable professional presence — answering questions about publications, research, skills, and experience with grounded, source-referenced responses.

Architecture OpenAI SDK · RAG

Tech Stack

OpenAI SDK RAG ChromaDB Gradio Hugging Face Spaces pypdf Python

Source Code View on GitHub

RAG

Grounded Retrieval Architecture

∞

Multi-Turn Conversations

Live

Deployed on Hugging Face Spaces

The Problem

A static CV can't answer questions — and professionals answer the same ones repeatedly

Recruiters, collaborators, and conference organizers regularly ask the same career questions — what research has been published, what projects were involved in, what technical skills are held, what experience is most relevant to a specific role. A PDF resume can't respond, can't clarify, and can't engage. The professional either answers each query manually — repeatedly — or leaves questions unanswered. Neither scales. What's needed is a persistent, accurate, always-available representative that knows the career data and can answer on behalf of the professional.

The Solution

A RAG-powered agent that knows the career data and answers for you

DigitalTwin App ingests a personal knowledge base — CV, LinkedIn profile, research papers, project summaries, and blog posts — chunks and embeds them using OpenAI embeddings, and stores them in a ChromaDB vector store. When a user asks a question, the system retrieves the most semantically relevant career facts and passes them to an OpenAI SDK conversational agent that responds accurately, in context, and grounded in the source documents. The agent maintains full multi-turn conversation history and runs an agentic tool-calling loop: it captures visitor contact details via a record_user_details tool and logs unanswered questions via a record_unknown_question tool — both triggering real-time Pushover push notifications. The application is deployed on Hugging Face Spaces via Gradio, publicly accessible and always available.

Key Outcome

A live, deployed career agent that turns a static CV into a queryable professional presence — answering questions about publications, research, skills, and experience with factual, grounded responses, while actively capturing visitor leads and flagging knowledge gaps as push notifications. Deployed on Hugging Face Spaces and accessible to anyone with the link.

Technical Deep Dive

Architecture & Design

RAG Pipeline

Phase 1 — Ingestion

Data Sources

Career Data

CV · Publications · Project summaries · Skills

Processing

Chunking & Embedding

pypdf loading · OpenAI embeddings · ChromaDB vector store

▼

Phase 2 — Query & Retrieval

User Input

Question via Gradio UI

Multi-turn conversation · Career-related queries

Semantic Retrieval

ChromaDB Lookup

Embeds query · Retrieves most relevant career facts

▼

Phase 3 — Generation

Conversational Agent · OpenAI SDK

DigitalTwin Agent

Retrieved facts + conversation history → grounded, persona-matched response · agentic tool calls for lead capture & question logging · Pushover alerts

▼

Phase 4 — Deployment

Live Deployment

Hugging Face Spaces

Gradio UI · Publicly accessible · HF Secrets for API key management

Phase 1

Ingestion

Career data — CV, publications list, and project summaries — is loaded and structured using pypdf, then chunked into semantically coherent passages. Each chunk is embedded using OpenAI embeddings and stored in a ChromaDB vector store, building the factual knowledge base the agent retrieves from at query time.

Phase 2

Query & Retrieval

When a user submits a question via the Gradio UI, the query is embedded using the same OpenAI embedding model and used to perform a semantic similarity search against the ChromaDB vector store. The most relevant career fact chunks are retrieved and passed to the agent as grounding context for its response.

Phase 3

Generation

The OpenAI SDK conversational agent receives the retrieved career facts alongside the full conversation history and a system prompt that defines the agent's persona. It runs an agentic tool-calling loop — invoking record_user_details when a visitor shares contact information and record_unknown_question when a query falls outside the knowledge base, both delivering real-time Pushover push notifications. It then generates a grounded, context-aware response that accurately represents the professional's background, maintaining tone and persona consistency across multi-turn conversations.

Phase 4

Deployment

The full application is deployed on Hugging Face Spaces using the Gradio SDK. API keys are managed securely via HF Spaces Secrets — never exposed in code. The deployed app is publicly accessible via a persistent URL, enabling anyone to query the career agent directly without requiring local setup.

Key Design Decisions

RAG grounds the agent in facts — not hallucinations

Without retrieval, a conversational LLM would generate plausible-sounding but potentially inaccurate career details. By grounding every response in semantically retrieved chunks from the actual career data, the agent is constrained to facts that exist in the knowledge base — making it a trustworthy representative rather than a generative one.

System prompt persona customization shapes identity

The agent's system prompt defines not just what it knows but how it communicates — tone, level of formality, and the professional identity it represents. This makes the agent feel like a genuine extension of the professional rather than a generic chatbot responding to career queries.

ChromaDB enables fast local semantic search without infrastructure

ChromaDB runs as an embedded vector store — no external database server, no cloud vector infrastructure required. For a single-professional career knowledge base, this provides sufficient retrieval performance while keeping the deployment simple and the setup reproducible on any machine with a Python environment.

Agentic tool use turns the chatbot into a lead-capture and feedback loop

Beyond answering questions, the agent runs a tool-calling loop with two functions: record_user_details captures visitor contact information, and record_unknown_question logs queries the agent could not answer. Both trigger real-time Pushover push notifications, converting the chatbot into an active lead-generation and knowledge-gap tracking tool.

Tech Stack

Technology	Purpose
OpenAI SDK	Conversational agent, embedding generation, and response synthesis
ChromaDB	Embedded vector store for semantic retrieval of career data chunks
Gradio	Interactive web UI with multi-turn conversation support
Hugging Face Spaces	Cloud deployment with persistent public URL
pypdf	PDF text extraction for CV, LinkedIn export, and research papers during ingestion
requests	HTTP POST calls to the Pushover REST API for real-time lead and question notifications
python-dotenv / HF Secrets	Secure API key management for local and cloud deployment
Python	Core language and pipeline orchestration

Results & Metrics

What the system delivers

RAG

Grounded Architecture

Every response grounded in retrieved career facts — no hallucinated credentials or invented history

∞

Multi-Turn Conversations

Full conversation history maintained across turns — context-aware follow-up questions supported

Agentic Tools

Lead capture and question logging via function-calling loop — both trigger real-time Pushover push notifications

Live

Deployed on HF Spaces

Publicly accessible via persistent URL — no local setup required for users

🎯

Factually accurate career responses

RAG retrieval constrains the agent to facts present in the career knowledge base. Queries about publications, research experience, technical skills, and project history return grounded, source-referenced responses — not plausible but fabricated answers.

💬

Context-aware multi-turn dialogue

The agent maintains full conversation history across turns — enabling follow-up questions, clarifications, and multi-step queries without losing context. A recruiter can ask about a publication, then ask for more detail, then ask about related skills, all in a single coherent conversation.

🤖

Agentic lead capture and question logging

The agent runs a tool-calling loop with two functions: record_user_details captures a visitor's name and email when they express interest in getting in touch, and record_unknown_question logs any query the agent could not answer from its knowledge base. Both trigger an immediate Pushover push notification — turning the chatbot into an active lead-generation and content-gap tracking tool.

🌐

Publicly deployed and always available

Deployed on Hugging Face Spaces with a persistent public URL — accessible to recruiters, collaborators, and conference organizers at any time without requiring the professional to be present or respond manually.

🔒

Secure credential management

API keys are managed via python-dotenv locally and HF Spaces Secrets in production — never hardcoded or exposed in the repository. The same codebase runs securely in both local development and cloud deployment environments.

← Back to Agentic AI Systems

← Previous

Trip Planner

CrewAI · Hierarchical