Atlas is an enterprise knowledge and retrieval platform designed to become the organization’s central intelligence layer for internal data discovery, retrieval-grounded assistants, and AI-native workflows. At its current stage, the platform unifies people profiles, structured skill inventories, CV documents, and engineering standards into a shared searchable corpus, exposed through APIs and retrieval services that can be consumed both by applications and by LLM-powered assistants. The system is evolving beyond a traditional search backend into an agent-ready knowledge layer: a foundation for grounded enterprise copilots that can reason over internal data, choose the right retrieval tools, and answer with traceable source-backed evidence. At the core of Atlas is a hybrid retrieval architecture that combines classical lexical search with semantic vector search. On the lexical side, the platform uses keyword ranking concepts such as BM25 and inverse document frequency (IDF) to prioritize rare, high-signal terms and preserve precision for exact technology, skill, role, and policy queries. On the semantic side, OpenAI embeddings and pgvector enable similarity search over normalized chunks of structured and unstructured content, improving recall for natural-language questions and conceptually related matches. This hybrid approach gives Atlas the best of both worlds: deterministic relevance for exact terms and robust semantic discovery for fuzzy or conversational queries. The platform is built for practical enterprise use cases such as finding specialists by stack or capability, exploring staff expertise across CV and skills data, retrieving policy or standards guidance, matching talent against requirements, and powering retrieval-augmented generation (RAG) scenarios for internal assistants. Its new assistant runtime introduces agentic orchestration on top of this retrieval layer: a single assistant endpoint can dynamically choose among dataset-bound tools, fetch authoritative source content, and produce grounded responses with citations instead of relying on prompt-only reasoning. This makes Atlas not just a search system, but an emerging agent platform for trusted internal knowledge operations. Atlas is also structured for growth. Source data can be refreshed, normalized into corpora, chunked, indexed, embedded, and served through a modular FastAPI-based runtime with observability baked in through Langfuse and OpenTelemetry. Today, the active knowledge domains are people and standards; over time, the system is intended to expand into a broader enterprise knowledge fabric spanning additional business domains, automation surfaces, and multi-agent workflows. In that sense, Atlas is the retrieval backbone for the company’s future AI stack: hybrid search, vector-native indexing, agentic tool use, and trustworthy LLM integration at enterprise scale.

Python 3.12
FastAPI
Uvicorn
OpenAI SDK
PydanticAI (for agents and tools for llm)
Langfuse
OpenTelemetry
PostgreSQL
pgvector
Psycopg
SQLAlchemy
NumPy
Tiktoken
PyPDF
Azure Data Factory
Azure Storage Blob SDK
Pydantic Settings
Loguru
uv
Docker Compose
pytest
Ruff