AI Toolbox
The tools we
actually build with.
Our curated stack. The platforms, models, engines, frameworks, and methodologies we draw from when we design and deploy AI systems. Not exhaustive. Opinionated. If it is here, we have used it or read it, and we stand behind it.
AI Stack
AI Models
The underlying neural networks. Models are trained artifacts. Engines are the tools you use to interact with them.
- Claude↗ Frontier
Anthropic's flagship LLM family. Strong reasoning, long context, low hallucination rate.
- GPT (OpenAI)↗ Frontier
The GPT family. Widest ecosystem, mature tooling, highest cost at production scale.
- Gemini↗ Frontier
Google DeepMind's multimodal frontier family. Deep Google Cloud integration.
- Grok (xAI)↗ Frontier
xAI's frontier model. Fast iteration, tight X-platform integration.
- Cohere Command↗ Enterprise
Enterprise-focused frontier models with strong retrieval and multilingual capability. Self-hostable at scale.
- AI21 Jamba↗ Hybrid arch
Hybrid Mamba-Transformer architecture. Extremely long context with efficient throughput.
- Reka↗ Multimodal
Native multimodal models with image, video, and audio understanding built in.
- Llama↗ Open weight
Meta's open-weight family. Self-hostable, commercially usable, community-backed.
- Mistral↗ Open weight
European open-weight models. Strong performance-to-size ratios, commercial and Apache options.
- DeepSeek↗ Open weight
Chinese frontier-class open-weight models. Aggressive price-performance, MoE architectures.
- Qwen↗ Open weight
Alibaba's multilingual model family. Strong non-English performance, permissive licensing.
- Gemma↗ Google open
Google's open-weight model family. Built from the same research as Gemini, runnable on consumer hardware.
- Phi↗ Microsoft open
Microsoft's small, capable open models. Tuned for on-device and edge workloads.
- NVIDIA Nemotron↗ NVIDIA
NVIDIA's open model family. Strong for building agentic systems on NVIDIA infrastructure.
AI Engines: Web
Browser-based chat and search interfaces. The public on-ramp for most AI use.
- ChatGPT↗ Web
OpenAI's flagship web interface. Still the default on-ramp for most people using AI.
- Claude.ai↗ Web
Anthropic's web client for Claude. Projects, Artifacts, and file-first workflows.
- Google Gemini↗ Web
Google's consumer chat surface for Gemini. Deep integration with Workspace and Search.
- Grok↗ Web
xAI's chat interface. Real-time access to the X firehose for current-events queries.
- Perplexity↗ Search
AI-powered search with inline citations. A Google replacement for many technical queries.
- Poe↗ Multi-model
Quora's multi-model chat client. One interface, many models, useful for comparison.
AI Engines: Desktop
Native applications. Better OS integration, local-model support, MCP tool access.
- Claude Desktop↗ Mac/Win
Native Claude app. First-class MCP server support. The reference client for local tool access.
- ChatGPT Desktop↗ Mac/Win
Native ChatGPT app. Tighter OS integration for voice and screen capture workflows.
- LM Studio↗ Local models
Desktop engine for running open-weight models locally. Chat UI + server in one app.
- Msty↗ Local + cloud
Desktop chat that handles both local and cloud models in one unified interface.
AI Engines: Command Line
Terminal-native agents. The engine of choice for developers and automation workflows.
- Claude Code↗ Anthropic
Anthropic's agentic CLI. Autonomously edits code, runs commands, completes multi-step tasks.
- OpenAI Codex CLI↗ OpenAI
OpenAI's open source coding agent. Runs locally with GPT models for code generation and edits.
- Gemini CLI↗ Google
Google's open source terminal agent for Gemini. Code, reasoning, and tool use from the CLI.
- Aider↗ Open source
Git-aware pair-programming CLI. Model-agnostic, strong for diff-based edits to large codebases.
- GitHub Copilot CLI↗ GitHub
Copilot in the terminal. Suggests shell commands and explains existing ones.
- Goose↗ Block
Block's open source CLI agent. MCP-native, extensible, strong for developer automation.
AI Engines: IDE Integration
AI built into the editor. Inline completion, chat panels, and multi-file agentic edits.
- GitHub Copilot↗ VS Code + JetBrains
The original inline code assistant. Chat, edits, agents, and the widest IDE coverage.
- Cursor↗ Fork of VS Code
AI-native editor with multi-file edits, agents, and tight inline context. Default for many teams.
- Windsurf↗ Codeium
Codeium's agentic IDE. Strong Cascade agent for multi-file and cross-repo work.
- Cline↗ VS Code ext
Open source agentic coding extension. Executes commands, edits files, runs checks inline.
- Continue↗ Open source
Open source Copilot alternative. Bring your own model, full control over context and rules.
- Zed↗ Native editor
Rust-built editor with first-class AI assistant, inline prediction, and agent panel.
- Cody↗ Sourcegraph
Sourcegraph-powered AI with deep codebase context through their indexing infrastructure.
- JetBrains AI Assistant↗ JetBrains
Native AI across IntelliJ, PyCharm, WebStorm, and the rest of the JetBrains family.
AI Service Providers
Hosted inference APIs. Where you actually call the models programmatically.
- Anthropic API↗ Direct
Direct Claude access. Best-in-class reasoning for governed production workloads.
- OpenAI API↗ Direct
GPT + Whisper + embeddings + fine-tuning. The most mature LLM API on the market.
- AWS Bedrock↗ Hyperscale
Unified API for Claude, Llama, Mistral, Titan. Sits cleanly inside existing AWS governance.
- Azure OpenAI↗ Hyperscale
OpenAI models on Microsoft's enterprise compliance and identity stack.
- Google Vertex AI↗ Hyperscale
Gemini + third-party models with full GCP integration. Strong MLOps tooling.
- OpenRouter↗ Gateway
Unified API across every major provider. Swap models without changing integration code.
- Groq↗ Fast inference
Custom LPU hardware for extremely low-latency open-model inference.
- Fireworks AI↗ Fast inference
Optimized hosting for open-weight models with aggressive throughput and fine-tuning support.
- Together AI↗ Open models
Hosted open-weight models with fine-tuning. Cheap scale for non-frontier workloads.
- Replicate↗ Open models
One-click hosted inference for open models. Strong for image and audio workloads.
- DeepInfra↗ Open models
Serverless inference for open-weight models. Pay-per-token with no capacity commitment.
Local Runtimes
Run models on your hardware. For privacy, cost, or compliance.
- Ollama↗ Local
Run open-weight LLMs locally. Fastest path from zero to a working private-model environment.
- LM Studio↗ Local
Desktop GUI for local model testing. Useful for evaluation and private workflows.
- llama.cpp↗ Engine
C++ inference engine underneath most local LLM tooling. Runs on anything that boots.
- vLLM↗ Server
High-throughput production inference server with PagedAttention. Industry default for self-hosting.
- Text Generation Inference↗ Server
Hugging Face's production serving stack. First-class support for any HF model.
Orchestration
Multi-step reasoning, chains, and stateful LLM applications.
- LangGraph↗ Graph
Stateful, graph-based orchestration. Fine-grained control over how agents reason and act.
- LangChain↗ Framework
The original LLM app framework. Chains, retrievers, tools, memory. The batteries-included path.
- LlamaIndex↗ RAG-first
Data framework for RAG and agents. Strong for document-heavy and knowledge-graph workflows.
- Vercel AI SDK↗ TypeScript
TypeScript-first SDK for building AI UX. Streaming, tools, agents, and provider abstraction.
- Semantic Kernel↗ .NET / Py
Microsoft's orchestration SDK. First-class across C#, Python, and Java.
- DSPy↗ Compiler
Declarative prompting. Compile programs, not prompts. Optimize end-to-end against evals.
- Haystack↗ RAG
Open source framework for search, RAG, and agent pipelines with strong pre-built components.
- Flowise↗ Low-code
Open source drag-and-drop builder for LLM flows. Fast prototyping, deployable API endpoints.
- Langflow↗ Visual
Visual LangChain builder. Good fit when non-engineers need to collaborate on agent flows.
- n8n↗ Workflow
General-purpose workflow automation with strong AI node support. Self-hostable.
Agent Frameworks
Autonomous or semi-autonomous agents that plan and act.
- Claude Agent SDK↗ Anthropic
Anthropic's agent SDK. Production primitives for building and deploying Claude-powered agents.
- OpenAI Agents SDK↗ OpenAI
OpenAI's official agent framework. Handoffs, tool use, guardrails, tracing. Successor to Swarm.
- Microsoft 365 Agents SDK↗ Microsoft
Microsoft's cross-platform SDK for building agents. Successor to Bot Framework, works across Teams, Copilot, web.
- Copilot Studio Agent SDK↗ Microsoft
SDK for building agents that plug into Copilot Studio and the broader Microsoft 365 Copilot ecosystem.
-
Google's open source multi-agent framework. First-class support for A2A protocol and Gemini.
- CrewAI↗ Multi-agent
Role-based agent teams that collaborate on complex tasks. Fast path to useful multi-agent setups.
- AutoGen↗ Microsoft Research
Microsoft Research's multi-agent framework. Strong academic and production lineage.
- Pydantic AI↗ Typed Python
Type-safe agent framework from the Pydantic team. Production-ready, FastAPI-like ergonomics.
- Mastra↗ TypeScript
TypeScript-native agent framework. Strong for building agents alongside web apps in the JS ecosystem.
- Letta↗ Memory
Formerly MemGPT. Stateful agents with explicit long-term memory and self-editing context.
- smolagents↗ Hugging Face
Minimalist agent framework from Hugging Face. Code-first agents with small dependency footprint.
- OpenHands↗ Coding agent
Formerly OpenDevin. Fully autonomous software engineering agent running in a sandboxed environment.
- Griptape↗ Python
Python framework for building agents with structured memory, tool use, and workflows.
- Agno↗ Lightweight
Formerly Phidata. Lightweight agent framework with Python-native ergonomics.
- Goose↗ CLI
Block's open source CLI agent. Extensible through MCP, strong for developer workflows.
- Strands Agents↗ AWS
AWS-backed open source agent framework. Model-driven loop, first-class Bedrock and AWS integration.
Decision Platforms
Enterprise AI platforms with ontology and action layers baked in.
- Palantir AIP↗ Decision
Ontology-anchored AI decision platform. LLMs with governed access to enterprise data, logic, and actions.
- Palantir Foundry↗ Data OS
The operational system beneath AIP. Integration, ontology, and workflow layer for the enterprise.
- Databricks Mosaic AI↗ Lakehouse AI
AI built on top of the lakehouse. Strong fit when data already lives in Databricks.
Protocols and Standards
The wire formats and contracts AI systems communicate over.
- Model Context Protocol (MCP)↗ Anthropic
Open standard for giving AI governed, contextual access to tools, data, and systems.
- OpenAI Function Calling↗ OpenAI
The original tool-use API. Still the widest-deployed spec for LLM function invocation.
- A2A Protocol↗ Google
Agent-to-Agent communication protocol. Emerging standard for agent-ecosystem interop.
- OpenAPI↗ Spec
Not AI-specific, but the lingua franca for describing APIs that agents need to call.
Generative Media
Image Generation
Text and image conditioning to produce still imagery.
- Midjourney↗ Image
Discord and web-native image generation. Still the aesthetic benchmark for stylized output.
- DALL-E↗ OpenAI
OpenAI's image model. Tight integration with ChatGPT, strong at prompt comprehension.
- Stable Diffusion↗ Open weight
The open-weight foundation for most self-hosted image workflows. SDXL, SD3, etc.
- Flux↗ Frontier
Black Forest Labs. Currently the strongest open-weight image model. Built by the Stable Diffusion team.
- Imagen↗ Google
Google DeepMind's photorealism-focused image family, available through Vertex AI.
- Ideogram↗ Typography
Best-in-class for in-image typography and coherent text rendering.
- Adobe Firefly↗ Commercial-safe
Adobe's commercially safe generative stack. Trained on licensed content, deep Creative Cloud integration.
- Leonardo.ai↗ Workflow
Image-focused workflow platform with fine-tuning, canvas editing, and production controls.
Video Generation
Text, image, and video-to-video models for motion output.
- Runway↗ Pro video
Gen-3 and Gen-4 video models with a professional editing workflow. Used in actual commercial production.
- Google Veo↗ Google
DeepMind's text-to-video model. Veo 3 delivers high-fidelity output with native audio generation.
- Sora↗ OpenAI
OpenAI's text and image-to-video model. Strong for longer, coherent scenes with complex motion.
- Pika↗ Consumer
Fast, playful text-to-video with strong effects library. Popular for short-form and social content.
- Kling↗ Kuaishou
Chinese text-to-video model with strong realism and long clip support.
- Luma Dream Machine↗ Luma
Text and image-to-video with fast iteration. Good at camera motion and scene dynamics.
- Hailuo↗ MiniMax
MiniMax's video model. Strong character consistency, competitive on cost per clip.
Audio and Music
Voice synthesis, song generation, sound design.
- Suno↗ Music
End-to-end song generation with vocals and instruments. The most capable music model available.
- Udio↗ Music
Competitor to Suno with strong audio fidelity and genre flexibility.
- ElevenLabs↗ Voice + TTS
Industry-standard voice synthesis and cloning. Production-ready voice agents and dubbing.
- Stable Audio↗ Stability
Stability AI's audio generation. Open weights available, strong for sound design and loops.
- Google Lyria↗ Google
DeepMind's music model. Available through MusicFX and Vertex AI.
- MusicGen↗ Meta
Meta's open-weight music generation model. Self-hostable for private workflows.
- AudioGen↗ Meta
Meta's open-weight ambient and effects audio model. Environmental sound from prompts.
3D Generation
Meshes, environments, and spatial assets from prompts.
- Meshy↗ 3D models
Text and image to 3D mesh generation. Game-dev and visualization-ready outputs.
- Luma Genie↗ Luma
Luma's 3D generator. Strong for 3D assets from photos or text prompts.
- Rodin↗ Deemos
High-fidelity 3D generation focused on characters and organic models.
- Blockade Labs↗ Skyboxes
AI-generated 360° skyboxes and environments. Used widely in game and VR workflows.
- Tripo3D↗ 3D models
Fast text and image-to-3D generation with PBR materials for real-time rendering.
Creative Platforms
Workflow tools and UIs that wrap generative models.
- ComfyUI↗ Node-based
Open source node-based workflow builder for Stable Diffusion and beyond. Industry standard for serious image pipelines.
- Automatic1111↗ Open source
The original Stable Diffusion web UI. Massive extension ecosystem, still widely deployed.
- Krea↗ Creative
Real-time image generation with a canvas-native creative workflow. Strong for iteration.
- Freepik AI Suite↗ All-in-one
Unified interface across multiple gen models. Good for comparing outputs quickly.
- Higgsfield↗ Motion
Cinematic video generation with camera-motion presets. Tuned for motion-design work.
Data & Retrieval
Vector Databases
Similarity search engines for embeddings.
- Pinecone↗ Managed
Managed vector database for RAG. Low-latency similarity search at production scale.
- Weaviate↗ Open source
Vector database with hybrid search, schema-awareness, and graph capabilities.
- Qdrant↗ Open source
Rust-based vector engine. Fast, lightweight, strong for self-hosted deployments.
- Chroma↗ Open source
Dev-friendly open source vector database. Zero-config start, embeddable.
- Milvus↗ Scale
Cloud-native vector database built for billion-scale workloads.
- LanceDB↗ Embedded
Embedded serverless vector database. Columnar, versioned, strong for multimodal workloads.
- pgvector↗ Postgres
Vector search extension for Postgres. Often the right answer when Postgres is already in play.
- Redis Vector Search↗ Redis
Vector search in Redis. Useful when Redis is already the cache or session store in your stack.
- MongoDB Atlas Vector Search↗ MongoDB
Native vector search in MongoDB Atlas. Best fit when document data is already in Mongo.
- Elasticsearch↗ Elastic
Dense vector + hybrid search in Elasticsearch. Strong when combined with full-text at scale.
Knowledge Graphs
Structured representation of entities and relationships.
- Neo4j↗ Property graph
The industry standard graph database. Strong tooling, mature ecosystem, Cypher query language.
- TigerGraph↗ Analytics
High-performance analytics graph. Strong for deep-link analysis and fraud detection.
- ArangoDB↗ Multi-model
Graph, document, and key-value in one engine. Useful when graph is part of a broader pattern.
Data Infrastructure
Storage, transformation, and movement of data at scale.
- Snowflake↗ Warehouse
Cloud data warehouse with strong data sharing, governance, and now native AI functions.
- Databricks↗ Lakehouse
Unified analytics and AI platform on top of open storage formats (Delta, Iceberg).
- dbt↗ Transform
SQL-based data transformation with version control and tests. The default for modern ELT.
- Airbyte↗ Ingestion
Open source data ingestion. Large connector catalog, self-hostable.
- Apache Kafka↗ Streaming
The streaming event backbone most production architectures eventually need.
- Apache Iceberg↗ Table format
Open table format for large analytic datasets. The neutral ground between warehouses.
Data Governance and Catalogs
Lineage, classification, access, audit.
- Collibra↗ Catalog
Enterprise data catalog and governance platform. Default in regulated industries.
- Atlan↗ Modern catalog
Modern data catalog with strong lineage, collaboration, and API-first design.
- OpenLineage↗ Open standard
Open standard for collecting data lineage from pipelines. Vendor-neutral.
- Unity Catalog↗ Databricks
Databricks-native governance layer. Fine-grained access control across data and AI assets.
Infrastructure
Cloud Providers
Where the workloads run.
- AWS↗ Hyperscale
The broadest cloud surface area. Strong for heterogeneous enterprise architectures.
- Azure↗ Hyperscale
Default cloud for Microsoft-aligned enterprises. Tight identity and M365 integration.
- Google Cloud↗ Hyperscale
Strongest data and AI primitives. Gemini, BigQuery, Vertex AI all first-class.
- Oracle Cloud Infrastructure↗ Enterprise
Fits Oracle-heavy shops. Strong database integration and competitive egress pricing.
- IBM Cloud↗ Enterprise
WatsonX AI and regulated-industry positioning. Hybrid and mainframe-adjacent workloads.
GPU Compute and Model Hosting
Raw GPU compute and specialized inference hosting. Where generative and custom model workloads actually run.
- RunPod↗ GPU cloud
On-demand and serverless GPU compute. One of the cheapest paths to run custom model workloads.
- Fal.ai↗ Fast inference
Optimized inference for generative models. Sub-second image and video generation at API scale.
- Modal↗ Serverless
Serverless Python functions with GPU support. Clean path from notebook to production.
- Baseten↗ Model deploy
Model deployment platform. Strong for serving custom and fine-tuned open-weight models.
- CoreWeave↗ GPU infra
GPU-specialized cloud infrastructure. Favored for large-scale training and inference.
- Lambda↗ GPU cloud
GPU cloud for training and inference. Competitive pricing, direct NVIDIA partnerships.
Containers, Orchestration, IaC
Packaging, deploying, and describing systems declaratively.
- Kubernetes↗ Orchestration
Container orchestration. The substrate most production AI workloads end up running on.
- OpenShift↗ Enterprise K8s
Red Hat's enterprise Kubernetes distribution. Default in regulated and hybrid-cloud shops.
- Docker↗ Containers
Container runtime and image format. Still the baseline for packaging workloads.
- Terraform↗ IaC
Infrastructure as code. The baseline for reproducible, version-controlled cloud environments.
- Ansible↗ Config
Agentless configuration management. Still the fastest way to automate existing systems.
- ArgoCD↗ GitOps
Declarative, Git-driven continuous deployment for Kubernetes.
- Helm↗ K8s
Package manager for Kubernetes. Charts are how most production apps get templated.
Application Performance Monitoring
General-purpose APM and observability. Traces, metrics, logs across applications and infrastructure.
- SigNoz↗ Open source
OpenTelemetry-native open source APM. Traces, metrics, and logs in one UI. Self-hostable alternative to Datadog.
- Datadog↗ Enterprise
Industry-standard SaaS observability. Broad coverage across infrastructure, APM, logs, and security.
- New Relic↗ Enterprise
Long-established APM and full-stack observability. Consumption-based pricing across telemetry types.
- Grafana + Prometheus↗ Open source
The de facto open source stack for metrics, dashboards, and alerting. Runs anywhere.
- Honeycomb↗ High-cardinality
Event-oriented observability for complex distributed systems. Strong for debugging production unknowns.
- Elastic APM↗ ELK
APM inside the Elastic stack. Fits naturally when logs and search already live in Elasticsearch.
- Dynatrace↗ Enterprise
AI-powered enterprise observability. Strong auto-instrumentation and root-cause analysis.
- Sentry↗ Errors + Perf
Error tracking and performance monitoring focused on application code. The default for front-end and app errors.
- OpenTelemetry↗ Open standard
The vendor-neutral telemetry standard. Instrument once, export anywhere. Backbone of modern observability.
LLM Observability and Evals
Measurement and monitoring specific to LLM workloads. Prompt tracing, evaluations, and cost tracking.
- LangSmith↗ Tracing + Evals
LangChain's observability and evaluation platform. Tight integration with their stack.
- Langfuse↗ Open source
Open source LLM engineering platform. Tracing, evals, prompt management.
- Braintrust↗ Evals
Evaluation-first platform for AI products. Strong for iteration speed on production prompts.
- Arize Phoenix↗ Open source
Open source observability for LLMs. Visual traces, evaluations, datasets.
- Helicone↗ Gateway
LLM gateway with logging, caching, and cost tracking built in. Drop-in proxy.
Practices
Methodologies
The patterns we apply. Vendor-neutral thinking that outlives any specific tool.
- Ontology Design
Formal modeling of business objects, relationships, and logic. The anchor for every downstream AI system.
- Retrieval-Augmented Generation (RAG)
LLM responses grounded in retrieved context from your own data. The baseline pattern for enterprise AI.
- Agentic Orchestration
Multi-step reasoning where AI plans, calls tools, evaluates results, and iterates. Beyond single-shot prompts.
- Human-in-the-Loop (HITL)
AI proposes, humans approve, systems execute. Accountability without slowing the flywheel.
- Purpose-Based Access Control
Access scoped by role, data classification, and intent. Not just who. Also why.
- Model Evaluations
Systematic measurement of model behavior over time. The difference between production AI and a hopeful pilot.
- Semantic Data Classification
Automatic tagging of data by type, sensitivity, and meaning. Makes governance scale past manual review.
- GitOps and Infrastructure as Code
Git as the source of truth for infrastructure and deployment. Auditable, reproducible, rollback-able.
- Knowledge Graphs
Relational representation of entities and their links. Gives AI structural context beyond bag-of-words retrieval.
- MCP Tool Design
Wrapping existing business logic, APIs, and data as callable tools for AI agents through MCP servers.
- Fine-tuning vs RAG
A diagnostic framing: fine-tune for style and format, retrieve for facts. Conflating them burns budget.
- Guardrails and Policy Enforcement
Deterministic checks around non-deterministic models. Input filtering, output validation, scope enforcement.
Reading
Specifications, Essays, and References
Outside thinking worth your time.
- Palantir AIP platform overview↗ Platform
How Palantir frames the decision platform: ontology, tools, actions, scenarios, guardrails.
- MCP specification↗ Spec
The Anthropic-authored open standard for AI-to-system integration. Read it before you build.
- Anthropic Engineering Blog↗ Research
Practical guidance from the team building Claude. Strong on agents and evaluation patterns.
- OpenAI Cookbook↗ Recipes
Practical, runnable examples from OpenAI. Still the best source for day-one integration patterns.
Have a platform or pattern we should know?
We update this as the landscape moves. If there is something production-proven we missed, tell us.