AI Toolbox

The tools we
actually build with.

Our curated stack. The platforms, models, engines, frameworks, and methodologies we draw from when we design and deploy AI systems. Not exhaustive. Opinionated. If it is here, we have used it or read it, and we stand behind it.

AI Stack

AI Models

The underlying neural networks. Models are trained artifacts. Engines are the tools you use to interact with them.

  • Claude Frontier

    Anthropic's flagship LLM family. Strong reasoning, long context, low hallucination rate.

  • GPT (OpenAI) Frontier

    The GPT family. Widest ecosystem, mature tooling, highest cost at production scale.

  • Gemini Frontier

    Google DeepMind's multimodal frontier family. Deep Google Cloud integration.

  • Grok (xAI) Frontier

    xAI's frontier model. Fast iteration, tight X-platform integration.

  • Cohere Command Enterprise

    Enterprise-focused frontier models with strong retrieval and multilingual capability. Self-hostable at scale.

  • AI21 Jamba Hybrid arch

    Hybrid Mamba-Transformer architecture. Extremely long context with efficient throughput.

  • Reka Multimodal

    Native multimodal models with image, video, and audio understanding built in.

  • Llama Open weight

    Meta's open-weight family. Self-hostable, commercially usable, community-backed.

  • Mistral Open weight

    European open-weight models. Strong performance-to-size ratios, commercial and Apache options.

  • DeepSeek Open weight

    Chinese frontier-class open-weight models. Aggressive price-performance, MoE architectures.

  • Qwen Open weight

    Alibaba's multilingual model family. Strong non-English performance, permissive licensing.

  • Gemma Google open

    Google's open-weight model family. Built from the same research as Gemini, runnable on consumer hardware.

  • Phi Microsoft open

    Microsoft's small, capable open models. Tuned for on-device and edge workloads.

  • NVIDIA's open model family. Strong for building agentic systems on NVIDIA infrastructure.

AI Engines: Web

Browser-based chat and search interfaces. The public on-ramp for most AI use.

  • OpenAI's flagship web interface. Still the default on-ramp for most people using AI.

  • Anthropic's web client for Claude. Projects, Artifacts, and file-first workflows.

  • Google's consumer chat surface for Gemini. Deep integration with Workspace and Search.

  • Grok Web

    xAI's chat interface. Real-time access to the X firehose for current-events queries.

  • Perplexity Search

    AI-powered search with inline citations. A Google replacement for many technical queries.

  • Poe Multi-model

    Quora's multi-model chat client. One interface, many models, useful for comparison.

AI Engines: Desktop

Native applications. Better OS integration, local-model support, MCP tool access.

  • Native Claude app. First-class MCP server support. The reference client for local tool access.

  • Native ChatGPT app. Tighter OS integration for voice and screen capture workflows.

  • LM Studio Local models

    Desktop engine for running open-weight models locally. Chat UI + server in one app.

  • Msty Local + cloud

    Desktop chat that handles both local and cloud models in one unified interface.

AI Engines: Command Line

Terminal-native agents. The engine of choice for developers and automation workflows.

  • Claude Code Anthropic

    Anthropic's agentic CLI. Autonomously edits code, runs commands, completes multi-step tasks.

  • OpenAI's open source coding agent. Runs locally with GPT models for code generation and edits.

  • Gemini CLI Google

    Google's open source terminal agent for Gemini. Code, reasoning, and tool use from the CLI.

  • Aider Open source

    Git-aware pair-programming CLI. Model-agnostic, strong for diff-based edits to large codebases.

  • Copilot in the terminal. Suggests shell commands and explains existing ones.

  • Goose Block

    Block's open source CLI agent. MCP-native, extensible, strong for developer automation.

AI Engines: IDE Integration

AI built into the editor. Inline completion, chat panels, and multi-file agentic edits.

  • GitHub Copilot VS Code + JetBrains

    The original inline code assistant. Chat, edits, agents, and the widest IDE coverage.

  • Cursor Fork of VS Code

    AI-native editor with multi-file edits, agents, and tight inline context. Default for many teams.

  • Windsurf Codeium

    Codeium's agentic IDE. Strong Cascade agent for multi-file and cross-repo work.

  • Cline VS Code ext

    Open source agentic coding extension. Executes commands, edits files, runs checks inline.

  • Continue Open source

    Open source Copilot alternative. Bring your own model, full control over context and rules.

  • Zed Native editor

    Rust-built editor with first-class AI assistant, inline prediction, and agent panel.

  • Cody Sourcegraph

    Sourcegraph-powered AI with deep codebase context through their indexing infrastructure.

  • Native AI across IntelliJ, PyCharm, WebStorm, and the rest of the JetBrains family.

AI Service Providers

Hosted inference APIs. Where you actually call the models programmatically.

  • Direct Claude access. Best-in-class reasoning for governed production workloads.

  • OpenAI API Direct

    GPT + Whisper + embeddings + fine-tuning. The most mature LLM API on the market.

  • AWS Bedrock Hyperscale

    Unified API for Claude, Llama, Mistral, Titan. Sits cleanly inside existing AWS governance.

  • Azure OpenAI Hyperscale

    OpenAI models on Microsoft's enterprise compliance and identity stack.

  • Google Vertex AI Hyperscale

    Gemini + third-party models with full GCP integration. Strong MLOps tooling.

  • OpenRouter Gateway

    Unified API across every major provider. Swap models without changing integration code.

  • Groq Fast inference

    Custom LPU hardware for extremely low-latency open-model inference.

  • Fireworks AI Fast inference

    Optimized hosting for open-weight models with aggressive throughput and fine-tuning support.

  • Together AI Open models

    Hosted open-weight models with fine-tuning. Cheap scale for non-frontier workloads.

  • Replicate Open models

    One-click hosted inference for open models. Strong for image and audio workloads.

  • DeepInfra Open models

    Serverless inference for open-weight models. Pay-per-token with no capacity commitment.

Local Runtimes

Run models on your hardware. For privacy, cost, or compliance.

  • Ollama Local

    Run open-weight LLMs locally. Fastest path from zero to a working private-model environment.

  • LM Studio Local

    Desktop GUI for local model testing. Useful for evaluation and private workflows.

  • llama.cpp Engine

    C++ inference engine underneath most local LLM tooling. Runs on anything that boots.

  • vLLM Server

    High-throughput production inference server with PagedAttention. Industry default for self-hosting.

  • Hugging Face's production serving stack. First-class support for any HF model.

Orchestration

Multi-step reasoning, chains, and stateful LLM applications.

  • LangGraph Graph

    Stateful, graph-based orchestration. Fine-grained control over how agents reason and act.

  • LangChain Framework

    The original LLM app framework. Chains, retrievers, tools, memory. The batteries-included path.

  • LlamaIndex RAG-first

    Data framework for RAG and agents. Strong for document-heavy and knowledge-graph workflows.

  • Vercel AI SDK TypeScript

    TypeScript-first SDK for building AI UX. Streaming, tools, agents, and provider abstraction.

  • Semantic Kernel .NET / Py

    Microsoft's orchestration SDK. First-class across C#, Python, and Java.

  • DSPy Compiler

    Declarative prompting. Compile programs, not prompts. Optimize end-to-end against evals.

  • Open source framework for search, RAG, and agent pipelines with strong pre-built components.

  • Flowise Low-code

    Open source drag-and-drop builder for LLM flows. Fast prototyping, deployable API endpoints.

  • Langflow Visual

    Visual LangChain builder. Good fit when non-engineers need to collaborate on agent flows.

  • n8n Workflow

    General-purpose workflow automation with strong AI node support. Self-hostable.

Agent Frameworks

Autonomous or semi-autonomous agents that plan and act.

  • Anthropic's agent SDK. Production primitives for building and deploying Claude-powered agents.

  • OpenAI's official agent framework. Handoffs, tool use, guardrails, tracing. Successor to Swarm.

  • Microsoft's cross-platform SDK for building agents. Successor to Bot Framework, works across Teams, Copilot, web.

  • SDK for building agents that plug into Copilot Studio and the broader Microsoft 365 Copilot ecosystem.

  • Google's open source multi-agent framework. First-class support for A2A protocol and Gemini.

  • CrewAI Multi-agent

    Role-based agent teams that collaborate on complex tasks. Fast path to useful multi-agent setups.

  • AutoGen Microsoft Research

    Microsoft Research's multi-agent framework. Strong academic and production lineage.

  • Pydantic AI Typed Python

    Type-safe agent framework from the Pydantic team. Production-ready, FastAPI-like ergonomics.

  • Mastra TypeScript

    TypeScript-native agent framework. Strong for building agents alongside web apps in the JS ecosystem.

  • Letta Memory

    Formerly MemGPT. Stateful agents with explicit long-term memory and self-editing context.

  • smolagents Hugging Face

    Minimalist agent framework from Hugging Face. Code-first agents with small dependency footprint.

  • OpenHands Coding agent

    Formerly OpenDevin. Fully autonomous software engineering agent running in a sandboxed environment.

  • Griptape Python

    Python framework for building agents with structured memory, tool use, and workflows.

  • Agno Lightweight

    Formerly Phidata. Lightweight agent framework with Python-native ergonomics.

  • Goose CLI

    Block's open source CLI agent. Extensible through MCP, strong for developer workflows.

  • AWS-backed open source agent framework. Model-driven loop, first-class Bedrock and AWS integration.

Decision Platforms

Enterprise AI platforms with ontology and action layers baked in.

  • Palantir AIP Decision

    Ontology-anchored AI decision platform. LLMs with governed access to enterprise data, logic, and actions.

  • The operational system beneath AIP. Integration, ontology, and workflow layer for the enterprise.

  • AI built on top of the lakehouse. Strong fit when data already lives in Databricks.

Protocols and Standards

The wire formats and contracts AI systems communicate over.

  • Open standard for giving AI governed, contextual access to tools, data, and systems.

  • The original tool-use API. Still the widest-deployed spec for LLM function invocation.

  • Agent-to-Agent communication protocol. Emerging standard for agent-ecosystem interop.

  • OpenAPI Spec

    Not AI-specific, but the lingua franca for describing APIs that agents need to call.

Generative Media

Image Generation

Text and image conditioning to produce still imagery.

  • Discord and web-native image generation. Still the aesthetic benchmark for stylized output.

  • DALL-E OpenAI

    OpenAI's image model. Tight integration with ChatGPT, strong at prompt comprehension.

  • Stable Diffusion Open weight

    The open-weight foundation for most self-hosted image workflows. SDXL, SD3, etc.

  • Flux Frontier

    Black Forest Labs. Currently the strongest open-weight image model. Built by the Stable Diffusion team.

  • Imagen Google

    Google DeepMind's photorealism-focused image family, available through Vertex AI.

  • Ideogram Typography

    Best-in-class for in-image typography and coherent text rendering.

  • Adobe Firefly Commercial-safe

    Adobe's commercially safe generative stack. Trained on licensed content, deep Creative Cloud integration.

  • Leonardo.ai Workflow

    Image-focused workflow platform with fine-tuning, canvas editing, and production controls.

Video Generation

Text, image, and video-to-video models for motion output.

  • Runway Pro video

    Gen-3 and Gen-4 video models with a professional editing workflow. Used in actual commercial production.

  • Google Veo Google

    DeepMind's text-to-video model. Veo 3 delivers high-fidelity output with native audio generation.

  • Sora OpenAI

    OpenAI's text and image-to-video model. Strong for longer, coherent scenes with complex motion.

  • Pika Consumer

    Fast, playful text-to-video with strong effects library. Popular for short-form and social content.

  • Kling Kuaishou

    Chinese text-to-video model with strong realism and long clip support.

  • Text and image-to-video with fast iteration. Good at camera motion and scene dynamics.

  • Hailuo MiniMax

    MiniMax's video model. Strong character consistency, competitive on cost per clip.

Audio and Music

Voice synthesis, song generation, sound design.

  • Suno Music

    End-to-end song generation with vocals and instruments. The most capable music model available.

  • Udio Music

    Competitor to Suno with strong audio fidelity and genre flexibility.

  • ElevenLabs Voice + TTS

    Industry-standard voice synthesis and cloning. Production-ready voice agents and dubbing.

  • Stable Audio Stability

    Stability AI's audio generation. Open weights available, strong for sound design and loops.

  • DeepMind's music model. Available through MusicFX and Vertex AI.

  • Meta's open-weight music generation model. Self-hostable for private workflows.

  • Meta's open-weight ambient and effects audio model. Environmental sound from prompts.

3D Generation

Meshes, environments, and spatial assets from prompts.

  • Meshy 3D models

    Text and image to 3D mesh generation. Game-dev and visualization-ready outputs.

  • Luma's 3D generator. Strong for 3D assets from photos or text prompts.

  • Rodin Deemos

    High-fidelity 3D generation focused on characters and organic models.

  • Blockade Labs Skyboxes

    AI-generated 360° skyboxes and environments. Used widely in game and VR workflows.

  • Tripo3D 3D models

    Fast text and image-to-3D generation with PBR materials for real-time rendering.

Creative Platforms

Workflow tools and UIs that wrap generative models.

  • ComfyUI Node-based

    Open source node-based workflow builder for Stable Diffusion and beyond. Industry standard for serious image pipelines.

  • Automatic1111 Open source

    The original Stable Diffusion web UI. Massive extension ecosystem, still widely deployed.

  • Krea Creative

    Real-time image generation with a canvas-native creative workflow. Strong for iteration.

  • Freepik AI Suite All-in-one

    Unified interface across multiple gen models. Good for comparing outputs quickly.

  • Higgsfield Motion

    Cinematic video generation with camera-motion presets. Tuned for motion-design work.

Data & Retrieval

Vector Databases

Similarity search engines for embeddings.

  • Pinecone Managed

    Managed vector database for RAG. Low-latency similarity search at production scale.

  • Weaviate Open source

    Vector database with hybrid search, schema-awareness, and graph capabilities.

  • Qdrant Open source

    Rust-based vector engine. Fast, lightweight, strong for self-hosted deployments.

  • Chroma Open source

    Dev-friendly open source vector database. Zero-config start, embeddable.

  • Milvus Scale

    Cloud-native vector database built for billion-scale workloads.

  • LanceDB Embedded

    Embedded serverless vector database. Columnar, versioned, strong for multimodal workloads.

  • pgvector Postgres

    Vector search extension for Postgres. Often the right answer when Postgres is already in play.

  • Vector search in Redis. Useful when Redis is already the cache or session store in your stack.

  • Native vector search in MongoDB Atlas. Best fit when document data is already in Mongo.

  • Dense vector + hybrid search in Elasticsearch. Strong when combined with full-text at scale.

Knowledge Graphs

Structured representation of entities and relationships.

  • Neo4j Property graph

    The industry standard graph database. Strong tooling, mature ecosystem, Cypher query language.

  • TigerGraph Analytics

    High-performance analytics graph. Strong for deep-link analysis and fraud detection.

  • ArangoDB Multi-model

    Graph, document, and key-value in one engine. Useful when graph is part of a broader pattern.

Data Infrastructure

Storage, transformation, and movement of data at scale.

  • Snowflake Warehouse

    Cloud data warehouse with strong data sharing, governance, and now native AI functions.

  • Databricks Lakehouse

    Unified analytics and AI platform on top of open storage formats (Delta, Iceberg).

  • dbt Transform

    SQL-based data transformation with version control and tests. The default for modern ELT.

  • Airbyte Ingestion

    Open source data ingestion. Large connector catalog, self-hostable.

  • Apache Kafka Streaming

    The streaming event backbone most production architectures eventually need.

  • Apache Iceberg Table format

    Open table format for large analytic datasets. The neutral ground between warehouses.

Data Governance and Catalogs

Lineage, classification, access, audit.

  • Collibra Catalog

    Enterprise data catalog and governance platform. Default in regulated industries.

  • Atlan Modern catalog

    Modern data catalog with strong lineage, collaboration, and API-first design.

  • OpenLineage Open standard

    Open standard for collecting data lineage from pipelines. Vendor-neutral.

  • Unity Catalog Databricks

    Databricks-native governance layer. Fine-grained access control across data and AI assets.

Infrastructure

Cloud Providers

Where the workloads run.

  • AWS Hyperscale

    The broadest cloud surface area. Strong for heterogeneous enterprise architectures.

  • Azure Hyperscale

    Default cloud for Microsoft-aligned enterprises. Tight identity and M365 integration.

  • Google Cloud Hyperscale

    Strongest data and AI primitives. Gemini, BigQuery, Vertex AI all first-class.

  • Fits Oracle-heavy shops. Strong database integration and competitive egress pricing.

  • IBM Cloud Enterprise

    WatsonX AI and regulated-industry positioning. Hybrid and mainframe-adjacent workloads.

GPU Compute and Model Hosting

Raw GPU compute and specialized inference hosting. Where generative and custom model workloads actually run.

  • RunPod GPU cloud

    On-demand and serverless GPU compute. One of the cheapest paths to run custom model workloads.

  • Fal.ai Fast inference

    Optimized inference for generative models. Sub-second image and video generation at API scale.

  • Modal Serverless

    Serverless Python functions with GPU support. Clean path from notebook to production.

  • Baseten Model deploy

    Model deployment platform. Strong for serving custom and fine-tuned open-weight models.

  • CoreWeave GPU infra

    GPU-specialized cloud infrastructure. Favored for large-scale training and inference.

  • Lambda GPU cloud

    GPU cloud for training and inference. Competitive pricing, direct NVIDIA partnerships.

Containers, Orchestration, IaC

Packaging, deploying, and describing systems declaratively.

  • Kubernetes Orchestration

    Container orchestration. The substrate most production AI workloads end up running on.

  • OpenShift Enterprise K8s

    Red Hat's enterprise Kubernetes distribution. Default in regulated and hybrid-cloud shops.

  • Docker Containers

    Container runtime and image format. Still the baseline for packaging workloads.

  • Infrastructure as code. The baseline for reproducible, version-controlled cloud environments.

  • Ansible Config

    Agentless configuration management. Still the fastest way to automate existing systems.

  • ArgoCD GitOps

    Declarative, Git-driven continuous deployment for Kubernetes.

  • Helm K8s

    Package manager for Kubernetes. Charts are how most production apps get templated.

Application Performance Monitoring

General-purpose APM and observability. Traces, metrics, logs across applications and infrastructure.

  • SigNoz Open source

    OpenTelemetry-native open source APM. Traces, metrics, and logs in one UI. Self-hostable alternative to Datadog.

  • Datadog Enterprise

    Industry-standard SaaS observability. Broad coverage across infrastructure, APM, logs, and security.

  • New Relic Enterprise

    Long-established APM and full-stack observability. Consumption-based pricing across telemetry types.

  • The de facto open source stack for metrics, dashboards, and alerting. Runs anywhere.

  • Honeycomb High-cardinality

    Event-oriented observability for complex distributed systems. Strong for debugging production unknowns.

  • APM inside the Elastic stack. Fits naturally when logs and search already live in Elasticsearch.

  • Dynatrace Enterprise

    AI-powered enterprise observability. Strong auto-instrumentation and root-cause analysis.

  • Sentry Errors + Perf

    Error tracking and performance monitoring focused on application code. The default for front-end and app errors.

  • OpenTelemetry Open standard

    The vendor-neutral telemetry standard. Instrument once, export anywhere. Backbone of modern observability.

LLM Observability and Evals

Measurement and monitoring specific to LLM workloads. Prompt tracing, evaluations, and cost tracking.

  • LangSmith Tracing + Evals

    LangChain's observability and evaluation platform. Tight integration with their stack.

  • Langfuse Open source

    Open source LLM engineering platform. Tracing, evals, prompt management.

  • Evaluation-first platform for AI products. Strong for iteration speed on production prompts.

  • Arize Phoenix Open source

    Open source observability for LLMs. Visual traces, evaluations, datasets.

  • Helicone Gateway

    LLM gateway with logging, caching, and cost tracking built in. Drop-in proxy.

Practices

Methodologies

The patterns we apply. Vendor-neutral thinking that outlives any specific tool.

  • Ontology Design

    Formal modeling of business objects, relationships, and logic. The anchor for every downstream AI system.

  • Retrieval-Augmented Generation (RAG)

    LLM responses grounded in retrieved context from your own data. The baseline pattern for enterprise AI.

  • Agentic Orchestration

    Multi-step reasoning where AI plans, calls tools, evaluates results, and iterates. Beyond single-shot prompts.

  • Human-in-the-Loop (HITL)

    AI proposes, humans approve, systems execute. Accountability without slowing the flywheel.

  • Purpose-Based Access Control

    Access scoped by role, data classification, and intent. Not just who. Also why.

  • Model Evaluations

    Systematic measurement of model behavior over time. The difference between production AI and a hopeful pilot.

  • Semantic Data Classification

    Automatic tagging of data by type, sensitivity, and meaning. Makes governance scale past manual review.

  • GitOps and Infrastructure as Code

    Git as the source of truth for infrastructure and deployment. Auditable, reproducible, rollback-able.

  • Knowledge Graphs

    Relational representation of entities and their links. Gives AI structural context beyond bag-of-words retrieval.

  • MCP Tool Design

    Wrapping existing business logic, APIs, and data as callable tools for AI agents through MCP servers.

  • Fine-tuning vs RAG

    A diagnostic framing: fine-tune for style and format, retrieve for facts. Conflating them burns budget.

  • Guardrails and Policy Enforcement

    Deterministic checks around non-deterministic models. Input filtering, output validation, scope enforcement.

Reading

Specifications, Essays, and References

Outside thinking worth your time.

  • How Palantir frames the decision platform: ontology, tools, actions, scenarios, guardrails.

  • The Anthropic-authored open standard for AI-to-system integration. Read it before you build.

  • Practical guidance from the team building Claude. Strong on agents and evaluation patterns.

  • Practical, runnable examples from OpenAI. Still the best source for day-one integration patterns.

Have a platform or pattern we should know?

We update this as the landscape moves. If there is something production-proven we missed, tell us.