Palantir Deep DivePart 4 of 4
Industrial AI

Palantir's AIP: Function Calling at Enterprise Scale

Most "enterprise AI" products start with a chat box and bolt on retrieval. AIP is interesting for a different reason: Palantir is productizing the control plane around tool use—permissions, approvals, audit trails, and (crucially) a constrained set of actions that map to real operations. Their docs for AIP Agent Studio are explicit that agents can use application commands as tools, and that command execution is approval-gated by default.

That puts AIP closer to "function calling with guardrails" than "RAG with vibes."

LLM as orchestrator: the only architecture that survives contact with production

LLM as Orchestrator Architecture
Mini Map
🧠
LLM
Decides what to do next
🔧
Tools
Execute actions safely
🔗
Ontology
Provides typed context
🛡️
Governance
Enforces constraints

If you want an AI system that changes state, you need to separate "model intent" from "system execution." OpenAI's function calling docs describe the canonical multi-step flow: model emits a tool call → your system executes it → you feed results back → model responds. OpenAI's safety guidance is blunt about human review in high-stakes domains.

Palantir's product choices line up with that model:

  • Agents can use commands as tools (tool access is configured).
  • Commands run in the user's application context (they can access application state/screen).
  • The default UX asks the user to Approve/Reject before executing a command.
  • There is an explicit toggle to let an agent auto-run commands (disabled by default).
  • Agents using commands as tools have a retention window that expires after 24 hours of inactivity.

None of that is accidental. It's the scaffolding you need when "the AI" is doing more than summarizing text.

Tool calling beats RAG when the output is a mutation

RAG (Retrieval)
QueryRetrieveAnswer
Best for:
  • Read-only information retrieval
  • Summarizing large documents
  • Q&A over knowledge bases
Limitation: Can hallucinate actions that don't exist
Tool Calling (Function)
IntentTool CallMutation
Best for:
  • State-changing operations
  • Structured data operations
  • Multi-step workflows
Advantage: Schema-constrained, auditable actions
When Output is a Mutation
Concern
RAG
Tool Calling
Hallucinated actions
❌ Model can describe non-existent actions
✅ Only declared tools available
Permission enforcement
⚠️ Ad hoc in app code
✅ Centralized tool permissions
Auditability
❌ Prompt logs aren't compliance
✅ Discrete, reviewable events
Context pressure
⚠️ Stuff chunks into prompts
✅ Fetch structured state via tools
ProblemRAG-first approachTool/function calling approach
Hallucinated actionsmodel can describe actions that don't existmodel can only call declared tools, with schema constraints
Permission enforcementoften implemented ad hoc in app codecentralized: tool availability + user permissions + approvals
Auditability"prompt logs" are not a compliance artifacttool calls are discrete events you can log + review
Context window pressurestuffing data chunks into promptsfetch structured state via tools, not raw text

Palantir also pushes the idea that AIP is built "on top of the Ontology and developer toolchain." That's the right dependency direction: you don't want agents inventing business objects; you want them operating over a typed model with defined actions.

An end-to-end AIP-style workflow (in real operational terms)

Take a common planning task: "Reschedule delayed shipments in the Northeast."

StepWho does itWhat happens
1userasks for rescheduling
2modelselects relevant tools/commands rather than drafting a prose plan
3systemruns command(s) in app context to pull the current shipment set
4modelproposes reschedule operations, parameterized per shipment
5user/systemapproval gate fires (default); user approves or rejects
6systemexecutes the command; downstream hooks/webhooks can propagate changes where configured in the operational layer
7system/modelaudit trail + outcome summary

This is the boring, correct architecture: intent → plan → gated execution → audit.

The cold-start paradox is real, and it's not "AI"

Tool calling scales with the richness of the action surface. If your organization hasn't defined operational actions—what can be changed, by whom, with what approvals—AI has nothing safe to do.

Palantir's own docs underline the dependency stack:

DependencyWhat it providesWhat breaks if it's missing
Ontology (typed objects)stable nouns and relationshipsagents operate on messy tables/files
Actions/commandssafe verbs (state transitions)agents can only "recommend," not execute
Security/governanceconstraints and accountabilityautomation becomes a liability
Evals/observabilitymeasurement and debuggingyou can't improve reliably

What I'd copy if I were building this elsewhere

I'd steal three ideas, because they're measurable and they work:

Design choiceWhy I like it
approval by default for state-changing toolsprevents "silent automation" disasters
strict, schema-driven toolsmakes invalid states unrepresentable; improves reliability
justification checkpoints for sensitive actionsforces operators (human or agent) to leave a rationale

If your "AI platform" can't do these, it's a demo environment.