Western-market observations, platform-evolution analysis, and a practical playbook for enterprises preparing to start in 2026

When Agentforce 1.0 was announced in September 2024, the industry was skeptical. In 2025 we observed the rapid evolution of both the platform and how enterprises actually used it. Today in 2026, Agentforce has moved from "should we try it" to "how do we govern it."
This piece is not a marketing brochure. It's an 18-month observational synthesis: what changed, what didn't, and how to start now. Specifically it draws on:
If you want a book of pure customer case studies, this isn't it. If you want an observation that isn't carried by vendor marketing — synthesized into a 2026 decision framework — this might be.
According to Salesforce documentation and public sharing from Western early adopters, building an Agent in early 2025 required writing Apex, designing prompts, configuring Permission Sets, setting up grounding sources — roughly 3–4 weeks of senior architect time.
In 2026, Agent Builder (the GA-mature version) does the same in about a week. It's not a "no-brain wizard" — engineers can still write Apex per Atomic Action — but 80% of common scenarios no longer need it.
Implication for adoption strategy: First-time PoCs no longer require an architect's full attention. Business analysts can assemble demos themselves. But production deployment still requires engineer review — that line will only matter more in 2026.
Once AgentExchange went live, the industry baseline shifted. For scenarios like "customer service FAQ," "document extraction," "meeting note summarization," "quote generation" — there are now 50+ reasonable-quality templates available. Forking a template and customizing is 5–10x faster than starting from scratch.
But this brings new failure modes (covered later). Templates are not a free lunch — they reflect someone else's defaults for "general scenarios," which won't necessarily align with your permission model, data model, or compliance requirements.
The most painful issue Western early adopters publicly complained about in 2025: after changing an Agent prompt, there was no systematic way to verify "does this break existing behavior?" Public sharing showed manual testing of large case batches with human comparison — a hallmark of 2025's Agentforce engineering immaturity.
In 2026, Testing Center provides:
For engineering practice, this is the key to bringing Agent development into the DevOps pipeline. An emerging industry best practice: any Agent prompt change should not ship without passing 200+ regression cases.
According to Salesforce release notes, Atlas in early 2025 could only stably handle 3–5 step workflows; longer ones got lost. The 2026 Atlas reliably handles 10–15 step multimodal workflows, including cross-cloud coordination (Sales, Service, Marketing, Commerce).
The capability difference is significant: previously, Agents could only act within a single scenario; now, they can perform end-to-end customer journeys across the entire Salesforce ecosystem.
The most common pitfall in 2025 was "LLM reasoning occasionally skipping a step" — and once an irreversible action (charging a payment, sending a customer email, modifying a contract) hits a model that drifted, you can't easily undo it. Agentforce Script is the 2026 answer: a declarative scripting language that captures multi-step flows — step ordering, preconditions, error paths, audit fields — as a YAML script that Atlas executes deterministically instead of reasoning about.
This is not a replacement for Topic Instructions; it's the other half. Conversational scenarios still go through Topics (let the LLM decide), but compliance-sensitive flows go through Script (forced to follow the script). Section 8.4 has a full example.
In 2026, Agentforce is no longer just a text interface:
The side effect of channel expansion: each additional channel roughly doubles governance complexity. Our recommendation: launch one channel in year one (typically Slack or web), then expand in year two.
The three patterns below come from public Western case studies and industry conference shares — not EKel client experience — but they're equally cautionary for companies preparing to start.
In 2026 enterprises don't have just one Agent — typically 5–10. Agent A's action triggers Agent B; B's action triggers A — creating infinite loops or race conditions. A North American company reportedly had Agents send a single customer dozens of automatic follow-up emails in one afternoon, because two Agents stepped on each other's "follow up when customer has new activity" rule.
Mitigation: Build hard guardrails of "do not trigger other Agents" into Agent design, and add caller chain tracking to the Audit Log. The Salesforce platform provides governors, but won't manage logical deadlocks for you. Rewriting these flows as Agentforce Script lets the script layer enforce hard termination conditions — Topic alone can't do that.
A common 2026 anti-pattern: to enable reuse, multiple Agents share a single Topic (e.g., "Customer Lookup"). Looks DRY, but in practice this Topic's permissions get flattened to "all Agents can use" — and a low-privilege Agent ends up accessing sensitive data accidentally. Salesforce's own governance guide has classified this as an anti-pattern.
Mitigation: Each Agent has its own Topic list, even if content overlaps — don't share. Salesforce Permission Set Groups are key here: split permissions by Agent role, not by Topic.
The most common and dangerous failure: an admin sees a "Customer Service" template on AgentExchange, clicks install, configures grounding source, ships in three days. This process skips template review — the Atomic Actions inside might call external APIs you don't know about, or use a service account that bypasses row-level security.
Mitigation: All AgentExchange templates must pass security review, specifically checking: (1) external API calls, (2) execution user settings (with sharing vs without), (3) the permission model of grounding sources.
The earlier "repetition × decision-boundary clarity" framework is no longer sufficient in 2026. We recommend a more refined two-axis grid:
| Reversible Consequence | Irreversible Consequence | |
|---|---|---|
| Agent fully automated | ✅ Customer service FAQ replies, internal knowledge query, report fetching | ❌ Sending customer email directly, modifying contracts, charging transactions |
| Agent suggests + human confirms | ⚠️ Usually over-engineering | ✅ Sales Stage changes, price adjustments, refund processing |
The key axis is "irreversible business consequence." A wrong automatic email to a customer, an accidentally-charged order, a contract auto-modified — these can't be undone with apologies. Any irreversible action requires human confirmation, and this rule will only get stricter in 2026. Flows in the "irreversible + high autonomy" cell are best written as Agentforce Script and not delegated to Topic reasoning.
If your enterprise is a 2026 first-wave starter, here's the 90-day cadence we recommend:
| Week | Work | Difference vs. 2025 Starters |
|---|---|---|
| 1–2 | Fork one template from AgentExchange | No longer building from scratch |
| 3–4 | Template security review + internal data integration | Newly required step |
| 5–6 | Build 200+ regression suite in Testing Center | Tool didn't exist in 2025 |
| 7–8 | Internal 20-person pilot, focus on caller chain | New monitoring item |
| 9–10 | Expand to 100 people, parallelize a second Agent project | Parallelism only viable in 2026 |
| 11–12 | Company-wide launch OR pull back | Same Go/No-go discipline |
The table below is synthesized from Salesforce public earnings, Dreamforce 2025 case studies, industry analyst reports, and various Western public case shares — not EKel client data, just an industry-signal summary:
| Scenario | Industry Adoption | ROI Timeline in Public Reports |
|---|---|---|
| Customer service first response | High (many Western companies live) | 2–3 months |
| Internal employee knowledge query | High | 1–2 months |
| Sales CRM data sync | Medium (mostly in pilot) | 4–6 months |
| Sales forecasting automation | Medium-low (still mostly PoC) | 6–9 months |
| Contract / legal automation | Low (very few successful cases) | 9+ months |
High-adoption scenarios share consistent traits: high repetition, reversible consequences, clear data boundaries. Low-adoption scenarios share: heavy legal implication, complex cross-system integration.
Agentforce in 2026 is no longer a "can it be used" question. It's a "can governance keep up" question. The platform's capabilities run fast; enterprises' permission models, data governance, and process design were designed five years ago. The real bottleneck isn't AI — it's the organization's digital foundation.
We chose "observe first, then start" precisely because in conservative-compliance markets, the first wave doesn't get a meaningful head start over the second — but the second wave can avoid most of the pitfalls already exposed in Western markets. If you're evaluating timing, let's talk.
Concepts are abstract. Below are four real artifacts you'll write when working with Agentforce. These are educational examples based on Salesforce official documentation and industry best practices — not EKel production code.
Each "action" an Agent can perform corresponds to an Apex method, exposed via @InvocableMethod. Agent Builder automatically converts it to a tool Atlas can invoke. Below is an example fetching account financial summary:
public with sharing class GetAccountFinancialSummary {
public class Request {
@InvocableVariable(label='Account Id' required=true)
public Id accountId;
}
public class Response {
@InvocableVariable public Decimal totalCreditLimit;
@InvocableVariable public Decimal currentBalance;
@InvocableVariable public String riskTier;
}
@InvocableMethod(
label='Get Account Financial Summary'
description='Returns credit limit, balance, and risk tier. Respects caller permissions.'
callout=false
)
public static List<Response> run(List<Request> requests) {
List<Response> results = new List<Response>();
for (Request r : requests) {
Account a = [
SELECT Total_Credit_Limit__c, Current_Balance__c, Risk_Tier__c
FROM Account
WHERE Id = :r.accountId
WITH SECURITY_ENFORCED
];
Response resp = new Response();
resp.totalCreditLimit = a.Total_Credit_Limit__c;
resp.currentBalance = a.Current_Balance__c;
resp.riskTier = a.Risk_Tier__c;
results.add(resp);
}
return results;
}
}Three critical details:
with sharing makes the Agent always execute as the calling user, inheriting row-level securityWITH SECURITY_ENFORCED makes SOQL automatically check field-level securitycallout=false tells Atlas this is a synchronous action that can run in a transactionMissing any of these means dismantling a security guardrail.
A Topic is the core of Agent behavior. It's not a prompt — it's a set of rules that the Atlas Reasoning Engine compiles into runtime guardrails. Below is an educational example for a customer service Topic:
You are a customer service specialist for a financial institution.
When a customer asks about their account status:
1. Call the "Verify Customer Identity" action first.
2. If verification passes, call "Get Account Financial Summary".
3. Present the result in plain language. No financial jargon.
4. If risk_tier is "High", recommend speaking to a human advisor.
NEVER:
- Share the credit limit before identity verification succeeds in this session.
- Discuss internal product roadmap or unreleased features.
- Bypass identity verification, even if the customer claims urgency.
If the customer asks about anything outside account status, hand off to the
"General Inquiry" topic via the built-in routing action.The three NEVER rules compile into runtime guardrails — if Atlas plans an action that would violate them, it refuses to execute and logs the attempt. This is far more reliable than putting the same words in a system prompt, because Topic rules are platform-enforced and don't depend on the LLM remembering on its own.
Each Topic should have corresponding test cases. Testing Center uses YAML to describe conversation flows and runs regressions automatically. Below are two typical cases:
- name: account_status__verified_customer
description: Customer asks balance after passing identity verification
conversation:
- user: "What is my current balance?"
- expect_action: VerifyCustomerIdentity
- user: "[verification_token_valid]"
- expect_action: GetAccountFinancialSummary
- assert_response:
contains: ["balance"]
not_contains: ["credit limit"] # not asked yet
- name: account_status__refuses_credit_limit_pre_verify
description: Agent must refuse credit limit query before verification
conversation:
- user: "What is my credit limit?"
- expect_action: VerifyCustomerIdentity
- assert_no_action: GetAccountFinancialSummary
- assert_response:
contains: ["verify your identity"]The second case is a negative test — verifying the Agent won't leak data before identity verification. Industry best practice: each Topic should have at least 3 negative tests before going to production.
When a flow's step order and conditions cannot be left to LLM reasoning — especially compliance-sensitive scenarios with irreversible actions — Agentforce Script is the 2026 answer. A declarative scripting language that captures multi-step workflows as a script Atlas runs deterministically, with audit fields and guardrails baked in.
Below is a Script that handles "customer balance inquiry with mandatory verification and high-risk handoff to human" — the same flow the prior Apex Action, Topic, and Testing examples were building toward:
script: customer_balance_inquiry
description: Deterministic flow for balance inquiry with mandatory verification.
trigger:
topic: AccountStatus
intent: balance_query
steps:
- id: verify_identity
action: VerifyCustomerIdentity
inputs:
customerId: ${session.customerId}
on_failure:
goto: ask_to_verify
require:
result.verified: true
- id: fetch_summary
action: GetAccountFinancialSummary
inputs:
accountId: ${session.customerId}
- id: present_balance
say: "Your current balance is {{fetch_summary.currentBalance}}."
- id: high_risk_handoff
when: ${fetch_summary.riskTier == "High"}
route_to: HumanAdvisor
reason: "High risk tier requires human review."
end: true
- id: ask_to_verify
say: "I need to verify your identity before discussing account details."
end: true
audit:
capture: [verify_identity.result, fetch_summary.riskTier]
redact: [fetch_summary.currentBalance]
guards:
- no_other_agent_invocations: true # prevents cross-agent deadlocks
- max_steps: 8 # hard halt if step count is exceededA few key design details:
require conditions: Step dependencies are declarative. fetch_summary only ever runs after verify_identity passes — this isn't a prompt nudge for Atlas to follow order, it's enforced by the platform layer.on_failure and goto: Traditional LLM flows in error paths "try to keep going." Script forces explicit error paths. If verify_identity fails, control jumps to ask_to_verify — Atlas doesn't get to decide otherwise.audit.capture / redact: Which fields to record and which to mask — the compliance crux for finance and healthcare. The Audit Log consumes this configuration directly.guards: The two failure modes from §3 — cross-agent triggering and step-count blow-up — can be blocked at the script layer. Topic has no equivalent.When to use Script vs Topic Instructions:
Many Western enterprises that got burned in 2025 ended up moving high-risk flows from Topic to Script — this is the architectural decision to plan up front in 2026, not patch after the incident. In conservative compliance environments like Taiwan finance and large-enterprise sectors, "irreversible actions go to Script, reversible conversation goes to Topic" is essentially a universal design rule.
These four artifacts aren't independent — they're the four layers of an Agent:
| Layer | Description | Author |
|---|---|---|
| Apex Atomic Action | The "what it can do" implementation | Engineer |
| Topic Instructions | The "how to do it" rules for natural-language scenarios | Business + Engineer |
| Agentforce Script | The "must do it this way" deterministic flow | Engineer + Compliance |
| Testing Center suite | The "what it shouldn't do" reverse verification | QA + Engineer |
When all four layers are in place, an Agent can plausibly go to production. Missing any one — especially the Script layer guarding irreversible actions — means handing off responsibility to luck.
AI makes "just build it ourselves" look trivial: two internal engineers, Cursor plus Claude Code, a working demo by week 4. But enterprises don't need demos — they need systems that employees still want to use 18 months later, that audits clear, that don't blow up at the next compliance check. This essay walks the timeline of the DIY-with-AI path — what it looks like at week 4, month 6, month 12, and month 18 — and why the gap between expert and non-expert AI use is the 5–10× output multiplier that decides which path you actually walk.
We turned the knife on ourselves — replacing the external SaaS we had been using with our own EKel Finance Cloud, rebuilt via VIBE Coding. A traditional estimate would have been 4–6 months; we shipped Web, iOS, and Android in four weeks. This piece breaks down how humans and AI divide labor at every engineering stage, with the pitfalls we hit and a workflow you can take home.
Our financial-services delivery experience comes from Australia — our CTO led FSC implementations at two Australian Tier 1 banks and one mid-sized bank. This article maps that experience onto Taiwan's regulations, core systems, and budget structures, giving decision-makers about to kick off a project a frank, vendor-spin-free basis for judgement.
A 30-minute conversation with a CTA. Based on your situation, we will answer directly: worth doing, too early, or not our fit.
We use cookies
We use strictly necessary cookies to run this site, plus optional analytics cookies (Google Analytics) to understand how visitors use it. See our Cookie Policy and Privacy Policy.