A taxonomy of agentic commerce in Europe

Abstract

Agentic commerce, defined as commercial transactions initiated, negotiated, or settled by autonomous software agents acting on behalf of human or institutional principals, is emerging as a structurally distinct category within digital trade. This paper constructs a four-layer taxonomy of agentic commerce as it operates in the European context: business-to-consumer (B2C) agentic commerce, business-to-business (B2B) agentic commerce, machine-to-machine (M2M) agentic commerce, and agent-to-agent marketplaces. Each layer is characterised by a distinct principal type, a distinct information asymmetry regime, and a distinct architecture of human oversight. The taxonomy is organised along two principal axes: the identity of the authorising principal and the degree of human intervention required at each transactional step. European regulatory instruments, including the EU AI Act, the Payment Services Directive, and the Digital Services Act, provide the normative backdrop against which each layer's oversight requirements are assessed. The paper identifies a structural gap between the theoretical maturity of mechanism-design protocols and the operational readiness of European payment rails to support fully autonomous agent transactions. It further finds that cross-layer platform operators, those whose infrastructure spans more than one taxonomy layer, accumulate systemic risk and market power at a rate that current regulatory instruments do not match. The taxonomy is offered as a practical instrument for regulators, payment service providers, and platform operators designing governance frameworks for agentic commercial systems.

Introduction

Commercial transactions have always required some mechanism by which a principal's intent is translated into a binding market action. For most of recorded commercial history, that mechanism was a human agent, a trader, broker, or employee, whose discretion was constrained by instruction, contract, and law. The digitisation of commerce compressed several of those steps, but the authorising human remained a required participant at the moment of transaction commitment. Agentic commerce disrupts that structural requirement. In the agentic model, a software agent receives delegated authority from a principal, interprets commercial context autonomously, and executes transactions, including price negotiation, order placement, contract formation, and payment initiation, without requiring human approval at each step.

This paper treats agentic commerce as a distinct analytical category, structurally separable from earlier forms of algorithmic or automated trade, because the agents in scope possess three properties that earlier automation did not jointly exhibit: (1) goal-directed behaviour that persists across multiple transactional steps rather than executing a single pre-specified instruction; (2) the capacity to select among counterparties or strategies without explicit human direction at decision time; and (3) the ability to generate binding legal and financial commitments on behalf of a principal. The combination of these three properties creates novel questions of liability, oversight, and market structure that existing regulatory instruments address only partially.

Europe provides the regulatory and market context for this analysis for three reasons. First, the European Union has enacted or is actively implementing the most comprehensive set of digital-economy regulations of any jurisdiction, including the EU AI Act, the revised Payment Services Directive (PSD2, with PSD3 in preparation), the Digital Services Act, and the Markets in Crypto-Assets Regulation. Second, European payment infrastructure, particularly SEPA Instant Credit Transfer and the PSD2 open banking API ecosystem, provides a concrete, regulated technical substrate against which the deployability of agentic payment mechanisms can be evaluated. Third, the European single market creates cross-border principal-agent relationships that intensify the governance challenges identified in this paper.

The contribution of this paper is a systematic, four-layer taxonomy of agentic commerce organised by two primary axes: the identity of the authorising principal and the degree to which human oversight is structurally required, architecturally available, or operationally absent at each transactional step. The four layers are:

B2C agentic commerce, in which a software agent acts on behalf of an individual consumer, operating within constraints set by that consumer as the primary principal.
B2B agentic commerce, in which a software agent acts on behalf of an enterprise, executing procurement, logistics, or financial transactions within authority delegated by institutional governance structures.
Machine-to-machine (M2M) agentic commerce, in which devices or sensors transact autonomously with one another within parameters set by an institutional operator, with no human at the interaction layer.
Agent-to-agent marketplaces, in which multiple autonomous agents, potentially representing different principal types, discover counterparties, negotiate terms, and settle transactions through market mechanisms, with human principals operating exclusively at the design or parameter-setting layer.

The taxonomy is not merely descriptive. It is structured to reveal where oversight mechanisms are structurally absent, where liability is legally unassigned, and where European regulatory instruments must be extended or harmonised to govern agentic transactions at commercial scale.

The remainder of this paper proceeds as follows. Section 2 grounds the taxonomy in current economic and regulatory drivers. Section 3 positions it against prior work in agent classification and payment governance. Section 4 describes the methodology by which the four layers were defined and the oversight axes were constructed. Section 5 presents each layer in structural detail. Section 6 analyses the oversight architecture across layers. Sections 7 and 8 address limitations and future research directions. Section 9 concludes with specific implications for regulatory design.

Economic and Regulatory Drivers

The construction of a taxonomy at this moment is a response to three converging pressures: regulatory enactment, commercial emergence, and demonstrated operational incident.

Regulatory enactment. The EU AI Act entered into force on 1 August 2024, but its substantive obligations apply in stages: provisions prohibiting unacceptable-risk AI practices apply from February 2025, obligations governing high-risk AI systems apply from August 2026, and further provisions take effect thereafter. Practitioners assessing current compliance obligations must therefore distinguish between what the Act formally prohibits now and what it will require at later phases. The Act classifies AI systems by risk level and imposes conformity assessment obligations on high-risk systems. Its provisions address AI that makes, or substantially influences, consequential decisions, yet they are not calibrated to the specific case of autonomous agents that initiate binding commercial transactions in real time. The Act establishes human oversight as a design requirement for high-risk AI systems but does not specify the conditions under which an agent's autonomous authorisation legally substitutes for a human principal's consent under eIDAS or PSD2 Strong Customer Authentication (SCA). This gap reflects the fact that the Act was drafted with decision-support systems primarily in mind, not transaction-executing systems that generate contractual obligations at machine speed. The Payment Services Directive's SCA requirement, which mandates multi-factor authentication by the payer, was designed for human-initiated payment flows and has no express provision for agent-initiated authentication on behalf of a consumer or corporate treasury.

Commercial emergence. Large language model (LLM)-powered agents capable of browsing product catalogues, comparing offers, and initiating checkout flows are being piloted by European e-commerce operators. Industrial IoT platforms in manufacturing and logistics are enabling devices to autonomously procure consumables or book transport capacity when inventory thresholds are crossed. Theoretical work on multi-agent reinforcement learning has demonstrated that Nash equilibria are achievable in model marketplaces for distributed energy resources [25], establishing a technical feasibility basis for such systems in European balancing markets; whether live deployments under ENTSO-E frameworks currently operate at that level of agent autonomy is not publicly documented at granular resolution. Each of these developments represents a different principal type and a different oversight architecture, and they are advancing without a shared regulatory vocabulary for classifying the relationships and obligations involved.

Operational incidents. Documented cases in adjacent domains illustrate the magnitude of transactional harm that can precede any regulatory response. Algorithmic trading incidents in financial markets, where autonomous systems amplified price dislocations before human intervention could occur, established that machine-speed transactions create systemic risks that post-hoc review cannot remediate. In e-commerce, recommendation agents have been shown to construct and reinforce consumer preferences in ways that are not transparent to the consumer [15], raising questions about who bears responsibility when an agent's recommendation function generates purchases the consumer would not have made under full information. In B2B contexts, automated procurement agents have entered supplier agreements that human procurement teams later disputed, exposing gaps in the law of mandate and electronic contract formation under eIDAS. These incidents are not isolated failures of specific systems; they are structural symptoms of deploying autonomous transactional agents in a regulatory environment calibrated to human-initiated commerce. The taxonomy developed in this paper provides the classification apparatus necessary to assign each incident type to a layer, identify the responsible principal, and specify what oversight mechanism should have been operative.

Prior Taxonomies and Governance Models

The literature on agent-based commerce divides into three broad streams: agent capability classification, market mechanism design, and platform governance. Each stream contributes partial insight to the present problem but does not produce a unified taxonomy organised by principal type and oversight architecture.

Agent capability classification. Early foundational work established the conceptual architecture of buying and selling agents [8], describing agents that could search, compare, and negotiate on behalf of buyers and sellers in electronic marketplaces. The Kasbah system [9] instantiated this architecture as a real marketplace experiment in which agents were given time-varying price functions and negotiated bilaterally. A subsequent real-life experiment in agent marketplaces [16] extended these results to multi-agent environments with heterogeneous strategies. These contributions defined agent functional roles, buyer agent, seller agent, matchmaker, and provided the vocabulary for agent interaction protocols. What they did not construct was a taxonomy organised by the identity and legal status of the principal authorising the agent, nor did they address oversight obligations under any regulatory framework, a gap that is historically understandable, as the regulatory instruments now at issue did not exist at the time of publication.

Market mechanism design. The mechanism design stream addresses how agents should negotiate and how market structures should be organised to produce efficient outcomes. Work on agent-supported negotiations in e-marketplaces [22] demonstrated that agent-mediated negotiation can improve joint surplus relative to posted-price mechanisms. Research on intelligent agents for one-to-many negotiation [18] formalised protocol structures for scenarios where one buyer agent negotiates simultaneously with multiple seller agents. Combinatorial auction mechanisms [14] were shown to support efficient allocation of multi-attribute resource bundles in distributed scheduling contexts. Agent argumentation approaches extended negotiation to include preference discovery and product search [28]. Multi-agent reinforcement learning in model energy resource marketplaces [25] demonstrated that Nash equilibria are achievable in distributed energy markets, though the path from theoretical convergence to regulatory compliance in European balancing markets remains uncharted. Collectively, this stream establishes that the mechanism-design tools for agentic commerce are theoretically mature, but it does not address the conditions under which those mechanisms operate on regulated payment rails or generate legally binding obligations.

Platform governance. The platform economics stream, covering two-sided and multi-sided market structures [5], network externalities [6], and two-sided internet platform lifecycle dynamics [13], describes how platforms intermediate between principals and creates the economic conditions for concentration. Research on B2B platform ecosystems [12] identifies value co-creation patterns but does not address the specific case where platform participants are autonomous agents rather than human firms. The distributed ledger literature [19] argues that blockchain intermediation displaces rather than eliminates platform power, a finding that applies directly to agent-to-agent marketplaces that are structured as decentralised autonomous organisations [21]. Research on trust evaluation in e-marketplaces [17] addresses how buyer agents should assess seller agent credibility, but the trust architecture it proposes is bilateral and does not extend to multi-layer principal chains.

What this taxonomy adds. The present paper differs from each stream in three specific respects. First, it organises the four layers by principal type and legal status, not by agent capability or market structure, making it directly operable as a regulatory classification instrument. Second, it introduces human oversight as a first-class axis of taxonomy construction rather than treating it as an implementation detail or ethical addendum. Third, it situates the taxonomy within the specific European regulatory environment, evaluating each layer against the AI Act, PSD2, and DSA, none of which the cited prior work addresses. Prior work on cooperative multi-agent transactions [24] and mobile agent security [23] addressed operational integrity within agent systems but did not map those concerns to a regulatory liability framework. The AI capability spectrum literature [4, 2] addressed the range from mechanical to agentic AI but did not produce a commerce-specific taxonomy. The present work synthesises these contributions into a governance-oriented structure.

Taxonomy Construction and Layer Definition

The taxonomy is constructed through a structured analytic method that combines principal-type differentiation, oversight-axis specification, and regulatory instrument mapping. This section describes each of these steps in sufficient detail that the classification logic can be applied to novel deployment scenarios.

Step 1: Principal-type differentiation. A principal is defined, for purposes of this taxonomy, as the legal or natural person whose preferences the agent is authorised to advance and whose legal and financial obligations are created by the agent's actions. This definition draws on standard principal-agent theory and on the concept of mandate in EU contract law. The taxonomy admits four principal types: (a) individual natural persons acting as consumers, (b) legal entities acting as commercial counterparties, (c) automated systems or devices that are themselves operated by institutional principals but that issue instructions without human mediation, and (d) compound multi-agent systems in which the immediate counterparty is itself an agent acting for an upstream principal. Layers are distinguished by which principal type occupies the demand side of the transaction, the position from which authority and liability flow.

Step 2: Oversight axis specification. Human oversight is treated as a continuous variable rather than a binary property. The taxonomy measures oversight along two dimensions: (a) structural availability, whether the architecture of the system provides a point at which a human can intervene before a transaction commits; and (b) operational frequency, whether a human in fact exercises oversight at that point in the transaction flow. The combination of these two dimensions produces four oversight states: human-in-the-loop (HITL), human-on-the-loop (HOTL), human-at-the-boundary (HATB), and human-absent (HA). HITL requires human approval of each transaction. HOTL means a human monitors agent behaviour and can halt a transaction stream but does not approve each individual transaction. HATB means a human sets the operating parameters of the agent at system configuration time but has no intervention capacity during live operation. HA means no human principal is available at any operational stage; the system runs entirely within pre-specified constraints and institutional governance structures.

Step 3: Liability axis specification. Each layer is characterised by the primary liability carrier, the party who bears legal responsibility for erroneous or harmful transactions. The taxonomy identifies three liability positions: the deploying institution (operator liability), the platform through which agents transact (platform liability), and the upstream principal by whose authority the agent operates (principal liability). Where European regulatory instruments assign liability explicitly, those assignments are recorded. Where they do not, the gap is noted as a structural deficiency.

Step 4: Regulatory instrument mapping. For each layer, the taxonomy records which European regulatory instruments are applicable: the EU AI Act (risk classification, conformity assessment, human oversight requirements), PSD2/PSD3 (SCA requirements, payment institution licensing, liability for unauthorised transactions), the Digital Services Act (transparency obligations for recommender systems), the eIDAS Regulation (electronic identification and authentication for contract formation), and AMLD6 (anti-money laundering obligations in payment contexts).

Layer selection criteria. The four layers were selected on the basis that each represents a qualitatively distinct principal type, a distinct oversight architecture, and a distinct regulatory liability regime. Layers were not defined by technology (e.g., LLM vs. rule-based agent) because the same transaction function can be performed by multiple technologies, and a technology-based taxonomy would require revision with each new capability release. The layer definition is therefore functional and relational rather than technical.

Four Layers of Agentic Commerce

Layer 1: B2C Agentic Commerce

Structural characteristics. In B2C agentic commerce, an autonomous software agent acts on behalf of an individual consumer. The agent may perform product search, price comparison, preference elicitation, cart population, and payment initiation. The consumer is the primary principal; the agent operates within constraints the consumer specifies, either explicitly through preference settings or implicitly through behavioural history. The agent's authority is bounded by a delegation scope defined at enrolment time.

Principal relationships. The consumer delegates authority to the agent, which in turn interacts with merchant systems. A second principal relationship exists between the consumer and the platform or service provider that deploys the agent. This creates a three-party principal chain: consumer, agent operator, agent. The agent operator bears a contractual obligation to the consumer and, under the AI Act and DSA, transparency obligations regarding the agent's decision logic. Merchant systems, by accepting agent-initiated orders, implicitly recognise the agent's authority to bind the consumer.

Oversight pattern. B2C agentic commerce is structurally compatible with both HITL and HOTL oversight. Many current deployments require consumer confirmation at checkout, preserving HITL. Emerging systems move toward HOTL, where the agent completes transactions below a pre-approved value threshold without per-transaction confirmation. The PSD2 SCA requirement currently applies at the payment step; where a consumer has pre-authorised a payment mandate or set up a standing order, the agent can trigger payment within that mandate without fresh SCA. The legal boundary of this pre-authorisation mechanism is actively contested in European regulatory discussions.

European examples. European grocery and subscription retail platforms have deployed agents that autonomously replenish household goods when stock falls below user-defined thresholds, triggering SEPA direct debit payments within pre-authorised mandates. Personal finance management applications regulated as payment initiation service providers (PISPs) under PSD2 deploy agents that automatically move funds between accounts to optimise yield or avoid overdraft fees. In both cases, the agent operates within a mandate structure, but the mandate's scope for autonomous action is rarely specified with the precision that liability-clear operation requires.

Liability allocation. Under PSD2 Article 73, where a payer did not authorise a payment transaction and notifies the payment service provider without undue delay, liability for that unauthorised transaction rests with the payment service provider. The notification prerequisite is load-bearing for the subsequent question of agent-initiated payments: where the agent, rather than the consumer, initiates the payment, the question of whether the transaction is authorised within the meaning of PSD2 turns on whether the consumer's pre-authorisation of the agent constitutes valid SCA. If it does not, the notification-and-refund mechanism of Article 73 is triggered, but the allocation of that refund obligation along the three-party principal chain, between the PSP, the agent operator, and the consumer, has not been resolved by European Banking Authority guidance as of the period of analysis.

Electronic recommendation agents and preference construction. Research on electronic recommendation agents in digital marketplaces demonstrates that agent-mediated choice environments do not merely reflect pre-existing consumer preferences; they actively construct and stabilise those preferences [15]. This finding is material to liability analysis because it means that an agent acting on inferred preferences may generate purchases the consumer would dispute as unauthorised, yet the consumer's interaction history with the agent's interface may constitute the very evidence used to assert authorisation. The DSA's transparency requirements for recommender systems address disclosure obligations but do not resolve the authorisation question.

Layer 2: B2B Agentic Commerce

Structural characteristics. In B2B agentic commerce, an autonomous agent acts on behalf of a legal entity, typically within a procurement, logistics, treasury, or supply-chain management context. The agent receives authority through internal governance structures rather than individual consumer consent. Transactions are typically higher in value, governed by pre-negotiated framework contracts, and subject to enterprise risk management policies that constrain agent authority.

Principal relationships. The principal chain is more complex than in B2C. The enterprise is the ultimate principal; internal governance structures (procurement policy, treasury mandates, delegated authority matrices) define the agent's operating envelope. The enterprise typically deploys the agent through a commercial platform provider, creating a tripartite relationship: enterprise, platform operator, agent. Where the agent transacts with a supplier's automated system, a fourth party enters the transactional relationship. For purposes of this taxonomy, the supplier's automated system is classified as a Type (c) principal, an automated system operated by an institutional principal that issues instructions without human mediation, and therefore occupies the supply-side counterpart position rather than a position within the demand-side principal chain. This classification preserves the taxonomy's four principal types while acknowledging the structural complexity of the interaction. The COMPOSITION project [26] demonstrated this multi-stakeholder architecture in an industrial context, integrating agent marketplaces with enterprise resource planning systems and machine learning modules.

Oversight pattern. B2B deployments characteristically operate under HOTL or HATB oversight. Human procurement officers set policy parameters; the agent executes within those parameters without per-transaction approval. Oversight is exercised through audit logs, exception reports, and periodic human review rather than real-time intervention. This structure is operationally efficient but creates a latency between the occurrence of a policy-violating transaction and its detection. In high-velocity procurement contexts, this latency can result in a volume of committed transactions that is commercially difficult to unwind.

European examples. European automotive manufacturers operate supplier integration platforms on which agent systems automatically issue purchase orders when production schedules and inventory models indicate need. European pharmaceutical distributors use agent-mediated tender platforms to match demand for regulated substances with licensed suppliers, with compliance verification automated against regulatory databases. European logistics networks employ agents to book transport capacity on spot markets, where prices are volatile and speed of commitment determines cost.

Liability allocation. Under EU contract law, an agent acting within the scope of its authority binds the principal. The critical question is whether an enterprise's internal configuration of an agent's authority envelope constitutes a legally effective mandate that binds the enterprise vis-a-vis third-party suppliers. The eIDAS Regulation provides a framework for electronic identification and authentication but does not address the specific case of machine-generated contract formation without human signature at the moment of commitment. This creates a gap in which contracts formed by B2B agents may be challenged on grounds of formation defect.

Layer 3: Machine-to-Machine (M2M) Agentic Commerce

Structural characteristics. M2M agentic commerce is the interaction layer at which devices or sensors transact autonomously with other devices, triggering commercial obligations, without human mediation at the interaction layer. The IoT infrastructure enabling this layer encompasses a range of communication protocols and device management architectures [1]. Transactions at this layer include autonomous procurement of cloud compute resources, automatic replenishment of industrial consumables triggered by sensor readings, and device-to-device energy trading in smart grid contexts.

Principal relationships. The immediate transacting parties are machines, but the authorising principal is invariably an institution, the operator who deployed and configured the device network. The principal-chain structure is therefore: institutional operator, device or sensor network, individual device executing the transaction. The operator's authority over the agent is exercised entirely at design and configuration time. Once deployed, the device network operates under its pre-specified parameters without human mediation at the transactional level.

Oversight pattern. The primary oversight state at the M2M layer is HATB. The institutional operator sets the operating parameters, value thresholds, permitted counterparty classes, and transaction limits at system configuration time; no structural mechanism permits per-transaction human review during live operation. HOTL is available only as a degraded fallback, exercised through monitoring dashboards and remote shutdown facilities, rather than as a primary oversight mode. This distinction matters for regulatory classification: the AI Act's human oversight requirements, when applied to M2M systems, engage at configuration time, not at transaction time, and an architecture audit must therefore examine parameter-setting procedures and shutdown protocols rather than real-time approval workflows.

Security frameworks for mobile and distributed agent systems emphasise fault tolerance and transaction atomicity [23] rather than human intervention capacity, because the transaction velocity and geographic distribution of M2M systems make real-time human review operationally infeasible. The risk profile that results is characterised by a long tail: low probability but high consequence events in which a misconfigured or compromised device network generates a large volume of erroneous transactions before the institutional operator can act through the HOTL fallback channel.

European examples. European smart grid pilots operating within ENTSO-E frameworks include device-level energy trading in which prosumer devices automatically sell surplus generation to aggregators, with settlement processed through payment infrastructure at the network boundary. European manufacturing platforms deploying industrial IoT systems enable automated procurement of raw materials and machine maintenance services triggered by condition-monitoring data. In both cases, transaction initiation occurs entirely within the device network; regulated payment rails are engaged only at the settlement layer.

Infrastructure note on SEPA Instant. The settlement layer for M2M transactions increasingly targets SEPA Instant Credit Transfer as the payment mechanism. Mandatory SEPA Instant reachability for euro-area payment service providers was established by Regulation (EU) 2024/886, with compliance deadlines staggered from October 2025 onward for euro-area PSPs and later for non-euro-area PSPs. Operators designing M2M settlement architectures should therefore treat SEPA Instant as an infrastructure target that is becoming universally reachable within the European payments area on a defined schedule, not as a uniformly available substrate at present.

Liability allocation. The institutional operator bears full liability for M2M transactions because the devices have no independent legal personality. The AI Act's product safety-adjacent provisions and the proposed Product Liability Directive are the most directly applicable instruments, though neither was drafted with autonomous transactional devices in mind. AML obligations under AMLD6 apply to the payment service provider processing the settlement, not to the device initiating the commercial obligation, creating a structural gap in which the commercial commitment and the payment obligation are subject to different regulatory regimes.

Layer 4: Agent-to-Agent Marketplaces

Structural characteristics. Agent-to-agent marketplaces are environments in which multiple autonomous agents, acting for potentially different principal types, discover counterparties, negotiate terms, and settle transactions through market mechanisms including auctions, multi-issue negotiation protocols, and reinforcement learning-based bidding strategies. The marketplace itself may be operated by a platform provider or may be structured as a decentralised autonomous organisation governed by smart contract logic [21]. Economic regime identification and prediction within these marketplaces has been studied as a mechanism for improving agent decision quality [27].

Principal relationships. Each agent in the marketplace represents an upstream principal, which may be a consumer, an enterprise, or an institutional device operator. The marketplace operator, whether a platform company or a DAO governance structure, is itself a principal with respect to the rules and mechanisms it enforces. The compounding of principal chains, where each agent in a multi-agent negotiation may represent a principal chain of its own, creates the deepest information asymmetry of any taxonomy layer and the most complex liability attribution problem.

Oversight pattern. Human oversight at this layer is structurally confined to the boundary-setting role. Principals set agent strategies, value functions, and participation constraints prior to marketplace entry. Once agents are operating within the marketplace, the negotiation process proceeds without human intervention. Research on multi-agent learning in energy resource marketplaces [25] demonstrates that agents can converge to Nash equilibria without human guidance, but also that convergence properties depend on marketplace mechanism design in ways that marketplace participants, including their human principals, may not fully anticipate.

European examples. Combinatorial auction mechanisms have been applied to distributed resource scheduling in industrial contexts [14], and the COMPOSITION project demonstrated agent marketplace integration in enterprise settings [26]. European energy balancing markets represent the most regulated live context for multi-agent trading, though the degree to which current deployments rely on full agent autonomy versus human-approved strategy execution varies and is not publicly documented at granular resolution. Decentralised autonomous organisations operating on Ethereum-based infrastructure have been analysed in terms of governance platform structure [21]; European DAO deployments in commercial contexts exist but remain early-stage.

Liability allocation. Agent-to-agent marketplaces present the most acute liability gap. Where two agents, each representing a different principal, negotiate and commit to a transaction, the question of which principal is bound and under what conditions the commitment can be disputed has no settled answer in EU law. The platform or DAO operator's liability for marketplace design choices, including mechanism parameters that systematically disadvantage certain participant classes, is not addressed by current EU regulatory instruments.

Oversight Architecture and Principal-Agent Alignment

The four-layer taxonomy reveals a structural gradient: as the principal type becomes more distal from the transacting agent, human oversight recedes from a structural requirement to a design-time parameter. This gradient is a logical consequence of the principal-chain architecture at each layer, independent of specific implementation choices. The discussion that follows identifies the mechanisms that produce this gradient, explains why certain oversight models fail at scale, and maps the cross-layer governance gaps that European regulation must address.

The information asymmetry gradient. Principal-agent theory establishes that information asymmetry between principal and agent increases when the agent's action space is large, when the principal cannot observe the agent's actions in real time, and when the mapping from agent action to principal outcome is complex. All three conditions worsen as the taxonomy moves from B2C to agent-to-agent layers. At the B2C layer, the consumer can in principle observe the agent's output at the transaction confirmation screen; the action space is bounded by a specific product category or service; and the outcome, a purchase, is directly legible. At the M2M layer, the institutional operator cannot observe individual device transactions in real time across a large device network; the action space spans multiple product categories and counterparties; and the mapping from device decision to financial exposure is mediated by several technical and contractual layers. The information asymmetry at the agent-to-agent layer is compounded by the fact that each agent is itself operating within its own incomplete information environment, as demonstrated by research on trustworthiness evaluation in e-marketplaces [17], making the aggregate behaviour of the marketplace genuinely opaque to any individual principal.

Why HITL oversight fails at scale. Human-in-the-loop oversight is operationally viable only when transaction frequency is low enough that a human can review each transaction within the time window required for the commercial opportunity to remain valid. At the B2C layer, per-transaction confirmation is feasible for large purchases but creates unacceptable friction for high-frequency, low-value replenishment transactions. At the B2B layer, the volume of procurement transactions in a large enterprise exceeds any plausible human review capacity. At the M2M and agent-to-agent layers, transaction frequency and the latency requirements of the underlying markets render HITL oversight technically incompatible with market participation. The consequence is that the regulatory instrument most commonly invoked by the AI Act, human oversight as a control mechanism, is precisely the instrument that degrades fastest as agentic commerce scales.

Why HOTL oversight creates detection latency risk. The HOTL model, in which a human monitors agent behaviour and can halt a transaction stream, is structurally dependent on two conditions: the monitoring system's ability to detect anomalous patterns in near real time, and the human operator's ability to act on that detection before the damage is irreversible. In payment contexts, the settlement leg of a SEPA Instant Credit Transfer is completed within seconds and is not reversible through the standard payment channel. The SEPA Instant rulebook and Regulation (EU) 2024/886 provide a Recall mechanism (R-transactions) through which a payer's PSP may request return of funds after settlement, but recall is a request rather than a guarantee of recovery: the beneficiary's PSP may decline to reverse a recalled transaction in certain circumstances, and the mechanism does not restore the settlement status quo ante automatically. An anomaly detected thirty seconds after settlement therefore initiates a recovery process with uncertain outcome, not an automatic reversal. In B2B procurement, purchase orders accepted by automated supplier systems may trigger production runs before a human operator detects the anomaly in a monitoring dashboard. HOTL oversight is therefore a necessary but not sufficient control mechanism; it requires pairing with reversibility provisions, either contractual (cancellation windows) or technical (settlement holds), that current European payment infrastructure supports only partially and under defined procedural conditions.

Platform intermediation and concentration risk. The abstract's finding that cross-layer platform operators accumulate systemic risk and market power at a rate that current regulatory instruments do not match warrants detailed elaboration here. Two-sided markets generate network effects that produce concentration dynamics in which a leading platform captures an increasing share of transaction flow [5, 6]. In agentic commerce, this dynamic intensifies when platform operators serve multiple taxonomy layers simultaneously, providing B2C agent services to consumers while also operating the B2B marketplace through which enterprise agents transact. Such operators accumulate two distinct data advantages: consumer preference signals from the B2C layer and enterprise procurement signals from the B2B layer. The combination of these two signals allows a cross-layer operator to anticipate demand at the enterprise level using consumer-side data, and to construct consumer-side offers informed by enterprise-side supply commitments, an informational position that single-layer operators structurally cannot replicate. Research on distributed ledger intermediation [19] confirms that the promise of disintermediation through decentralised mechanisms does not eliminate concentration; it relocates concentration to the protocol design and governance layer. Decentralised autonomous organisations, as demonstrated in comparative analysis of Ethereum-based governance platforms [21], replicate concentration within their token-weighted voting structures. The concentration risk identified in this paper therefore requires monitoring across both conventional platform architectures and DAO-governed marketplace architectures.

Winner-take-most dynamics and the cross-layer amplification mechanism. The term winner-take-most, as developed in the platform economics literature [5, 6], refers to the tendency of network effects and data advantages to concentrate transaction flow in markets where switching costs are non-trivial and multi-homing is constrained. In agentic commerce, the cross-layer amplification mechanism operates as follows: a platform that achieves scale at the B2C layer accumulates a consumer preference dataset large enough to train superior agent recommendation and execution models; those models are then deployed at the B2B layer, where the platform's agents achieve better price and term outcomes than competitors relying on smaller training sets; superior B2B outcomes attract additional enterprise clients, whose procurement data further refines the consumer-facing models. This feedback loop accelerates concentration without any single transaction or design choice that competition authorities can identify as an exclusionary act. Market-share metrics applied within a single taxonomy layer do not detect this cross-layer feedback. The specific monitoring instrument required is settlement-flow concentration analysis by agent-platform operator identity, measured across taxonomy layers simultaneously.

Cross-layer governance gaps. Three specific regulatory gaps are identified across the taxonomy layers. First, no EU instrument specifies the conditions under which an autonomous agent's transaction initiation constitutes legally valid authorisation under PSD2 SCA for any principal type. This gap affects all four layers but is most acute at the M2M and agent-to-agent layers, where no human is available to satisfy SCA at the moment of transaction. Second, the AI Act's high-risk AI classification does not explicitly include autonomous commercial agents that generate binding contracts, creating uncertainty about whether conformity assessment obligations apply to agent-to-agent marketplace operators. The phased application timeline of the Act, with high-risk provisions operative from August 2026, provides a defined window within which this classification question must be resolved if the conformity assessment pathway is to be functional at the point of legal obligation. Third, the DSA's transparency requirements for recommender systems address consumer-facing systems but do not extend to B2B agent systems or to the mechanism design of agent-to-agent marketplaces, leaving the largest information asymmetry regimes outside the transparency regime.

Agent argumentation and preference manipulation. Research on agent-to-agent argumentation in product search [28] and on negotiation agent usability [22] establishes that agents engaged in multi-issue negotiation do not merely execute pre-specified strategies; they adapt to counterparty behaviour in ways that their principals may not have anticipated. When this adaptive behaviour occurs within a marketplace that itself has mechanism design properties favouring certain participant classes, as multi-agent learning research in model energy markets demonstrates [25], the alignment between agent behaviour and principal intent degrades progressively. The regulatory implication is that principal authorisation given at system configuration time does not remain a valid proxy for principal approval of specific transactions that result from adaptive agent behaviour during live operation. No current EU instrument addresses this temporal decay of principal alignment.

The social commerce foundation and its agentic extension. The evolution of social commerce demonstrates that digital commerce platforms successfully integrate social signals into purchase decisions [10], and that AI-driven marketing strategies can be calibrated across a spectrum of mechanical, analytical, and empathetic AI capabilities [4]. The agentic extension of these models, in which the agent both recommends and executes, compounds the preference-construction effect identified in recommendation agent research [15] with the commitment-generation capacity of transactional agency. The result is a system in which the consumer's or enterprise's preferences at the moment of transaction may differ materially from those at the moment of agent authorisation, and in which the agent's adaptive behaviour, rather than a human decision, determines which preferences govern.

Conclusion: Toward Layered Governance of Agentic Commerce

This paper has constructed a four-layer taxonomy of agentic commerce, organised by principal type and human oversight architecture, and evaluated each layer against the European regulatory landscape. The taxonomy identifies structural properties of each layer that are prescriptive for regulatory design, carrying direct implications for how rules should be framed and enforced.

The central finding is that European regulation as currently constituted addresses the symptom of autonomous AI action, by requiring human oversight as an abstract principle, without specifying the structural conditions under which oversight is operationally meaningful at each taxonomy layer. The AI Act requires human oversight for high-risk AI systems but does not define what constitutes meaningful oversight when transaction velocity renders per-transaction review impossible, and its high-risk provisions do not apply until August 2026, leaving a defined interim period during which agentic commerce deployments operate against an incomplete regulatory standard. PSD2 requires SCA but does not specify how SCA obligations migrate when the payment-initiating party is a software agent. The DSA imposes recommender system transparency but does not extend that transparency obligation to the negotiation mechanisms of agent-to-agent marketplaces.

Regulatory bodies should use this taxonomy in three specific ways. First, the taxonomy provides a classification instrument for determining which regulatory instruments apply to a given agentic commerce deployment. An operator seeking to determine whether its system falls under PSD2, the AI Act, or both can use the layer classification, principal type, and oversight state to identify the relevant obligations. Second, the taxonomy identifies the specific points at which new regulatory instruments are required: a legal standard for agent-mediated payment authorisation under PSD2/PSD3; a conformity assessment pathway for autonomous commercial agents under the AI Act, ideally specified before the August 2026 application date for high-risk provisions; and a transparency obligation for marketplace mechanism design in agent-to-agent settings. Third, the taxonomy provides a basis for calibrated audit requirements, recognising that HITL oversight generates a natural audit trail in the form of human approval records, while HATB and HA oversight requires technical audit mechanisms, specifically immutable transaction logs with principal-attribution metadata, to perform the equivalent function.

The concentration risk identified across layers requires regulatory instruments calibrated to the cross-layer amplification mechanism described in Section 6. Competition authorities monitoring European digital markets should extend their data collection instruments to cover cross-layer platform operators, specifically those whose infrastructure spans more than one taxonomy layer. The cross-layer feedback loop, in which consumer-layer data advantages are converted into enterprise-layer competitive advantages and back, is not captured by market-share metrics applied within a single layer. Settlement-flow data from PSD2 API providers and, as it becomes universally reachable under Regulation (EU) 2024/886, from SEPA Instant, aggregated by agent-platform operator identity across taxonomy layers, would provide the empirical basis for detecting whether winner-take-most dynamics, as characterised in the platform economics literature [5, 6], are already concentrating transaction flow in European agentic commerce verticals.

The scalability of transparency is a specific technical and legal challenge that the taxonomy resolves into concrete sub-problems. At high transaction volumes, individual transaction disclosure is neither technically feasible nor informative for principal oversight. Audit mechanisms adequate to the M2M and agent-to-agent layers must therefore operate at the level of strategy and mechanism disclosure: the parameter sets within which agents operate, the value functions that govern their bidding behaviour, the counterparty selection criteria encoded at configuration time, and the mechanism design choices made by marketplace operators. Each of these disclosure objects corresponds to a specific point in the principal chain at which a human made an authorising decision, and each therefore represents an auditable record of principal intent against which agent behaviour can be evaluated. Developing the specific data formats, retention periods, and regulatory access rights for such mechanism-level audit logs requires coordination between the European Banking Authority, the AI Office established under the AI Act, national competition authorities, and payment infrastructure operators. The taxonomy provides the structural map for that coordination by identifying, at each layer, which principal type owns the relevant disclosure obligation and which regulatory instrument provides the enforcement basis.

Limitations

This taxonomy is subject to the following specific limitations, each of which is grounded in an identifiable evidentiary gap.

International transaction flows are excluded. The taxonomy is constructed within a European regulatory perimeter. Agentic commerce transactions that cross between European and non-European regulatory regimes, including transactions processed through non-EU payment rails or involving agents deployed in third-country jurisdictions, are not classified by the taxonomy. The specific evidentiary gap is the absence of a cross-border principal-chain analysis for agentic systems; this omission is material because many European enterprise agent deployments interact with supplier agents in non-EU jurisdictions.
Hybrid agent architectures are not fully addressed. Several commercially deployed systems exhibit characteristics of more than one taxonomy layer simultaneously. A consumer-facing agent that also participates in a B2B procurement marketplace on the consumer's behalf, for example, occupies the B2C and agent-to-agent layers concurrently. The taxonomy does not currently provide a combinatorial classification for such hybrid architectures, and the liability allocation logic at layer boundaries requires further development.
Threshold effects between oversight states are not specified. The taxonomy identifies four oversight states (HITL, HOTL, HATB, HA) but does not specify the transaction frequency or value thresholds at which each state becomes operationally untenable. The evidentiary gap is the absence of empirical data on the operational capacity limits of human oversight in live European agentic commerce deployments.
Real-time enforcement capacity is not evaluated. The taxonomy identifies where regulatory instruments apply but does not assess whether European supervisory authorities have the technical infrastructure to monitor and enforce those instruments in real time against agentic commerce systems operating at machine speed. National competent authorities' current supervisory tooling was built for human-initiated transaction flows, and the evidentiary gap is the absence of public documentation on supervisory technology capacity.
The corpus does not include live European transaction data. The analysis draws on academic literature and regulatory text. No live transaction-volume data from European agentic commerce deployments is available in the corpus, which means that assertions about commercial scale and market structure are structural inferences rather than empirically verified measurements.

Future Research Directions

Several concrete research directions would refine the taxonomy and improve its operational utility for regulators and practitioners.

Empirical transaction-volume mapping. Settlement-flow data from SEPA Instant and PSD2 open banking API providers, disaggregated by initiating party type (human vs. agent-mediated), would allow the taxonomy's layer boundaries to be grounded in observed transaction patterns rather than structural inference. Access to such data requires a specific data-sharing instrument, potentially modelled on the EBA's existing supervisory reporting frameworks, extended to capture agent-initiated payment flows as a distinct category.

SCA compatibility analysis. A legal and technical analysis of whether existing PSD2 SCA mechanisms, particularly payment mandates and pre-authorised standing orders, can be interpreted to cover autonomous agent-initiated transactions at each taxonomy layer would directly address the most acute regulatory gap identified in this paper. This analysis requires cooperation between the European Banking Authority, national competent authorities, and payment infrastructure operators.

Cross-layer concentration monitoring. Longitudinal market-share analysis of European agent-platform operators, specifically measuring settlement-flow concentration across taxonomy layers by operator identity, would test the concentration hypothesis advanced in this paper. The specific instrument required is a regulatory reporting obligation for agent-platform operators spanning multiple taxonomy layers, submitted to national competition authorities and the European Commission.

Mechanism design to payment rail integration. Pilot integration of combinatorial auction or multi-agent reinforcement learning mechanisms with a PSD2-compliant API or SEPA Instant rail, conducted within an EU regulatory sandbox, would test whether the theoretical maturity of agent negotiation mechanisms [14, 25, 18] translates to operationally deployable and regulatorily compliant systems. Results from such a pilot would directly address the mechanism-design-to-payment-rail gap identified throughout this paper.

References

[1] Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., & Ayyash, M. (2015). Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications. Institute of Electrical and Electronics Engineers.

[2] Dwivedi, Y. K., Hughes, L., Ismagilova, E., Aarts, G., Coombs, C., & Crick, T. (2019). Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. Elsevier BV.

[3] Dwivedi, Y. K., Ismagilova, E., Hughes, D. L., Carlson, J., Filieri, R., & Jacobson, J. (2020). Setting the future of digital and social media marketing research: Perspectives and research propositions. Elsevier BV.

[4] Huang, M.-H., & Rust, R. T. (2020). A strategic framework for artificial intelligence in marketing. Springer Science+Business Media.

[5] Rysman, M. (2009). The Economics of Two-Sided Markets. American Economic Association.

[6] Liebowitz, S. J., & Margolis, S. E. (1994). Network Externality: An Uncommon Tragedy. American Economic Association.

[8] Maes, P., Guttman, R., & Moukas, A. (1999). Agents that buy and sell. Association for Computing Machinery.

[9] Chavez, A., & Maes, P. (1997). Kasbah: An Agent Marketplace for Buying and Selling Goods.

[10] Wang, C., & Zhang, P. (2012). The Evolution of Social Commerce: The People, Management, Technology, and Information Dimensions. Association for Information Systems.

[12] Hein, A., Weking, J., Schreieck, M., Wiesche, M., Bohm, M., & Krcmar, H. (2019). Value co-creation practices in business-to-business platform ecosystems. Springer Science+Business Media.

[13] Muzellec, L., Ronteau, S., & Lambkin, M. (2015). Two-sided Internet platforms: A business model lifecycle perspective. Elsevier BV.

[14] Kutanoglu, E., & Wu, S. D. (1999). On combinatorial auction and Lagrangean relaxation for distributed resource scheduling. Taylor & Francis.

[15] Haubl, G., & Murray, K. B. (2003). Preference Construction and Persistence in Digital Marketplaces: The Role of Electronic Recommendation Agents. Elsevier BV.

[16] Chavez, A., Dreilinger, D., Guttman, R., & Maes, P. (1997). A real-life experiment in creating an agent marketplace. Springer Science+Business Media.

[17] Zhang, J., & Cohen, R. (2008). Evaluating the trustworthiness of advice about seller agents in e-marketplaces: A personalized approach. Elsevier BV.

[18] Rahwan, I., Kowalczyk, R., & Pham, H. H. (2002). Intelligent agents for automated one-to-many e-commerce negotiation.

[19] Zamani, E. D., & Giaglis, G. M. (2018). With a little help from the miners: distributed ledger technology and market disintermediation. Emerald Publishing Limited.

[21] Faqir-Rhazoui, Y., Arroyo, J., & Hassan, S. (2021). A comparative analysis of the platforms for decentralized autonomous organizations in the Ethereum blockchain. Springer Science+Business Media.

[22] Chen, E. E., Vahidov, R., & Kersten, G. E. (2005). Agent-supported negotiations in the e-marketplace. Inderscience Publishers.

[23] Vogler, H., Kunkelmann, T., & Moschgath, M.-L. (1997). An approach for mobile agent security and fault tolerance using distributed transactions.

[24] Chen, Q., & Dayal, U. (2000). Multi-agent Cooperative Transactions for E-Commerce. Springer Science+Business Media.

[25] Narasimha, D., Lee, K., Kalathil, D., & Shakkottai, S. (2022). Multi-Agent Learning via Markov Potential Games in Marketplaces for Distributed Energy Resources. IEEE Conference on Decision and Control.

[26] Bonino, D., & Vergori, P. (2017). Agent Marketplaces and Deep Learning in Enterprises: The COMPOSITION Project. Annual International Computer Software and Applications Conference.

[27] Gini, M. L., & Ketter, W. (2007). Identification and prediction of economic regimes to guide decision making in multi-agent marketplaces.

[28] Huang, S.-L., & Lin, C.-Y. (2010). The search for potentially interesting products in an e-marketplace: An agent-to-agent argumentation approach. Expert Systems with Applications.