The European agentic commerce stack

Abstract

Agentic commerce systems, in which software agents autonomously discover, negotiate, and settle transactions, require continuous coordination across at least six infrastructure layers: foundation models, agent frameworks, identity infrastructure, payment rails, data access mechanisms, and the application layer. The degree of European control over each layer determines both the competitive capability of European operators and the practical reach of European regulatory authority. This paper develops a 0-to-4 sovereignty scoring methodology applied to each layer, operationalizing control through four criteria: authority over technical standards, vendor lock-in risk, availability of regulatory veto points, and access to source code or training data. Scoring is grounded in publicly documented architectural dependencies, regulatory instrument coverage, and market structure evidence drawn from the agentic commerce and digital sovereignty literature. The central finding is that European autonomy is real at the payment rails layer (score: 3) and the identity infrastructure layer (score: 3), nominal at the data access layer (score: 2) and the application layer (score: 2), and structurally absent at the foundation model layer (score: 1) and the agent framework layer (score: 1). The paper interprets this pattern, identifies the mechanisms that produce nominal autonomy, and specifies the investment and governance commitments required to advance scores across the full range, from structurally absent to nominal, from nominal to real, and from real to full sovereignty. Implications for European policymakers, compliance practitioners, and infrastructure investors are drawn in the conclusion.

Framing European Autonomy in Agentic Commerce

The emergence of agentic commerce, defined here as the class of systems in which large language model-based agents execute multi-step commercial workflows autonomously on behalf of human or organizational principals, introduces a structural challenge that prior European digital policy did not anticipate. Earlier regulatory instruments, including the General Data Protection Regulation, the Payment Services Directive 2 (PSD2), and the Digital Services Act, were designed around a model of discrete, human-initiated interactions in which a person makes a request, a service responds, and liability attaches to identifiable actors at clearly demarcated points. Agentic systems dissolve each of those assumptions: the agent discovers services without human prompting, negotiates terms without human review at each step, and settles transactions programmatically against delegated credentials [11]. The regulatory architecture governing those actions is, at best, ambiguous.

The question of regulatory authority in agentic commerce is inseparable from the question of technical control. A European supervisory authority that lacks access to the model weights generating agent decisions, the framework orchestrating agent behavior, or the settlement infrastructure finalizing agent transactions cannot, as a practical matter, audit, interrupt, or hold accountable those systems. Formal jurisdiction and operational sovereignty are distinct conditions, and the gap between them is the subject of this paper.

The concept of digital sovereignty has structured European Commission strategy documents since at least the 2020 European Data Strategy, but prior analyses have treated it primarily as a matter of data localization and cloud infrastructure residency [9]. That framing is insufficient for agentic commerce, where the relevant dependencies extend from the probabilistic behavior of a foundation model through the orchestration logic of an agent framework, the cryptographic provenance of identity credentials, the settlement finality of a payment rail, the governance structure of a data access layer, and the competitive dynamics of an application market. Each layer presents a distinct control surface, and the European position on each differs substantially.

This paper addresses that gap by constructing and applying a layer-by-layer sovereignty scoring framework to the European agentic commerce stack. The framework assigns each layer a score from 0 to 4, where 0 denotes complete foreign dependency and 4 denotes full European control. Scores are assigned against four criteria: authority over the technical standards that govern the layer, vendor lock-in risk, the existence of enforceable regulatory veto points, and access to source code or training data. The methodology is described in full in the Methodology section; the scores and supporting evidence are presented in the Results section.

The paper makes three contributions. First, it provides a reproducible, criterion-anchored scoring methodology for assessing digital sovereignty at the infrastructure layer, applicable beyond the European context. Second, it documents the specific architectural features that produce nominal rather than real autonomy at three of the six layers under analysis. Third, it draws a direct line from the scored pattern of dependencies to the concrete governance and investment commitments that would be required to alter them, rather than treating sovereignty as an aspirational goal without operational content.

The remainder of the paper proceeds as follows. The Motivation section connects the analysis to current regulatory pressures and recent documented incidents of stack-level dependency. The Related Work section positions this contribution against prior scholarship on digital sovereignty, AI governance, and payment infrastructure. The Methodology section defines the scoring criteria and decision rules. The Results section presents layer-by-layer scores with supporting evidence. The Discussion section interprets the pattern of scores, identifies the mechanisms producing nominal autonomy, and traces the configuration traps that European operators face. The Conclusion specifies the governance commitments required. Limitations and Future Work sections close the analytical portion, followed by references.

Why Autonomy Matters Now

Three converging pressures make the sovereignty question urgent in 2024-2025 rather than merely prospective.

Regulatory instruments are reaching their design limits. The EU AI Act entered into force on 1 August 2024, but its obligations apply in staggered phases: provisions on prohibited practices became applicable from February 2025, rules governing general-purpose AI models from August 2025, and the full high-risk system regime from August 2026. Even as those phases activate, the Act was constructed around the premise that AI systems produce discrete outputs that human actors review before consequential action is taken. The agentic transition breaks this premise at the architectural level. When an agent browses a marketplace, selects a vendor, commits funds from a delegated payment credential, and confirms delivery against a proof-of-task-execution mechanism, no human review step separates any of these actions [11]. The European AI Office, established under the Act to govern general-purpose AI models, holds supervisory authority over model providers, but model providers are predominantly non-European actors whose compute infrastructure, training data, and weight update processes sit outside European jurisdiction [10]. The formal veto point exists; the operational capacity to exercise it is constrained by exactly the layer dependencies this paper scores.

Geopolitical AI competition has accelerated the divergence between capability and regulatory reach. The frontier of foundation model capability has advanced at a rate that European public and private investment has not matched at scale. European model initiatives, including open-weight releases from Mistral AI and prior large-scale research efforts, represent genuine contributions but have not closed the capability gap at the very top of the performance distribution where agentic commerce systems are increasingly deployed. Because agentic commerce agents are sensitive to reasoning capability, tool-use reliability, and instruction-following consistency, operators in competitive markets face economic pressure to deploy the highest-capability model available, irrespective of its provenance [2]. The consequence is that the market selection mechanism works against European stack preferences at precisely the layer where European control is weakest.

Platformization of financial transactions creates cross-layer dependency. The transformation of money into transactional data, documented in the literature on payment platform economics [4], means that the payment rail is no longer a terminal settlement event but a data-generating node embedded in a broader platform architecture. When a non-European platform controls the distribution channel through which an agentic commerce application reaches end users, the payment event occurring within that application generates data that flows to platform infrastructure outside European jurisdiction, regardless of whether the payment instrument itself is European-controlled. This is the two-sided market mechanism applied to agentic commerce: the platform that controls cross-side externalities also captures the informational surplus of every transaction [1]. The European Payment Initiative (EPI), whose Wero product had launched in Germany, France, Belgium, and the Netherlands as of 2024-2025 but had not yet achieved pan-European rollout, and the digital euro project represent direct responses to this dynamic, but both face maturity constraints that limit their near-term capacity to anchor sovereign agentic settlement [6], [12].

The concrete stake for European operators is not abstract. An operator deploying an agentic commerce system on non-European foundation models, orchestrated by a non-European agent framework, distributing through a non-European application marketplace, faces a situation in which each contractual and regulatory commitment it makes to European data subjects rests on infrastructure it does not control and cannot independently audit. That is the operational meaning of nominal sovereignty, and it is the condition this paper is designed to measure.

Layered Sovereignty in Digital Infrastructure

The concept of digital sovereignty has migrated from political science into technical governance discourse over the past decade, accumulating a variety of definitions that differ in what they take as the relevant unit of analysis: the state, the firm, the individual, or the infrastructure layer. Pohle, Nanni, and Santaniello [9] provide the most rigorous critical mapping of this migration, tracing how sovereignty claims in digital policy have shifted across geopolitical, commercial, and individual registers without settling on a stable operational meaning. Their analysis establishes that digital sovereignty is contested precisely because it is instrumentalized differently by state actors, platform firms, and civil society. What this paper takes from their work is the importance of specifying the unit of analysis and the criteria for control before making any sovereignty claim. Where their analysis stops, however, is at the conceptual level: they do not produce a scored assessment of specific infrastructure dependencies, and they do not address the agentic layer.

The data governance literature provides a complementary foundation. Micheli, Ponti, Craglia, and Berti Suman [3] identify four emerging archetypes for non-corporate data governance: data sharing pools, data cooperatives, public data trusts, and personal data sovereignty models. Their typology is directly relevant to the data access layer scored in this paper. They frame these models as emerging rather than operationally dominant, a characterization that informs the score assigned to the European data access layer in the Results section. The paper by Otto, ten Hompel, and Wrobel [5] on designing data spaces develops the technical architecture of federated, consent-governed data sharing, as implemented in the GAIA-X and International Data Spaces initiatives. This paper treats that architectural work as evidence of European capability at the data access layer, while acknowledging the gap between architectural specification and operational adoption.

On the economics of platform control, Rysman's [1] foundational treatment of two-sided markets establishes the mechanism by which dominant platforms extract surplus from both sides of a transaction network. That mechanism is central to this paper's interpretation of nominal sovereignty: a European payment instrument embedded in a non-European platform distribution channel inherits the platform's cross-side externality capture regardless of the payment instrument's regulatory provenance. Westermeier [4] extends this analysis to the specific case of financial transactions, arguing that platformization converts payment events into data assets, shifting competitive advantage from settlement efficiency to data accumulation. These two bodies of work together establish the structural reason why payment rail sovereignty, even where scored high, is insufficient to guarantee autonomy at the application layer.

On monetary infrastructure and CBDCs, Bindseil, Panetta, and Terol [6] and Dionysopoulos, Marra, and Urquhart [7] provide the primary analytical treatments of CBDC design constraints. Bindseil et al. address functional scope and control mechanisms; Dionysopoulos et al. provide the critical review that identifies scalability and privacy concerns. Peneder [8] situates digital currency within the longer history of money as a social technology, providing the conceptual grounding for why settlement finality and programmability are sovereignty-relevant properties, not merely technical choices.

On agentic trust infrastructure and the design space of digital payments, this paper draws on the architectural analysis of verify-then-pay mechanisms and the systematization of CBDC payment system designs available in the working paper literature to assess what is absent from current European payment rail and identity infrastructure. These sources are identified in the References section with the caveat noted below.

On AI governance, analysis of the transition from the EU AI Act to potential European AI Agency structures identifies the current European AI Office as structurally constrained in its capacity to govern general-purpose AI at the pace of frontier model development [10]. This paper uses that analysis to assess the regulatory veto point criterion at the foundation model and agent framework layers. Mohamed, Png, and Isaac [2] contribute the decolonial framing, which this paper uses to observe that the same structural dependencies that constitute technological colonialism in one analytical register constitute sovereignty deficits in another, without conflating the two arguments.

No prior work scores European infrastructure control on a unified 0-to-4 scale across all six layers of an agentic commerce stack, and no prior work draws the consequence of that pattern for nominal versus real autonomy in the context of agentic transaction systems. This paper occupies that gap.

Note on sources [10], [11], [12], and [13]: These references carry arXiv identifiers and 2026 publication dates that cannot be verified against currently retrievable records. They are retained in the reference list as submitted by the study authors for disclosure purposes, but readers should treat citations to these sources as provisional pending independent verification. Where this paper relies on the structural arguments attributed to those sources, it does so at the level of architectural mechanism rather than on the authority of the specific documents.

Sovereignty Scoring Methodology

Scale Definition

The sovereignty score for each layer is an integer on the interval [0, 4], assigned according to the following anchors:

0 (Complete Dependency): No European entity controls any material dimension of the layer. Standards, code, data, and operational decisions are set by non-European actors. European regulatory instruments have no enforceable reach into the layer's core operations.
1 (Weak Autonomy): European entities have marginal influence, limited to downstream configuration or application. One or more formal regulatory instruments nominally apply, but enforcement requires cooperation from non-European actors who can withdraw or constrain it. Substitution of non-European components is not operationally feasible within a policy-relevant timeframe (twelve to thirty-six months).
2 (Nominal Autonomy): European entities control the interface standards or distribution rules but not the underlying logic. Regulatory veto points exist and are exercisable, but the veto operates on outputs rather than on the generative mechanism. Competitive European alternatives exist at sub-frontier capability.
3 (Real Autonomy with Residual Risk): European entities control the primary standards, can substitute the dominant non-European component with a European alternative without material capability loss, and hold enforceable veto points over core operations. Residual dependency exists but does not constitute structural lock-in under current conditions.
4 (Full Sovereignty): European entities control all material dimensions, including standards authorship, primary implementation, training data or source code, and operational infrastructure. No structural dependency on non-European actors remains.

Six Layers Under Assessment

Layer 1: Foundation Models. The probabilistic reasoning substrate from which agent behavior emerges. Assessed on: which entities set benchmark standards, where training compute resides, whether European operators can inspect or retrain deployed weights, and whether regulatory instruments reach model training decisions.

Layer 2: Agent Frameworks. The orchestration software that decomposes goals into tool-use sequences, manages memory, routes to APIs, and governs multi-agent delegation. Assessed on: open-source availability, primary maintainer jurisdiction, European capacity to fork and maintain independently, and the enforceability of transparency requirements.

Layer 3: Identity Infrastructure. The mechanisms that authenticate agents, delegate credentials, and establish trust provenance for autonomous actions [11]. Assessed on: standards authorship (W3C Decentralized Identifiers, eIDAS2), national or European public key infrastructure, and the practical recognizability of European identity assertions in non-European settlement systems.

Layer 4: Payment Rails. The settlement infrastructure that finalizes autonomous transactions, including card schemes, instant payment systems, CBDC infrastructure, and programmable settlement primitives [12]. Assessed on: European ownership of clearing and settlement, SEPA governance, European instant payment mandate coverage, and the availability of programmable settlement suitable for agentic commerce.

Layer 5: Data Access. The governance mechanisms controlling training data, transactional data, and behavioral data generated by agent interactions [3]. Assessed on: GDPR enforcement capacity, European data space adoption, and the degree to which European data governance models can restrict non-European data accumulation from European agent transactions.

Layer 6: Application Layer. The distribution channels, marketplaces, and consumer-facing interfaces through which agentic commerce applications reach end users. Assessed on: DMA-mandated interoperability, European market share of dominant distribution platforms, and the extent to which European operators can reach users without passing through non-European gatekeepers.

Decision Rules

Each layer receives scores on the four sub-criteria (standards authority, lock-in risk, regulatory veto, code/data access) on a 0-1 binary basis. The scale permits only integer values of 0 or 1 per sub-criterion; no half-point values are used. Where a criterion is met for a dominant sub-component but not for the full layer, the conservative value of 0 is applied unless the sub-component is sufficiently central that awarding 1 does not overstate control at the layer level. The layer score is the sum of the four binary sub-scores, producing a 0-4 integer. Where evidence is absent or contested, the conservative (lower) score is applied, consistent with the analysis pattern noted in the synthesis: evidence gaps favor underestimating autonomy rather than overstating it. This conservative default is identified explicitly in the Results section wherever it applies.

Layer-by-Layer Sovereignty Scores and Evidence

Layer 1: Foundation Models (Score: 1, Weak Autonomy)

The four sub-criteria yield: standards authority 0, lock-in risk 0, regulatory veto 1, code/data access 0.

No European entity sets the benchmark standards by which frontier foundation models are evaluated and ranked. The dominant evaluation frameworks, infrastructure benchmarks, and model release conventions originate from US laboratories and open-source communities governed predominantly by US-based foundations. European model initiatives, including open-weight releases from Mistral AI and prior collaborative research programs, demonstrate that European actors can produce competitive models at mid-tier capability levels; they have not established benchmark-setting authority at the frontier tier relevant for agentic commerce deployment.

Lock-in risk at this layer is high. The economic and operational cost of switching from a frontier non-European model to a European alternative involves retraining agent pipelines, accepting capability regression on complex reasoning tasks, and re-validating all downstream workflows. This is not a twelve-to-thirty-six month substitution window under current market conditions.

The single point awarded for regulatory veto reflects the EU AI Act's general-purpose AI model provisions, which impose transparency, capability evaluation, and systemic risk obligations on providers of models above defined compute thresholds. Those provisions, which became applicable from August 2025 under the Act's phased timeline, create a formal veto point: a model provider that cannot satisfy EU AI Office requirements cannot legally deploy within the EU. The practical exercise of that veto, however, requires technical capacity within the European AI Office to evaluate model behavior independently, and the current European AI Office is structurally constrained in this capacity [10]. The veto point exists in statute; its operational credibility depends on institutional build-out that remains in progress.

Code and data access score 0. European operators deploying closed frontier models have no access to weights, training data, or fine-tuning pipelines beyond what the provider offers through a commercial API. The decolonial AI literature [2] frames this access asymmetry as a structural condition: the entities that accumulated the compute and data required to train frontier models retain positional advantage until a European actor matches that accumulation, and current European investment levels do not close that gap within a policy-relevant horizon.

Layer 2: Agent Frameworks (Score: 1, Weak Autonomy)

The four sub-criteria yield: standards authority 0, lock-in risk 0, regulatory veto 0, code/data access 1.

The dominant agent orchestration frameworks, including LangChain, AutoGen, and related tool-use scaffolding systems, are maintained primarily by US-based organizations and venture-backed entities. While most carry open-source licenses, the primary development roadmaps, plugin ecosystems, and model integration priorities are set by non-European actors. European entities can fork these frameworks under open-source terms, but sustaining a fork at parity with a rapidly advancing upstream requires continuous engineering investment that no European public or private institution has yet committed at the required scale.

Standards authority scores 0 for the same reason as Layer 1: no European body authors the conventions, APIs, or interoperability specifications that the dominant frameworks implement. Lock-in risk scores 0 because switching to an independently maintained European fork involves accepting divergence from the upstream plugin ecosystem and the model-integration libraries that most commercial deployments depend on.

The regulatory veto criterion scores 0 at the framework level. The EU AI Act's deployer obligations apply to organizations that build agentic applications on top of these frameworks, creating indirect leverage on application behavior, but the framework itself is not a regulated entity under current instrument design [10]. Imposing transparency or audit obligations on an open-source framework without a designated legal entity responsible for it is not operationally feasible under existing regulatory mechanisms. This is a genuine zero rather than a conservative rounding: no enforceable instrument currently reaches the framework core.

Code/data access scores 1 because the open-source licensing of the dominant frameworks gives European operators full access to the source code, including the right to inspect, fork, and deploy modified versions. This is a real, if currently underutilized, capability. The evidentiary gap is whether European operators are actively maintaining forks with independent security patching and feature development; absent that evidence, the 1 reflects legal access rather than operational independence, and the conservative default identified in the methodology applies to how this score is interpreted.

Layer 3: Identity Infrastructure (Score: 3, Real Autonomy with Residual Risk)

The four sub-criteria yield: standards authority 1, lock-in risk 1, regulatory veto 1, code/data access 0.

European identity infrastructure scores positively on three criteria. The eIDAS2 regulation establishes the European Digital Identity Wallet framework and gives European institutions primary standards authority over the digital identity layer for EU residents and legal entities. It is necessary to record, however, that as of the 2024-2025 scoring horizon, the eIDAS2 implementing acts and the reference wallet architecture remain under finalization, with the mandatory wallet availability deadline set under Article 5a for 2026. The score of 1 for standards authority reflects eIDAS2 as an enacted regulatory instrument that sets binding standards, while acknowledging that operational wallet deployment is a forward-looking condition rather than a currently observable fact. Scores drawn from this layer should be read with that caveat: European capability here is real as a legal and architectural commitment, but not yet verified at population scale.

W3C Decentralized Identifier standards, while global in origin, incorporate European contributions and are designed for jurisdictional neutrality, meaning that eIDAS2-compliant credentials can be issued and verified on infrastructure under European control. Zero-knowledge proof authentication mechanisms compatible with privacy-preserving European identity architectures are under development in the digital payment infrastructure research community [13].

Lock-in risk is low relative to other layers because identity infrastructure is federated by design: an eIDAS2-compliant wallet, once deployed, is interoperable across member states and does not require a single non-European intermediary for verification. Regulatory veto is strong: issuance, revocation, and recognition of European digital identity credentials are governed by EU law and enforced by national supervisory authorities. The residual risk that prevents the score from reaching 4 concerns agentic identity delegation. This is the cryptographic mechanism by which an agent proves it is authorized to act on behalf of a human principal in a specific commercial context, and it is not yet standardized within the eIDAS2 framework [11]. Proof-of-task-execution and delegation provenance are not covered by current European identity standards, and no implementing act or technical specification addresses them as of the scoring horizon.

Layer 4: Payment Rails (Score: 3, Real Autonomy with Residual Risk)

The four sub-criteria yield: standards authority 1, lock-in risk 1, regulatory veto 1, code/data access 0.

SEPA governs euro-denominated payment standards across the European Economic Area, and its governance structure is European. The SEPA Instant Credit Transfer scheme, now subject to a mandatory adoption timeline under the 2024 Instant Payments Regulation, provides a European-controlled low-latency payment channel. The European Payment Initiative, whose Wero wallet and payment system had achieved live deployment in Germany, France, Belgium, and the Netherlands as of 2024-2025 but had not yet extended to all EU member states, represents a European-originated consumer-facing payment product operating on European rails. Its current geographic footprint limits its role as a pan-European anchor, and this constraint informs the residual risk assessment at this layer. The digital euro project, if completed, would provide a European-controlled programmable settlement layer directly relevant to agentic commerce [6], [7].

Lock-in risk is low in the sense that European operators can process euro-denominated transactions end-to-end on European infrastructure without structural dependence on Visa, Mastercard, or US-domiciled bank networks, for domestic and intra-European flows. Regulatory veto is the strongest at this layer: the European Central Bank, national central banks, and the European Banking Authority collectively govern clearing, settlement finality, and payment service provider licensing within the EU.

The code/data access criterion scores 0 because the programmability gap is material. Current SEPA and European card infrastructure does not natively support the proof-of-task-execution, escrow, and conditional settlement primitives that agentic commerce requires [11], [12]. The digital euro's programmability architecture remains in design phases, and the maturity of privacy-enhancing settlement technologies is insufficient at current scale [6]. European payment rails are sovereign in the settlement sense; they are not yet sovereign in the programmability sense required for agentic commerce.

Layer 5: Data Access (Score: 2, Nominal Autonomy)

The four sub-criteria yield: standards authority 1, lock-in risk 0, regulatory veto 1, code/data access 0.

GDPR gives European institutions primary standards authority over personal data processing, and the Data Act extends this to non-personal data generated by connected devices and services. European data space initiatives, following the architectural framework developed in the GAIA-X and International Data Spaces programs [5], specify federated governance models that assert European control over data sharing terms. The regulatory veto is exercisable: GDPR enforcement authorities have imposed substantial penalties on non-European operators, and the Data Act creates mandatory data sharing obligations that apply to non-European providers operating in the EU market.

Lock-in risk, however, scores 0 because the training data and behavioral data generated by agentic commerce transactions accumulate in the infrastructure of the model and framework providers, which are predominantly non-European [2], [3]. Even where GDPR restricts certain processing, the data generated by European agent interactions flows into model improvement pipelines, API usage logs, and platform behavioral databases that are not subject to European operational control. The four alternative data governance models identified by Micheli et al. [3] (cooperatives, trusts, sharing pools, personal sovereignty) exist as governance designs but do not yet constitute operationally dominant alternatives to hyperscaler data accumulation.

Layer 6: Application Layer (Score: 2, Nominal Autonomy)

The four sub-criteria yield: standards authority 0, lock-in risk 0, regulatory veto 1, code/data access 1.

The Digital Markets Act designates large online platforms as gatekeepers and imposes interoperability, data portability, and non-discrimination obligations. This creates a regulatory veto point at the application distribution layer: European authorities can require gatekeepers to provide access to APIs and data on non-discriminatory terms. European operators can build and distribute applications, and open-source tooling provides code access for the application tier. These two criteria score positively.

Standards authority scores 0 because the dominant application distribution platforms, including major mobile operating systems and their associated marketplaces, set the technical and commercial standards that govern application behavior, and those standards are authored by non-European actors. Lock-in risk scores 0 because European agentic commerce applications that require distribution through non-European app stores or integration with non-European API ecosystems inherit the platform's dependency structure regardless of the application's own technical provenance. The two-sided market logic documented by Rysman [1] and the platformization analysis of Westermeier [4] identify this as a structural condition: the platform captures cross-side externalities from every transaction that flows through its distribution channel, converting nominal application-layer autonomy into effective platform dependency.

Reading the Pattern: Real Autonomy vs. Nominal Autonomy

Why Payment and Identity Score High

The payment rails and identity layers score 3 for a common structural reason: both were constituted by deliberate, decades-long European public investment in standard-setting authority. SEPA is the product of deliberate European public institution-building, not a market-emergent outcome. The eIDAS framework, and now eIDAS2, was authored by European institutions with explicit strategic intent to prevent identity infrastructure from becoming a privately controlled monopoly. The result is that European operators at these two layers can point to infrastructure they fully govern: the clearing rules, the licensing conditions, the technical specifications, and the supervisory authority are all within the European institutional perimeter.

This is the mechanism that produces real autonomy: prior investment in the productive capacity of the layer, combined with prior investment in the regulatory apparatus surrounding it. The distinction matters because regulation without productive capacity is a veto without an alternative, and a veto without an alternative is structurally weaker than it appears. At the payment and identity layers, European institutions have built the alternative, which is why the regulatory veto is credible rather than nominal.

The residual risk that prevents both layers from reaching 4 is concentrated at the agentic-specific requirements that existing infrastructure does not yet cover. Standard eIDAS2 credentials do not include the delegation semantics required for an agent to prove, cryptographically, that its principal authorized a specific commercial action at a specific moment with specific constraints [11]. Standard SEPA infrastructure does not include the conditional escrow and proof-of-execution settlement required for multi-step agentic transactions [12]. These are peripheral deficiencies only in appearance; they are in fact the features that determine whether European payment and identity infrastructure can serve as the trust anchor for European agentic commerce, or whether that trust anchor migrates to whoever first builds the agentic-native layer on top of SEPA and eIDAS2. The risk is that non-European actors build that agentic layer, creating a new dependency at the application-adjacent tier that progressively hollows out the practical value of the sovereign infrastructure beneath it.

Why Foundation Models and Agent Frameworks Score Low

The score of 1 at both the foundation model and agent framework layers reflects a single underlying condition: European institutions did not invest at scale in the productive capacity of these layers before the current generation of systems was built. The result is that European regulatory authority must operate on outputs rather than on the generative mechanism.

The EU AI Act's general-purpose AI model provisions create transparency and evaluation obligations, but these obligations are satisfied by the model provider submitting documentation and test results to the European AI Office [10]. The European AI Office does not hold the weights, does not control the training process, and cannot independently retrain or modify a model that fails its requirements. Its enforcement options are market access denial and financial penalties, both of which are exercisable but neither of which constitutes operational control over the model. This is precisely the pattern the scoring rubric identifies as weak autonomy: formal regulatory instruments nominally apply, but enforcement requires cooperation from non-European actors who can, in principle, withdraw or restructure the relationship.

The decolonial AI framing [2] adds a further dimension to this analysis. The positional advantage of frontier model providers is a function of current compute investment and also of accumulated training data that represents the documented knowledge, creative output, and interaction patterns of populations who had no voice in the governance of that accumulation. European data protection law constrains new data collection; it does not reverse historical accumulation. The foundation model sovereignty deficit is, therefore, partially irreversible under any policy instrument currently in European hands.

What the Intermediate Scores Reveal: Liability Without Control

The scores of 2 at the data access and application layers reveal a structural condition that is more operationally deceptive than the scores of 1 at the foundation model and agent framework layers, because nominal compliance indicators mask structural dependency. At the bottom of the scale, European operators know they are dependent and can plan accordingly. At the nominal autonomy level, European operators may believe they are compliant and protected when the actual control structure does not support that belief.

A European operator deploying an agentic commerce application under GDPR, subject to DMA gatekeeper obligations, and using a European payment instrument is, by every observable compliance criterion, a European-controlled operation. However, the agent's reasoning is produced by a non-European model, the agent's orchestration runs on a non-European framework, the agent's behavioral data is accumulated in non-European infrastructure, and the agent's distribution channel is a non-European platform. The regulatory obligations the operator has accepted, in particular its commitments to data subject rights under GDPR, rest on infrastructure it cannot audit, interrupt, or substitute without material operational disruption.

This is liability without control: the operator has accepted European legal obligations for outcomes that are causally determined by infrastructure outside its governance perimeter. The two-sided market mechanism [1] and the platformization of financial transactions [4] are the specific structural features that produce this condition. Every transaction that flows through a non-European distribution channel generates data that is captured as platform surplus. The operator's European regulatory compliance does not prevent this capture; it merely governs what the operator itself does with the data it retains.

Configuration Traps

Certain stack configurations amplify the liability-without-control condition into what this paper terms a configuration trap: an architectural choice that is individually rational, commercially viable, and individually compliant, but collectively produces a situation from which no European operator can unilaterally exit without competitive disadvantage.

The canonical configuration trap for European agentic commerce is as follows: a European operator deploys an agentic commerce system using a frontier non-European foundation model (Layer 1, score 1) on a non-European agent framework (Layer 2, score 1), distributed through a non-European application marketplace (Layer 6, score 2 with platform dependency), using European payment rails (Layer 4, score 3) and European identity credentials (Layer 3, score 3). The two high-scoring layers provide genuine European control over settlement and authentication. However, the settlement event is determined by a non-European reasoning process, the delegation chain is not cryptographically verified under European standards, the behavioral data generated by the transaction flows to non-European infrastructure, and the distribution channel captures cross-side surplus. The European payment and identity infrastructure provides a sovereign foundation for a non-sovereign superstructure.

This configuration is not hypothetical; it describes the current deployment posture of most European agentic commerce operators who have not made deliberate, costly choices to substitute non-European components. The scoring pattern makes this trap legible: high scores at the foundation layers create the appearance of structural autonomy that the low and nominal scores at the reasoning and distribution layers do not support.

Reframing the Literature

The findings require a refinement of how the digital sovereignty literature treats European regulatory instruments. Pohle et al. [9] correctly identify the contested and multi-registered nature of digital sovereignty claims, but their analysis does not distinguish between layers of a unified stack. This paper's results show that the heterogeneity of European sovereignty within a single operational system is as significant as the heterogeneity across national claims in the broader political debate. A European operator can simultaneously hold real autonomy at two layers and structural dependency at two others within a single transaction event.

The implication for governance is that layer-by-layer assessment must become the standard unit of analysis for European digital sovereignty policy, replacing the current tendency to assess sovereignty at the level of the application or the firm. An application can be European-operated, European-regulated, and European-facing while remaining substantively non-European at the layers that determine its actual behavior and the disposition of the data it generates.

Sovereignty as a Layered Commitment

The central finding of this analysis is that European autonomy in the agentic commerce stack is real at two layers, nominal at two layers, and structurally absent at two layers. This distribution is not incidental to the current moment; it is the product of deliberate prior investment where European public institutions chose to build productive capacity (payment rails, identity infrastructure) and the absence of such investment where they did not (foundation models, agent frameworks). Regulatory instruments do not substitute for productive capacity. They can impose obligations on non-European actors, but the enforceability of those obligations degrades as the distance between the regulatory veto point and the operational mechanism increases.

Converting nominal autonomy to real autonomy at the data access layer requires three specific actions. European data space initiatives must advance from architectural specification to operational adoption, which means public procurement mandates requiring GAIA-X or International Data Spaces compliance for government-adjacent agentic systems. The Data Act's mandatory data sharing provisions must be extended, through implementing regulation, to cover the behavioral data generated by agentic interactions running on commercial model APIs. Enforcement authorities must develop the technical capacity to audit data flows from European agent transactions into non-European model improvement pipelines, which currently requires cooperation with the model provider that the enforcement authority cannot compel.

Converting nominal autonomy to real autonomy at the application layer requires engagement with the DMA's interoperability provisions as an active instrument rather than a passive obligation. The European Commission must use gatekeeper designation proceedings to require that non-European application distribution platforms expose APIs sufficient for European agentic commerce applications to operate with full functionality outside the platform's native discovery and distribution infrastructure. Without this, the regulatory veto at the application layer remains theoretical: operators comply with DMA terms, but the platform's cross-side data capture persists.

Moving the foundation model and agent framework layers from score 1 toward score 2 requires sustained public investment in European model development at a scale commensurate with the computational demands of frontier-tier agentic systems, combined with mandatory transparency requirements that give the European AI Office independent technical evaluation capacity, rather than reliance on provider-submitted documentation. Advancing these layers to score 2 would require, at a minimum, that European operators can substitute a European model or framework without material capability regression, and that the European AI Office holds the technical instrumentation to audit weight behavior rather than documentation about it. Reaching score 3 would additionally require European training data commons, governed under the data governance archetypes identified by Micheli et al. [3], that provide European model initiatives with training resources not subject to non-European accumulation dynamics, and sustained commitment to maintaining European framework forks at parity with the upstream development pace.

The payment and identity layers require a specific and bounded extension. Advancing them from score 3 toward score 4 requires demonstrating that no structural dependency on non-European actors remains, as the methodology's full-sovereignty definition specifies. Two concrete gaps stand between the current score 3 and that condition. First, eIDAS2 must be extended, through implementing acts or technical specifications under Article 5a, to cover agentic delegation semantics: the cryptographic proof that a specific agent instance was authorized by a specific human principal for a specific commercial action within defined constraints, at a provable moment in time. Second, the digital euro's programmability architecture must reach the conditional escrow and proof-of-execution settlement layer that multi-step agentic transactions require [11], [12]. Beyond these two gaps, the score-4 definition also requires that operational infrastructure, including the clearing and settlement nodes processing agentic transactions, sits entirely within European jurisdiction and is governed by European public authority. The SEPA governance structure satisfies this condition for euro-denominated flows, but the digital euro's full deployment and the eIDAS2 wallet's population-scale rollout remain forward-looking conditions on the 2026-horizon timeline. Advancing to score 4 is therefore contingent not only on the engineering and legal work to close the delegation and programmability gaps, but on completing the infrastructure buildout that the regulatory instruments already mandate.

European policymakers should adopt the posture that sovereignty is a layered commitment requiring separate investment decisions at each layer, rather than a condition achieved by regulatory coverage of the stack as a whole. The scoring framework developed in this paper provides an instrument for tracking progress against that commitment with sufficient granularity to identify where investment is producing returns and where it is not.

Limitations of the Sovereignty Audit

The 0-4 scale collapses meaningful gradations within scores. A layer scoring 2 may sit close to the boundary with score 1 or close to the boundary with score 3, and the integer representation does not communicate this. The binary sub-criterion scoring (0 or 1 on each of four criteria) means that boundary cases are resolved by judgment rather than by a finer-grained instrument. The precision caveat here is not that publicly observable evidence is absent; the Results section draws on SEPA governance documents, GDPR enforcement records, and DMA gatekeeper proceedings as evidence for specific sub-criterion assignments. The limitation is rather that these sources do not yield continuous, cardinal measures of control intensity. For example, knowing that GDPR enforcement authorities have issued penalties against non-European operators establishes that the regulatory veto sub-criterion is satisfied, but does not quantify what fraction of non-compliant data flows the enforcement record actually reaches. Future iterations of this framework require quantitative proxies for each sub-criterion (for example, the percentage of model compute under European jurisdiction, or the number of active eIDAS2 wallet implementations at population scale) to replace judgment-based binary scoring with empirically anchored cardinal values.
The analysis does not account for M&A-driven score changes. Acquisitions of European model developers, identity infrastructure providers, or data space operators by non-European entities would immediately alter the relevant sovereignty scores without any change in the underlying technology. The framework is synchronic: it reflects the ownership and governance structure at the time of scoring. The specific evidentiary gap is the absence of a monitoring mechanism linked to competition authority filings, foreign direct investment screening decisions, and corporate registry changes that would trigger score reassessment.
Vendor roadmaps advance faster than policy cycles. A model provider that commits to European data residency, weight transparency, and third-party audit in its current terms of service may alter those terms within a policy-relevant timeframe. The regulatory veto point criterion assumes the veto is exercisable at the moment of assessment; it does not capture the dynamic in which the regulated entity restructures its offering to make the veto point inaccessible without formally withdrawing from the market. The specific evidentiary gap is contractual: terms of service and API access agreements are not public instruments subject to regulatory stabilization.
The analysis is synchronic, not longitudinal. Score trajectories over time, whether improving or degrading, are not captured. A layer that scored 1 five years ago and now scores 2 represents a different policy situation than a layer that was 3 and has regressed to 2. The specific evidentiary gap is the absence of historical sovereignty score time-series data, which this paper, being the first application of this methodology, cannot produce. Longitudinal analysis requires repeated application of the methodology at defined intervals.

Directions for Continuous Auditing

Four concrete directions would advance the scoring framework from a point-in-time audit to a living measurement instrument.

First, operationalize the sub-criteria into quantitative proxies. Each of the four sub-criteria (standards authority, lock-in risk, regulatory veto, code/data access) requires a measurable indicator that can be updated without repeating the full qualitative analysis. For standards authority, the relevant data source is standards body membership and voting records at ETSI, ISO, and W3C, combined with the European affiliation of working group chairs. For lock-in risk, the relevant instrument is a switching cost model applied to each layer, calibrated by vendor concentration data from European Commission market investigations and Digital Markets Act gatekeeper proceedings.

Second, expand the framework to include compliance chains. An agentic commerce transaction involves a chain of service providers, each of which introduces its own sovereignty profile. The current framework scores layers independently; a compliance chain analysis would score the aggregate sovereignty of a specific deployment configuration by composing the individual layer scores according to the weakest-link logic identified in the Discussion section's configuration trap analysis.

Third, track score volatility following M&A events. In coordination with European foreign direct investment screening authorities and national competition authorities, the framework should be applied immediately following any acquisition that changes the effective jurisdiction of a material component at any layer. The method is event-triggered reassessment using the existing sub-criteria, producing a time-series of scores that captures the M&A-driven sovereignty trajectory of the European agentic stack.

Fourth, integrate with regulatory enforcement data. European AI Office supervisory decisions, GDPR enforcement actions, and DMA gatekeeper compliance assessments generate public records of where regulatory veto points were exercised, with what effect, and against which actors. Systematic integration of these records into the regulatory veto sub-criterion would replace the current judgment-based assessment with an evidence-grounded measure of veto effectiveness over time.

References

[1] Rysman, M. (2009). The Economics of Two-Sided Markets. Journal of Economic Perspectives, 23(3), 125-143.

[2] Mohamed, S., Png, M.-T., & Isaac, W. (2020). Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence. Philosophy & Technology, 33(4), 659-684.

[3] Micheli, M., Ponti, M., Craglia, M., & Berti Suman, A. (2020). Emerging models of data governance in the age of datafication. Big Data & Society, 7(2).

[4] Westermeier, C. (2020). Money is data - the platformization of financial transactions. Information, Communication & Society, 23(14), 2047-2063.

[5] Otto, B., ten Hompel, M., & Wrobel, S. (2022). Designing Data Spaces. Springer.

[6] Bindseil, U., Panetta, F., & Terol, I. (2021). Central Bank Digital Currency: Functional Scope, Pricing and Controls. SSRN Working Paper.

[7] Dionysopoulos, L., Marra, M., & Urquhart, A. (2023). Central bank digital currencies: A critical review. International Review of Financial Analysis, 91, 103031.

[8] Peneder, M. (2021). Digitization and the evolution of money as a social technology of account. Journal of Evolutionary Economics, 32(1), 175-203.

[9] Pohle, J., Nanni, R., & Santaniello, M. (2024). Unthinking Digital Sovereignty: A Critical Reflection on Origins, Objectives, and Practices. Policy & Internet, 16(3).

[10] Pavlidis, G. (2026). From the AI Act to a European AI Agency: Completing the Union's Regulatory Architecture. arXiv preprint. [Note: This reference carries a 2026 date and an arXiv identifier that could not be verified against currently retrievable records as of the drafting of this paper. The citation is retained as submitted; readers should treat it as provisional pending independent verification.]

[11] Goenka, M., Pathak, T., & Asthana, S. (2026). TessPay: Verify-then-Pay Infrastructure for Trusted Agentic Commerce. arXiv preprint. [Note: This reference carries a 2026 date and an arXiv identifier that could not be verified against currently retrievable records as of the drafting of this paper. The citation is retained as submitted; readers should treat it as provisional pending independent verification.]

[12] Senn, J., Judmayer, A., Stifter, N., & Böhme, R. (2026). Systematization of Knowledge: The Design Space of Digital Payment Systems with Potential for CBDC. arXiv preprint. [Note: This reference carries a 2026 date and an arXiv identifier that could not be verified against currently retrievable records as of the drafting of this paper. The citation is retained as submitted; readers should treat it as provisional pending independent verification.]

[13] Mondal, S., & Chithralekha, T. (2026). Zero-Knowledge Proof (ZKP) Authentication for Offline CBDC Payment System Using IoT Devices. arXiv preprint. [Note: This reference carries a 2026 date and an arXiv identifier that could not be verified against currently retrievable records as of the drafting of this paper. The citation is retained as submitted; readers should treat it as provisional pending independent verification.]