Chapter 2: Neosis Axioms and Formal Model

2.1 Primitive Ingredients and State

Neosis is defined on top of a small collection of primitive ingredients that will be reused throughout the chapter. In this section, the goal is not to describe the full internal structure of a Neo, but to fix the basic objects and types—time, binary state, energy, and continuous parameters—that later sections will assemble into a complete formal model.

2.1.1 Time

All dynamics unfold in discrete time. We index ticks by tN={0,1,2,}t \in \mathbb{N} = \{0,1,2,\dots\}, with t=0t = 0 denoting the initial configuration of the system. Each application of the update rules (perception, internal computation, reward, and mutation) advances the system from tick tt to tick t+1t+1. Throughout the chapter, we will describe the behavior of Neos and the NeoVerse by specifying how relevant quantities change as a function of this tick index.

2.1.2 Binary State Substrate

The underlying state substrate of Neosis is binary. We write B={0,1}\mathbb{B} = \{0,1\} for individual bits, and Bn\mathbb{B}^n for length-nn bit vectors. At any tick tt, the internal memory of a Neo, its perceptual input, and its output will all be represented as elements of Bn\mathbb{B}^n for some finite nn. The dimensionality nn is not fixed once and for all: it may change over time as the Neo gains or loses nodes through structural mutation. This choice keeps the local state space simple, while still allowing the overall system to grow in representational capacity.

2.1.3 Energy (Nex)

Each Neo maintains an energy budget, called Nex, which constrains its computation and evolution. At tick tt, the energy of a given Neo is denoted by

NtR0.N_t \in \mathbb{R}_{\ge 0}.

Running computations and performing structural mutations both consume energy, while successful prediction of the NeoVerse yields energy in the form of rewards (Sparks). All such costs and rewards are measured in the same units as Nex, so that energy evolves by simple additive updates of the form

Nt+1=Nt+(reward at t)(cost at t).N_{t+1} = N_t + \text{(reward at } t\text{)} - \text{(cost at } t\text{)}.

Once NtN_t reaches zero, the Neo becomes inert: it can no longer perform internal computation or apply mutations, and its trajectory effectively terminates.

2.1.4 Continuous Parameters and Discrete Structure

A central modeling choice in Neosis is to separate structure from parameters. The structure of a Neo—its set of nodes, edges, and connectivity pattern—will be discrete and graph-like. However, each structural unit carries continuous parameters. For node ii, we write

θiRki,\theta_i \in \mathbb{R}^{k_i},

where kik_i is the parameter dimensionality associated with that node. These parameters control the local computation performed at the node (for example, weights, thresholds, or other coefficients), and they may change over time through learning mechanisms or mutation.

This separation between discrete structure (which nodes exist and how they are connected) and continuous parameters θi\theta_i (how each node computes) is deliberate. It allows Neos to evolve by changing their topology in a combinatorial way, while still supporting rich, smooth families of local computations at each node. Later sections will make this distinction explicit when we define the internal graph of a Neo and the node-local update functions.

2.1.5 Global State at Tick tt

At each tick tt, we conceptually distinguish between the internal state of a Neo and the state of the surrounding world. We write

Worldt\text{World}_t

for the (possibly high-dimensional) state of the NeoVerse at tick tt, and

Neot\text{Neo}_t

for the complete internal state of a single Neo at the same tick, including its binary memory, graph structure, parameters, and energy. In this chapter we will focus on formalizing Neot\text{Neo}_t; the NeoVerse state Worldt\text{World}_t will be treated abstractly and will be accessed only through a projection function introduced in Section 2.2.

2.2 The NeoVerse and Perception

Neos do not exist in isolation. They operate inside an external world, called the NeoVerse, whose dynamics generate the signals that Neos attempt to predict. In this section we keep the NeoVerse deliberately abstract. The aim is not to model the entire environment in detail, but to specify how it interfaces with a Neo through perception.

At each tick tt, the NeoVerse has a state

Worldt,\text{World}_t,

which may be arbitrarily complex and high-dimensional. We do not constrain how Worldt\text{World}_t evolves over time; it may follow a deterministic or stochastic rule, and it may or may not depend on the past behavior of Neos. For the purposes of this chapter, it is sufficient to regard {Worldt}t0\{\text{World}_t\}_{t \ge 0} as an exogenous process that generates the raw conditions under which Neos must operate.

A Neo does not have direct access to Worldt\text{World}_t. Instead, it perceives only a projection of the NeoVerse through a perceptual interface. Formally, we introduce a projection function

Φt:WorldtBmt,\Phi_t : \text{World}_t \longrightarrow \mathbb{B}^{m_t},

and define the perceptual input at tick tt as

Ut=Φt(Worldt)Bmt.\mathbf{U}_t = \Phi_t(\text{World}_t) \in \mathbb{B}^{m_t}.

The dimensionality mtm_t represents the number of binary channels the Neo can currently observe. This dimensionality is not fixed: as the Neo gains or loses input nodes through structural mutation, its perceptual capacity can change, and the corresponding projection Φt\Phi_t can be updated to match.

In the simplest cases, Ut\mathbf{U}_t may consist of a single bit, expressing a minimal signal about the NeoVerse. More generally, Ut\mathbf{U}_t can be a vector of bits encoding multiple aspects of Worldt\text{World}_t. The exact semantics of each bit are not specified at this level; they depend on the particular environment and experimental setup. What matters for the formal model is that all percepts are binary vectors, and that perception is always mediated by some projection Φt\Phi_t from Worldt\text{World}_t into the Neo's current input space.

This view makes the Neo's situation explicitly partially observable. The Neo must form internal representations and predictions on the basis of Ut\mathbf{U}_t rather than on the full underlying state Worldt\text{World}_t. In later sections, we will define the output Yt\mathbf{Y}_t of a Neo as a prediction about future percepts Ut+1\mathbf{U}_{t+1}, and we will use the accuracy of these predictions to determine the Neo's energy gain or loss.

2.3 The Neo: Internal Structure

We now turn from the external NeoVerse to the internal organization of a Neo. At a high level, the state of a single Neo at tick tt consists of two coupled subsystems:

  • Lio, the Learner, which carries the Neo’s computational graph, internal memory, parameters, and input–output interface.

  • Evo, the Evolver, which controls how the structure and parameters of Lio change over time.

For the purposes of this chapter, we focus on specifying the static structure of these subsystems at a given tick tt. The dynamics that update them from tt to t+1t+1 will be introduced in later sections.

We write the overall internal state of a Neo at tick tt as

Neot=(Liot, Evot, Nt),\text{Neo}_t = (\text{Lio}_t,\ \text{Evo}_t,\ N_t),

where NtN_t is the energy (Nex) introduced in Section 2.1.

2.3.1 Lio as an Evolving Binary Graph

Lio contains all components directly involved in perception, internal computation, and prediction. At tick tt, we represent it as

Liot=(Vt, Gt, Θt, Ut, Ot),\text{Lio}_t = (\mathbf{V}_t,\ G_t,\ \Theta_t,\ \mathbf{U}_t,\ O_t),

The vector Vt\mathbf{V}_t is the internal binary state (or memory) of the Neo:

VtBnt,\mathbf{V}_t \in \mathbb{B}^{n_t},

where ntn_t is the number of internal nodes at tick tt. Each coordinate Vt[i]\mathbf{V}_t[i] corresponds to the state of a single node. We will use indices

i{1,,nt}i \in \{1,\dots,n_t\}

to refer to nodes, so there is no separate symbol for the node set.

The graph structure is captured by

Gt=Et,G_t = E_t,

where Et{1,,nt}×{1,,nt}E_t \subseteq \{1,\dots,n_t\} \times \{1,\dots,n_t\} is the set of directed edges between nodes. We do not impose any topological restriction: EtE_t may describe a feedforward, recurrent, or cyclic graph. This flexibility allows the Neo to evolve arbitrary computational motifs, including those that resemble neural networks, finite-state machines, or more complex dynamical systems.

Each node i{1,,nt}i \in \{1,\dots,n_t\} is associated with a continuous parameter vector

θiRki,\theta_i \in \mathbb{R}^{k_i},

which determines how that node processes its inputs. We collect all node parameters at tick tt into

Θt={θi:i=1,,nt}.\Theta_t = \{\theta_i : i = 1,\dots,n_t\}.

These parameters will be used in Section 2.4 to define the node-local update rules that map incoming binary signals to new node states.

The interface between Lio and the NeoVerse is given by the input vector

UtBmt.\mathbf{U}_t \in \mathbb{B}^{m_t}.

The input Ut\mathbf{U}_t is the percept at tick tt defined by the projection in Section 2.2. The output Yt\mathbf{Y}_t is defined as a direct readout of the internal state at the output indices:

Yt=Vt[Ot]Bpt,\mathbf{Y}_t = \mathbf{V}_t[O_t] \in \mathbb{B}^{p_t},

where OtO_t is the set of output node indices stored in Liot\text{Lio}_t. Since Yt\mathbf{Y}_t is always a direct readout of Vt\mathbf{V}_t at indices OtO_t, it carries no independent state beyond what is already encoded in Vt\mathbf{V}_t and OtO_t. The dimensions mtm_t and ptp_t may change over time as the Neo gains or loses input and output nodes through structural mutation.

In summary, Lio at tick tt is an evolving binary graph with continuous parameters, equipped with a binary input interface. The pair (Et, Θt)(E_t,\ \Theta_t) specifies what computational structure exists, while (Vt, Ut)(\mathbf{V}_t,\ \mathbf{U}_t) specifies the current binary activity flowing through that structure. The output Yt\mathbf{Y}_t is derived from Vt\mathbf{V}_t via the output index set OtO_t.

2.3.2 Evo as a Meta-Level Mutation Controller

Evo operates at a meta level: it does not directly process percepts from the NeoVerse, but instead governs how Lio’s structure and parameters change over time. At tick tt, we keep Evo abstract and write

Evot=(Ψt, Ξt),\text{Evo}_t = (\Psi_t,\ \Xi_t),

where Ψt\Psi_t denotes any internal variables Evo maintains (for example, mutation rates or exploration preferences), and Ξt\Xi_t denotes a mutation policy.

Conceptually, the mutation policy Ξt\Xi_t is a rule that can inspect the current state of the Neo and propose structural or parametric changes to Lio. In later sections, these changes will be formalized as mutation primitives (adding or removing nodes and edges, or perturbing parameters) with associated energy costs. For the present chapter, it is enough to note that Evo:

  • has access to Liot\text{Lio}_t and NtN_t,

  • can decide which mutations to attempt at each tick, and

  • must respect the available energy when doing so.

This separation between Lio (which computes and predicts) and Evo (which decides how Lio itself should change) is central to Neosis. It mirrors the distinction, in biological systems, between fast neural dynamics and slower evolutionary or developmental processes that shape the underlying circuitry.

2.4.1 Node Inputs

At tick tt, the internal state of the Neo is

VtBnt,\mathbf{V}_t \in \mathbb{B}^{n_t},

and the current percept is

UtBmt.\mathbf{U}_t \in \mathbb{B}^{m_t}.

The directed edge set Et{1,,nt}×{1,,nt}E_t \subseteq \{1,\dots,n_t\} \times \{1,\dots,n_t\} specifies how internal nodes read from one another. To allow nodes to also depend on perceptual inputs, we conceptually extend the set of possible inputs by treating components of Ut\mathbf{U}_t as additional sources.

For each node index i{1,,nt}i \in \{1,\dots,n_t\}, we define a finite index set

It(i){1,,nt}{input indices},\mathcal{I}_t(i) \subseteq \{1,\dots,n_t\} \cup \{\text{input indices}\},

which lists the internal and input coordinates that feed into node ii at tick tt. From this set we form an input vector

zi(t)Bki,\mathbf{z}_i(t) \in \mathbb{B}^{k_i},

by collecting the corresponding bits from Vt\mathbf{V}_t and Ut\mathbf{U}_t in a fixed order, where ki=It(i)k_i = |\mathcal{I}_t(i)|. As before, kik_i may change over time as edges or inputs are added or removed.

In addition to these deterministic inputs, each node also receives a stochastic binary input

ηi(t)Bernoulli(pi),\eta_i(t) \sim \text{Bernoulli}(p_i),

where pi(0,1)p_i \in (0,1) is a per-node noise bias parameter stored in the node's parameter vector. This random bit allows local computations to be intrinsically stochastic even when Vt\mathbf{V}_t and Ut\mathbf{U}_t are fixed. The parameter pip_i controls the bias of the stochastic input, enabling nodes to evolve different levels of intrinsic randomness.

Each node ii also receives a binary gate input gi(t){0,1}g_i(t) \in \{0,1\}, which may be drawn from Ut\mathbf{U}_t (as a perceptual input) or from Vt\mathbf{V}_t (as feedback from another internal node). The gate bit controls whether the node updates from its other inputs or tends to maintain its previous state, as described in the update rule below.

2.4.2 Parametric Local Update Rule

Each node ii carries a continuous parameter vector

θiRki+3,\theta_i \in \mathbb{R}^{k_i + 3},

which we interpret as a concatenation of weights, noise parameters, and a bias:

θi=(wi,αi,pi,bi),\theta_i = (w_i, \alpha_i, p_i, b_i),

where

wiRki,αiR,pi(0,1),biR.w_i \in \mathbb{R}^{k_i}, \qquad \alpha_i \in \mathbb{R}, \qquad p_i \in (0,1), \qquad b_i \in \mathbb{R}.

The parameter pip_i controls the bias of the stochastic input ηi(t)Bernoulli(pi)\eta_i(t) \sim \text{Bernoulli}(p_i), allowing each node to have adjustable stochasticity. When pi=0.5p_i = 0.5, the noise is unbiased; values closer to 0 or 1 produce more deterministic behavior.

Given the binary input vector zi(t)Bki\mathbf{z}_i(t) \in \mathbb{B}^{k_i}, the stochastic bit ηi(t)Bernoulli(pi)\eta_i(t) \sim \text{Bernoulli}(p_i), and the gate bit gi(t){0,1}g_i(t) \in \{0,1\}, the node update rule is:

  • If gi(t)=0g_i(t) = 0, the node tends to copy its previous state:

    Vt+1[i]=Vt[i].\mathbf{V}_{t+1}[i] = \mathbf{V}_t[i].
  • If gi(t)=1g_i(t) = 1, the node updates from its other inputs. The node first computes a real-valued activation

    ai(t)=wizi(t)+αiηi(t)+bi,a_i(t) = w_i^\top \mathbf{z}_i(t) + \alpha_i\, \eta_i(t) + b_i,

    and then applies a threshold to obtain the new binary state:

    Vt+1[i]=Lexi(zi(t),ηi(t),gi(t),θi)=H(ai(t)).\mathbf{V}_{t+1}[i] = \text{Lex}_i\big(\mathbf{z}_i(t), \eta_i(t), g_i(t), \theta_i\big) = H\big(a_i(t)\big).

This gate mechanism allows nodes to explicitly freeze their state when gi(t)=0g_i(t) = 0, while enabling normal computation when gi(t)=1g_i(t) = 1. The gate bit gi(t)g_i(t) is treated as part of the node's input set, so it may be wired from perceptual inputs Ut\mathbf{U}_t or from other internal nodes via the edge set EtE_t. with the Heaviside step function

H(x)={1,x0,0,x<0.H(x) = \begin{cases} 1, & x \ge 0,\\ 0, & x < 0. \end{cases}

This definition preserves the properties we want:

  • Locality: each update depends only on zi(t)\mathbf{z}_i(t), ηi(t)\eta_i(t), gi(t)g_i(t), and θi\theta_i.

  • Binary state: outputs stay in B\mathbb{B}.

  • Adjustable stochasticity: the per-node parameter pip_i controls the bias of ηi(t)Bernoulli(pi)\eta_i(t) \sim \text{Bernoulli}(p_i), allowing evolution to tune the level of intrinsic randomness at each node.

  • Explicit freeze control: the gate bit gi(t)g_i(t) provides direct control over whether a node updates or maintains its previous state, enabling richer temporal dynamics.

  • Structural robustness: when kik_i changes, we only resize wiw_i and the construction of zi(t)\mathbf{z}_i(t); αi\alpha_i, pip_i, and bib_i remain single scalars.

Snapshot semantics remain as before: all nodes read Vt\mathbf{V}_t, Ut\mathbf{U}_t, and their own ηi(t)\eta_i(t) at the beginning of tick tt, then update in parallel to produce Vt+1\mathbf{V}_{t+1}.

2.5 Mutation Primitives and Structural Updates

A defining property of a Neo is that its internal structure is not fixed. Both the topology of its computational graph and the interpretation of its outputs may change over time through discrete mutation events. These mutations are proposed by Evo's mutation policy Ξt\Xi_t and applied during the Mutation Phase of each Cycle, subject to available energy.

We introduce a unified set of mutation primitives:

Amut={node,  edge,  paramf,  output},A_{\text{mut}} = \{ \texttt{node},\; \texttt{edge},\; \texttt{param}^f,\; \texttt{output} \},

where each primitive includes multiple subtypes (addition, removal, or reassignment) defined below. Each mutation type aAmuta \in A_{\text{mut}} has an associated energy cost Cmut(a)0C_{\text{mut}}(a) \ge 0.

All mutations operate locally on the tuple

(Vt,Et,Θt,Ut,Ot),(V_t, E_t, \Theta_t, U_t, O_t),

and produce an updated structure consistent with the rules of the Neo's internal graph.

2.5.1 Node Mutation

Node mutations modify the number of internal nodes. A node mutation consists of either adding a new node or removing an existing one.

node+^+

A node-addition mutation introduces a new internal node and increases the dimensionality of the state vector from ntn_t to nt+1n_t+1. Formally,

VtBnt+1,Θt=Θt{θnt+1},V'_t \in \mathbb{B}^{n_t+1}, \qquad \Theta'_t = \Theta_t \cup \{\theta_{n_t+1}\},

where Vt[nt+1]V'_t[n_t+1] is initialized to 0 and the new parameter vector θnt+1\theta_{n_t+1} is drawn from an initialization distribution over Rknt+1+3\mathbb{R}^{k_{n_t+1}+3}.

Optionally, Evo may introduce new edges involving the new node:

Et=EtEnew.E'_t = E_t \cup E_{\text{new}}.

All index sets and parameter vectors are resized accordingly.

node^-

A node-removal mutation selects an index i{1,,nt}i \in \{1,\dots,n_t\} and deletes it. The updated dimensionality becomes nt=nt1n'_t = n_t - 1. All edges incident to ii are removed:

Et={(j,k)Et:ji,  ki}.E'_t = \{ (j,k) \in E_t : j \neq i,\; k \neq i \}.

The corresponding state coordinate and parameter vector are removed, and remaining node indices are re-labeled to maintain a contiguous index set. If iOti \in O_t, it is also removed from the output set.

Node removal may disconnect the graph; the result is still considered valid.

2.5.2 Edge Mutation

Edge mutations change information flow by adding or removing directed edges.

edge+^+

Select a pair (j,k)(j,k) with jkj \neq k. The edge is added:

Et=Et{(j,k)}.E'_t = E_t \cup \{(j,k)\}.

This increases the input dimensionality of node kk by one, requiring expansion of its weight vector wkw_k by appending a new weight drawn from an initialization distribution.

edge^-

Select an existing edge (j,k)Et(j,k) \in E_t and delete it:

Et=Et{(j,k)}.E'_t = E_t \setminus \{(j,k)\}.

The corresponding coordinate is removed from wkw_k, decreasing its input dimensionality.

2.5.3 Parameter Perturbation

A parameter-perturbation mutation updates the continuous parameters of a single node without altering the graph structure. For a selected node ii:

θiθi+Δi,\theta_i \leftarrow \theta_i + \Delta_i,

where Δi\Delta_i is drawn from a zero-mean perturbation distribution on Rki+3\mathbb{R}^{k_i+3}. All other nodes and edges remain unchanged.

This primitive enables exploration of local computational behaviors.

2.5.4 Output Mutation

Output mutations allow the Neo to change which internal nodes contribute to its prediction vector Yt=Vt[Ot]\mathbf{Y}_t = \mathbf{V}_t[O_t]. The output index set at tick tt is

Ot={o1,,opt}{1,,nt}.O_t = \{o_1, \dots, o_{p_t}\} \subseteq \{1,\dots,n_t\}.

We introduce two subtypes.

output+^+

Select a node index i{1,,nt}i \in \{1,\dots,n_t\} with iOti \notin O_t and add it to the output set:

Ot+1=Ot{i}.O_{t+1} = O_t \cup \{i\}.

This increases the output dimensionality ptpt+1=pt+1p_t \to p_{t+1} = p_t + 1.

output^-

Select iOti \in O_t and remove it:

Ot+1=Ot{i},O_{t+1} = O_t \setminus \{i\},

reducing the output dimensionality ptpt+1=pt1p_t \to p_{t+1} = p_t - 1.

Output mutations allow the Neo to evolve its prediction interface, enabling specialization, pruning, and reallocation of computational resources.


Together the unified mutation setprovides a minimal but expressive basis for evolving both the topology and computation of Lio. By associating each primitive with an energy cost and constraining mutations to be affordable at tick tt, Evo must balance exploration against the Neo's available energy, embedding evolutionary pressure directly into the organism's survival dynamics.

2.6 The Cycle: Operational Semantics

We now describe how a Neo evolves from tick tt to tick t+1t+1. The Cycle specifies the order in which perception, internal computation, reward, energy update, and mutation occur. All quantities are understood to be conditioned on the current internal state

Neot=(Liot, Evot, Nt)\text{Neo}_t = (\text{Lio}_t,\ \text{Evo}_t,\ N_t)

and the external world state Worldt\text{World}_t.

For readability, we keep the description at a single-Neo level; in later chapters, populations of Neos will be handled by applying the same rules to each individual.

2.6.1 Perception

At the beginning of tick tt, the Neo perceives the NeoVerse through the projection function introduced in Section 2.2. The world is in state Worldt\text{World}_t, and the percept is

Ut=Φt(Worldt)Bmt.\mathbf{U}_t = \Phi_t(\text{World}_t) \in \mathbb{B}^{m_t}.

This value is written into Lio's input component, so that

Liot=(Vt, Et, Θt, Ut, Ot),\text{Lio}_t = (\mathbf{V}_t,\ E_t,\ \Theta_t,\ \mathbf{U}_t,\ O_t),

with Ut\mathbf{U}_t matching the current projection of the NeoVerse.

2.6.2 Internal Computation and Output

Given Vt\mathbf{V}_t, Ut\mathbf{U}_t, the edge set EtE_t, and parameters Θt\Theta_t, Lio updates its internal state and produces an output.

For each node index i=1,,nti = 1,\dots,n_t:

  1. Construct the input index set It(i)\mathcal{I}_t(i) and the corresponding binary vector

    zi(t)Bki\mathbf{z}_i(t) \in \mathbb{B}^{k_i}

    by reading from Vt\mathbf{V}_t and Ut\mathbf{U}_t.

  2. Read the gate bit gi(t){0,1}g_i(t) \in \{0,1\} from the node's inputs (which may come from Vt\mathbf{V}_t or Ut\mathbf{U}_t via the edge set EtE_t).

  3. Sample a stochastic bit using the node's noise bias parameter:

    ηi(t)Bernoulli(pi),\eta_i(t) \sim \text{Bernoulli}(p_i),

    where pi(0,1)p_i \in (0,1) is stored in θi=(wi,αi,pi,bi)\theta_i = (w_i, \alpha_i, p_i, b_i).

  4. Update the node's binary state using the local rule:

    • If gi(t)=0g_i(t) = 0, set Vt+1[i]=Vt[i]\mathbf{V}_{t+1}[i] = \mathbf{V}_t[i] (freeze).

    • If gi(t)=1g_i(t) = 1, compute the activation using the node's parameters:

      ai(t)=wizi(t)+αiηi(t)+bi,a_i(t) = w_i^\top \mathbf{z}_i(t) + \alpha_i\, \eta_i(t) + b_i,

      and set

      Vt+1[i]=Lexi(zi(t),ηi(t),gi(t),θi)=H(ai(t)),\mathbf{V}_{t+1}[i] = \text{Lex}_i\big(\mathbf{z}_i(t), \eta_i(t), g_i(t), \theta_i\big) = H\big(a_i(t)\big),

      where H()H(\cdot) is the Heaviside step function.

We adopt snapshot semantics: all nodes read Vt\mathbf{V}_t and Ut\mathbf{U}_t and their own ηi(t)\eta_i(t) at the start of tick tt, and all updates to Vt+1\mathbf{V}_{t+1} are conceptually applied in parallel.

The output vector YtBpt\mathbf{Y}_t \in \mathbb{B}^{p_t} is defined as a direct readout of the internal state at the output indices:

Yt=Vt[Ot],\mathbf{Y}_t = \mathbf{V}_t[O_t],

Thus at tick tt, the Neo produces a prediction Yt\mathbf{Y}_t based on its internal state and the current percept, while its internal memory is updated to Vt+1\mathbf{V}_{t+1} for use at the next tick.

2.6.3 Running Cost and Energy Deduction

Executing the internal computation incurs a running cost that depends on the size of the Neo’s active structure. We introduce a cost function

Crun:N2R0,C_{\text{run}} : \mathbb{N}^2 \to \mathbb{R}_{\ge 0},

which may, for example, depend on the number of internal nodes and input bits. A simple choice is

Crun(nt,mt)=cnodent+cinmt,C_{\text{run}}(n_t, m_t) = c_{\text{node}}\, n_t + c_{\text{in}}\, m_t,

with non-negative constants cnode,cinc_{\text{node}}, c_{\text{in}}.

Nt=NtCrun(nt,mt).N_t' = N_t - C_{\text{run}}(n_t, m_t).

If Nt0N_t' \le 0, the Neo has exhausted its energy and becomes inert; its trajectory terminates, and no further computation or mutation occurs.

2.6.4 Reward (Spark) and Energy Update

After Lio has produced Yt\mathbf{Y}_t and updated its internal state, the NeoVerse advances to the next tick. The world transitions to Worldt+1\text{World}_{t+1} according to its own dynamics, and the Neo receives a new percept

Ut+1=Φt+1(Worldt+1).\mathbf{U}_{t+1} = \Phi_{t+1}(\text{World}_{t+1}).

The quality of the Neo’s prediction is assessed by a reward function

R:Bpt×Bmt+1R,R : \mathbb{B}^{p_t} \times \mathbb{B}^{m_{t+1}} \to \mathbb{R},

which compares Yt\mathbf{Y}_t to Ut+1\mathbf{U}_{t+1}. We write the resulting reward (Spark) as

St=R(Yt, Ut+1).S_t = R(\mathbf{Y}_t,\ \mathbf{U}_{t+1}).

The Neo’s energy is then updated to

Nt=Nt+St.N_t'' = N_t' + S_t.

The specific form of RR can vary with the environment; in many examples it will reward accurate prediction of selected components of Ut+1\mathbf{U}_{t+1} and penalize systematic errors. For the formal model, it is enough to assume that RR is well-defined and can be evaluated from Yt\mathbf{Y}_t and Ut+1\mathbf{U}_{t+1}.

2.6.5 Mutation Phase

If Nt>0N_t'' > 0, Evo may attempt to modify Lio’s structure or parameters. At tick tt, Evo’s policy Ξt\Xi_t can inspect the current internal state and energy

(Liot, Nt)(\text{Lio}_t,\ N_t'')

and select a (possibly empty) finite sequence of mutation primitives

(at,1,at,2,,at,Kt),at,kAmut,(a_{t,1}, a_{t,2}, \dots, a_{t,K_t}), \qquad a_{t,k} \in \mathcal{A}_{\text{mut}},

to be applied to (Vt+1,Et,Θt,Ut+1,Ot)(\mathbf{V}_{t+1}, E_t, \Theta_t, \mathbf{U}_{t+1}, O_t).

Each mutation type aAmuta \in \mathcal{A}_{\text{mut}} has an associated non-negative energy cost

Cmut(a)R0.C_{\text{mut}}(a) \in \mathbb{R}_{\ge 0}.

Let

Cmut,total(t)=k=1KtCmut(at,k)C_{\text{mut,total}}(t) = \sum_{k=1}^{K_t} C_{\text{mut}}(a_{t,k})

be the total cost of the proposed mutations. Evo can only apply mutations up to the available energy. Formally, the sequence of mutations is truncated, if necessary, at the largest prefix that satisfies

Ntk=1kCmut(at,k)>0.N_t'' - \sum_{k=1}^{k^\ast} C_{\text{mut}}(a_{t,k}) > 0.

The truncated prefix is then applied in order, yielding updated structural and parametric components (which we still denote by Et+1E_{t+1}, Θt+1\Theta_{t+1}, and Ot+1O_{t+1} for simplicity). All bookkeeping on Vt+1\mathbf{V}_{t+1} and Ut+1\mathbf{U}_{t+1} required to maintain consistency with the new structure is treated as part of the mutation operation.

The final energy after mutation is

Nt+1=Ntk=1kCmut(at,k),N_{t+1} = N_t'' - \sum_{k=1}^{k^\ast} C_{\text{mut}}(a_{t,k}),

and the Neo's internal state at tick t+1t+1 is

Liot+1=(Vt+1,Et+1,Θt+1,Ut+1,Ot+1),\text{Lio}_{t+1} = (\mathbf{V}_{t+1}, E_{t+1}, \Theta_{t+1}, \mathbf{U}_{t+1}, O_{t+1}),

where Yt+1\mathbf{Y}_{t+1} will be derived from Vt+1\mathbf{V}_{t+1} via Ot+1O_{t+1} at the next computation step.

2.6.6 Summary of One Cycle

Putting the pieces together, one full Cycle from tick tt to tick t+1t+1 consists of:

  1. Perception: observe Ut=Φt(Worldt)\mathbf{U}_t = \Phi_t(\text{World}_t).

  2. Computation: update Vt+1\mathbf{V}_{t+1} and produce Yt\mathbf{Y}_t via local node rules.

  3. Running cost: deduct Crun(nt,mt,pt)C_{\text{run}}(n_t, m_t, p_t) to obtain NtN_t'.

  4. World update and reward: compute Ut+1\mathbf{U}_{t+1} and reward St=R(Yt, Ut+1)S_t = R(\mathbf{Y}_t,\ \mathbf{U}_{t+1}), yielding energy NtN_t''.

  5. Mutation (optional): Evo selects and applies affordable mutations from Amut\mathcal{A}_{\text{mut}}, updating (Et,Θt)(E_t, \Theta_t) to (Et+1,Θt+1)(E_{t+1}, \Theta_{t+1}) and reducing energy to Nt+1N_{t+1}.

  6. Termination check: if Nt+10N_{t+1} \le 0, the Neo becomes inert; otherwise, the Cycle repeats.

This operational definition provides a complete, minimal description of how a single Neo interacts with the NeoVerse, computes, earns or loses energy, and modifies its own structure over time. In the next section, we introduce a performance measure that summarizes how efficiently a Neo converts structure and energy into predictive success.

2.7 Performance Measures

The formal model of Neosis defines a complete energy trajectory

N0,N1,N2,N_0, N_1, N_2, \dots

for each Neo interacting with a given NeoVerse. This trajectory already combines prediction rewards and structural costs, so we do not introduce an additional ratio of "reward over cost." Instead, we summarize performance with a primary long-term measure—Survivability—that captures the probability of sustained survival, along with two transient measures that describe short-term behavior.

2.7.1 Survivability

The Survivability of a Neo is defined as the probability that it maintains positive energy indefinitely in a given NeoVerse. Formally, for a Neo with initial energy N0>0N_0 > 0 and energy trajectory {Nt}t0\{N_t\}_{t \ge 0}, we define

Survivability=P(lim inftNt>0).\text{Survivability} = \mathbb{P}\left(\liminf_{t \to \infty} N_t > 0\right).

This measure captures the long-term viability of a Neo's structure and parameters in its environment. A Neo with high survivability has evolved a configuration that, on average, maintains a positive energy drift over time, allowing it to persist indefinitely despite stochastic fluctuations. In contrast, a Neo with low survivability is doomed to eventual extinction, even if it may survive for extended periods due to favorable short-term fluctuations.

Survivability depends on the interplay between the Neo's predictive accuracy, its structural costs, and the statistics of the NeoVerse. In later chapters, we will show how survivability can be analyzed through the energy drift and variance, revealing critical phase transitions between certain extinction and possible long-term survival.

2.7.2 Transient Measures: Lifetime and Vitality

While survivability captures long-term prospects, two transient measures provide insight into short-term performance:

Lifetime τ\tau: A Neo is considered alive at tick tt if its energy is strictly positive, Nt>0N_t > 0. Once its energy reaches zero, it becomes inert and can no longer compute or mutate. We define the lifetime as

τ=max{t0:Nt>0},\tau = \max\{t \ge 0 : N_t > 0\},

the last tick at which the Neo is still alive. Lifetime measures how long a Neo persists in a single run, but it is a transient quantity: even a Neo with zero survivability may achieve a long lifetime in a particular trajectory due to favorable noise realizations.

Vitality: We define the Vitality of a Neo as the maximum energy it attains over its lifetime:

Vitality=max0tτNt.\text{Vitality} = \max_{0 \le t \le \tau} N_t.

Vitality quantifies how energetically "alive" a Neo becomes during its existence, reflecting its ability to accumulate energy reserves. Like lifetime, vitality is transient: it describes a single trajectory and does not directly predict long-term survival.

In most analyses, we will use survivability as the primary performance measure for assessing the long-term viability of Neo configurations. The transient measures (τ, Vitality)(\tau,\ \text{Vitality}) provide complementary information about short-term behavior and can be useful for understanding individual trajectories, but they do not capture the fundamental question of whether a Neo can persist indefinitely in its environment.

2.8 Rationale for the Neo Structure

The formal model above makes a specific set of design choices: a Neo is an evolving directed graph over binary node states, with continuous local parameters, one stochastic bit per node, and an explicit separation between fast computation (Lio) and slower structural change (Evo). In this section we briefly justify these choices and relate them to both artificial neural networks and biological synapses.

2.8.1 Relation to Neurons and Synapses

At the level of a single node, the update rule

Vt+1[i]={Vt[i],if gi(t)=0,H(wizi(t)+αiηi(t)+bi),if gi(t)=1,\mathbf{V}_{t+1}[i] = \begin{cases} \mathbf{V}_t[i], & \text{if } g_i(t) = 0,\\ H\big(w_i^\top \mathbf{z}_i(t) + \alpha_i \eta_i(t) + b_i\big), & \text{if } g_i(t) = 1, \end{cases}

is deliberately close to a threshold neuron: it combines a weighted sum of inputs with a bias and then applies a nonlinearity, with an explicit gate mechanism that allows nodes to freeze their state. The directed edges EtE_t play the role of synapses, determining which nodes can influence which others, and the continuous parameters θi=(wi,αi,pi,bi)\theta_i = (w_i, \alpha_i, p_i, b_i) determine the strength and sign of those influences, as well as the bias of the stochastic input.

The key differences from a standard artificial neuron are:

  • Binary internal state: node outputs live in B\mathbb{B}, making the local state space as simple as possible while still allowing rich global dynamics through the network.

  • Evolving topology: the edge set EtE_t is not fixed. Nodes and edges can be added or removed, unlike conventional ANNs where the graph is chosen once and trained only in weight space.

  • Explicit energy accounting: each run and mutation is charged against NtN_t, tying “synaptic complexity” and structural changes directly to survival.

This makes each node loosely analogous to a neuron with a discrete firing state and continuously tunable synaptic efficacy, while Evo provides a separate mechanism more reminiscent of developmental or evolutionary processes acting on circuitry over longer timescales.

2.8.2 Why Stochasticity at Each Node?

The inclusion of a stochastic bit ηi(t)Bernoulli(pi)\eta_i(t) \sim \text{Bernoulli}(p_i) per node is intentional rather than cosmetic. Even with binary inputs zi(t)\mathbf{z}_i(t) fixed, the activation

ai(t)=wizi(t)+αiηi(t)+bia_i(t) = w_i^\top \mathbf{z}_i(t) + \alpha_i \eta_i(t) + b_i

can change from tick to tick through ηi(t)\eta_i(t), and thus the output can fluctuate. The per-node parameter pi(0,1)p_i \in (0,1) controls the bias of this stochastic input, allowing evolution to tune the level of intrinsic randomness at each node independently.

This local randomness serves several purposes:

  • Exploration in parameter and structure space: stochastic node outputs can cause different sequences of rewards StS_t under the same environment, which in turn biases Evo's choices of mutations. This provides an intrinsic exploration mechanism without needing an additional external noise process at the level of Evo.

  • Symmetry breaking: in purely deterministic systems, structurally identical Neos placed in identical environments would follow identical trajectories. The per-node stochasticity allows initially identical Neos to diverge, supporting richer population-level dynamics without complicating the deterministic part of the update rule.

  • Modeling stochastic environments: many NeoVerses are inherently noisy. Allowing internal computations to incorporate randomness makes it easier for Neos to represent and approximate stochastic mappings from past percepts to future outcomes, rather than being restricted to deterministic input–output relationships.

  • Adjustable noise levels: the parameter pip_i enables nodes to evolve different stochasticity profiles. A node with pi0.5p_i \approx 0.5 provides balanced exploration, while pip_i near 0 or 1 produces more deterministic behavior, allowing the Neo to balance exploration and exploitation at the node level.

Crucially, the stochasticity is added in the simplest possible way: a single Bernoulli bit enters linearly with weight αi\alpha_i. This keeps the local rule analytically tractable while still providing a source of randomness that can be up- or down-weighted by evolution (through changes in αi\alpha_i) and bias-tuned through the parameter pip_i.

2.8.3 Minimality and Extensibility

The overall structure of a Neo is chosen to be minimal but extensible:

  • Minimal substrate: all observable states (internal, input, output) are binary, and all structure is encoded in a finite directed graph EtE_t and parameter set Θt\Theta_t. This keeps the state space simple and makes it easy to reason about limits such as small-Neo behavior or single-node dynamics.

  • Continuous parameters with discrete structure: separating discrete topology from continuous parameters allows us to treat structural mutations (node+^+, node^-, edge+^+, edge^-) and local parametric changes (paramf^f) within a single framework. Conventional ANNs appear as a special case where EtE_t is fixed and only parameter updates are allowed.

  • Clean energy coupling: by charging both running cost and mutation cost directly to NtN_t, every aspect of the Neo’s complexity—depth, width, connectivity, and rate of structural change—becomes subject to selection pressure through Lifetime and Vitality. There is no separate, ad hoc regularizer.

  • Straightforward generalizations: the current node rule is threshold-based, but replacing H()H(\cdot) by another nonlinearity, or allowing continuous-valued node states, requires only local modifications to Section 2.4. The rest of the framework (structure, energy, mutation, Cycle) remains unchanged.

In summary, the chosen Neo structure sits deliberately between biological inspiration and mathematical simplicity. It is close enough to a network of stochastic threshold neurons with evolving synapses to be cognitively meaningful, yet minimal enough to support precise analysis of survival, evolution, and emergent computation in the subsequent chapters.

2.9 Neo and Other Computational Models

To situate Neo within the broader landscape of computational models, we compare it with several established frameworks: recurrent neural networks, Hopfield networks, McCulloch–Pitts threshold networks, spiking neural networks, and probabilistic finite automata. These comparisons highlight both the mathematical connections and the distinctive features that make Neo a unique computational object.

2.9.1 Neo and Recurrent Neural Networks

RNNs provide continuous-state dynamics of the form

ht+1=f(Wht+Uxt+b),\mathbf{h}_{t+1} = f(W\mathbf{h}_t + U\mathbf{x}_t + \mathbf{b}),

with ff a smooth nonlinearity (tanh, ReLU). Neo shares with RNNs a dependence on recurrent structure and a parallel update pattern, but differs fundamentally in its discrete state space and nondifferentiable threshold nonlinearity.

Replacing ff with a Heaviside function and allowing intrinsic noise turns the Neo update into a discrete analogue of an RNN cell:

Vt+1=H(WVt+BXt+αηt+b),\mathbf{V}_{t+1} = H(W\mathbf{V}_t + B\mathbf{X}_t + \alpha\boldsymbol{\eta}_t + \mathbf{b}),

except that Neo's rule includes the freeze gate gi(t)g_i(t), which forces exact state persistence when gi(t)=0g_i(t) = 0, something ordinary RNNs do not structurally encode except via learned sigmoid gates in LSTM/GRU architectures. The presence of a tunable Bernoulli(pi)\text{Bernoulli}(p_i) perturbation shifts the Neo closer to a stochastic recurrent automaton rather than a differentiable dynamical system.

2.9.2 Neo and Hopfield Networks

A classical Hopfield update is

si(t+1)=sign(jWijsj(t)),s_i(t+1) = \text{sign}\left(\sum_j W_{ij} s_j(t)\right),

operating under symmetric weights to guarantee an energy-minimization principle. Neo resembles Hopfield units in its binary thresholding, yet diverges sharply through directed connectivity, stochastic activations, external inputs, and freeze gating. If one removes stochasticity (αi=0\alpha_i = 0), removes freeze, and enforces weight symmetry, the Neo update collapses toward Hopfield-like behavior. But with ηi(t)Bernoulli(pi)\eta_i(t) \sim \text{Bernoulli}(p_i) and arbitrary directed edges (Section {CH}.4.1), Neo is no longer confined to gradient descent on a Lyapunov energy; its transitions instead form a probabilistic threshold dynamical system unconstrained by symmetry or convergence guarantees.

2.9.3 Neo and McCulloch–Pitts Threshold Networks

The closest mathematical ancestor of Neo is the McCulloch–Pitts neuron:

vi(t+1)=H(jwijvj(t)θi).v_i(t+1) = H\left(\sum_j w_{ij} v_j(t) - \theta_i\right).

Neo extends this rule in two orthogonal directions. First, the stochastic term αiηi(t)\alpha_i \eta_i(t) introduces controlled randomness into the activation function, keeping the local rule analytically simple but probabilistically expressive. Second, the freeze gate turns the update into a conditional assignment:

Vt+1[i]={Vt[i],if gi(t)=0,H(wizi(t)+αiηi(t)+bi),if gi(t)=1,\mathbf{V}_{t+1}[i] = \begin{cases} \mathbf{V}_t[i], & \text{if } g_i(t) = 0,\\ H\big(w_i^\top \mathbf{z}_i(t) + \alpha_i \eta_i(t) + b_i\big), & \text{if } g_i(t) = 1, \end{cases}

which makes Neo nodes capable of behaving like latches or memory elements independent of the weighted sum. Unlike fixed McCulloch–Pitts networks, Neo's connectivity evolves (Section {CH}.5), so the set of inputs zi(t)\mathbf{z}_i(t) is a dynamic quantity. This combination yields a threshold unit that is both structurally fluid and stochastically parameterized.

2.9.4 Neo and Spiking Neural Networks

A typical spiking neuron satisfies

vi(t+1)=λvi(t)+jwijsj(t)θi,si(t)=1[vi(t)0],v_i(t+1) = \lambda v_i(t) + \sum_j w_{ij} s_j(t) - \theta_i, \quad s_i(t) = \mathbf{1}[v_i(t) \ge 0],

where vi(t)v_i(t) is a continuous membrane potential. Superficially, both SNNs and Neos emit binary spikes, but their internal mechanisms differ: SNNs rely on temporal integration and threshold crossing, whereas Neo updates instantaneously with no membrane accumulation. Noise in SNNs often appears as probabilistic spike generation conditioned on vi(t)v_i(t); Neo's stochasticity is structurally simpler—the noise ηi(t)Bernoulli(pi)\eta_i(t) \sim \text{Bernoulli}(p_i) enters linearly.

The freeze gate has no analogue in standard SNNs, which lack native state-holding operators. Thus Neo achieves a spike-like binary output through a fundamentally different, purely threshold-based local rule with explicitly programmable persistence.

2.9.5 Neo and Probabilistic Finite Automata

A probabilistic finite automaton (PFA) uses a transition kernel

P(st+1=sst,xt),P(s_{t+1} = s' \mid s_t, x_t),

defining state changes as conditional probabilities. When one considers the global Neo state vector Vt{0,1}nt\mathbf{V}_t \in \{0,1\}^{n_t}, the Neo update induces exactly such a kernel: for each node,

Pr(Vt+1[i]=1Vt,Ut)={Vt[i],if gi(t)=0,Pr(wizi(t)+αiηi(t)+bi0),if gi(t)=1,\Pr(\mathbf{V}_{t+1}[i] = 1 \mid \mathbf{V}_t, \mathbf{U}_t) = \begin{cases} \mathbf{V}_t[i], & \text{if } g_i(t) = 0,\\ \Pr(w_i^\top \mathbf{z}_i(t) + \alpha_i \eta_i(t) + b_i \ge 0), & \text{if } g_i(t) = 1, \end{cases}

where the freeze case gi(t)=0g_i(t) = 0 deterministically preserves the previous state. Because ηi(t)Bernoulli(pi)\eta_i(t) \sim \text{Bernoulli}(p_i), when gi(t)=1g_i(t) = 1 these probabilities take closed analytic form:

Pr(Vt+1[i]=1gi(t)=1)={1,if wizi(t)+bi0,pi,if αiwizi(t)+bi<0,0,if wizi(t)+bi<αi.\Pr(\mathbf{V}_{t+1}[i] = 1 \mid g_i(t) = 1) = \begin{cases} 1, & \text{if } w_i^\top \mathbf{z}_i(t) + b_i \ge 0,\\ p_i, & \text{if } -\alpha_i \le w_i^\top \mathbf{z}_i(t) + b_i < 0,\\ 0, & \text{if } w_i^\top \mathbf{z}_i(t) + b_i < -\alpha_i. \end{cases}

The Neo therefore behaves precisely as a parametric probabilistic finite-state machine, where weights define implicit transition probabilities and freeze introduces deterministic self-loops. Unlike classical PFA transitions, these probabilities depend smoothly on the weight vector and bias, giving Neo both the interpretability of automata and the expressiveness of parameterized nonlinear models.

2.9.6 Synthesis

Across these comparisons, Neo emerges not as a variant of any single established computational model but as a synthesis of their core mathematical motifs. The update rule retains the threshold simplicity of McCulloch–Pitts neurons while admitting the stochastic richness of probabilistic automata. It mirrors RNN recurrence but without continuous states or differentiability, and it resembles spiking networks in its discreteness without adopting their temporal membrane dynamics. The freeze gate, in particular, introduces a structural control mechanism absent from all these systems, giving Neo a formal ability to preserve state independent of ongoing computation.

This combination of binary substrate, stochastic thresholding, dynamic graph structure, and programmable persistence distinguishes Neo as a computational object whose nearest relatives lie in the intersection of threshold networks and probabilistic automata.

Last updated