ChessCoin: Proof-of-Training Chess Chain

A RandomX Proof-of-Work blockchain for deterministic, verifiable chess model training

Whitepaper v0.1 · Concept draft · Joris Hartog

Abstract

We propose ChessCoin, a blockchain protocol in which chain consensus is provided by RandomX Proof-of-Work while a separate proof-of-training layer directs useful computation toward the improvement of a shared chess model. The protocol distinguishes Sybil resistance from training correctness: RandomX determines block eligibility and chain weight, while deterministic training traces, commitment chaining, random sampling, and slashing conditions constrain model updates. The resulting system is not a cryptographic proof of machine-learning progress; rather, it is an economic and probabilistic construction intended to make invalid training submissions costly while preserving a public sequence of reproducible model transitions.

Introduction
System Model
Consensus Layer
Global Model State
Training Work
Trace Commitments
Probabilistic Verification
Slashing Conditions
Model Evaluation
Security Discussion
Limitations
Conclusion

1. Introduction

Proof-of-Work networks intentionally consume computational resources in order to make Sybil attacks expensive. In conventional designs, the useful output of this computation is primarily the security of the ledger itself. ChessCoin investigates whether part of the surrounding mining process can be structured so that it also produces a reproducible artifact: incremental training of a chess-playing neural network.

The central design decision is separation of concerns. RandomX Proof-of-Work is used for consensus and chain selection. Training work is treated as a protocol-level payload whose correctness is verified through deterministic execution rules and sampled recomputation. This prevents the training problem from becoming the sole source of Sybil resistance, while still allowing model updates to be economically tied to block production.

2. System Model

The protocol contains three logically distinct layers. The consensus layer orders blocks. The training layer specifies reproducible model transitions. The verification layer determines whether a submitted transition is eligible for reward and whether the submitter is subject to penalty.

Layer	Function	Primary Assumption
Consensus	Block eligibility, chain ordering, and Sybil resistance.	RandomX work is costly to produce and cheap to verify.
Training	Deterministic transition from model state M_t to M_t+1.	All nodes can reproduce sampled transitions under fixed software and numeric rules.
Verification	Sampling, fraud detection, and slashing.	Invalid traces are likely to be detected with sufficient sampling probability.

3. Consensus Layer

ChessCoin uses RandomX Proof-of-Work for chain consensus. Miners search for valid block headers under a network difficulty target. Chain selection follows accumulated work, and difficulty adjustment is defined independently of training success.

RandomX is selected because its CPU-oriented and memory-hard design is more aligned with broad participation than specialized ASIC-first hashing. In this design, RandomX does not prove that training was performed correctly. It only proves that the block producer expended consensus work.

Design invariant: failure of a training submission must not compromise the ability of the consensus layer to order blocks.

4. Global Model State

The network maintains a canonical model checkpoint M_t as part of protocol state. A valid training submission proposes a transition to M_t+1, accompanied by metadata describing the training inputs, deterministic seed schedule, software version, optimizer parameters, and trace commitment root.

M_t -> M_{t+1}

For a model transition to be admissible, the transition must be reproducible under the protocol's deterministic execution environment. Floating-point behavior, batching, random seeds, data ordering, and optimizer precision are therefore consensus-relevant parameters.

5. Training Work

A miner that wishes to claim training reward performs the following procedure after producing or referencing an eligible RandomX block candidate:

Fetch the canonical model checkpoint M_t and the protocol-defined training input set.
Execute the deterministic training procedure for a fixed compute budget.
Emit the candidate checkpoint M_t+1.
Publish a commitment-chained execution trace.
Bond collateral that may be slashed if sampled trace verification fails.

The training layer can be configured to use self-play games, curated positions, or a deterministic dataset schedule. Regardless of source, the generated training sequence must be reproducible from public inputs.

6. Training Trace Commitments

Each execution step S_i is committed into a hash chain. The final trace root is included in the block payload or associated training transaction. Validators can then request specific trace openings without requiring the entire training transcript to be stored on-chain.

H_i = hash(H_{i-1}, S_i)

A trace step may include a compact representation of the model shard, batch identifier, optimizer state, gradient summary, and deterministic transition metadata. The exact encoding must be canonical, because non-canonical encodings would create verification ambiguity.

7. Probabilistic Verification

Full recomputation of all submitted training would eliminate most practical benefit. ChessCoin therefore uses probabilistic verification. Validators derive random sample indices from block entropy after the trace commitment has been fixed. For each sampled step, the submitter must reveal enough data for validators to recompute the transition and compare it against the committed trace.

If p is the probability that a fraudulent trace step is sampled, and k independent samples are checked, the probability of escaping detection decreases as the sample count increases:

P(undetected) = (1 - p)^k

This mechanism provides economic assurance, not absolute proof. Parameter selection must therefore consider training cost, verification cost, collateral size, and expected reward.

8. Slashing Conditions

A miner or training submitter may be penalized when a verifier demonstrates one of the following conditions:

A sampled transition does not reproduce under the deterministic execution rules.
The opened trace data is inconsistent with the published commitment chain.
The final model checkpoint does not correspond to the committed trace root.
The candidate model violates protocol-defined evaluation constraints.

Slashing is intended to make dishonest training submissions negative in expectation. The collateral requirement should exceed the expected gain from submitting fabricated or low-cost traces.

9. Model Evaluation

ChessCoin does not require every accepted update to strictly improve playing strength. Such a rule would be brittle, expensive, and vulnerable to benchmark overfitting. Instead, the protocol defines a bounded non-regression constraint over a fixed evaluation suite.

Eval(M_{t+1}) >= Eval(M_t) - epsilon

The evaluation oracle consists of fixed chess position suites, deterministic engine snapshots, and bounded compute budgets. The purpose is to reject clearly destructive updates while allowing the noisy, incremental nature of model training.

10. Security Discussion

ChessCoin's security model is hybrid. Chain integrity relies on RandomX Proof-of-Work and accumulated work. Training integrity relies on reproducibility, commitment binding, randomized sampling, and slashing. These mechanisms address different adversarial goals.

Attack	Mitigation
Sybil block production	RandomX difficulty and accumulated work.
Fabricated training trace	Commitment openings, sampled recomputation, and slashing.
Destructive model update	Bounded non-regression evaluation.
Verifier overload	Probabilistic sampling and bounded trace openings.

11. Limitations

The protocol does not provide an absolute cryptographic proof that all training was performed correctly. It provides a probabilistic fraud-detection mechanism whose effectiveness depends on sampling parameters, collateral sizing, and deterministic reproducibility. The design also inherits the operational complexity of distributed machine-learning systems, including hardware variance, software versioning, numerical precision, and benchmark selection.

A second limitation is incentive alignment. A model can satisfy short-term evaluation constraints while failing to produce long-term useful progress. Future work should investigate richer evaluation regimes, delayed rewards, tournament-based assessment, and mechanisms that reduce benchmark overfitting.

12. Conclusion

ChessCoin combines RandomX Proof-of-Work with a deterministic proof-of-training layer for chess AI. The consensus mechanism secures block production, while trace commitments and sampled verification create economic pressure for honest training submissions. The result is a research-oriented blockchain design in which mining remains Sybil-resistant while adjacent protocol work produces a public sequence of reproducible model updates.