Changelog¶
All notable changes to decimal-scaled are documented here.
The format is loosely based on Keep a Changelog, and this project follows Semantic Versioning.
[0.3.1]¶
Release-process patch. Library code, public API, on-wire format, and bench numbers are byte-identical to 0.3.0.
Fixed¶
- GitHub Pages docs site (
mootable.github.io/decimal-scaled) failed to refresh on the v0.3.0 tag push. The Pages environment protection rule only allows deploys frommain; the concurrent main-push run that was allowed to deploy got cancelled by the tag-push run that arrived right after. Tag-triggered deploys blocked. 0.3.1 ships as amain-branch push so the docs workflow runs to completion.
Notes¶
- Future releases should land the version-bump commit, let
mainbuild + deploy the docs, then tag — not the other way round. Will codify inscripts/deploy.ps1.
[0.3.0]¶
The half-width-tier release. The decimal ladder now goes
D9 → D18 → D38 → D56 → D76 → D114 → D153 → D230 → D307 →
D461 → D615 → D923 → D1231 — every adjacent pair has a
lossless From / widen() plus a fallible TryFrom /
narrow(). New x-wide and xx-wide Cargo umbrellas gate the
wider ranges so default builds stay lean.
The strict / fast dispatcher rule changes too: strict is now
the default plain dispatch in every build, and wins on tiebreak
when both strict and fast are enabled. The only way to
land plain sin / ln / sqrt etc. on the f64 bridge is a
deliberate three-step opt-out (default-features = false + add
fast + add std + don't re-add strict). The named
*_strict and *_fast methods stay available regardless of
feature choice.
Added — new decimal tiers¶
D56(192-bit),D114(384-bit),D230(768-bit) — half-width tiers between every existing power-of-two width. Gated behindd56/d114/d230individually or by the expandedwideumbrella (nowD56–D307together).D461(1536-bit),D615(2048-bit) — new x-wide tier, gated behindx-wide(ord461/d615individually).D923(3072-bit),D1231(4096-bit) — new xx-wide tier, gated behindxx-wide(ord923/d1231individually).- The naming rule: the number on every
D{N}type is the highest safeSCALE(MAX_SCALE) the storage can hold, i.e. the number of decimal digits you can represent without overflow. SoD1231meansMAX_SCALE = 1232(∼ 1232 decimal digits of headroom), not "1231 bits". - Comprehensive scale aliases per the new tiers: ≥ 16 per tier above the narrow range, covering 0 / common midpoints / the previous tier's MAX_SCALE as a cross-tier sentinel / the new tier's MAX_SCALE.
Changed — breaking¶
D38.widen()now returnsD56<SCALE>instead ofD76<SCALE>. Symmetrically,D76.narrow()→D56,D76.widen()→D114,D153.narrow()→D114,D153.widen()→D230,D307.narrow()→D230, andD307.widen()is new (→D461, gated behindx-wide). The legacy power-of-two-next-up semantics are gone; the comprehensive ladder is the new default. Callers that need the old jump can use.into()/.try_into()to skip rungs.strict+fastnow resolves to strict. Previouslyfastwon the tiebreak. The strict path is now fast enough (ln_strictat D38<19> is ~1.5 µs) that staying on the deterministic correctly-rounded path by default is the right call across more codepaths.(no feature)builds now dispatch plain transcendentals to*_stricttoo (previously they fell through to*_fast). Same reasoning: strict-by-default unless deliberately opted out.
Added — performance¶
- Chain-of-÷10^38 rescale for wide-tier
mulatSCALE > 38: factorsn / 10^SCALEas a sequence ofn / 10^38chunks, each riding the existing base-2^128 MG 2-by-1 magic kernel. Combined-remainder bookkeeping preserves HalfToEven correctness across chunks. Measured wins: - D307<150> mul: 786 ns → 434 ns (1.8× faster)
- D461<230> mul: 1.62 µs → 866 ns (1.9×)
- D615<308> mul: 2.20 µs → 1.36 µs (1.6×)
- D923<461> mul: 3.30 µs → 2.68 µs (1.2×)
- D1231<616>: marginal (chain length eats the per-pass win; needs Barrett or wider magic tables — tracked for 0.3.x).
d56/d114/.../d1231per-tier features can be enabled individually if you don't want a full umbrella.
Added — benches + tooling¶
- Per-width
lib_cmp_d{N}benches — 13 new bench binaries (cargo bench --bench lib_cmp_d307) replace the monolithiclibrary_comparison.rs. Each tier runs in minutes instead of hours; iterating on one tier's perf doesn't need a full matrix sweep. Shared macros + helpers live inbenches/lib_cmp_common.rs. benches/quick_div.rs— focused microbench for D307/D615/D923/D1231 div + mul. Used during the wide-tier perf tuning passes.- Per-width summary chart family — one PNG per power-of-two
storage width (
docs/figures/library_comparison/summary_{N}bit.png) showing every op (add / sub / neg / mul / div / rem / sqrt / ln / exp / sin / cos / tan / atan / sinh / cosh / tanh) on a log-y axis with one bar per library. Re-rendered automatically via thescripts/refresh_bench_artifacts.shworkflow. scripts/bench_log_to_medians.py— extracts the criterion medians from any number of bench-log files intotarget/medians.tsvso chart_gen.rs picks up the latest run.- chart_gen filter tightened: only renders multi-library line charts where ≥ 2 libraries have ≥ 2 data points, so scatter-of-dots charts get dropped automatically.
- Trig family in the bench matrix —
cos/tan/atan/sinh/cosh/tanhbenched for every peer that ships them (decimal-scaled, fastnum, g_math, rust_decimal cos/tan). examples/rounding_mode_probe.rs— diagnostic that prints the candidate renderings under each rounding mode forexp(1),sin(1),ln(2),sqrt(2). Used to verify that the §5 "1 ULP" entries onfastnum/rust_decimalwere render-mode artifacts (they carry the correct value internally), not computation errors.examples/fast_vs_strict_ulp.rs— per-tier accuracy-loss table for*_fastvs*_strict. Drives the new §2.1 indocs/benchmarks.md.
Fixed¶
- D56 work-integer overflow — D56's transcendental work
integer was Int512, which couldn't carry the squared
intermediate at
SCALE + GUARD = 86working scale. Bumped to Int1024. Caught by the in-flight bench sweep. feature = "xx-wide"-only builds — several macro arms insrc/macros/full.rsandsrc/mg_divide.rsgated only onwide/x-wideand missedxx-wide/d923/d1231, so anxx-wide-only build failed to compile. Gates extended.
Docs¶
- §5 in
docs/benchmarks.mdrewritten as "Where each crate fits" — feature-matrix-first framing, per-storage-width summary charts, an explicit note distinguishing render-mode artifacts (fastnum / rust_decimal / decimal-rs) from real precision losses (dashu-float 4 ULP exp(1), g_math 6–46 ULP). Bench-doc tone shifted from "competition" to "here's where each crate fits, here's where ours fits". docs/widths.md,docs/getting-started.md,docs/strict-mode.md,docs/features.mdall updated to enumerate the thirteen-tier ladder and describe the new strict-by-default dispatcher.- README — "Two headline guarantees" lede now promotes both ≤ 0.5 ULP correctness AND caller-chosen rounding mode at every lossy operation; tier table extended to all 13 widths.
ALGORITHMS.md— wide-int section enumerates every shipped storage type; atan halvings table extended to cover the new wide tiers; tier-listing references updated from the old four-tier list.ROADMAP.md— explicit versioning intent table (0.3.0 done / 0.4.0 signed SCALE + RNG / 1.0.0 gated on competitive wide-tier mul/div or per-row documented gap); MG magic-multiply extension + Barrett path queued for 0.3.x; out-of-tree ecosystem crates (decimal-scaled-expr / -math / -finance) with the expression-engine dual-track + whole-tree serialisation design captured.
[0.2.5]¶
Docs + benchmark accuracy patch. Library code, public API, and on-wire format are byte-identical to 0.2.4.
Fixed¶
docs/benchmarks.md- every numeric cell in the arithmetic, fast-transcendental, strict-transcendental, and wide-integer-backend tables was re-measured on a single machine in one criterion run with the default 3 s warm-up, 50-sample (D38-and-narrower) or 20-sample (wide tier) windows. The previous numbers were collected with a much shorter warm-up / fewer samples and several rows shipped unsubstituted__LOSSY_*__/__STRICT_*__template placeholders.benches/decimal_backends.rs- theD128_lossyandD256_lossyrows called the plain*dispatcher methods, which with the defaultstrictCargo feature flip to the*_strictinteger kernel. The rows therefore measured the strict path twice instead of contrasting fast vs strict. They now call*_fastexplicitly, so the fast / strict distinction shown in the docs is honest.
Changed¶
docs/benchmarks.md§2 "Fast transcendentals" - table reshaped from the unsubstituted "D9 / D18 / D38 fast" placeholders to the actually-benched "D38*_fast/ D76*_fast/rust_decimal" comparison, with a prose note that D9 / D18*_fastshare the D38 f64-bridge kernel viato_f64/from_f64and incur only a sub-ns round-trip on top of the listed D38 numbers.docs/benchmarks.mdmethodology section - warm-up / sample-size text updated to match the bench harness's actual configuration (3 s warm-up, auto-tuned measurement window, 50 or 20 samples depending on tier).docs/benchmarks.md§3 strict transcendental tables - collapsed from one column per (width, scale) to one column per width, each cell showing only the s = mid measurement (the honest series-cost scale - s = 0 hits fast-path early returns and s = max sometimes shortens via Cody-Waite range reduction, so neither is a fair comparator). The chosen mid is listed in the column header (e.g.D76 (s=35)).docs/benchmarks.mdTime units table - added picosecond row and reframed the third column as "Relative to a second" instead of "Relative tons" for consistency across the table.docs/benchmarks.mddata-cell rendering - scientific notation in data cells (e.g.1.46×10⁻³ µs) replaced with plain decimals (0.00146 µs). Scientific notation is now reserved for values smaller than 10⁻⁵ of the row's unit (none in the current tables). Time units table is unchanged (still uses10⁻³ s,10⁻⁶ s, etc. for second relationships).
Added¶
benches/library_comparison.rs- new bench that pitsdecimal-scaledagainst every viable peer on crates.io (fastnum,bigdecimal,dashu-float,decimal-rs,rust_decimal,fixed::I*F*,g_math) across all six decimal-scaled width tiers (32 / 64 / 128 / 256 / 512 / 1024 bit) at three scales per tier (s=0, s=mid, s=max).examples/ulp_report.rs- one-shot accuracy report measuring ULP error for each library'sln(2)/exp(1)/sin(1)/sqrt(2)against aD76<19>integer-only*_strictbaseline. Confirmsdecimal-scaledis 0 ULP on every transcendental tested and showsg_math's "0 ULP transcendentals" marketing claim is 6–46 ULP off at the matched precision.examples/chart_gen.rs- pure-Rust (plotters) chart generator that readstarget/medians.tsvand emits one layered line chart per (op × width) todocs/figures/library_comparison/. 60 PNGs total; the meaningful-variation subset is embedded indocs/benchmarks.md§5.docs/figures/library_comparison/*.png- 52 generated charts (every op × width combination that has measurements).docs/benchmarks.md§5 Library comparison - new chapter with one speed table per width tier (s=mid representative scale, library-by-library), an accuracy table at the 128-bit tier (ULP errors for every supported transcendental across every library), embedded charts for the meaningful-variation ops, and a "reading the comparison" buyers-guide paragraph that maps "what do you need" → "which crate".- Dev-dependencies:
fastnum,bigdecimal,dashu-float,decimal-rs,scientific,plotters- all bench/example-only, none compiled into a normal build. ROADMAP.md(repo root) - tracked list of throughput gaps surfaced by the §5 library comparison and the planned fix per item (Burnikel-Ziegler divide, Karatsuba/Toom-3 mul,*_approx(working_digits)transcendental family). Cross-linked fromdocs/benchmarks.mdRoadmap section.
[0.2.4]¶
Agent-ecosystem additions. No library code changes - the crate's public API, behaviour, and on-wire format are byte-identical to 0.2.3. The bump exists so the new agent-facing assets ship in the crates.io tarball.
Added¶
AGENTS.md(top level) - tool-agnostic usage guide consumable by Cursor, Continue, Aider, Codeium and any other agent runner that crawls a repo forAGENTS.md. Covers width / scale picking, the strict-vs-fast dual API, rounding modes,DecimalConsts, rescaling, serde format, common anti-patterns, Cargo feature cheat sheet, and quick recipes..claude/skills/decimal-scaled.md- Claude Code skill (same content asAGENTS.md) withname/descriptionfrontmatter so Claude Code can auto-discover and invoke the skill when a user prompt mentions the crate.
[0.2.3]¶
Documentation patch (and matching test additions). The 0.2.2 docs
incorrectly listed golden among the constants that don't fit
D38<38>'s storage range - the code was correct (golden ≈ 1.618 is
inside the ±1.70141 storage range, so the method returns the
correctly-rounded value), but the prose and CHANGELOG copy claimed it
panicked. Fixed.
Fixed¶
- Documentation in
consts.rs(module preamble +DecimalConststrait doc) andCHANGELOG.mdfor 0.2.2 said the larger-magnitude constants that overflowD38<38>storage were "pi, tau, e, golden".golden ≈ 1.61803actually fits the ±1.70141 storage range and the method returns the correctly-rounded value; onlypi(3.14),tau(6.28), ande(2.72) overflow. Docs corrected.
Added¶
tests/consts.rsinline mod: explicit#[should_panic]tests pinningD38<38>::pi(),D38<38>::tau(), andD38<38>::e()to the storage-overflow panic. Promotes the prior single-constant spot test to cover all three.- The
fitting_constants_at_scale_38_are_correctly_roundedtest now also assertsD38<38>::golden()is correctly rounded to 0.5 ULP (= 1 LSB) of the canonical 38-digit value.
[0.2.2]¶
DecimalConsts 0.5-ULP contract is now uniform across every supported
scale - the 0.2.0 / 0.2.1 ≈ 5 ULP "exception at D38<38>" is gone.
Changed¶
DecimalConstsreference precision - every constant on D9 / D18 / D38 is now derived from the 75-digitInt256reference (the same one the wide tier already used), rescaled down half-to-even to the caller'sSCALE. The previous code used a 37-digiti128reference and rescaled upward by 10 atD38<38>, which appended a placeholder zero and left the result ≈ 5 ULP off the canonical value. Every result on every supported scale on every width is now within 0.5 ULP of the canonical decimal expansion - the precision contract holds with no documented exceptions.D38<38>storage-range overflow - atSCALE = 38the D38 storage range is approximately ±1.7, so the four larger-magnitude constants (pi ≈ 3.14,tau ≈ 6.28,e ≈ 2.72,golden ≈ 1.62) genuinely cannot be represented. The corresponding methods previously panicked with the generic rescale messageD38::rescale: scale-up overflow; they now panic with the explicitD38 constant out of storage range: <name> cannot fit i128 at SCALE = 38.D38<38>::half_pi()andD38<38>::quarter_pi()(which fit storage) are correctly rounded to 0.5 ULP - verified by a new test asserting|result − truth| ≤ 1 LSBat the 38-digit storage scale.
Fixed¶
- The "
D38<38>≈ 5 ULP exception" mentioned in 0.2.1'sDecimalConstsmodule / trait docs is removed from both the preamble and the trait blurb; the rewritten docs state the now- uniform 0.5 ULP contract. docs/strict-mode.md"Choosing the configuration" table reflowed: the default isstrict(not the f64 bridge), and thefastfeature row no longer claims it drops the*_strictsurface (corrected in 0.2.0 elsewhere; the table row was missed).
[0.2.1]¶
Documentation patch - no API changes.
Fixed¶
- docs.rs build: the rendered crate page on https://docs.rs/decimal-scaled
was only showing the default-feature surface, so the wide-tier types
(
D76/D153/D307), thedNN!proc-macros, and theserde_helpersmodule were missing. Added a[package.metadata.docs.rs]block that enablesstd,serde,strict,macros,wide, andx-widefor the docs build. The matchingdocsrscfg is also wired so future#[cfg(docsrs)]doc-cfg badges can highlight feature-gated items. constsmodule +DecimalConststrait docs: the preamble and per- method blurbs claimed every constant was rescaled from a single 37-digiti128reference. That was true for D9/D18/D38 but ignored the per-tier raw references shipped for D76 (75 digits), D153 (153 digits), and D307 (307 digits) underconsts_wide.rs. The new docs include a per-tier reference table and an explicit statement of the precision contract: within 0.5 ULP at every supported scale on every width, with the documented exception ofD38<38>(the D38 maximum, rescaled upward by 10 from the 37-digit reference - ≈ 5 ULP bound onpi/tau/e/golden).
[0.2.0]¶
The 0.2 release rounds out the family the 0.1 line scaffolded: every
width ships the full method surface, the Decimal trait carries the
width-generic API, and the strict / fast routing is symmetric and
explicit.
Added¶
- Wide tier (D76 / D153 / D307) - the 256 / 512 / 1024-bit decimal
widths are now feature-complete. Each implements every surface D38
has: arithmetic and bitwise operators, sign methods, integer
methods, overflow variants, pow + powi + the four pow overflow
variants, cross-type
PartialEqagainst every primitive integer and float, the float bridge (from_f64,from_f64_with,to_f64,to_f32), serde round-trip, and the full strict-transcendental surface - every*_strictmethod plus a mode-aware*_strict_with(mode)sibling. Two AGM alternatesln_strict_agm/exp_strict_agm(Brent–Salamin 1976, Newton-on-AGM-ln) are exposed alongside the canonical artanh / Taylor paths. - In-tree wide-integer module (
crate::wide_int) - the wide tier is now backed by a hand-rolledInt256/Int512/Int1024/Int2048/Int4096family (plus unsigned siblings) emitted by a macro. No external big-integer dependency. Includes Karatsuba multiplication (dispatched at the 16-limb threshold), Knuth Algorithm D, and a Burnikel–Ziegler recursive divide wrapper. Decimaltrait - expanded surface - the trait now carries every uniform method every width implements: arithmetic, bitwise, and shift operators as supertrait bounds; sign (abs,signum,is_positive,is_negative); integer methods (div_euclid,rem_euclid,div_floor,div_ceil,abs_diff,midpoint,mul_add); integer-shape predicates (is_nan,is_infinite,is_finite); the full pow + checked/wrapping/saturating/overflowing pow family; the fullchecked_*/wrapping_*/saturating_*/overflowing_*ofadd/sub/mul/div/neg/rem; integer conversion (from_i32,to_int,to_int_with); the float bridge gated onstd; and default reductions (is_zero,is_one,is_normal,sum,product). PlusDebug/Display/Hashsupertraits.d9!/d18!/d76!/d153!/d307!proc-macros - matchingd38!per-width entry points, including:- per-scale wrappers (
d38s12!,d18s6!, etc.) that pre-bake the scale qualifier; - radix prefix integers (
0xFF,0o755,0b1010); - the explicit
radix Nqualifier; - fractional radix literals (
d76!(1.A3, radix 16, scale 12)); - explicit
scale Nandroundedqualifiers. *_strictand*_fastnamed methods always available - both surfaces compile in every feature configuration (subject only to dependency gates -*_fastneedsfeature = "std"). The plain*form is the only thing thestrict/fastfeatures control.widen()/narrow()hop methods - promote to the next storage tier or demote with a fallible narrowing, without the longhandFrom::from/TryFrom::try_fromsyntax.rescale_with(mode)mode-aware sibling on every width.with_scale<TARGET>()builder-style alias forrescale.*_with(mode)siblings throughout - every default-rounding operation (from_f64,to_int,rescale, etc.) now has a sibling taking an explicitRoundingMode.from_num/to_numon D38 (insrc/num_traits.rs, renamed fromfixed_compat) - saturating, never-panicking constructors and readers that thread throughnum_traits::NumCast.hypot_stricton every width - integer-only, correctly-roundedsqrt(a² + b²)via the scale-trick algorithm.
Changed¶
- Type names - public types now name themselves by safe decimal
digit capacity (
D9/D18/D38/D76/D153/D307) rather than by underlying integer bit-width. The number in the type name is also the type'sMAX_SCALE. - Strict mode is the default -
default = ["std", "serde", "strict"]. Build without default features to get the f64-bridge path. *_fast(formerly suffix-free) - the f64-bridge methods are now named*_fast(ln_fast,sin_fast, …) for symmetry with*_strict. Plain*is the feature-driven dispatcher.fastfeature contract - no longer drops the*_strictsurface; only forces plain*to resolve to*_fast.Decimaltrait supertrait bounds - extended withDefault,Debug,Display,Hash, all arithmetic /*Assignoperators, and the full bitwise / shift operator set.fixed_compat.rs→num_traits.rs- file renamed; module docs no longer reference thefixedcrate. Thefrom_num/to_nummethods themselves are unchanged.- README, docs/, and crate-level documentation rewritten to reflect the all-six-widths reality. Stale claims about D38-only-implements-trait, bnum-backed wide tier, "wide tier not yet wired", "Karatsuba is a future optimisation", and "fast drops the strict surface" are all corrected.
Removed¶
- The
bnumdependency - wide-tier storage migrated to the in-treewide_intmodule.bnumand friends remain as[dev-dependencies]for the benchmark baselines only. _lossy/_fastfloat-conversion suffixes - the float conversion methods are nowto_f64,from_f64,to_f32,from_f64_with. The historic_lossy/_fastsuffixes were redundant since there is no strict counterpart for these (they are type conversions, not transcendentals)._lossy/_intinteger-conversion suffixes dropped for the same reason -from_int/to_int/to_int_withare the only variants.- Placeholder wide-tier feature flags (
d115,d230,d462,d616,d924,d1232) - these were forward-planned widths that were never implemented. Shipping no-op feature flags would mislead users pinning to them. Will be re-added when the corresponding storage types land. - Dead code in
mg_divide- the unuseddiv_exp_fast_2wordwrapper (only the_with_remvariant has callers). - Inline test mods that ran without asserting - the runtime
if !DEFAULT_IS_HALF_TO_EVEN { return; }guard pattern was replaced with module-level#![cfg(...)]so tests never silently no-op under a non-defaultrounding-*feature.
Fixed¶
- Strict/fast routing defect - pre-0.2 the
*_strictmethods werecfg(not(feature = "fast"))and the*_fastmethods werecfg(all(feature = "std", any(not(feature = "strict"), feature = "fast"))), so in the default-strict build there was no way to call the f64-bridge methods by name, and vice versa. Both surfaces now always compile (subject tostdfor*_fast). - Module-level doc comment staleness - six modules contained D38-only narratives / "Phase N will add" / "future widths" / broken file-path references; rewritten to match the all-six-widths reality.
- Broken intra-doc links -
[Self::MIN]at module scope,[FromStr]withoutcore::str::prefix,[D38::rescale]at module scope,[num_traits::Zero]shadowed by the post-renamecrate::num_traitsmod - all fixed.cargo doc --no-deps --document-private-itemsnow reports zero warnings. - Crate-wide warning-clean build under every feature
combination -
default,--no-default-features,fast,strict,wide,x-wide, and combinations thereof. - Coverage hardening - comprehensive functional tests added for
every public surface, the wide-integer kernels,
mg_divide, the guard-digitd_w128_kernels, and every transcendental's domain panic. Tests are functional (named by behaviour, not by uncovered line) and topic-organised intests/.
Compile-time / MSRV¶
- MSRV declared as Rust 1.85 (lower bound for the 2024 edition).
Migration notes¶
- The
D128(etc.) type names are gone - they were renamed to their digit-capacity counterparts in the 0.1 line. If you pinned to a pre-rename name, update to the new spelling. - Code that called
.ln()/.sin()etc. and relied on the f64 bridge being present in the default strict build now still compiles, but the routing has been clarified - call.ln_fast()/.sin_fast()directly if you specifically want the f64 path regardless of the build's feature set. - The
_lossy/_fastsuffixes on float conversion methods (to_f64_lossy,from_f64_fast, …) have been removed across two prior renames; the methods are now justto_f64/from_f64/ etc. Update any leftover suffixed call sites. - If you depended on a placeholder wide-tier feature flag (
d115,d230,d462,d616,d924,d1232), the flag no longer exists. Usewideorx-wideto cover the implemented widths.
[0.1.1] - 2025-12¶
Bug-fix release of the initial public 0.1 line. Refer to the git
history under tag v0.1.1 for the full commit log; the changes
focused on the repo URL / documentation metadata.
[0.1.0] - 2025-12¶
Initial public release. Established the core D38<const SCALE: u32>
type, the strict-vs-fast transcendental dual API, the 256-bit
Möller-Granlund magic-number divide path for mul/div, the
correctly-rounded sqrt / cbrt via exact-integer radicand, the serde
helpers, the d128! macro, and the docs/benchmarks scaffolding.