Popcorn

Popcorn is a .Net Middleware for your RESTful API that allows your consumers to request exactly as much or as little as they need, with no effort from you.

This project is maintained by Skyward App Company

Contents
Latest Post

Incremental optimizations — three-step walk

Captured 2026-04-23. Baseline in ../README.md (commit 9168c4c, .NET 9.0.15). Each step below is an isolated generator change, tested against the full functional suite (182 passing, 13 skipped throughout), benchmarked with the same BDN configuration.

Raw logs:

The three changes

#1 — Replace LINQ .Any() / .FirstOrDefault() with for-loops

ExpanderGenerator.cs:CreateComplexObjectInnerBody. The emitted converter body used properties.Any(p => ...) twice for !all/!default flag detection and properties.FirstOrDefault(p => ...) once per property. All replaced with a single index-loop pass for flags and inline index-loops per property. Same complexity, different constant factor — and no per-call iterator state (even when JIT stack-allocates the enumerator, the method-call / virtual-dispatch overhead is real).

#2 — Hoist useAll / useDefault / naming out of the inner converter into list/dict callers

ExpanderGenerator.cs:FlagSetupCode. Added a split: every complex-object Pop now has a 4-arg wrapper that computes flags + naming delegate, and a new Pop{X}Inner overload accepting pre-computed values. List/dict converters compute the setup ONCE outside the foreach and call _Inner per item. For a 100-item list, this eliminates 99 redundant scans and 99 naming-delegate allocations. Single-member callsites (nested properties with differing propertyReference.Children) keep using the 4-arg overload.

#3 — Elide HashSet<object> allocation for cycle-safe type graphs

ExpanderGenerator.cs:IsConverterCycleSafe. A DFS through each converter’s payload type (unwrapping arrays / IEnumerable<T> / IDictionary<K,V> / Nullable<T>) flags the converter as cycle-safe when no transitive property can reach back to an ancestor of itself. Cycle-safe converters pass null instead of new HashSet<object>() at the entry point; body ops become null-conditional. Cycle-risky types (e.g. ComplexNestedModel with its Child back-reference, CircularReferenceModel, DeepNestingModel) continue to allocate and track.

Incremental results (DefaultJob, mean time ns / allocated bytes)

Benchmark Original #1 LINQ→for #2 hoist flags #3 elide HashSet Total Δ time Total Δ alloc
SimpleModel_PopcornDefault 262.8 / 592 223.6 / 592 230.5 / 592 204.0 / 416 −22% −30%
SimpleModel_PopcornAll 351.3 / 776 320.7 / 776 326.7 / 776 292.0 / 600 −17% −23%
SimpleModel_PopcornCustom 335.1 / 632 335.5 / 632 306.6 / 632 280.5 / 456 −16% −28%
SimpleModelList_PopcornDefault 16,415 / 35,808 14,381 / 35,808 14,047 / 29,472 12,491 / 29,296 −24% −18%
SimpleModelList_PopcornAll 26,861 / 54,024 24,704 / 54,016 23,581 / 47,680 21,775 / 47,504 −19% −12%
SimpleModelList_PopcornCustom 23,775 / 38,840 24,132 / 38,840 20,651 / 32,504 19,491 / 32,328 −18% −17%
ComplexModel_PopcornDefault 252.3 / 520 215.6 / 520 222.0 / 512 225.6 / 520 −11% ±0%
ComplexModel_PopcornAll 2,629 / 3,880 1,874 / 3,880 1,737 / 3,496 1,795 / 3,496 −32% −10%
ComplexModel_PopcornCustom 2,325 / 3,720 2,089 / 3,728 2,049 / 3,344 2,059 / 3,344 −11% −10%
ComplexModelList_PopcornDefault 3,884 / 6,992 3,238 / 6,992 3,170 / 5,464 3,150 / 5,456 −19% −22%
ComplexModelList_PopcornAll 36,837 / 56,640 28,885 / 56,640 28,144 / 50,992 27,824 / 51,000 −24% −10%
ComplexModelList_PopcornCustom 35,384 / 50,528 32,241 / 50,528 32,486 / 44,888 31,692 / 44,880 −10% −11%

What each change contributed

Step #1 (LINQ → for): primarily time wins, no allocation delta. In modern .NET, List<T>.Any() and .FirstOrDefault() use the concrete List<T>.Enumerator struct (often stack-allocated by the JIT). The gains come from eliminating method-call overhead, delegate dispatch to the predicate lambda, and unnecessary enumerator state management — not heap allocations. Strongest on ComplexModel_PopcornAll (−29%) where the per-property FirstOrDefault fires most often.

Step #2 (hoist flags): primarily allocation wins on list/dict scenarios (−10% to −22%) with modest time gains (−5% to −14%). The Func<string, string> naming delegate was being allocated per item inside each list iteration — 100× for a 100-item list. Hoisting it to the list converter’s body reduces that to 1. Time savings come from not re-scanning PropertyReferences 100 times.

Step #3 (elide HashSet): both time and allocation wins on single-object scalar scenarios (−15% time / −30% alloc on SimpleModel_PopcornDefault). The new HashSet<object>() at each converter entry is ~80 bytes including its internal buckets array. Cycle-safe types (SimpleModel, ScalableModel, AttributeHeavyModel, PropertyMappingModel, and any list/dict/nullable wrapper thereof) skip this entirely. Cycle-risky types (ComplexNestedModel, CircularReferenceModel, DeepNestingModel) are unchanged — which is why ComplexModel results don’t move for this step.

Ratios vs baselines — where we ended up

Scenario vs Stj_Reflection (time) vs Legacy_Default (time)
SimpleModel_PopcornAll 1.77× (was 2.03×) 0.41×
SimpleModelList_PopcornAll 1.40× (was 1.80×) 0.31×
SimpleModelList_PopcornDefault 0.81× (was 1.10×) 0.18×
ComplexModelList_PopcornAll 0.87× (was 0.97×) — now faster than STJ when emitting everything 0.22×
ComplexModelList_PopcornDefault 0.10× (unchanged — already dominant) 0.17×

The “flat simple list” case (SimpleModelList_PopcornAll) — previously the worst Popcorn vs STJ showing at 1.80× slower — is now 1.40× slower. Still not parity, but meaningfully tighter. The “complex nested list with everything” case (ComplexModelList_PopcornAll) crossed parity: 0.87× STJ while still supporting selective-fetch at runtime.

What didn’t move (and why)

  • ComplexModel scenarios barely changed in step #3. Expected — ComplexNestedModel has a Child: ComplexNestedModel? self-reference, so the cycle-safety analyzer correctly flags it as cycle-risky and keeps the HashSet.
  • Custom include lists don’t benefit as much as Default/All. The per-property include-match scan loops over the include list; larger include lists = more work per property. Orthogonal to the three changes here.

What’s still on the table (future work, not in this run)

  • Pre-encoded property names (JsonEncodedText) like STJ source-gen uses. Would save per-property UTF-16→UTF-8 encoding cost. Complicated by runtime PropertyNamingPolicy — a policy in effect means the encoded form depends on options, forcing per-options caching. Probably the biggest remaining lever; not attempted here.
  • Replace Span<char>.Equals with a pre-computed hash per include-list entry. For types with many properties and large include lists, the linear scan per property is O(n·m). A small hashtable keyed by name would be O(1). Marginal — list sizes are typically small.
  • Skip the include-match scan when useAll and no negations are present. Every property unconditionally emits under !all, so the per-property scan is pure overhead. A one-time check at the top of the body could bypass the scan entirely.