<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>CDAC on Welcome to Christophe Nasarre's Blog</title><link>https://chrisnas.github.io/tags/cdac/</link><description>Recent content in CDAC on Welcome to Christophe Nasarre's Blog</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 17 Jun 2026 09:00:00 +0000</lastBuildDate><atom:link href="https://chrisnas.github.io/tags/cdac/index.xml" rel="self" type="application/rss+xml"/><item><title>Reading CLR internals the cDAC way: contracts instead of the DAC</title><link>https://chrisnas.github.io/posts/2026-06-17_reading-clr-internals-the/</link><pubDate>Wed, 17 Jun 2026 09:00:00 +0000</pubDate><guid>https://chrisnas.github.io/posts/2026-06-17_reading-clr-internals-the/</guid><description>Why .NET is replacing the DAC with the cDAC data contracts, how they work locally / over a process / from a dump, and how to map the GitHub .md contracts to C# - with a from-scratch !eeheap as a working example.</description><content:encoded><![CDATA[<hr>
<p>I already spent a lot of time accessing CLR internals from the outside: walking the GC heap, listing loader allocators, decoding thread state. If you have followed <a href="/posts/2017-02-21_clrmd-part-1-going-beyond/">my ClrMD series</a> or the <a href="/posts/2022-07-28_digging-into-the-clr/">Digging into the CLR</a> post, you know the drill. What you might not know is that .NET is quietly changing the very foundation those tools stand on: the venerable <strong>DAC</strong> is being replaced by the <strong>cDAC data contracts</strong> that should be officially supported in .NET 11 even though you might find some exposed in .NET 9 and .NET 10.</p>
<p>In this post, I will explain how the contracts work (locally, over a live process, and from a dump), what happens under the hood - including the GC &ldquo;extended&rdquo; descriptor - and how to map the contract <code>.md</code> files on GitHub to actual C# code. To make it concrete, I reimplemented the SOS <code>!eeheap</code> command from scratch on top of the contracts only: a small <code>RuntimeDataContract</code> solution with <strong>no ClrMD and no runtime contract assemblies</strong>, where everything is read straight from the target memory.</p>
<h2 id="the-dac-way-and-why-it-might-hurt">The DAC way, and why it might hurt</h2>
<p>Today, every tool that inspects a running CLR or a crash dump goes through the <strong>DAC</strong> (Data Access Component), shipped as <code>mscordaccore.dll</code>. The DAC is a private, compiled-in copy of the runtime&rsquo;s data-structure knowledge: it knows the layout of every internal type, and it exposes that knowledge through COM interfaces such as <code>ICLRDataTarget</code> and <code>ISOSDacInterface</code>. ClrMD sits on top of those interfaces - you can see it throughout its <code>DacImplementation</code> folder, for example in <a href="https://github.com/microsoft/clrmd/blob/main/src/Microsoft.Diagnostics.Runtime/ClrRuntime.cs">ClrRuntime.cs</a> where <code>EnumerateClrNativeHeaps()</code> calls down into the DAC to walk the loader and code heaps.</p>
<p>This works, but the established CoreCLR debugger architecture has a hard requirement baked in: the debugger must acquire and load DAC (and DBI) libraries that <strong>exactly match</strong> the runtime build being debugged. The wrong <code>mscordaccore.dll</code> and you get nothing - no heap, no threads, no analysis. The <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/datacontracts_design.md">data contracts design doc</a> spells out the concrete problems that match-exactly requirement creates:</p>
<ul>
<li><strong>Security.</strong> The DAC/DBI that matches the exact runtime may be untrusted - think custom or 3rd-party builds of the runtime - which is exactly why you hit signature-validation errors when loading them in a debugger.</li>
<li><strong>Servicing.</strong> It is hard to ship a debugger-only fix in the DAC/DBI without shipping a whole new runtime build. The debugger behavior only improves once a new runtime is built and targeted.</li>
<li><strong>Acquisition.</strong> Where do you even get the DAC/DBI that matches the exact runtime version of a dump captured on some other machine? This is the recurring headache of dump debugging.</li>
<li><strong>Cross-architecture.</strong> The host/target combination of the DAC/DBI may simply not be available - for instance analyzing a Linux-arm64 target from a Windows-x64 machine.</li>
</ul>
<p>On top of those, the struct layouts are baked into a native binary: consumers cannot see them, cannot version them, and cannot reason about what changed between two builds, while every runtime change risks silently breaking the DAC.</p>
<p>The runtime team&rsquo;s answer is drastic: instead of shipping a binary that <em>knows</em> the layout, make the runtime <strong>publish its own layout as a versioned contract</strong>, eliminating the need for an exactly-matching DAC and DBI. A single (managed today) reader can then work across builds, operating systems, and architectures. This is the <strong>cDAC</strong> (the &ldquo;c&rdquo; stands for <em>contract</em>), and the design is documented in <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/datacontracts_design.md">datacontracts_design.md</a>.</p>
<h2 id="cdac-basics-the-contract-descriptor">cDAC basics: the contract descriptor</h2>
<p>Before diving in, here is the whole mechanism on one picture that is genuinely the mental model you need to keep in mind:</p>
<p><img alt="From the exported DotNetRuntimeContractDescriptor symbol to the struct, its JSON descriptor and the pointer_data auxiliary array" loading="lazy" src="/posts/2026-06-17_reading-clr-internals-the/DescriptorStructure.png"></p>
<p>A data contract has two halves:</p>
<ul>
<li>The <strong>data descriptor</strong> provides the set of globals variables, types with their fields offset that the runtime publishes about itself - the <em>&ldquo;what it looks like in memory&rdquo;</em>.</li>
<li>The <strong>algorithmic contracts</strong> are documented in the <a href="https://github.com/dotnet/runtime/tree/main/docs/design/datacontracts">.NET runtime repository</a>, versioned algorithms that walk those structures - the <em>&ldquo;how to interpret it&rdquo;</em>. The target advertises a <code>contract -&gt; version</code> map, and a reader has to switch to the implementation of the version it finds. As of today, the runtime implements such <a href="https://github.com/dotnet/runtime/tree/main/src/native/managed/cdac">a reader in managed code</a>.</li>
</ul>
<p>Everything starts from a single exported symbol: <code>DotNetRuntimeContractDescriptor</code>. It points at a small, fixed-shape header whose only job is to bootstrap the rest. Here is the layout, straight from <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/contract-descriptor.md">contract-descriptor.md</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-c" data-lang="c"><span class="line"><span class="cl"><span class="k">struct</span> <span class="n">DotNetRuntimeContractDescriptor</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span>  <span class="n">magic</span><span class="p">;</span>              <span class="c1">// &#34;DNCCDAC\0&#34; in little endian and &#34;\0CADCCND&#34; in big endian
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>  <span class="n">flags</span><span class="p">;</span>              <span class="c1">// bit0 = 1, bit1 = pointer size (0 =&gt; 64-bit, 1 =&gt; 32-bit)
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>  <span class="n">descriptor_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">descriptor</span><span class="p">;</span>       <span class="c1">// UTF-8 JSON (length = descriptor_size, NOT NUL-terminated)
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>  <span class="n">pointer_data_count</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>  <span class="n">pad0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uintptr_t</span> <span class="o">*</span><span class="n">pointer_data</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Two important details are hiding in this struct. First, <code>magic</code> is a <code>uint64</code> written in the <em>target&rsquo;s</em> endianness; because it is a multi-byte value, comparing its in-memory bytes against the two known orderings both validates the descriptor and reveals the target&rsquo;s endianness. Second, <code>flags</code> bit1 tells you the pointer size (8 bytes for 64-bit and 4 bytes for 32-bit), so you can interpret every address that follows.</p>
<p>The <code>descriptor</code> field points to the contract details as UTF-8 JSON (non NUL-terminated) text whose size is provided by the <code>descriptor_size</code> field. Since runtime addresses cannot be encoded in the text that is built at compile time, values like real pointers are stored in a side array, <code>pointer_data</code>, and each pointer is referenced from the JSON by index. The index 0 always has the value 0. If you are interested in looking at how these mappings global/index, search for <code>CDAC_TYPE_FIELD</code> in CLR source code.</p>
<p>In the reader, accessing and validating this header lives in <code>ContractDescriptorReader</code>. The magic field check figures out the endianness even though only little endian platforms are supported in .NET:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">Span</span><span class="p">&lt;</span><span class="kt">byte</span><span class="p">&gt;</span> <span class="n">magic</span> <span class="p">=</span> <span class="k">stackalloc</span> <span class="kt">byte</span><span class="p">[</span><span class="m">8</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(!</span><span class="n">reader</span><span class="p">.</span><span class="n">ReadMemory</span><span class="p">(</span><span class="n">address</span><span class="p">,</span> <span class="n">magic</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">isLittleEndian</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">magic</span><span class="p">.</span><span class="n">SequenceEqual</span><span class="p">(</span><span class="n">MagicLittleEndian</span><span class="p">))</span>        <span class="c1">// 44 4E 43 43 44 41 43 00  (&#34;DNCCDAC\0&#34;)</span>
</span></span><span class="line"><span class="cl">    <span class="n">isLittleEndian</span> <span class="p">=</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">magic</span><span class="p">.</span><span class="n">SequenceEqual</span><span class="p">(</span><span class="n">MagicBigEndian</span><span class="p">))</span>      <span class="c1">// reversed</span>
</span></span><span class="line"><span class="cl">    <span class="n">isLittleEndian</span> <span class="p">=</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// flags bit1 selects pointer width, which fixes every offset below.</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(!</span><span class="n">TryReadUInt32</span><span class="p">(</span><span class="n">reader</span><span class="p">,</span> <span class="n">address</span> <span class="p">+</span> <span class="m">8</span><span class="p">,</span> <span class="k">out</span> <span class="kt">uint</span> <span class="n">flags</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">is32Bit</span> <span class="p">=</span> <span class="p">(</span><span class="n">flags</span> <span class="p">&amp;</span> <span class="m">0x2</span><span class="p">)</span> <span class="p">!=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kt">int</span> <span class="n">pointerSize</span> <span class="p">=</span> <span class="n">is32Bit</span> <span class="p">?</span> <span class="m">4</span> <span class="p">:</span> <span class="m">8</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Once the JSON blob is read and the <code>pointer_data</code> array kept as a field, an indirect global written as <code>[index]</code> in the JSON is resolved against that array. This is why the parser distinguishes a direct value from an indirect one:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// Four JSON shapes are possible (see the runtime&#39;s GlobalDescriptorConverter):</span>
</span></span><span class="line"><span class="cl"><span class="c1">//   [value]            -&gt; indirect: pointer_data[value]</span>
</span></span><span class="line"><span class="cl"><span class="c1">//   [value, &#34;type&#34;]    -&gt; direct value, with a type name</span>
</span></span><span class="line"><span class="cl"><span class="c1">//   [[index], &#34;type&#34;]  -&gt; indirect with a type name: pointer_data[index]</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The takeaway: runtime internals layout needed for diagnostics purpose is reachable from one exported symbol and a chunk of JSON. No private binary required.</p>
<p>If you want to see it for yourself, the CLI can dump the parsed descriptor and exit:</p>
<pre tabindex="0"><code>Cdac.Cli --desc
</code></pre><p>This also writes the raw JSON next to the executable and prints a summary of how many contracts, types, and globals were parsed (the <code>DumpDescriptor</code> helper in <code>Program.cs</code>).
<img alt="Cdac.Cli &ndash;desc output: a summary of the parsed contracts, types, and globals" loading="lazy" src="/posts/2026-06-17_reading-clr-internals-the/descOutput.png"></p>
<h3 id="anatomy-of-the-json-descriptor">Anatomy of the JSON descriptor</h3>
<p>That JSON blob is the in-memory <strong>data descriptor</strong>, and its shape is specified in <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/data_descriptor.md">data_descriptor.md</a>. A complete one looks like this (trimmed from the spec):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-jsonc" data-lang="jsonc"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;version&#34;</span><span class="p">:</span> <span class="s2">&#34;1&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;baseline&#34;</span><span class="p">:</span> <span class="s2">&#34;empty&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;types&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;Thread&#34;</span><span class="p">:</span>      <span class="p">{</span> <span class="nt">&#34;Id&#34;</span><span class="p">:</span> <span class="mi">16</span><span class="p">,</span> <span class="nt">&#34;OSId&#34;</span><span class="p">:</span> <span class="mi">224</span><span class="p">,</span> <span class="nt">&#34;State&#34;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span><span class="err">...</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="err">...</span>
</span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;globals&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;GCInfoVersion&#34;</span><span class="p">:</span> <span class="s2">&#34;0x4&#34;</span><span class="p">,</span>                   <span class="c1">// number value
</span></span></span><span class="line"><span class="cl">    <span class="nt">&#34;AppDomain&#34;</span><span class="p">:</span> <span class="p">[[</span><span class="mi">2</span><span class="p">],</span> <span class="s2">&#34;pointer&#34;</span><span class="p">],</span>            <span class="c1">// indirect: pointer_data[2]
</span></span></span><span class="line"><span class="cl">    <span class="nt">&#34;OperatingSystem&#34;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&#34;Windows&#34;</span><span class="p">,</span> <span class="s2">&#34;string&#34;</span><span class="p">]</span>  <span class="c1">// string value
</span></span></span><span class="line"><span class="cl">  <span class="p">},</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;sub-descriptors&#34;</span><span class="p">:</span> <span class="p">{</span> <span class="nt">&#34;GC&#34;</span><span class="p">:</span> <span class="p">[[</span> <span class="mi">45</span><span class="p">],</span> <span class="s2">&#34;pointer&#34;</span><span class="p">]</span> <span class="p">},</span> <span class="c1">// indirect: pointer_data[45]
</span></span></span><span class="line"><span class="cl">  <span class="nt">&#34;contracts&#34;</span><span class="p">:</span> <span class="p">{</span> <span class="nt">&#34;AuxiliarySymbols&#34;</span><span class="p">:</span> <span class="s2">&#34;c1&#34;</span><span class="p">,</span> <span class="err">...,</span> <span class="nt">&#34;StressLog&#34;</span><span class="p">:</span><span class="s2">&#34;c2&#34;</span><span class="p">,</span> <span class="nt">&#34;SyncBlock&#34;</span><span class="p">:</span> <span class="s2">&#34;c1&#34;</span><span class="p">,</span> <span class="nt">&#34;Thread&#34;</span><span class="p">:</span> <span class="s2">&#34;c1&#34;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Each top-level key has a precise job:</p>
<ul>
<li><strong><code>version</code></strong> - the physical descriptor format version (currently <code>1</code> in .NET 11). It is <em>not</em> a runtime version.</li>
<li><strong><code>baseline</code></strong> - an optional identifier of a well-known descriptor checked into the repo under <a href="https://github.com/dotnet/runtime/tree/main/docs/design/datacontracts/data"><code>docs/design/datacontracts/data/</code></a> and currently empty. In the early days of cDAC, it was discussed to publish baseline contracts and then publish deltas from this baseline in the CLR binary. The idea is that the delta compression would save space in the runtime binary and in the dump files. In practice, it added complexity and, probably, it won&rsquo;t be used (thanks <a href="https://github.com/noahfalk">Noah Falk</a> for the historical point of view).</li>
<li><strong><code>types</code></strong> - the type/field layout. Each entry is a struct name with the offset of each of its fields. In addition the special &ldquo;!&rdquo; entry provides the size of the structure because it is useful when dealing with arrays of such a structure. Note that in the .NET 9 implementation, it is possible to see typed fields in addition to the offset, such as  <code>&quot;Thread&quot; : {..., GCHandle&quot;: [312, &quot;GCHandle&quot;]....}</code></li>
<li><strong><code>globals</code></strong> - named values: integer constants, strings or pointers as index in <code>pointer_data</code>.</li>
<li><strong><code>sub-descriptors</code></strong> - pointers to <em>other</em> descriptors that get merged in (more on the GC one later).</li>
<li><strong><code>contracts</code></strong> - the <code>contract -&gt; version</code> map. <code>&quot;GC&quot;: &quot;c1&quot;</code> says &ldquo;this runtime satisfies version <code>c1</code> of the GC contract&rdquo;.</li>
</ul>
<p>There are two physical encodings defined in <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/data_descriptor.md">data_descriptor.md</a>: a verbose <strong>regular</strong> format (arrays of <code>{ &quot;name&quot;, &quot;type&quot;, &quot;offset&quot; }</code> dictionaries) used for the hand-written baselines in the repo, and the <strong>compact</strong> format above used for the in-memory blob. The tool&rsquo;s <code>ContractDescriptorParser</code> handles both, which is why you will see two code paths (<code>ParseCompactType</code> / <code>ParseRegularType</code>) even though the regular format should never be seen for the in-memory case.</p>
<h3 id="where-the-algorithms-actually-live">Where the algorithms actually live</h3>
<p>Here is a point that confused me at first: <strong>the descriptor does not contain any algorithms.</strong> It only publishes <em>data</em> (layouts + globals) and a <em>list of contract versions the runtime claims to satisfy</em>. The algorithms themselves - &ldquo;how do I walk the heap segments?&rdquo;, &ldquo;how do I enumerate modules?&rdquo; - live in two places, both in the <code>dotnet/runtime</code> repository, but never available at runtime:</p>
<ul>
<li>The <strong>specifications</strong> are prose + C#-like pseudocode in <code>docs/design/datacontracts/&lt;contract&gt;.md</code> (one file per contract, every version in the same file).</li>
<li>The <strong>reference implementation</strong> in real C# under <code>src/native/managed/cdac</code>.</li>
</ul>
<p>So a reader needs to match the <code>contracts</code> list from the target against algorithms it knows. The descriptor says <em>&ldquo;I am a <code>GC</code> version <code>c1</code>&rdquo;</em>; the reader uses the <code>c1</code> algorithm and points it at the in-memory layout the descriptor provides via the <code>pointer_data</code>.</p>
<p>Each algorithm is built from the same small set of primitives. The design doc calls it the <code>Target</code> API, and it is deliberately minimal: read a primitive type at an address, read a pointer, look up a global, and look up a type&rsquo;s field offset/size. That is the entire vocabulary. In the tool, the <code>Target</code> type is exactly exposing these primitives (<code>Read&lt;T&gt;</code>, <code>ReadPointer</code>, <code>ReadGlobalPointer</code>, <code>GetFieldOffset</code>, &hellip;), and every contract is &ldquo;just&rdquo; composing those together. Nothing more is needed to express even how to walk the GC internal structures.</p>
<h3 id="the-tricky-bits-arrays-and-linked-lists">The tricky bits: arrays and linked lists</h3>
<p>Most contract algorithms end up walking two kinds of containers.</p>
<p><strong>Linked lists</strong> are the easy ones: read a head pointer from a global or a field, then follow a <code>Next</code> field until it is null. The loader-heap walk is a good example - <code>LoaderHeap.FirstBlock</code>, then follow <code>LoaderHeapBlock.Next</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">ulong</span> <span class="n">block</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">loaderHeap</span><span class="p">.</span><span class="n">Value</span><span class="p">,</span> <span class="s">&#34;LoaderHeap&#34;</span><span class="p">,</span> <span class="s">&#34;FirstBlock&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">while</span> <span class="p">(</span><span class="n">block</span> <span class="p">!=</span> <span class="m">0</span> <span class="p">&amp;&amp;</span> <span class="n">guard</span><span class="p">++</span> <span class="p">&lt;</span> <span class="n">MaxListIterations</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">address</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;LoaderHeapBlock&#34;</span><span class="p">,</span> <span class="s">&#34;VirtualAddress&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">size</span>    <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;LoaderHeapBlock&#34;</span><span class="p">,</span> <span class="s">&#34;VirtualSize&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">next</span>    <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;LoaderHeapBlock&#34;</span><span class="p">,</span> <span class="s">&#34;Next&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// ... yield (address, size) ...</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">next</span> <span class="p">==</span> <span class="n">block</span><span class="p">)</span> <span class="k">break</span><span class="p">;</span>   <span class="c1">// self-referential = corrupt; stop</span>
</span></span><span class="line"><span class="cl">    <span class="n">block</span> <span class="p">=</span> <span class="n">next</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>(Note the <code>guard</code> counter and the self-reference check: on a corrupt dump a &ldquo;linked list&rdquo; can loop forever.)</p>
<p><strong>Arrays / array-lists</strong> are more interesting because you cannot walk them without the <em>determinate size</em> published by the descriptor. The CLR&rsquo;s <code>ArrayListBase</code> (used, for example, for the <code>AppDomain</code>&rsquo;s list of assemblies) is a linked list of <em>blocks</em>, where each block holds an inline array of pointers right after its header. To index into that inline array you need to know the pointer size (for the stride) and the field offset of the array start:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">uint</span> <span class="n">count</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadField</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">arrayListBase</span><span class="p">,</span> <span class="s">&#34;ArrayListBase&#34;</span><span class="p">,</span> <span class="s">&#34;Count&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kt">ulong</span> <span class="n">block</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">FieldAddress</span><span class="p">(</span><span class="n">arrayListBase</span><span class="p">,</span> <span class="s">&#34;ArrayListBase&#34;</span><span class="p">,</span> <span class="s">&#34;FirstBlock&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kt">uint</span> <span class="n">found</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">while</span> <span class="p">(</span><span class="n">block</span> <span class="p">!=</span> <span class="m">0</span> <span class="p">&amp;&amp;</span> <span class="n">found</span> <span class="p">&lt;</span> <span class="n">count</span> <span class="p">&amp;&amp;</span> <span class="n">guard</span><span class="p">++</span> <span class="p">&lt;</span> <span class="n">MaxListIterations</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint</span>  <span class="n">size</span>       <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadField</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;ArrayListBlock&#34;</span><span class="p">,</span> <span class="s">&#34;Size&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">arrayStart</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">FieldAddress</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;ArrayListBlock&#34;</span><span class="p">,</span> <span class="s">&#34;ArrayStart&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">uint</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">size</span> <span class="p">&amp;&amp;</span> <span class="n">found</span> <span class="p">&lt;</span> <span class="n">count</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">ulong</span> <span class="n">element</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadPointer</span><span class="p">(</span><span class="n">arrayStart</span> <span class="p">+</span> <span class="n">i</span> <span class="p">*</span> <span class="p">(</span><span class="kt">ulong</span><span class="p">)</span><span class="n">_target</span><span class="p">.</span><span class="n">PointerSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">found</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">element</span> <span class="p">!=</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">yield</span> <span class="k">return</span> <span class="n">element</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">block</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;ArrayListBlock&#34;</span><span class="p">,</span> <span class="s">&#34;Next&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that <code>ArrayStart</code> is an <em>inline</em> array, so I take its <strong>field address</strong> (<code>FieldAddress</code>) rather than dereferencing a pointer - the elements live right there inside the block. Then, to get the next element (that is a pointer), I need the <code>PointerSize</code> given by the descriptor, so the same code is correct both for a 32-bit and a 64-bit target. This is precisely the &ldquo;types with a determinate size may be used for pointer arithmetic&rdquo; rule from the spec, made concrete.</p>
<h2 id="one-reader-three-targets-local-remote-and-dump">One reader, three targets: local, remote, and dump</h2>
<p>Here is where the design really pays off compared to the native DAC. An algorithmic contract never needs more than <em>&ldquo;read N bytes at address X&rdquo;</em>. The tool captures exactly that with a one-method abstraction:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">interface</span> <span class="nc">IMemoryReader</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">PointerSize</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">ReadMemory</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">,</span> <span class="n">Span</span><span class="p">&lt;</span><span class="kt">byte</span><span class="p">&gt;</span> <span class="n">buffer</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Three implementations cover the three supported scenarios:</p>
<ul>
<li><strong>Self</strong> - <code>SelfMemoryReader</code> reads in the current process directly.</li>
<li><strong>Remote</strong> - <code>LiveProcessMemoryReader</code> reads in another live process (via <code>ReadProcessMemory</code>).</li>
<li><strong>Dump</strong> - <code>MinidumpMemoryReader</code> reads in a <code>.dmp</code> memory dump.</li>
</ul>
<p>The self reader has one subtlety worth mentioning: walking arbitrary CLR structures (loader-heap block lists, for instance) can follow a pointer into an unmapped or guard page. On modern .NET, a raw copy from such an address raises an access violation, which is a <em>non-catchable corrupted-state exception</em> that kills the process. To honor the &ldquo;return false on a bad read&rdquo; contract, the self reader probes the range with <code>VirtualQuery</code> before copying:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">bool</span> <span class="n">ReadMemory</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">,</span> <span class="n">Span</span><span class="p">&lt;</span><span class="kt">byte</span><span class="p">&gt;</span> <span class="n">buffer</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">address</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span> <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">buffer</span><span class="p">.</span><span class="n">IsEmpty</span><span class="p">)</span> <span class="k">return</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">IsRangeReadable</span><span class="p">(</span><span class="n">address</span><span class="p">,</span> <span class="p">(</span><span class="n">nuint</span><span class="p">)</span><span class="n">buffer</span><span class="p">.</span><span class="n">Length</span><span class="p">))</span>   <span class="c1">// VirtualQuery: committed + readable?</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">source</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ReadOnlySpan</span><span class="p">&lt;</span><span class="kt">byte</span><span class="p">&gt;((</span><span class="k">void</span><span class="p">*)(</span><span class="n">nint</span><span class="p">)</span><span class="n">address</span><span class="p">,</span> <span class="n">buffer</span><span class="p">.</span><span class="n">Length</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">CopyTo</span><span class="p">(</span><span class="n">buffer</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>What really differs between the three targets is how to <em>find</em> the descriptor, not how to read memory. <code>DescriptorLocator</code> uses the OS loader for self (a simple <code>NativeLibrary.TryGetExport</code> against <code>coreclr</code>), but for a remote process or a dump it has to parse the CoreCLR module&rsquo;s PE export table directly from target memory (<code>PeImage.FindExport</code>).</p>
<p>After that, the open path is identical regardless of the source:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">reader</span> <span class="p">=</span> <span class="cm">/* SelfMemoryReader | LiveProcessMemoryReader | MinidumpMemoryReader */</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">descriptorAddress</span> <span class="p">=</span> <span class="cm">/* DescriptorLocator.Locate() | LocateIn(...) */</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">Target</span> <span class="n">target</span> <span class="p">=</span> <span class="n">Target</span><span class="p">.</span><span class="n">Create</span><span class="p">(</span><span class="n">reader</span><span class="p">,</span> <span class="n">descriptorAddress</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>With the DAC, you needed a matching native binary per runtime build and per platform. Here, the <em>same managed code</em> works for a live local process, a remote process, and a dump.</p>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">flowchart TD
  self[&#34;SelfMemoryReader&#34;] --&gt; ir[&#34;IMemoryReader&#34;]
  proc[&#34;LiveProcessMemoryReader&#34;] --&gt; ir
  dump[&#34;MinidumpMemoryReader&#34;] --&gt; ir
  ir --&gt; tgt[&#34;Target (globals + type layouts + reads)&#34;]
  tgt --&gt; c[&#34;Contracts: GC / Loader / ExecutionManager / RuntimeInfo&#34;]
</code></pre><h2 id="under-the-hood-the-logical-descriptor-and-the-gc-sub-descriptor">Under the hood: the logical descriptor and the GC sub-descriptor</h2>
<p>Every contract requires a <code>Target</code> instance to operate. Its API is deliberately small: primitive reads, plus accessors for globals and type/field offsets:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">T</span> <span class="n">Read</span><span class="p">&lt;</span><span class="n">T</span><span class="p">&gt;(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">)</span> <span class="k">where</span> <span class="n">T</span> <span class="p">:</span> <span class="n">unmanaged</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">ulong</span> <span class="n">ReadPointer</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">bool</span> <span class="n">HasGlobal</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">ulong</span> <span class="n">ReadGlobalPointer</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">bool</span> <span class="n">HasType</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">bool</span> <span class="n">HasField</span><span class="p">(</span><span class="kt">string</span> <span class="n">typeName</span><span class="p">,</span> <span class="kt">string</span> <span class="n">fieldName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">int</span>  <span class="n">GetFieldOffset</span><span class="p">(</span><span class="kt">string</span> <span class="n">typeName</span><span class="p">,</span> <span class="kt">string</span> <span class="n">fieldName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// convenience over the above</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">ulong</span> <span class="n">FieldAddress</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">baseAddress</span><span class="p">,</span> <span class="kt">string</span> <span class="n">typeName</span><span class="p">,</span> <span class="kt">string</span> <span class="n">fieldName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">T</span>     <span class="n">ReadField</span><span class="p">&lt;</span><span class="n">T</span><span class="p">&gt;(</span><span class="kt">ulong</span> <span class="n">baseAddress</span><span class="p">,</span> <span class="kt">string</span> <span class="n">typeName</span><span class="p">,</span> <span class="kt">string</span> <span class="n">fieldName</span><span class="p">)</span> <span class="k">where</span> <span class="n">T</span> <span class="p">:</span> <span class="n">unmanaged</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">ulong</span> <span class="n">ReadFieldPointer</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">baseAddress</span><span class="p">,</span> <span class="kt">string</span> <span class="n">typeName</span><span class="p">,</span> <span class="kt">string</span> <span class="n">fieldName</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Those <code>Has*</code> guards are not decoration - they are <em>the</em> mechanism that lets one reader survive layout drift across builds. A contract asks &ldquo;does this build expose that field?&rdquo; before reading it, and degrades gracefully when the answer is no.</p>
<p>Now the non-obvious part. As I already mentioned, in a descriptor, the <code>subDescriptors</code> section could contain a list of sub-descriptors. As of today, only the GC contract ends up to this section (in .NET 11 Preview 5):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="s2">&#34;subDescriptors&#34;</span><span class="err">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="nt">&#34;GC&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">      <span class="p">[</span>
</span></span><span class="line"><span class="cl">         <span class="mi">45</span>
</span></span><span class="line"><span class="cl">      <span class="p">],</span>
</span></span><span class="line"><span class="cl">      <span class="s2">&#34;pointer&#34;</span>
</span></span><span class="line"><span class="cl">   <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span><span class="err">,</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The corresponding in-memory descriptor is pointed to by the address in the 45th slot of the <code>pointer_data</code> in the root descriptor. The globals of a parsed sub-descriptor are resolved against the <code>pointer_data</code> array of that sub-descriptor to compute their value:</p>
<p><img alt="A GC sub-descriptor and how its globals are resolved against its own pointer_data array" loading="lazy" src="/posts/2026-06-17_reading-clr-internals-the/GCSubdescriptorGlobals.png"></p>
<p>Once resolved in the sub-descriptor, the corresponding <code>GlobalValue</code> instances are simply added to the root&rsquo;s <code>Globals</code> ones. This is part of the recursive process done by the <code>LogicalDescriptor</code> class. It merges with cycle protection, so a sub-descriptor that points back to an already-merged one is skipped:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">bool</span> <span class="n">Merge</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">isRoot</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">address</span> <span class="p">==</span> <span class="m">0</span> <span class="p">||</span> <span class="n">_visited</span><span class="p">.</span><span class="n">Contains</span><span class="p">(</span><span class="n">address</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// ... read + parse this descriptor ...</span>
</span></span><span class="line"><span class="cl">    <span class="n">_visited</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">address</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="n">TypeInfo</span> <span class="n">type</span> <span class="k">in</span> <span class="n">parsed</span><span class="p">.</span><span class="n">Types</span><span class="p">.</span><span class="n">Values</span><span class="p">)</span>        <span class="n">MergeType</span><span class="p">(</span><span class="n">type</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">global</span> <span class="k">in</span> <span class="n">parsed</span><span class="p">.</span><span class="n">Globals</span><span class="p">)</span>                 <span class="n">Globals</span><span class="p">[</span><span class="n">global</span><span class="p">.</span><span class="n">Key</span><span class="p">]</span> <span class="p">=</span> <span class="n">global</span><span class="p">.</span><span class="n">Value</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">contract</span> <span class="k">in</span> <span class="n">parsed</span><span class="p">.</span><span class="n">Contracts</span><span class="p">)</span>             <span class="n">Contracts</span><span class="p">[</span><span class="n">contract</span><span class="p">.</span><span class="n">Key</span><span class="p">]</span> <span class="p">=</span> <span class="n">contract</span><span class="p">.</span><span class="n">Value</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">_pendingSlots</span><span class="p">.</span><span class="n">AddRange</span><span class="p">(</span><span class="n">parsed</span><span class="p">.</span><span class="n">SubDescriptorSlots</span><span class="p">);</span>     <span class="c1">// resolve these later</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>GC is the perfect example</strong> of why this matters. The GC publishes its layout in a <em>separate</em> sub-descriptor, and the pointer slot for it can be <strong>null until the GC has initialized</strong>. So, even if it is a very unlikely situation, a freshly-attached live target may not have the GC contract &ldquo;ready&rdquo; yet. <code>LogicalDescriptor</code> tracks those still-null slots as &ldquo;pending&rdquo; and exposes a <code>Refresh()</code> that re-scans them:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// On a live target a sub-descriptor slot (e.g. the GC&#39;s) can transition</span>
</span></span><span class="line"><span class="cl"><span class="c1">// null -&gt; real-address after we first read the descriptor, so re-scan pending slots.</span>
</span></span><span class="line"><span class="cl"><span class="c1">// Dumps are a fixed snapshot, so Refresh() is a no-op there.</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">target</span><span class="p">.</span><span class="n">PendingSubDescriptorCount</span> <span class="p">&gt;</span> <span class="m">0</span> <span class="p">&amp;&amp;</span> <span class="n">isLiveTarget</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">target</span><span class="p">.</span><span class="n">Refresh</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This is also why the tool can print a friendly &ldquo;this target does not expose the GC contract&rdquo; message instead of crashing: the contract simply is not there yet. In a memory dump, what you captured is what you get; on a live process, a <code>Refresh()</code> after the GC warms up makes the GC types and globals available.</p>
<p>For completeness, parsing turns the JSON into a <code>ParsedDescriptor</code> (types, globals, contracts, and the list of sub-descriptor slots), and <code>LogicalDescriptor</code> merges those into the final dictionaries the <code>Target</code> will read from.</p>
<h2 id="mapping-the-github-md-contracts-to-c">Mapping the GitHub <code>.md</code> contracts to C#</h2>
<p>This is the part I find most empowering: once you understand the descriptor, you can implement <em>any</em> contract yourself by reading its specification. Here is where everything lives in <a href="https://github.com/dotnet/runtime">dotnet/runtime</a>:</p>
<ul>
<li><strong>The specs</strong> (your source of truth) are in <code>docs/design/datacontracts/*.md</code> - for example <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/GC.md">GC.md</a>, <code>Loader.md</code>, and <code>ExecutionManager.md</code>. Each one lists the globals and data structures the algorithm relies on, then describes the algorithm in prose.</li>
<li><strong>Microsoft&rsquo;s own managed reader</strong> is under <code>src/native/managed/cdac</code> (the <code>Microsoft.Diagnostics.DataContractReader.Contracts</code> project). I use it only to confirm exact field and enum names.</li>
</ul>
<p>My porting recipe is the following:</p>
<ol>
<li>The contract&rsquo;s &ldquo;globals / data structures&rdquo; tables become <code>HasGlobal</code>/<code>ReadGlobalPointer</code> and <code>GetTypeInfo</code>/<code>GetFieldOffset</code> calls.</li>
<li>Each documented algorithm method becomes one C# method on a contract class that takes a <code>Target</code> in its constructor.</li>
<li>Every optional global or field is wrapped in a <code>Has*</code> guard so the port degrades nicely across versions.</li>
</ol>
<p>Let&rsquo;s take the GC contract as an example. The spec says the heap count is 1 for workstation and comes from a <code>NumHeaps</code> global for server; that maps almost literally:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">uint</span> <span class="n">GetGCHeapCount</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">string</span><span class="p">[]</span> <span class="n">ids</span> <span class="p">=</span> <span class="n">GetGCIdentifiers</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">ids</span><span class="p">.</span><span class="n">Contains</span><span class="p">(</span><span class="s">&#34;workstation&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="m">1</span><span class="p">;</span>                                                  <span class="c1">// WRK_HEAP_COUNT</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">ids</span><span class="p">.</span><span class="n">Contains</span><span class="p">(</span><span class="s">&#34;server&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="p">(</span><span class="kt">uint</span><span class="p">)</span><span class="n">_target</span><span class="p">.</span><span class="n">Read</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">&gt;(</span><span class="n">_target</span><span class="p">.</span><span class="n">ReadGlobalPointer</span><span class="p">(</span><span class="s">&#34;NumHeaps&#34;</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="k">throw</span> <span class="k">new</span> <span class="n">NotSupportedException</span><span class="p">(</span><span class="s">&#34;Unknown GC heap type.&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Since the <code>Heaps</code> table contains pointers to each heap, enumerating the server heaps is &ldquo;read the <code>Heaps</code> table address and iterate on each heap pointer size by pointer size&rdquo;:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">IReadOnlyList</span><span class="p">&lt;</span><span class="n">TargetPointer</span><span class="p">&gt;</span> <span class="n">GetGCHeaps</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">IsServer</span><span class="p">)</span> <span class="k">return</span> <span class="p">[];</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">uint</span> <span class="n">count</span> <span class="p">=</span> <span class="n">GetGCHeapCount</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">heapTable</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadPointer</span><span class="p">(</span><span class="n">_target</span><span class="p">.</span><span class="n">ReadGlobalPointer</span><span class="p">(</span><span class="s">&#34;Heaps&#34;</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">heaps</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">TargetPointer</span><span class="p">&gt;((</span><span class="kt">int</span><span class="p">)</span><span class="n">count</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">uint</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">count</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">        <span class="n">heaps</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="k">new</span> <span class="n">TargetPointer</span><span class="p">(</span><span class="n">_target</span><span class="p">.</span><span class="n">ReadPointer</span><span class="p">(</span><span class="n">heapTable</span> <span class="p">+</span> <span class="n">i</span> <span class="p">*</span> <span class="p">(</span><span class="kt">ulong</span><span class="p">)</span><span class="n">_target</span><span class="p">.</span><span class="n">PointerSize</span><span class="p">)));</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">heaps</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And reading a heap segment is a field-by-field copy, with the newer fields guarded so older runtimes still work:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">return</span> <span class="k">new</span> <span class="n">GCHeapSegmentData</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">Address</span><span class="p">:</span>   <span class="n">segmentAddress</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">Allocated</span><span class="p">:</span> <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;Allocated&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">Committed</span><span class="p">:</span> <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;Committed&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">Reserved</span><span class="p">:</span>  <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;Reserved&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">Used</span><span class="p">:</span>      <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;Used&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">Mem</span><span class="p">:</span>       <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;Mem&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">Flags</span><span class="p">:</span>     <span class="k">new</span> <span class="n">TargetNUInt</span><span class="p">(</span><span class="n">_target</span><span class="p">.</span><span class="n">ReadPointer</span><span class="p">(</span><span class="n">seg</span> <span class="p">+</span> <span class="p">(</span><span class="kt">ulong</span><span class="p">)</span><span class="n">_target</span><span class="p">.</span><span class="n">GetFieldOffset</span><span class="p">(</span><span class="s">&#34;HeapSegment&#34;</span><span class="p">,</span> <span class="s">&#34;Flags&#34;</span><span class="p">))),</span>
</span></span><span class="line"><span class="cl">    <span class="n">Next</span><span class="p">:</span>      <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;Next&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="n">BackgroundAllocated</span><span class="p">:</span> <span class="n">_target</span><span class="p">.</span><span class="n">HasField</span><span class="p">(</span><span class="s">&#34;HeapSegment&#34;</span><span class="p">,</span> <span class="s">&#34;BackgroundAllocated&#34;</span><span class="p">)</span> <span class="p">?</span> <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;BackgroundAllocated&#34;</span><span class="p">)</span> <span class="p">:</span> <span class="n">TargetPointer</span><span class="p">.</span><span class="n">Null</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">Heap</span><span class="p">:</span>                <span class="n">_target</span><span class="p">.</span><span class="n">HasField</span><span class="p">(</span><span class="s">&#34;HeapSegment&#34;</span><span class="p">,</span> <span class="s">&#34;Heap&#34;</span><span class="p">)</span>                <span class="p">?</span> <span class="n">ReadSegmentPointer</span><span class="p">(</span><span class="n">seg</span><span class="p">,</span> <span class="s">&#34;Heap&#34;</span><span class="p">)</span>                <span class="p">:</span> <span class="n">TargetPointer</span><span class="p">.</span><span class="n">Null</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For each contract, you define two types: an interface (<code>IGC</code>) plus a contract implementation (<code>GCContract(Target target)</code> ). That mirrors how the real <strong>cdac</strong> project models a versioned contract, which makes it easy to add a &ldquo;version 2&rdquo; implementation later without touching callers.</p>
<h3 id="dealing-with-versions">Dealing with versions</h3>
<p>This is where the &ldquo;versioned&rdquo; in &ldquo;versioned contract&rdquo; finally pays off, and it works at two distinct levels.</p>
<p><strong>Contract-level versioning.</strong> Remember the <code>contracts</code> map in the descriptor: <code>&quot;GC&quot;: &quot;c1&quot;</code> (and <code>&quot;GC&quot;: 1</code> in .NET 9). That value is the contract version, and a couple of rules from the <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/datacontracts_design.md">design doc</a> are worth detailing. A &ldquo;higher&rdquo; identifier is <strong>not</strong> &ldquo;more recent&rdquo; - the versions are just <em>different</em> implementations of the same API surface, and a runtime advertises exactly one version per contract. A reader&rsquo;s job is therefore: read the version value, then dispatch to the matching algorithm. The Microsoft reference implementation does this with one class per version (<code>PrecodeStubs_1</code>, <code>PrecodeStubs_2</code>, &hellip;) behind a shared <code>IPrecodeStubs</code> interface (in cdac\Microsoft.Diagnostics.DataContractReader.Abstractions\Contracts folder), selected from the version string. For <code>IGC</code>, My tool only needs <code>c1</code> today, so it keeps a single <code>GCContract</code>; the moment a <code>c2</code> appears with different algorithms, I will probably add a second class and pick between them from <code>target.Contracts[&quot;GC&quot;]</code> - callers never change:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">IGC</span> <span class="n">gc</span> <span class="p">=</span> <span class="n">target</span><span class="p">.</span><span class="n">Contracts</span><span class="p">[</span><span class="s">&#34;GC&#34;</span><span class="p">]</span> <span class="k">switch</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="s">&#34;c1&#34;</span> <span class="p">=&gt;</span> <span class="k">new</span> <span class="n">GCContract</span><span class="p">(</span><span class="n">target</span><span class="p">),</span>     <span class="c1">// today</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// &#34;c2&#34; =&gt; new GCContract_2(target), // a future, different algorithm</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">v</span> <span class="p">=&gt;</span> <span class="k">throw</span> <span class="k">new</span> <span class="n">NotSupportedException</span><span class="p">(</span><span class="s">$&#34;GC contract version {v} is not supported.&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>A nice consequence of &ldquo;same API surface across versions&rdquo; is that an operation a newer runtime no longer supports is simply defined to <code>throw new NotSupportedException()</code> in that version - the interface stays stable, callers stay simple. The main drawback is that, as time passes, you will need to keep the implementation of all versions you want to support. Knowing that you are not supposed to call the implementation of the previous version for a newer one, it might mean a lot of code for large contracts such as <code>IGC</code>.</p>
<p><strong>Field-level versioning.</strong> Even within a single contract version, the <em>data layout</em> varies build to build: a field may be added, renamed, or dropped. The spec&rsquo;s guidance is that algorithms are written against the <strong>union</strong> of all field shapes they might encounter, and accessing a field that the current runtime does not define is an error. That is precisely what the <code>Has*</code> guards provides: I check <code>HasField</code> / <code>HasGlobal</code> before reading, and degrade gracefully when something is absent. You already saw it in <code>GetHeapSegmentData</code>, where <code>BackgroundAllocated</code> and <code>Heap</code> are only read when present. The same pattern lets a single <code>LoaderContract</code> serve a runtime that has the new <code>StaticsHeap</code>/<code>DynamicHelpersStubHeap</code> and one that does not - it just asks first.</p>
<p>So versioning is handled on two fronts: pick the right <em>algorithm</em> from the contract version, and guard every <em>optional field</em> so one algorithm spans many builds. Between the two, a single managed reader genuinely keeps working as the runtime evolves - which was the whole point of replacing the build time version-locked DAC.</p>
<h2 id="proof-a-from-scratch-eeheap-based-on-contracts-only">Proof: a from-scratch <code>!eeheap</code> based on contracts only</h2>
<p>As an example, the equivalent of SOS&rsquo;s <code>!eeheap</code> (and ClrMD&rsquo;s <code>EnumerateClrNativeHeaps()</code>) lists every native heap the CLR allocates: JIT code heaps, loader-allocator heaps, and GC regions. This is rebuilt with <strong>zero ClrMD</strong> in <code>NativeHeapEnumerator.cs</code>, using only the contracts described above.</p>
<p><strong>Loader heaps.</strong> A <code>LoaderAllocator</code> exposes a handful of <code>LoaderHeap</code> pointers (low/high frequency, statics, stub, executable, precode heaps, &hellip;). Each <code>LoaderHeap</code> is a linked list of <code>LoaderHeapBlock { VirtualAddress, VirtualSize, Next }</code> that we just walk:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">IEnumerable</span><span class="p">&lt;</span><span class="n">ClrNativeHeapInfo</span><span class="p">&gt;</span> <span class="n">WalkLoaderHeap</span><span class="p">(</span><span class="n">TargetPointer</span> <span class="n">loaderHeap</span><span class="p">,</span> <span class="n">NativeHeapKind</span> <span class="n">kind</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">loaderHeap</span><span class="p">.</span><span class="n">IsNull</span> <span class="p">||</span> <span class="p">!</span><span class="n">_target</span><span class="p">.</span><span class="n">HasType</span><span class="p">(</span><span class="s">&#34;LoaderHeap&#34;</span><span class="p">)</span> <span class="p">||</span> <span class="p">!</span><span class="n">_target</span><span class="p">.</span><span class="n">HasType</span><span class="p">(</span><span class="s">&#34;LoaderHeapBlock&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">yield</span> <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">block</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">loaderHeap</span><span class="p">.</span><span class="n">Value</span><span class="p">,</span> <span class="s">&#34;LoaderHeap&#34;</span><span class="p">,</span> <span class="s">&#34;FirstBlock&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">guard</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">while</span> <span class="p">(</span><span class="n">block</span> <span class="p">!=</span> <span class="m">0</span> <span class="p">&amp;&amp;</span> <span class="n">guard</span><span class="p">++</span> <span class="p">&lt;</span> <span class="n">MaxListIterations</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">ulong</span> <span class="n">address</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;LoaderHeapBlock&#34;</span><span class="p">,</span> <span class="s">&#34;VirtualAddress&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="kt">ulong</span> <span class="n">size</span>    <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;LoaderHeapBlock&#34;</span><span class="p">,</span> <span class="s">&#34;VirtualSize&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="kt">ulong</span> <span class="n">next</span>    <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s">&#34;LoaderHeapBlock&#34;</span><span class="p">,</span> <span class="s">&#34;Next&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">address</span> <span class="p">!=</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">yield</span> <span class="k">return</span> <span class="k">new</span> <span class="n">ClrNativeHeapInfo</span><span class="p">(</span><span class="k">new</span> <span class="n">TargetPointer</span><span class="p">(</span><span class="n">address</span><span class="p">),</span> <span class="n">size</span><span class="p">,</span> <span class="n">kind</span><span class="p">,</span> <span class="n">NativeHeapState</span><span class="p">.</span><span class="n">Active</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">next</span> <span class="p">==</span> <span class="n">block</span><span class="p">)</span> <span class="k">break</span><span class="p">;</span>       <span class="c1">// self-referential = corrupt; stop</span>
</span></span><span class="line"><span class="cl">        <span class="n">block</span> <span class="p">=</span> <span class="n">next</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p><strong>JIT code heaps.</strong> Starting from the <code>EEJitManagerAddress</code> global, the <code>AllCodeHeaps</code> linked list of <code>CodeHeapListNode { Heap, Next }</code> is walked:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">IEnumerable</span><span class="p">&lt;</span><span class="n">ClrNativeHeapInfo</span><span class="p">&gt;</span> <span class="n">EnumerateCodeHeaps</span><span class="p">(</span><span class="n">ILoaderHeaps</span> <span class="n">loader</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">IsSupported</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">yield</span> <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">eeJitManager</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadPointer</span><span class="p">(</span><span class="n">_target</span><span class="p">.</span><span class="n">ReadGlobalPointer</span><span class="p">(</span><span class="s">&#34;EEJitManagerAddress&#34;</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">eeJitManager</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">yield</span> <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">node</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">eeJitManager</span><span class="p">,</span> <span class="s">&#34;EEJitManager&#34;</span><span class="p">,</span> <span class="s">&#34;AllCodeHeaps&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">guard</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">while</span> <span class="p">(</span><span class="n">node</span> <span class="p">!=</span> <span class="m">0</span> <span class="p">&amp;&amp;</span> <span class="n">guard</span><span class="p">++</span> <span class="p">&lt;</span> <span class="n">MaxListIterations</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">ulong</span> <span class="n">heap</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="s">&#34;CodeHeapListNode&#34;</span><span class="p">,</span> <span class="s">&#34;Heap&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="kt">ulong</span> <span class="n">next</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="s">&#34;CodeHeapListNode&#34;</span><span class="p">,</span> <span class="s">&#34;Next&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">heap</span> <span class="p">!=</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">foreach</span> <span class="p">(</span><span class="n">ClrNativeHeapInfo</span> <span class="n">info</span> <span class="k">in</span> <span class="n">DescribeCodeHeap</span><span class="p">(</span><span class="n">heap</span><span class="p">,</span> <span class="n">loader</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">                <span class="k">yield</span> <span class="k">return</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">next</span> <span class="p">==</span> <span class="n">node</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">node</span> <span class="p">=</span> <span class="n">next</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>then branch on <code>CodeHeap.HeapType</code> in <code>DescribeCodeHeap()</code>.</p>
<p>Here is the gotcha that cost me an hour: for a <code>LoaderCodeHeap</code>, the <code>LoaderHeap</code> field is an <code>ExplicitControlLoaderHeap</code> embedded <strong>inline</strong> in the struct - its address is the <em>field address</em>, not a pointer to dereference. Dereferencing it produced petabyte-sized garbage regions:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">case</span> <span class="n">LoaderCodeHeap</span> <span class="n">when</span> <span class="n">_target</span><span class="p">.</span><span class="n">HasType</span><span class="p">(</span><span class="s">&#34;LoaderCodeHeap&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// inline struct: take the field address, do NOT ReadFieldPointer here</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">loaderHeap</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">FieldAddress</span><span class="p">(</span><span class="n">codeHeap</span><span class="p">,</span> <span class="s">&#34;LoaderCodeHeap&#34;</span><span class="p">,</span> <span class="s">&#34;LoaderHeap&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">info</span> <span class="k">in</span> <span class="n">loader</span><span class="p">.</span><span class="n">WalkLoaderHeap</span><span class="p">(</span><span class="k">new</span> <span class="n">TargetPointer</span><span class="p">(</span><span class="n">loaderHeap</span><span class="p">),</span> <span class="n">NativeHeapKind</span><span class="p">.</span><span class="n">LoaderCodeHeap</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">yield</span> <span class="k">return</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="n">HostCodeHeap</span> <span class="n">when</span> <span class="n">_target</span><span class="p">.</span><span class="n">HasType</span><span class="p">(</span><span class="s">&#34;HostCodeHeap&#34;</span><span class="p">):</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">baseAddress</span> <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">codeHeap</span><span class="p">,</span> <span class="s">&#34;HostCodeHeap&#34;</span><span class="p">,</span> <span class="s">&#34;BaseAddress&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">current</span>     <span class="p">=</span> <span class="n">_target</span><span class="p">.</span><span class="n">ReadFieldPointer</span><span class="p">(</span><span class="n">codeHeap</span><span class="p">,</span> <span class="s">&#34;HostCodeHeap&#34;</span><span class="p">,</span> <span class="s">&#34;CurrentAddress&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">ulong</span> <span class="n">length</span>      <span class="p">=</span> <span class="n">current</span> <span class="p">&gt;=</span> <span class="n">baseAddress</span> <span class="p">?</span> <span class="n">current</span> <span class="p">-</span> <span class="n">baseAddress</span> <span class="p">:</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">yield</span> <span class="k">return</span> <span class="k">new</span> <span class="n">ClrNativeHeapInfo</span><span class="p">(</span><span class="k">new</span> <span class="n">TargetPointer</span><span class="p">(</span><span class="n">baseAddress</span><span class="p">),</span> <span class="n">length</span><span class="p">,</span> <span class="n">NativeHeapKind</span><span class="p">.</span><span class="n">HostCodeHeap</span><span class="p">,</span> <span class="n">NativeHeapState</span><span class="p">.</span><span class="n">Active</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Enumerating the code heaps only needs this linked-list walk; the nibble-map / RangeSectionMap machinery is required for instruction-pointer lookups, not for enumeration so I did not added it into the code.</p>
<p><strong>GC native regions.</strong> Free regions, handle-table segments, and bookkeeping regions all come from the GC contract - and every one of them is <code>Has*</code>-guarded so an older descriptor simply yields nothing instead of throwing an exception or crashing.</p>
<p>Finally, the <code>NativeHeapEnumerator</code> classes stitches the three sources together in the same order ClrMD uses, and deduplicates loader allocators (the SystemDomain&rsquo;s allocator is shared, so without a <code>visited</code> set you would count it many times):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">IEnumerable</span><span class="p">&lt;</span><span class="n">ClrNativeHeapInfo</span><span class="p">&gt;</span> <span class="n">EnumerateAll</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">info</span> <span class="k">in</span> <span class="n">EnumerateCodeHeaps</span><span class="p">())</span>  <span class="k">yield</span> <span class="k">return</span> <span class="n">info</span><span class="p">;</span>   <span class="c1">// 1. JIT code heaps</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">info</span> <span class="k">in</span> <span class="n">EnumerateLoaderHeaps</span><span class="p">())</span> <span class="k">yield</span> <span class="k">return</span> <span class="n">info</span><span class="p">;</span>  <span class="c1">// 2. loader heaps (deduped)</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">info</span> <span class="k">in</span> <span class="n">EnumerateGCRegions</span><span class="p">())</span>  <span class="k">yield</span> <span class="k">return</span> <span class="n">info</span><span class="p">;</span>   <span class="c1">// 3. GC regions</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">visited</span> <span class="p">=</span> <span class="k">new</span> <span class="n">HashSet</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">TargetPointer</span> <span class="n">global</span> <span class="p">=</span> <span class="n">SafeGet</span><span class="p">(</span><span class="n">_loader</span><span class="p">.</span><span class="n">GetGlobalLoaderAllocator</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(!</span><span class="n">global</span><span class="p">.</span><span class="n">IsNull</span> <span class="p">&amp;&amp;</span> <span class="n">visited</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">global</span><span class="p">.</span><span class="n">Value</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">SafeCollect</span><span class="p">(</span><span class="n">results</span><span class="p">,</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">_loader</span><span class="p">.</span><span class="n">EnumerateLoaderAllocatorHeaps</span><span class="p">(</span><span class="n">global</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">foreach</span> <span class="p">(</span><span class="n">TargetPointer</span> <span class="n">module</span> <span class="k">in</span> <span class="n">SafeList</span><span class="p">(</span><span class="n">_loader</span><span class="p">.</span><span class="n">EnumerateModules</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">TargetPointer</span> <span class="n">la</span> <span class="p">=</span> <span class="n">SafeGet</span><span class="p">(()</span> <span class="p">=&gt;</span> <span class="n">_loader</span><span class="p">.</span><span class="n">GetModuleLoaderAllocator</span><span class="p">(</span><span class="n">module</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">la</span><span class="p">.</span><span class="n">IsNull</span> <span class="p">&amp;&amp;</span> <span class="n">visited</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">la</span><span class="p">.</span><span class="n">Value</span><span class="p">))</span>                          <span class="c1">// dedup shared allocators</span>
</span></span><span class="line"><span class="cl">        <span class="n">SafeCollect</span><span class="p">(</span><span class="n">results</span><span class="p">,</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">_loader</span><span class="p">.</span><span class="n">EnumerateLoaderAllocatorHeaps</span><span class="p">(</span><span class="n">la</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">TargetPointer</span> <span class="n">thunkHeap</span> <span class="p">=</span> <span class="n">SafeGet</span><span class="p">(()</span> <span class="p">=&gt;</span> <span class="n">_loader</span><span class="p">.</span><span class="n">GetModuleThunkHeap</span><span class="p">(</span><span class="n">module</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">thunkHeap</span><span class="p">.</span><span class="n">IsNull</span> <span class="p">&amp;&amp;</span> <span class="n">visited</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">thunkHeap</span><span class="p">.</span><span class="n">Value</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="n">SafeCollect</span><span class="p">(</span><span class="n">results</span><span class="p">,</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">_loader</span><span class="p">.</span><span class="n">WalkLoaderHeap</span><span class="p">(</span><span class="n">thunkHeap</span><span class="p">,</span> <span class="n">NativeHeapKind</span><span class="p">.</span><span class="n">ThunkHeap</span><span class="p">));</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Each heap walking is wrapped in a try/catch (<code>SafeCollect</code>/<code>SafeGet</code>), so a single corrupted structure in a dump degrades to &ldquo;skip this region&rdquo; instead of aborting the whole listing.</p>
<p>The CLI exposes all of this behind an <code>H. !eeheap</code> menu entry that runs against whatever target you opened - self, <code>--pid</code>, or <code>--file</code>:</p>
<p><img alt="!eeheap reimplemented on cDAC contracts: code heaps, loader heaps and GC regions" loading="lazy" src="/posts/2026-06-17_reading-clr-internals-the/AllNativeHeaps.png"></p>
<p>Run it against the current process and against a captured dump, and you get the same heap kinds and memory ranges out of both - with no DAC anywhere in sight.</p>
<h2 id="wrapping-up">Wrapping up</h2>
<p>A few honest caveats, because I would rather you know them up front. A <code>LoaderHeapBlock</code> only exposes its reserved <code>VirtualSize</code>, so I report reserved memory and mark blocks <code>Active</code>; SOS reads the allocation pointers for finer committed-vs-reserved state. Thunk heaps are only emitted when the descriptor&rsquo;s <code>Module</code> type happens to carry a <code>ThunkHeap</code> field, since no contract operation surfaces them. And all of the GC-region output depends on the descriptor publishing the newer GC globals and types - on older descriptors, those sources gracefully degrade to empty.</p>
<p>None of that changes the big picture: this is the direction SOS and ClrMD themselves are moving. As the contracts mature, the matching-DAC pain - the wrong-binary-means-no-analysis problem that has haunted dump debugging for years - simply goes away, replaced by a single managed reader that asks the runtime to describe itself. I find that genuinely exciting, and building a <code>!eeheap</code> clone on top of it (with the help of Cursor and Opus 4.8) was the most fun I have had reading CLR internals in a long time.</p>
<p>The burden that was before on build-time DAC binaries and ClrMD managed helpers is now fully on Microsoft teams: it means that all these contracts/types definition and reader contracts implementation will have to be validated on each and every build of the CLR.</p>
<p>The full <code>RuntimeDataContract</code> source (the <code>Cdac.Core</code>, <code>Cdac.Contracts</code>, and <code>Cdac.Cli</code> projects) is available on my <a href="https://github.com/chrisnas/RuntimeDataContract">GitHub repository</a>.</p>
<p>Happy coding!</p>
<h2 id="references">References</h2>
<ul>
<li>cDAC design overview: <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/datacontracts_design.md">datacontracts_design.md</a></li>
<li>The contract descriptor format: <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/contract-descriptor.md">contract-descriptor.md</a></li>
<li>The data descriptor format: <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/data_descriptor.md">data_descriptor.md</a></li>
<li>The GC contract spec: <a href="https://github.com/dotnet/runtime/blob/main/docs/design/datacontracts/GC.md">GC.md</a></li>
<li>Microsoft&rsquo;s managed cdac reader: <code>src/native/managed/cdac</code> in <a href="https://github.com/dotnet/runtime/tree/main/src/native/managed/cdac">dotnet/runtime</a></li>
<li>For contrast, my earlier deep dives: <a href="/posts/2017-02-21_clrmd-part-1-going-beyond/">ClrMD part 1</a> and <a href="/posts/2022-07-28_digging-into-the-clr/">Digging into the CLR</a></li>
</ul>
]]></content:encoded></item></channel></rss>