<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Performance on Welcome to Christophe Nasarre's Blog</title><link>https://chrisnas.github.io/tags/performance/</link><description>Recent content in Performance on Welcome to Christophe Nasarre's Blog</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 11 Feb 2026 09:16:11 +0000</lastBuildDate><atom:link href="https://chrisnas.github.io/tags/performance/index.xml" rel="self" type="application/rss+xml"/><item><title>How to support .NET Framework PDB format and source line with ISymUnmanagedReader</title><link>https://chrisnas.github.io/posts/2026-02-11_how-to-support-net/</link><pubDate>Wed, 11 Feb 2026 09:16:11 +0000</pubDate><guid>https://chrisnas.github.io/posts/2026-02-11_how-to-support-net/</guid><description>After DIA and DbgHelp, time to dig into ISymUnmanagedReader to get its line in source code.</description><content:encoded><![CDATA[<hr>
<p>In my previous posts, I explained how to use <a href="/posts/2025-12-08_how-to-dump-function/">DIA</a> and <a href="/posts/2026-01-16_but-where-is-my/">DbgHelp</a> to map a method to its line in source code. I forgot to mention that it was correct for .NET Core but not for the “old” .NET Framework Windows PDB format. Instead of encoding the method token in the name, the symbol file contains the name of the methods. So, how to do the mapping for .NET Framework assemblies? You will find the answer (plus some tricks) in this article.</p>
<p>When I started to work on the support of the old Windows PDB format, I looked at what existed to parse the <a href="https://github.com/microsoft/microsoft-pdb/tree/master">raw format</a> and… I decided to try DbgHelp instead. With this <a href="/posts/2026-01-16_but-where-is-my/">first implementation</a>, I realized that some token where missing and source code information was not retrieved for most of the methods.</p>
<p><img loading="lazy" src="/posts/2026-02-11_how-to-support-net/1_ppZFw6o-Ygx7BzpX3jyWIw.png"></p>
<p>So, I looked for another API to use and I found <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedreader-interface?WT.mc_id=DT-MVP-5003325">ISymUnmanagedReader</a>. The usage philosophy is totally different from DIA or DbgHelp.</p>
<h2 id="a-little-bit-ofmagic">A little bit of magic</h2>
<p>This interface is implemented in diasymreader.dll that comes with every .NET Framework installation. But you need to do COM magic to get it. After having called <a href="https://learn.microsoft.com/en-us/windows/win32/api/objbase/nf-objbase-coinitialize?WT.mc_id=DT-MVP-5003325"><strong>CoInitialize</strong></a> to setup COM, you ask for an instance of <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedbinder-interface?WT.mc_id=DT-MVP-5003325"><strong>ISymUnmanagedBinder</strong></a> from <strong>CLSID_CorSymBinder_SxS</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">ISymUnmanagedBinder</span><span class="o">&gt;</span> <span class="n">pBinder</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">CoCreateInstance</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">CLSID_CorSymBinder_SxS</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nb">NULL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">CLSCTX_INPROC_SERVER</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">IID_ISymUnmanagedBinder</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pBinder</span>
</span></span><span class="line"><span class="cl"><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>From the binder, you can get the <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedreader-interface?WT.mc_id=DT-MVP-5003325"><strong>ISymUnmanagedReader</strong> interface</a> corresponding to the assembly you are interested in with <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedbinder-getreaderforfile-method?WT.mc_id=DT-MVP-5003325"><strong>GetReaderForFile</strong></a>. However, there are two tiny details to consider.</p>
<p>First, one parameter expects the path to the assembly, not to the .pdb file. That symbol file has to be stored in the same folder but note that the documentation states that you could have more flexible search with <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedbinder2-getreaderforfile2-method?WT.mc_id=DT-MVP-5003325"><strong>ISymUnmanagedBinder2::GetReaderForFile2</strong></a> but I did not test it.</p>
<p>The second detail is the first parameter: an instance of <strong>IMetaDataImport</strong> for the same assembly. The steps to get it are… complicated.</p>
<h2 id="hosting-theclr">Hosting the CLR</h2>
<p>The idea is to host the .NET Framework and get the corresponding <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/hosting/iclrmetahost-interface?WT.mc_id=DT-MVP-5003325">ICLRMetaHost</a> interface:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">ICLRMetaHost</span><span class="o">&gt;</span> <span class="n">pMetaHost</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">CLRCreateInstance</span><span class="p">(</span><span class="n">CLSID_CLRMetaHost</span><span class="p">,</span> <span class="n">IID_ICLRMetaHost</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pMetaHost</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Calling the <a href="CComPtr%3cICLRMetaHost%3e%20pMetaHost;"><strong>CLRCreateInstance</strong> API</a> allows you to get an instance of <strong>ICLRMetaHost</strong> from which you could <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/hosting/iclrmetahost-enumerateinstalledruntimes-method?WT.mc_id=DT-MVP-5003325">enumerate installed version</a> of .NET Framework. In my case, I know which version I want:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Get the installed .NET Framework runtime (v4.0+)
</span></span></span><span class="line"><span class="cl"><span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">ICLRRuntimeInfo</span><span class="o">&gt;</span> <span class="n">pRuntimeInfo</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">pMetaHost</span><span class="o">-&gt;</span><span class="n">GetRuntime</span><span class="p">(</span><span class="sa">L</span><span class="s">&#34;v4.0.30319&#34;</span><span class="p">,</span> <span class="n">IID_ICLRRuntimeInfo</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pRuntimeInfo</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/hosting/iclrruntimeinfo-interface?WT.mc_id=DT-MVP-5003325">ICLRRuntimeInfo interface</a> allows you to get access to runtime services via <strong>GetInterface</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IMetaDataDispenser</span><span class="o">&gt;</span> <span class="n">pDispenser</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">pRuntimeInfo</span><span class="o">-&gt;</span><span class="n">GetInterface</span><span class="p">(</span><span class="n">CLSID_CorMetaDataDispenser</span><span class="p">,</span> <span class="n">IID_IMetaDataDispenser</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pDispenser</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The service I’m interested in is the <a href="https://learn.microsoft.com/en-us/windows/win32/api/rometadataapi/nn-rometadataapi-imetadatadispenser?WT.mc_id=DT-MVP-5003325"><strong>IMetadataDispenser</strong> interface</a> that allows you to “open a scope” on the assembly you are interested in:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">pDispenser</span><span class="o">-&gt;</span><span class="n">OpenScope</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">wModulePath</span><span class="p">.</span><span class="n">c_str</span><span class="p">(),</span>
</span></span><span class="line"><span class="cl">    <span class="n">ofRead</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">IID_IMetaDataImport</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">IUnknown</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">_pMetaDataImport</span>
</span></span><span class="line"><span class="cl"><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that the first parameter is the path to the assembly not to the .pdb file. The scope is abstracted by an <strong>IMetadataImport</strong> interface <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">I have already described</a> and that is needed to call <strong>GetReaderForFile</strong>: and get the <strong>ISymUnmanagedReader</strong>:</p>
<p>hr = pBinder-&gt;GetReaderForFile(_pMetaDataImport, wModulePath.c_str(), nullptr, &amp;_pReader);</p>
<h2 id="the-road-to-get-symbol-details-for-amethod">The road to get symbol details for a method</h2>
<p>The <strong>ISymUnmanagedReader</strong> interface implements <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedreader-getmethod-method?WT.mc_id=DT-MVP-5003325"><strong>GetMethod</strong></a> to get details about a given method token via an <strong>ISymUnmanagedMethod</strong> interface. So, the next question is how to get these tokens. If you remember <a href="/posts/2026-01-16_but-where-is-my/">the previous article</a>, these tokens are from the 06 MethodDef table in the assembly metadata; starting from <strong>06000001</strong> to the last one.</p>
<p>This means that you could write a simple loop starting from 1 up to a hardcoded maximum value, call <strong>TokenFromRid(index, mdtMethodDef)</strong> to get the corresponding token. However, since you are a professional developer, you would search for the exact number of tokens from <a href="https://learn.microsoft.com/en-us/dotnet/core/unmanaged-api/metadata/interfaces/imetadatatables-interface?WT.mc_id=DT-MVP-5003325"><strong>IMetadataTables</strong></a> retrieved from <strong>IMetadataImport</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">cRows</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// Get IMetaDataTables interface to query the MethodDef table
</span></span></span><span class="line"><span class="cl"><span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IMetaDataTables</span><span class="o">&gt;</span> <span class="n">pTables</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pMetaDataImport</span><span class="o">-&gt;</span><span class="n">QueryInterface</span><span class="p">(</span><span class="n">IID_IMetaDataTables</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pTables</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">FAILED</span><span class="p">(</span><span class="n">hr</span><span class="p">)</span> <span class="o">||</span> <span class="n">pTables</span> <span class="o">==</span> <span class="k">nullptr</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">cRows</span> <span class="o">=</span> <span class="n">LAST_METHODDEF_TOKEN</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">else</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// Get the number of rows in the MethodDef table (table index 0x06 = Method)
</span></span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">pTables</span><span class="o">-&gt;</span><span class="n">GetTableInfo</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="mh">0x06</span><span class="p">,</span>           <span class="c1">// MethodDef table
</span></span></span><span class="line"><span class="cl">        <span class="nb">NULL</span><span class="p">,</span>           <span class="c1">// cbRow (not needed)
</span></span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">cRows</span><span class="p">,</span>         <span class="c1">// pcRows (number of methods)
</span></span></span><span class="line"><span class="cl">        <span class="nb">NULL</span><span class="p">,</span>           <span class="c1">// pcCols (not needed)
</span></span></span><span class="line"><span class="cl">        <span class="nb">NULL</span><span class="p">,</span>           <span class="c1">// piKey (not needed)
</span></span></span><span class="line"><span class="cl">        <span class="nb">NULL</span>            <span class="c1">// ppName (not needed)
</span></span></span><span class="line"><span class="cl">    <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">FAILED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">cRows</span> <span class="o">=</span> <span class="n">LAST_METHODDEF_TOKEN</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now that you have the number of rows (i.e. number of methods defined in the metadata), it is easy and safe to get method information from symbols:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="kt">uint32_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;=</span> <span class="n">cRows</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">mdMethodDef</span> <span class="n">token</span> <span class="o">=</span> <span class="n">TokenFromRid</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">mdtMethodDef</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">ISymUnmanagedMethod</span><span class="o">&gt;</span> <span class="n">pMethod</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">_pReader</span><span class="o">-&gt;</span><span class="n">GetMethod</span><span class="p">(</span><span class="n">token</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pMethod</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="n">pMethod</span> <span class="o">!=</span> <span class="k">nullptr</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">MethodInfo</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">GetMethodInfoFromSymbol</span><span class="p">(</span><span class="n">pMethod</span><span class="p">,</span> <span class="n">info</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">_methods</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">info</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that <strong>GetMethod</strong> might fail (returning <strong>E_FAIL</strong>) for P/Invoked functions, abstract methods, or methods decorated with <strong>DebuggerHidden</strong> attribute.</p>
<h2 id="give-me-line-and-sourcecode">Give me line and source code!</h2>
<p>For the other methods with symbol information, you can get its token via the <strong>GetToken</strong> method. The <strong>ISymUnmanagedMethod</strong> interface allows low level access to line/column mapping that is beyond the scope of this article. At a high level, positions in source file are named <em>sequence points</em>. Call <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedmethod-getsequencepointcount-method?WT.mc_id=DT-MVP-5003325"><strong>GetSequencePointCount</strong></a> to get… the number of sequence points for a given method.</p>
<p>The next step is to call <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanagedmethod-getsequencepoints-method?WT.mc_id=DT-MVP-5003325">GetSequencePoints</a> with the number of points you want and the corresponding arrays of offsets, lines, columns, end lines, end columns and <strong>ISymUnmanagedDocument</strong>. In my case, I’m only interested in where the method starts so the first sequence point is good enough:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Get sequence points (source line information)
</span></span></span><span class="line"><span class="cl"><span class="n">ULONG32</span> <span class="n">cPoints</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">pMethod</span><span class="o">-&gt;</span><span class="n">GetSequencePointCount</span><span class="p">(</span><span class="o">&amp;</span><span class="n">cPoints</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="p">(</span><span class="n">cPoints</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">cPoints</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="c1">// We only need the first sequence point for start line
</span></span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ULONG32</span><span class="o">&gt;</span> <span class="n">offsets</span><span class="p">(</span><span class="n">cPoints</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ULONG32</span><span class="o">&gt;</span> <span class="n">lines</span><span class="p">(</span><span class="n">cPoints</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ULONG32</span><span class="o">&gt;</span> <span class="n">columns</span><span class="p">(</span><span class="n">cPoints</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ULONG32</span><span class="o">&gt;</span> <span class="n">endLines</span><span class="p">(</span><span class="n">cPoints</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ULONG32</span><span class="o">&gt;</span> <span class="n">endColumns</span><span class="p">(</span><span class="n">cPoints</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ISymUnmanagedDocument</span><span class="o">*&gt;</span> <span class="n">documents</span><span class="p">(</span><span class="n">cPoints</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ULONG32</span> <span class="n">actualCount</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">pMethod</span><span class="o">-&gt;</span><span class="n">GetSequencePoints</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">cPoints</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">actualCount</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">offsets</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">documents</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">lines</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">columns</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">endLines</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="o">&amp;</span><span class="n">endColumns</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="p">(</span><span class="n">actualCount</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The source file is described by <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanageddocument-interface?WT.mc_id=DT-MVP-5003325">ISymUnmanagedDocument</a> that provides its name when <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/diagnostics/isymunmanageddocument-geturl-method?WT.mc_id=DT-MVP-5003325"><strong>GetURL</strong></a> is called:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Get the first sequence point&#39;s document and line
</span></span></span><span class="line"><span class="cl">        <span class="n">ISymUnmanagedDocument</span><span class="o">*</span> <span class="n">pDoc</span> <span class="o">=</span> <span class="n">documents</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">pDoc</span> <span class="o">!=</span> <span class="k">nullptr</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// Get document URL (file path)
</span></span></span><span class="line"><span class="cl">            <span class="n">ULONG32</span> <span class="n">urlLen</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">hr</span> <span class="o">=</span> <span class="n">pDoc</span><span class="o">-&gt;</span><span class="n">GetURL</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">urlLen</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="p">(</span><span class="n">urlLen</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">WCHAR</span><span class="o">&gt;</span> <span class="n">url</span><span class="p">(</span><span class="n">urlLen</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">                <span class="n">hr</span> <span class="o">=</span> <span class="n">pDoc</span><span class="o">-&gt;</span><span class="n">GetURL</span><span class="p">(</span><span class="n">urlLen</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">urlLen</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">url</span><span class="p">[</span><span class="mi">0</span><span class="p">]);</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="c1">// Convert wide string to narrow string
</span></span></span><span class="line"><span class="cl">                    <span class="kt">int</span> <span class="n">len</span> <span class="o">=</span> <span class="n">WideCharToMultiByte</span><span class="p">(</span><span class="n">CP_UTF8</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">url</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">urlLen</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">                    <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">narrowUrl</span><span class="p">(</span><span class="n">len</span><span class="p">,</span> <span class="sc">&#39;\0&#39;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">                    <span class="n">WideCharToMultiByte</span><span class="p">(</span><span class="n">CP_UTF8</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">url</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">urlLen</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">narrowUrl</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">len</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">                    <span class="n">info</span><span class="p">.</span><span class="n">sourceFile</span> <span class="o">=</span> <span class="n">narrowUrl</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="c1">// NOTE: 0xFEEFEE is a special value indicating hidden lines
</span></span></span><span class="line"><span class="cl">            <span class="n">info</span><span class="p">.</span><span class="n">lineNumber</span> <span class="o">=</span> <span class="n">lines</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="n">pDoc</span><span class="o">-&gt;</span><span class="n">Release</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The final interesting trick is that the line number might have the special <strong>0xFEEFEE</strong> value. It means that the line is hidden. I have seen it for methods generated by the C# compiler such as <strong>MoveNext</strong> for async state machines or anonymous methods:</p>
<p><img loading="lazy" src="/posts/2026-02-11_how-to-support-net/1_ZBXLm1oWWKp4pacMcX3g5Q.png"></p>
<p>The source code is available from <a href="https://github.com/chrisnas/DumpManagedMethodInfoFromSymbols">my Github repository</a>.</p>
<p>Happy coding!</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://github.com/microsoft/microsoft-pdb/tree/master">Archived Microsoft-pdb repository</a></li>
<li><a href="/posts/2025-12-08_how-to-dump-function/">DIA implementation article</a></li>
<li><a href="/posts/2026-01-16_but-where-is-my/">DbgHelp implementation article</a></li>
</ul>
]]></content:encoded></item><item><title>But where is my method code? DbgHelp comes to the rescue</title><link>https://chrisnas.github.io/posts/2026-01-16_but-where-is-my/</link><pubDate>Fri, 16 Jan 2026 08:21:30 +0000</pubDate><guid>https://chrisnas.github.io/posts/2026-01-16_but-where-is-my/</guid><description>This post show how DbgHelp could help you figure out the line and source code of each managed methods from a .pdb file</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>In our Datatog continuous .NET profiler implementation, we collect the call stack of a thread when something interesting happens such as an exception is thrown for example. In addition to the method name we would like to figure out at what line in which source code file this method is implemented.</p>
<p>This information is usually stored in the <em>program database</em> (.pdb) file that is generated by the compiler when the assembly is generated from the source code. The type and the name of the method are stored in the metadata of the assembly itself but <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">I already told this story before</a>. The .NET compilers support two formats of .pdb: the Portable format for .NET Core and the Windows format for .NET Framework.</p>
<p>I’ve explained <a href="/posts/2025-12-08_how-to-dump-function/">how to use the DIA API</a> and it is now time to show how to leverage the <strong>DbgHelp</strong> API that is available on all Windows machines (even though <a href="https://learn.microsoft.com/en-us/windows/win32/debug/dbghelp-versions?WT.mc_id=DT-MVP-5003325">it is recommended</a> to always install the latest release via the Debugging Tools for Windows.</p>
<p>This time, my goal is to extract from a Windows .pdb file the source code and line information for a given managed method. You can find the corresponding source code of this DumpLine tool in <a href="https://github.com/chrisnas/DumpManagedMethodInfoFromSymbols">my Github repository</a>.</p>
<h2 id="starting-withdbghelp">Starting with DbgHelp</h2>
<p>There are two major ways to get access to symbols with DbgHelp: either from a running process (i.e. to map the currently loaded .dll to their associated .pdb files) or from a tool that would explicitly load a .dll or a .pdb file.</p>
<p>Before anything, you tell DbgHelp which options you want by calling <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symsetoptions?WT.mc_id=DT-MVP-5003325">SymSetOptions</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">DWORD</span> <span class="n">options</span> <span class="o">=</span> <span class="n">SymGetOptions</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">options</span> <span class="o">|=</span> <span class="n">SYMOPT_DEBUG</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">options</span> <span class="o">|=</span> <span class="n">SYMOPT_LOAD_LINES</span><span class="p">;</span>           <span class="c1">// Load line number information
</span></span></span><span class="line"><span class="cl">    <span class="n">options</span> <span class="o">|=</span> <span class="n">SYMOPT_UNDNAME</span><span class="p">;</span>              <span class="c1">// Undecorate symbol names
</span></span></span><span class="line"><span class="cl">    <span class="c1">//options |= SYMOPT_DEFERRED_LOADS;       // Defer symbol loading
</span></span></span><span class="line"><span class="cl">    <span class="n">options</span> <span class="o">|=</span> <span class="n">SYMOPT_EXACT_SYMBOLS</span><span class="p">;</span>        <span class="c1">// Require exact symbol match
</span></span></span><span class="line"><span class="cl">    <span class="n">options</span> <span class="o">|=</span> <span class="n">SYMOPT_FAIL_CRITICAL_ERRORS</span><span class="p">;</span> <span class="c1">// Don&#39;t show error dialogs
</span></span></span><span class="line"><span class="cl">    <span class="n">SymSetOptions</span><span class="p">(</span><span class="n">options</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This is where I ask that line number information should be collected.</p>
<p>The next step is to call <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-syminitialize?WT.mc_id=DT-MVP-5003325">SymInitialize</a> to setup DbgHelp environment. The first parameter expects a process handle (returned by <a href="https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getcurrentprocess?WT.mc_id=DT-MVP-5003325">GetCurrentProcess</a> in my case). You could pass a path where to find the .pdb files for your dlls as a second parameter. In my case, since I will provide a .pdb file path, I don’t need it, and NULL will be passed. It means that, if needed, DbgHelp will use the current folder and the path set in _NT_SYMBOL_PATH and _NT_ALTERNATE_SYMBOL_PATH environment variables.</p>
<p>The last boolean parameter tells DbgHelp if you want that <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symloadmodule?WT.mc_id=DT-MVP-5003325">SymLoadModule64</a> to be called for each and every loaded .dll in the given process. Definitively not what I want so I’m passing FALSE.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">_hProcess</span> <span class="o">=</span> <span class="n">GetCurrentProcess</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">SymInitialize</span><span class="p">(</span><span class="n">_hProcess</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">FALSE</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_hProcess</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>At that point, I’m ready to load a .pdb file.</p>
<h2 id="loading-apdbfile">Loading a .pdb file</h2>
<p>The API is straightforward: just call <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symloadmoduleex?WT.mc_id=DT-MVP-5003325">SymLoadModuleEx</a> :</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">_baseAddress</span> <span class="o">=</span> <span class="n">SymLoadModuleEx</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">_hProcess</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nb">NULL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">pdbFilePath</span><span class="p">.</span><span class="n">c_str</span><span class="p">(),</span>
</span></span><span class="line"><span class="cl">        <span class="nb">NULL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="mh">0x10000000</span><span class="p">,</span> <span class="c1">// arbitrary base address
</span></span></span><span class="line"><span class="cl">        <span class="mi">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nb">NULL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="mi">0</span>
</span></span><span class="line"><span class="cl">    <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_baseAddress</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The important parameters are the process handle (same as for <strong>SymInitialize</strong>) and the path of the .pdb file. I’ve lost some time trying to understand why my code was not working due to a weird behavior of this function. You know that it succeeds when the returned address is not 0. Well… This is not 100% correct. If the path you provide does not exist, you won’t get 0 but the base address that you also provide. Even worth, when you call the functions I’ll detail later on, no error will happen but nothing will work as expected. So, I simply check that the file exists:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// BUG? : dbghelp does not fail if the .pdb file does not exist...
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">GetFileAttributesA</span><span class="p">(</span><span class="n">pdbFilePath</span><span class="p">.</span><span class="n">c_str</span><span class="p">())</span> <span class="o">==</span> <span class="n">INVALID_FILE_ATTRIBUTES</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that it is possible to unload the symbols of a given loaded module by calling <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symunloadmodule64?WT.mc_id=DT-MVP-5003325">SymUnloadModule</a> with the same process handle and its base address: this will reduce the memory consumption if you don’t need the symbols anymore.</p>
<p>In case of deferred load symbols, it is needed to call <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symgetmoduleinfo64?WT.mc_id=DT-MVP-5003325">SymGetModuleInfo64</a> before trying to access the symbols:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">IMAGEHLP_MODULE64</span> <span class="n">moduleInfo</span> <span class="o">=</span> <span class="p">{</span> <span class="mi">0</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="n">moduleInfo</span><span class="p">.</span><span class="n">SizeOfStruct</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">IMAGEHLP_MODULE64</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">SymGetModuleInfo64</span><span class="p">(</span><span class="n">_hProcess</span><span class="p">,</span> <span class="n">_baseAddress</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">moduleInfo</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In addition, this will fill up an <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/ns-dbghelp-imagehlp_module?WT.mc_id=DT-MVP-5003325">IMAGEHLP_MODULE64 structure</a> with possibly interesting details:</p>
<p><img loading="lazy" src="/posts/2026-01-16_but-where-is-my/1_SUaczO2zsWrnmCvQ5Y30OA.png"></p>
<p>The .pdb signature and age could be useful to build urls to communicate with symbol servers; but this is another story:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">_age</span> <span class="o">=</span> <span class="n">moduleInfo</span><span class="p">.</span><span class="n">PdbAge</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">GUID</span> <span class="n">guid</span> <span class="o">=</span> <span class="n">moduleInfo</span><span class="p">.</span><span class="n">PdbSig70</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">char</span> <span class="n">strGUID</span><span class="p">[</span><span class="mi">80</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">    <span class="n">sprintf_s</span><span class="p">(</span><span class="n">strGUID</span><span class="p">,</span> <span class="mi">80</span><span class="p">,</span> <span class="s">&#34;%08x%04x%04x%02x%02x%02x%02x%02x%02x%02x%02x&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">guid</span><span class="p">.</span><span class="n">Data1</span><span class="p">,</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data2</span><span class="p">,</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data3</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">3</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">        <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">4</span><span class="p">],</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">5</span><span class="p">],</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">6</span><span class="p">],</span> <span class="n">guid</span><span class="p">.</span><span class="n">Data4</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">_guid</span> <span class="o">=</span> <span class="n">strGUID</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You also know if line numbers are available or not thanks to the <strong>LineNumbers</strong> field.</p>
<p>Note that if you asked for deferred symbols option, you won’t get any interesting details:</p>
<p><img loading="lazy" src="/posts/2026-01-16_but-where-is-my/1_X4Y-pHdVa77r58SrppUR3Q.png"></p>
<p>Only the module name and path are provided but nothing else.</p>
<h2 id="enumerating-themethods">Enumerating the methods</h2>
<p>It is now time to iterate on the symbols in the loaded .pdb thanks to <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symenumsymbols?WT.mc_id=DT-MVP-5003325">SymEnumSymbols</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">SymEnumSymbols</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="n">_hProcess</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">_baseAddress</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s">&#34;*!*&#34;</span><span class="p">,</span>  <span class="c1">// Mask (all symbols)
</span></span></span><span class="line"><span class="cl">            <span class="n">EnumMethodSymbolsCallback</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="k">this</span>    <span class="c1">// User context to store the methods in _methods instance field
</span></span></span><span class="line"><span class="cl">    <span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In addition to the obvious parameters, this function expects a callback function that will be called for each symbol in the module specified by the process handle and the base address. Note that you can pass any context as the last parameter. In my case, the instance of my <strong>DbgHelpParser</strong> class is passed to be able to store the methods in a dedicated <strong>_methods</strong> field:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MethodInfo</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">name</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">sourceFile</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">lineNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">MethodInfo</span><span class="o">&gt;</span> <span class="n">_methods</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The “*!*” mask tells DbgHelp to look for symbols in all modules. This might sound counter intuitive, but the syntax is similar to what you find in WinDBG or Visual Studio: <strong><module>!<symbol></strong>. This could be useful if you load more than one .pdb.</p>
<p>The job of the callback function is to detect the symbols you are interested in from the <a href="https://learn.microsoft.com/en-us/windows/win32/api/DbgHelp/ns-dbghelp-symbol_info?WT.mc_id=DT-MVP-5003325">SYMBOL_INFO structure</a> passed for each matching symbol:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">BOOL</span> <span class="n">CALLBACK</span> <span class="n">DbgHelpParser</span><span class="o">::</span><span class="n">EnumMethodSymbolsCallback</span><span class="p">(</span><span class="n">PSYMBOL_INFO</span> <span class="n">pSymInfo</span><span class="p">,</span> <span class="n">ULONG</span> <span class="n">SymbolSize</span><span class="p">,</span> <span class="n">PVOID</span> <span class="n">UserContext</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">DbgHelpParser</span><span class="o">*</span> <span class="n">parser</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">DbgHelpParser</span><span class="o">*&gt;</span><span class="p">(</span><span class="n">UserContext</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="n">pSymInfo</span><span class="o">-&gt;</span><span class="n">Tag</span> <span class="o">==</span> <span class="n">SymTagFunction</span><span class="p">)</span> <span class="o">&amp;&amp;</span>
</span></span><span class="line"><span class="cl">        <span class="p">((</span><span class="n">pSymInfo</span><span class="o">-&gt;</span><span class="n">Flags</span> <span class="o">&amp;</span> <span class="p">(</span><span class="n">SYMFLAG_CLR_TOKEN</span> <span class="o">|</span> <span class="n">SYMFLAG_METADATA</span><span class="p">))</span> <span class="o">==</span> <span class="p">(</span><span class="n">SYMFLAG_CLR_TOKEN</span> <span class="o">|</span> <span class="n">SYMFLAG_METADATA</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>Tag</strong> field contains a value from <a href="https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/bkedss5f(v=vs.100)?WT.mc_id=DT-MVP-5003325">SymTagEnum</a> but, for a managed .pdb file, you will only get <strong>SymTagFunction</strong>. Also, the <strong>Flags</strong> field should contain SYMFLAG_CLR_TOKEN and SYMFLAG_METADATA because we are only interested in managed methods.</p>
<p>Next, you get the name, address and size from other fields before looking for the source file and line details by calling <a href="https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symgetlinefromaddr64?WT.mc_id=DT-MVP-5003325">SymGetLineFromAddr64</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">MethodInfo</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">info</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">pSymInfo</span><span class="o">-&gt;</span><span class="n">Name</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">info</span><span class="p">.</span><span class="n">address</span> <span class="o">=</span> <span class="n">pSymInfo</span><span class="o">-&gt;</span><span class="n">Address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">info</span><span class="p">.</span><span class="n">size</span> <span class="o">=</span> <span class="n">pSymInfo</span><span class="o">-&gt;</span><span class="n">Size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// Try to get source file and line information
</span></span></span><span class="line"><span class="cl">        <span class="n">IMAGEHLP_LINE64</span> <span class="n">line</span> <span class="o">=</span> <span class="p">{</span> <span class="mi">0</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl">        <span class="n">line</span><span class="p">.</span><span class="n">SizeOfStruct</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">IMAGEHLP_LINE64</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">DWORD</span> <span class="n">displacement</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">SymGetLineFromAddr64</span><span class="p">(</span><span class="n">parser</span><span class="o">-&gt;</span><span class="n">_hProcess</span><span class="p">,</span> <span class="n">pSymInfo</span><span class="o">-&gt;</span><span class="n">Address</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">displacement</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">line</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">info</span><span class="p">.</span><span class="n">sourceFile</span> <span class="o">=</span> <span class="n">line</span><span class="p">.</span><span class="n">FileName</span> <span class="o">?</span> <span class="n">line</span><span class="p">.</span><span class="nl">FileName</span> <span class="p">:</span> <span class="s">&#34;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">info</span><span class="p">.</span><span class="n">lineNumber</span> <span class="o">=</span> <span class="n">line</span><span class="p">.</span><span class="n">LineNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">info</span><span class="p">.</span><span class="n">sourceFile</span> <span class="o">=</span> <span class="s">&#34;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">info</span><span class="p">.</span><span class="n">lineNumber</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">parser</span><span class="o">-&gt;</span><span class="n">_methods</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">info</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">TRUE</span><span class="p">;</span> <span class="c1">// Continue enumeration
</span></span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The callback returns TRUE to continue the enumeration. You could return FALSE if you would look for specific symbols and wanted to speed up the processing.</p>
<h2 id="the-managed-side-of-thestory">The managed side of the story</h2>
<p>When the tool is run on a managed assembly, you get the following kind of output:</p>
<p><img loading="lazy" src="/posts/2026-01-16_but-where-is-my/1_KW21HTdZwwDwwKI69wlvUg.png"></p>
<p>The name of the method does not match at all with the names in my test assembly! Why do all methods have this generic <strong>Method.#<number></strong> format?</p>
<p>Well… the number corresponds to the RID of the corresponding method in the metadata of the assembly. Let’s have a look at what the <strong>MethodDef</strong> metadata table of this assembly looks like in ILSpy:</p>
<p><img loading="lazy" src="/posts/2026-01-16_but-where-is-my/1_S_vIz4k8maK3smcE2MNdVw.png"></p>
<p>The RID column corresponds to the number in the name in the tool output. So, the <strong>Method.#3</strong> is the <strong>get_Records</strong> property getter in PDBFormatReaderTest.cs:28. And this is exactly what I can see in the test source code:</p>
<p><img loading="lazy" src="/posts/2026-01-16_but-where-is-my/1_WZWHUMS4jfWwzomHlrN9pw.png"></p>
<p>You can also check that the lines look correct compared to what is listed by the tool:</p>
<ul>
<li><strong>FilePath</strong> getter at line 19</li>
<li><strong>FilePath</strong> setter at line 20</li>
<li><strong>Records</strong> getter at line 28</li>
<li><strong>Records</strong> setter at line 29</li>
</ul>
<p>Feel free to look at the source code from <a href="https://github.com/chrisnas/DumpManagedMethodInfoFromSymbols">my Github repository</a>.</p>
<p>DbgHelp provides many more services to look into symbols but that’s all for today!</p>
]]></content:encoded></item><item><title>Vibe coding a .pdb dumper or how I became a Product Manager</title><link>https://chrisnas.github.io/posts/2025-12-08_vibe-coding-pdb-dumper/</link><pubDate>Mon, 08 Dec 2025 10:12:16 +0000</pubDate><guid>https://chrisnas.github.io/posts/2025-12-08_vibe-coding-pdb-dumper/</guid><description>Follow me Vibe Coding in Cursor during my Datadog R&amp;amp;D Week to list function symbols with their signature and more</description><content:encoded><![CDATA[<hr>
<p>During this R&amp;D week at Datadog, I wanted to implement a tool accepting a .pdb file and generate a .sym file listing functions symbols with their address, size, name with signature and if they are public or private. This post dig into the implementation details of using <a href="https://learn.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/getting-started-debug-interface-access-sdk??WT.mc_id=DT-MVP-5003325">Microsoft Debug Interface Access (DIA) COM API</a> to achieve these objectives. If you want to see what my vibe coding experience in Cursor was, read <a href="/posts/2025-12-08_vibe-coding-pdb-dumper/">this other post</a> instead.</p>
<h2 id="one-self-contained-toolplease">One self-contained tool please!</h2>
<p>I would like the tool to be self-contained but since DIA is based on a COM server, it would require registering msdia40.dll on the machine. Not a good idea. In case the dll is in the same folder as the tool, one could “emulate” the magic done by <strong>CoCreateInstance</strong> to get an instance of <strong>IDiaDataSource</strong> (more on this interface soon) by:</p>
<ul>
<li>Call <strong>LoadLibrary</strong> to load the dll in memory</li>
<li>Call <strong>GetProcAddress</strong> to get the <strong>DllGetClassObject</strong> implementation</li>
<li>Call this function to get the <strong>IClassFactory</strong> implementation</li>
<li>Call its <strong>CreateInstance</strong> method to get an object implementing <strong>IDiaDataSource</strong></li>
</ul>
<p>Here is the corresponding code (without error checking for readability)</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Create DIA data source without registration using DLL loading
</span></span></span><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="n">PdbSymbolExtractor</span><span class="o">::</span><span class="n">NoRegCoCreate</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">wstring</span><span class="o">&amp;</span> <span class="n">dllPath</span><span class="p">,</span> <span class="n">REFCLSID</span> <span class="n">rclsid</span><span class="p">,</span> <span class="n">REFIID</span> <span class="n">riid</span><span class="p">,</span> <span class="kt">void</span><span class="o">**</span> <span class="n">ppv</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">HMODULE</span> <span class="n">hDll</span> <span class="o">=</span> <span class="n">LoadLibraryW</span><span class="p">(</span><span class="n">dllPath</span><span class="p">.</span><span class="n">c_str</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">typedef</span> <span class="nf">HRESULT</span><span class="p">(</span><span class="kr">__stdcall</span><span class="o">*</span> <span class="n">DllGetClassObjectFunc</span><span class="p">)(</span><span class="n">REFCLSID</span><span class="p">,</span> <span class="n">REFIID</span><span class="p">,</span> <span class="n">LPVOID</span><span class="o">*</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">DllGetClassObjectFunc</span> <span class="n">pDllGetClassObject</span> <span class="o">=</span> <span class="p">(</span><span class="n">DllGetClassObjectFunc</span><span class="p">)</span><span class="n">GetProcAddress</span><span class="p">(</span><span class="n">hDll</span><span class="p">,</span> <span class="s">&#34;DllGetClassObject&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IClassFactory</span><span class="o">&gt;</span> <span class="n">pClassFactory</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">pDllGetClassObject</span><span class="p">(</span><span class="n">rclsid</span><span class="p">,</span> <span class="n">IID_IClassFactory</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pClassFactory</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">pClassFactory</span><span class="o">-&gt;</span><span class="n">CreateInstance</span><span class="p">(</span><span class="nb">NULL</span><span class="p">,</span> <span class="n">riid</span><span class="p">,</span> <span class="n">ppv</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Note: We intentionally don&#39;t call FreeLibrary here because the DLL needs to stay loaded
</span></span></span><span class="line"><span class="cl">    <span class="c1">// The COM object references will keep it alive
</span></span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">S_OK</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This function is called with the path of msdia40.dll and the UUID of the expected <strong>IDiaDataSource</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-objectivec" data-lang="objectivec"><span class="line"><span class="cl"><span class="c1">// Create DIA data source without registration
</span></span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">NoRegCoCreate</span><span class="p">(</span><span class="n">dllPath</span><span class="p">,</span> <span class="n">CLSID_DiaSource</span><span class="p">,</span> <span class="n">__uuidof</span><span class="p">(</span><span class="n">IDiaDataSource</span><span class="p">),</span> <span class="p">(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">_pDiaDataSource</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>But still, I don’t want to have two binaries!</p>
<p>The trick is to embed the msdia40.dll inside the tool as a Windows resource. In the .rc file, add an RCDATA entry that points to the dll:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">IDR_MSDIA_DLL</span>      <span class="n">RCDATA</span>      <span class="s">&#34;x64</span><span class="se">\\</span><span class="s">Release</span><span class="se">\\</span><span class="s">msdia140.dll&#34;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You should see it in the Resource View in Visual Studio:</p>
<p><img loading="lazy" src="1_-GYj2ovADruJxXqRMF930A.png"></p>
<p>Here is the code that extracts it as a file on disk is straightforward (error checking has been removed for readability):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-php" data-lang="php"><span class="line"><span class="cl"><span class="c1">// Extract embedded msdia140.dll from resources
</span></span></span><span class="line"><span class="cl"><span class="nx">bool</span> <span class="nx">PdbSymbolExtractor</span><span class="o">::</span><span class="na">ExtractEmbeddedDll</span><span class="p">(</span><span class="k">const</span> <span class="no">std</span><span class="o">::</span><span class="na">wstring</span><span class="o">&amp;</span> <span class="nx">outputPath</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// Find the resource
</span></span></span><span class="line"><span class="cl">    <span class="nx">HMODULE</span> <span class="nx">hModule</span> <span class="o">=</span> <span class="nx">GetModuleHandle</span><span class="p">(</span><span class="k">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="nx">HRSRC</span> <span class="nx">hResource</span> <span class="o">=</span> <span class="nx">FindResource</span><span class="p">(</span><span class="nx">hModule</span><span class="p">,</span> <span class="nx">MAKEINTRESOURCE</span><span class="p">(</span><span class="nx">IDR_MSDIA_DLL</span><span class="p">),</span> <span class="nx">RT_RCDATA</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Load the resource
</span></span></span><span class="line"><span class="cl">    <span class="nx">HGLOBAL</span> <span class="nx">hLoadedResource</span> <span class="o">=</span> <span class="nx">LoadResource</span><span class="p">(</span><span class="nx">hModule</span><span class="p">,</span> <span class="nx">hResource</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Lock the resource to get a pointer to the data
</span></span></span><span class="line"><span class="cl">    <span class="nx">LPVOID</span> <span class="nx">pResourceData</span> <span class="o">=</span> <span class="nx">LockResource</span><span class="p">(</span><span class="nx">hLoadedResource</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Get the size of the resource
</span></span></span><span class="line"><span class="cl">    <span class="nx">DWORD</span> <span class="nx">resourceSize</span> <span class="o">=</span> <span class="nx">SizeofResource</span><span class="p">(</span><span class="nx">hModule</span><span class="p">,</span> <span class="nx">hResource</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Write the DLL to disk
</span></span></span><span class="line"><span class="cl">    <span class="nx">std</span><span class="o">::</span><span class="na">ofstream</span> <span class="nx">outFile</span><span class="p">(</span><span class="nx">outputPath</span><span class="p">,</span> <span class="nx">std</span><span class="o">::</span><span class="na">ios</span><span class="o">::</span><span class="na">binary</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="nx">outFile</span><span class="o">.</span><span class="nx">write</span><span class="p">(</span><span class="nx">static_cast</span><span class="o">&lt;</span><span class="k">const</span> <span class="no">char</span><span class="o">*&gt;</span><span class="p">(</span><span class="nx">pResourceData</span><span class="p">),</span> <span class="nx">resourceSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="nx">outFile</span><span class="o">.</span><span class="nx">close</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="k">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="where-are-my-functionsymbols">Where are my function symbols?</h2>
<p>After calling <strong>NoRegCoCreate()</strong>, _<strong>pDiaDataSource</strong> stores a reference to the entry point into the DIA APIs. Here are the steps to follow before being able to list the symbols:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="n">PdbSymbolExtractor</span><span class="o">::</span><span class="n">ExtractSymbolsFromPdb</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">wstring</span><span class="o">&amp;</span> <span class="n">pdbPath</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">FunctionSymbol</span><span class="o">&gt;&amp;</span> <span class="n">symbols</span><span class="p">)</span> 
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// Load the PDB file
</span></span></span><span class="line"><span class="cl">    <span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">_pDiaDataSource</span><span class="o">-&gt;</span><span class="n">loadDataFromPdb</span><span class="p">(</span><span class="n">pdbPath</span><span class="p">.</span><span class="n">c_str</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Open a session
</span></span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IDiaSession</span><span class="o">&gt;</span> <span class="n">pSession</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">_pDiaDataSource</span><span class="o">-&gt;</span><span class="n">openSession</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pSession</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Get the global scope
</span></span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IDiaSymbol</span><span class="o">&gt;</span> <span class="n">pGlobal</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">pSession</span><span class="o">-&gt;</span><span class="n">get_globalScope</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pGlobal</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now, you have the global scope of the symbols, you can ask for an enumerator for the type of symbols you are interested in; <strong>SymTagFunction</strong> in my case:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Enumerate all function symbols
</span></span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IDiaEnumSymbols</span><span class="o">&gt;</span> <span class="n">pEnumSymbols</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">pGlobal</span><span class="o">-&gt;</span><span class="n">findChildren</span><span class="p">(</span><span class="n">SymTagFunction</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">nsNone</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pEnumSymbols</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">LONG</span> <span class="n">count</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">pEnumSymbols</span><span class="o">-&gt;</span><span class="n">get_Count</span><span class="p">(</span><span class="o">&amp;</span><span class="n">count</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">wcout</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;Found &#34;</span> <span class="o">&lt;&lt;</span> <span class="n">count</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34; function symbols&#34;</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>pEnumSymbols</strong> iterator allows you to loop on each <strong>SymTagFunction</strong> symbol and get its name:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">while</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">pEnumSymbols</span><span class="o">-&gt;</span><span class="n">Next</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pSymbol</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">celt</span><span class="p">))</span> <span class="o">&amp;&amp;</span> <span class="n">celt</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">FunctionSymbol</span> <span class="n">func</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// Get function name
</span></span></span><span class="line"><span class="cl">        <span class="n">BSTR</span> <span class="n">bstrName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">pSymbol</span><span class="o">-&gt;</span><span class="n">get_name</span><span class="p">(</span><span class="o">&amp;</span><span class="n">bstrName</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">func</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">bstrName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">SysFreeString</span><span class="p">(</span><span class="n">bstrName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that each symbol details are stored in a <strong>FunctionSymbol</strong> instance:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">FunctionSymbol</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">name</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">signature</span><span class="p">;</span> <span class="c1">// Function signature (parameters only, no return type)
</span></span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">rva</span><span class="p">;</span>              <span class="c1">// Relative Virtual Address
</span></span></span><span class="line"><span class="cl">    <span class="n">ULONGLONG</span> <span class="n">length</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">isPublic</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>with the rest of the code in the <strong>while()</strong> loop:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Get function signature (parameters only, no return type)
</span></span></span><span class="line"><span class="cl">    <span class="n">func</span><span class="p">.</span><span class="n">signature</span> <span class="o">=</span> <span class="n">ExtractFunctionSignature</span><span class="p">(</span><span class="n">pSymbol</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Get relative virtual address
</span></span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">rva</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pSymbol</span><span class="o">-&gt;</span><span class="n">get_relativeVirtualAddress</span><span class="p">(</span><span class="o">&amp;</span><span class="n">rva</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">func</span><span class="p">.</span><span class="n">rva</span> <span class="o">=</span> <span class="n">rva</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Get function length
</span></span></span><span class="line"><span class="cl">    <span class="n">ULONGLONG</span> <span class="n">length</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pSymbol</span><span class="o">-&gt;</span><span class="n">get_length</span><span class="p">(</span><span class="o">&amp;</span><span class="n">length</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">func</span><span class="p">.</span><span class="n">length</span> <span class="o">=</span> <span class="n">length</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Determine if function is public or private
</span></span></span><span class="line"><span class="cl">    <span class="c1">// Check access level - default to private
</span></span></span><span class="line"><span class="cl">    <span class="n">func</span><span class="p">.</span><span class="n">isPublic</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">access</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pSymbol</span><span class="o">-&gt;</span><span class="n">get_access</span><span class="p">(</span><span class="o">&amp;</span><span class="n">access</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">func</span><span class="p">.</span><span class="n">isPublic</span> <span class="o">=</span> <span class="p">(</span><span class="n">access</span> <span class="o">==</span> <span class="n">CV_public</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">pSymbol</span><span class="p">.</span><span class="n">Release</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>I did not have the time to do more trial for private/public state, but I should have tried by enumerating <strong>SymTagPublicSymbol</strong> or <strong>SymTagExport</strong> that could be considered as public.</p>
<h2 id="better-with-a-signature">Better with a signature</h2>
<p>The final step is to figure out the signature of each function. This is where the genericity of DIA could be confusing because so many things are represented by <strong>IDiaSymbol</strong>: a symbol, a function, the type of a function, or the type of a parameter…</p>
<p>So, the type of the function is retrieved as an <strong>IDiaSymbol</strong> by calling <strong>getType()</strong> on the function symbol. From that <strong>IDiaSymbol</strong>, <strong>findChildren()</strong> lets you iterate on the parameters:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Extract function signature (parameters only, no return type)
</span></span></span><span class="line"><span class="cl"><span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">PdbSymbolExtractor</span><span class="o">::</span><span class="n">ExtractFunctionSignature</span><span class="p">(</span><span class="n">IDiaSymbol</span><span class="o">*</span> <span class="n">pSymbol</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">pSymbol</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;()&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Get function type
</span></span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IDiaSymbol</span><span class="o">&gt;</span> <span class="n">pFunctionType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pSymbol</span><span class="o">-&gt;</span><span class="n">get_type</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pFunctionType</span><span class="p">)</span> <span class="o">!=</span> <span class="n">S_OK</span> <span class="o">||</span> <span class="o">!</span><span class="n">pFunctionType</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;()&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Enumerate function arguments
</span></span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IDiaEnumSymbols</span><span class="o">&gt;</span> <span class="n">pEnumArgs</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">FAILED</span><span class="p">(</span><span class="n">pFunctionType</span><span class="o">-&gt;</span><span class="n">findChildren</span><span class="p">(</span><span class="n">SymTagFunctionArgType</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">nsNone</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pEnumArgs</span><span class="p">)))</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;()&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">LONG</span> <span class="n">argCount</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">pEnumArgs</span><span class="o">-&gt;</span><span class="n">get_Count</span><span class="p">(</span><span class="o">&amp;</span><span class="n">argCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">argCount</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;()&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now, the same <strong>Next()</strong> method is called on the enumerator to iterate on each parameter:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Build signature string
</span></span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">signature</span> <span class="o">=</span> <span class="sa">L</span><span class="s">&#34;(&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IDiaSymbol</span><span class="o">&gt;</span> <span class="n">pArg</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">ULONG</span> <span class="n">argCelt</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">first</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">while</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">pEnumArgs</span><span class="o">-&gt;</span><span class="n">Next</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pArg</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">argCelt</span><span class="p">))</span> <span class="o">&amp;&amp;</span> <span class="n">argCelt</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">first</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">signature</span> <span class="o">+=</span> <span class="sa">L</span><span class="s">&#34;, &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="n">first</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// Get the argument type
</span></span></span><span class="line"><span class="cl">        <span class="n">CComPtr</span><span class="o">&lt;</span><span class="n">IDiaSymbol</span><span class="o">&gt;</span> <span class="n">pArgType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">pArg</span><span class="o">-&gt;</span><span class="n">get_type</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pArgType</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span> <span class="o">&amp;&amp;</span> <span class="n">pArgType</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">signature</span> <span class="o">+=</span> <span class="n">GetTypeName</span><span class="p">(</span><span class="n">pArgType</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">signature</span> <span class="o">+=</span> <span class="sa">L</span><span class="s">&#34;?&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">pArg</span><span class="p">.</span><span class="n">Release</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">signature</span> <span class="o">+=</span> <span class="sa">L</span><span class="s">&#34;)&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">signature</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The final step is to get the name of the type from the <strong>IDiaSymbol</strong> returned by <strong>get_type()</strong>. If it is a custom type, call <strong>get_name()</strong> like any other symbol. Otherwise, for basic types, call <strong>get_baseType()</strong> and <strong>get_length()</strong> as shown by the code below:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span><span class="lnt">46
</span><span class="lnt">47
</span><span class="lnt">48
</span><span class="lnt">49
</span><span class="lnt">50
</span><span class="lnt">51
</span><span class="lnt">52
</span><span class="lnt">53
</span><span class="lnt">54
</span><span class="lnt">55
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">PdbSymbolExtractor</span><span class="o">::</span><span class="n">GetTypeName</span><span class="p">(</span><span class="n">IDiaSymbol</span><span class="o">*</span> <span class="n">pType</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">pType</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;?&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Try to get type name directly
</span></span></span><span class="line"><span class="cl">    <span class="n">BSTR</span> <span class="n">bstrTypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pType</span><span class="o">-&gt;</span><span class="n">get_name</span><span class="p">(</span><span class="o">&amp;</span><span class="n">bstrTypeName</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span> <span class="o">&amp;&amp;</span> <span class="n">bstrTypeName</span> <span class="o">&amp;&amp;</span> <span class="n">wcslen</span><span class="p">(</span><span class="n">bstrTypeName</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">typeName</span> <span class="o">=</span> <span class="n">bstrTypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">SysFreeString</span><span class="p">(</span><span class="n">bstrTypeName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">typeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// For basic types or unnamed types, try getting basic type info
</span></span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">baseType</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">ULONGLONG</span> <span class="n">length</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pType</span><span class="o">-&gt;</span><span class="n">get_baseType</span><span class="p">(</span><span class="o">&amp;</span><span class="n">baseType</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">pType</span><span class="o">-&gt;</span><span class="n">get_length</span><span class="p">(</span><span class="o">&amp;</span><span class="n">length</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// Map basic types to names
</span></span></span><span class="line"><span class="cl">        <span class="k">switch</span> <span class="p">(</span><span class="n">baseType</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btVoid</span><span class="p">:</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;void&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btChar</span><span class="p">:</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;char&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btWChar</span><span class="p">:</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;wchar_t&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btBool</span><span class="p">:</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;bool&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btInt</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btLong</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;char&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">2</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;short&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">4</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;int&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">8</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;__int64&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;int&#34;</span> <span class="o">+</span> <span class="n">std</span><span class="o">::</span><span class="n">to_wstring</span><span class="p">(</span><span class="n">length</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btUInt</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btULong</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;unsigned char&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">2</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;unsigned short&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">4</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;unsigned int&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">8</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;unsigned __int64&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;uint&#34;</span> <span class="o">+</span> <span class="n">std</span><span class="o">::</span><span class="n">to_wstring</span><span class="p">(</span><span class="n">length</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">case</span> <span class="nl">btFloat</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">4</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;float&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">length</span> <span class="o">==</span> <span class="mi">8</span><span class="p">)</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;double&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span> <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;float&#34;</span> <span class="o">+</span> <span class="n">std</span><span class="o">::</span><span class="n">to_wstring</span><span class="p">(</span><span class="n">length</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">default</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">                <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;?&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="sa">L</span><span class="s">&#34;?&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This is a “simple” implementation that does not take pointers, addresses, arrays, and more into account. For a more complete solution, I would recommend looking at the <strong>PrintType()</strong> implementation in the DIA2Dump code sample that is installed with Visual Studio.</p>
<p>I hope this will get your foot in the door of symbol parsing and make you want to dig further into DIA.</p>
<h2 id="references">References</h2>
<ul>
<li>Corresponding source code is available in <a href="https://github.com/chrisnas/VibeCoding">my github repository</a>.</li>
<li>Archived Microsoft <a href="https://github.com/microsoft/microsoft-pdb/tree/master">documentation/implementation of .pdb format</a> including a <a href="https://github.com/microsoft/microsoft-pdb/tree/master/cvdump">symbol dumper</a> code.</li>
<li><a href="https://learn.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/dia2dump-sample">DIA2Dump</a> Visual Studio code sample.</li>
</ul>
]]></content:encoded></item><item><title>How to monitor .NET applications startup</title><link>https://chrisnas.github.io/posts/2025-03-13_how-to-monitor-net/</link><pubDate>Thu, 13 Mar 2025 10:50:55 +0000</pubDate><guid>https://chrisnas.github.io/posts/2025-03-13_how-to-monitor-net/</guid><description>This episode explains how to monitor the startup of a .NET application and get insights about its lock and wait contentions duration</description><content:encoded><![CDATA[<hr>
<p>In <a href="/posts/2025-01-13_measuring-the-impact-of/">the previous article</a>, I presented what is needed (i.e. listen to <strong>WaitHandleWait</strong> events) to compute lock/wait durations and call stacks for <strong>Mutex</strong>, <strong>Semaphore</strong>, <strong>SemaphoreSlim</strong>, <strong>Manual</strong>/<strong>AutoResetEvent</strong>, <strong>ManualResetEventSlim</strong>, <strong>ReaderWriterLockSlim</strong> .NET synchronization constructs for a running process.</p>
<p>However, since the application is already running, some JIT-related events are missing, and some frames of the call stacks cannot be symbolized. Also, it would be great to monitor an application’s startup to see if it could be faster.</p>
<p>This post will detail how to monitor a .NET application since the very beginning of its life and the issues you might face.</p>
<h2 id="preparing-a-newnet-process-to-be-monitored">Preparing a new .NET process to be monitored</h2>
<p>From .NET 5, the <strong>dotnet-trace</strong> CLI tool allows you to <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/dotnet-trace-instructions.md#using-dotnet-trace-to-launch-a-child-process-and-trace-it-from-startup">pass a command line to execute and trace it from startup</a>. In a <a href="https://medium.com/@ocoanet/tracing-allocations-with-eventpipe-part-3-tracing-without-dotnet-trace-7244bdb86e03">very interesting article</a>, Olivier Coanet presented the gory details about how to tell the .NET runtime to start an application in a pseudo-suspended mode as shown in the following diagram:</p>
<p><img loading="lazy" src="/posts/2025-03-13_how-to-monitor-net/1_9J_2x2R0n0nkuRTl6MDzZw.png"></p>
<p>The first step is to create a <strong>ReverseDiagnosticsServer</strong> instance with a specific port (i.e. <strong>dotnet-wait_1234</strong> in the diagram). Next, the process to monitor is spawned with the <strong>DOTNET_DiagnosticPorts</strong> environment variable set to the same port (i.e. <strong>dotnet-wait_1234</strong>). Look at the <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/design-docs/ipc-protocol.md#diagnostic-ports">Diagnostics documentation</a> of the Diagnostic Ports with <strong>DOTNET_DiagnosticPorts</strong> environment variable for more details. The .NET runtime is the new process will listen to this port and… wait.</p>
<p>When the tool is ready, it sends a resume command via a <strong>DiagnosticsClient</strong>: from that point in time, the CLR executes the normal flow of actions to run the application and… you will receive all events without missing one!</p>
<h2 id="get-my-command-lineplease">Get my command line please</h2>
<p>Following the <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/dotnet-trace-instructions.md#using-dotnet-trace-to-launch-a-child-process-and-trace-it-from-startup"><strong>dotnet-trace</strong> example</a>, my <a href="https://www.nuget.org/packages/dotnet-wait">dotnet-wait</a> tool accepts the command line of the child process in its final arguments that follow the** — **trigger. For example, dotnet-wait — dotnet foo.dll will start the program in foo.dll by using dotnet.exe. I’m reusing <a href="https://github.com/dotnet/diagnostics/blob/main/src/Tools/Common/ReversedServerHelpers/ReversedServerHelpers.cs#L37">the code in ReversedServerHelper.cs</a> to deal with arguments containing spaces:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">current</span> <span class="p">==</span> <span class="s">&#34;--&#34;</span><span class="p">)</span>  <span class="c1">// this is supposed to be the last one</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">i</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="p">&lt;</span> <span class="n">args</span><span class="p">.</span><span class="n">Length</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">parameters</span><span class="p">.</span><span class="n">pathName</span> <span class="p">=</span> <span class="n">args</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// use the remaining arguments as the arguments for the child app to spawn</span>
</span></span><span class="line"><span class="cl">        <span class="n">i</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="p">&lt;</span> <span class="n">args</span><span class="p">.</span><span class="n">Length</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">parameters</span><span class="p">.</span><span class="n">arguments</span> <span class="p">=</span> <span class="s">&#34;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="p">=</span> <span class="n">i</span><span class="p">;</span> <span class="n">j</span> <span class="p">&lt;</span> <span class="n">args</span><span class="p">.</span><span class="n">Length</span><span class="p">;</span> <span class="n">j</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(</span><span class="n">args</span><span class="p">[</span><span class="n">j</span><span class="p">].</span><span class="n">Contains</span><span class="p">(</span><span class="sc">&#39; &#39;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="n">parameters</span><span class="p">.</span><span class="n">arguments</span> <span class="p">+=</span> <span class="s">$&#34;\&#34;</span><span class="p">{</span><span class="n">args</span><span class="p">[</span><span class="n">j</span><span class="p">].</span><span class="n">Replace</span><span class="p">(</span><span class="s">&#34;\&#34;&#34;</span><span class="p">,</span> <span class="s">&#34;\\\&#34;&#34;</span><span class="p">)}</span><span class="err">\</span><span class="s">&#34;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span><span class="line"><span class="cl">                <span class="k">else</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="n">parameters</span><span class="p">.</span><span class="n">arguments</span> <span class="p">+=</span> <span class="n">args</span><span class="p">[</span><span class="n">j</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(</span><span class="n">j</span> <span class="p">!=</span> <span class="n">args</span><span class="p">.</span><span class="n">Length</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="n">parameters</span><span class="p">.</span><span class="n">arguments</span> <span class="p">+=</span> <span class="s">&#34; &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// no need to look for more arguments</span>
</span></span><span class="line"><span class="cl">        <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">throw</span> <span class="k">new</span> <span class="n">InvalidOperationException</span><span class="p">(</span><span class="s">$&#34;Missing path name value...&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The code to spawn the child process is simple:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// start the monitored app</span>
</span></span><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">psi</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ProcessStartInfo</span><span class="p">(</span><span class="n">pathName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(!</span><span class="kt">string</span><span class="p">.</span><span class="n">IsNullOrEmpty</span><span class="p">(</span><span class="n">arguments</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">psi</span><span class="p">.</span><span class="n">Arguments</span> <span class="p">=</span> <span class="n">arguments</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="n">psi</span><span class="p">.</span><span class="n">EnvironmentVariables</span><span class="p">[</span><span class="s">&#34;DOTNET_DiagnosticPorts&#34;</span><span class="p">]</span> <span class="p">=</span> <span class="n">port</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">psi</span><span class="p">.</span><span class="n">UseShellExecute</span> <span class="p">=</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">process</span> <span class="p">=</span> <span class="n">System</span><span class="p">.</span><span class="n">Diagnostics</span><span class="p">.</span><span class="n">Process</span><span class="p">.</span><span class="n">Start</span><span class="p">(</span><span class="n">psi</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Here is an example with the following prompt:</p>
<pre tabindex="0"><code>-- dotnet &#34;C:\CommandLineTest.dll&#34; one two &#34;t h r e e&#34; four &#39;five six&#39;
</code></pre><p>that generates the output (the test application is just listing its arguments):</p>
<pre tabindex="0"><code>dotnet-wait v1.0.0.0 - List wait duration
by Christophe Nasarre

Press ENTER to exit...
6 arguments
   1 | one
   2 | two
   3 | t h r e e
   4 | four
   5 | &#39;five
   6 | six&#39;
</code></pre><p>This test reminded me to never use simple quotes in prompts :^)</p>
<h2 id="its-myconsole">It’s my console!</h2>
<p>Once I implemented these steps, I immediately faced a very simple problem: my <strong>dotnet-wait</strong> tool and the test application are console applications. It means that they will share the same console for both input and output. For example, both are waiting for the RETURN key to (1) stop for the tool and (2) start for the test application: too bad for me because the tool will stop as soon the application starts…</p>
<p>Going back in time in my Windows memories, I remembered that the Win32 <a href="https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw?WT.mc_id=DT-MVP-5003325">CreateProcess</a> API accepts <a href="https://learn.microsoft.com/en-us/windows/win32/procthread/process-creation-flags?WT.mc_id=DT-MVP-5003325"><strong>CREATE_NEW_CONSOLE</strong> as creation flag</a> to automagically start the child process into its own new console. Unfortunately, it is not possible to pass this flag in .NET; maybe a limitation due to Linux support.</p>
<p>One simple solution could be to redirect the output of the tool or the application to a file: that would avoid mixing them in the console. Note that, by default, <strong>dotnet-trace</strong> discards output from the child process (by setting <strong>RedirectStandardOutput</strong>, <strong>RedirectStandardError</strong> and <strong>RedirectStandardInput</strong> <a href="https://github.com/dotnet/diagnostics/blob/main/src/Tools/Common/ReversedServerHelpers/ReversedServerHelpers.cs#L95">to false</a> and by <a href="https://github.com/dotnet/diagnostics/blob/main/src/Tools/Common/ReversedServerHelpers/ReversedServerHelpers.cs#L69">ignoring the error and output streams</a>) except if you pass <strong>— show-child-io</strong> on the command line. In this case, no output for <strong>dotnet-trace</strong>.</p>
<p>I decided to do the opposite for <strong>dotnet-wait</strong>: by default, you also get the child output but you can redirect the output of the tool to a file with <strong>-o <output path name></strong>. Still, this does not solve the input problem in case of common expected keys.</p>
<p>If you remember the interactions between the tool and the monitored application, the latter is suspended until <strong>DiagnosticsClient::ResumeRuntime</strong> is called. So, why not starting the tool that spawns the application in one console and another instance of the tool in a new console that will resume the application? This is exactly what my friend <a href="https://x.com/KooKiz">Kevin Gosse</a> imagined and how <strong>dotnet-wait</strong> works.</p>
<p><img loading="lazy" src="/posts/2025-03-13_how-to-monitor-net/1_u9iVEg9L5Z_DVpn6jtcM2w.png"></p>
<p>After the timeout that you give to <strong>diagnosticsServer.AcceptAsync(cancellation.Token)</strong> has elapsed, the runtime in the child process will display the following message:</p>
<pre tabindex="0"><code>The runtime has been configured to pause during startup and is awaiting a Diagnostics IPC ResumeStartup command from a Diagnostic Port.
DOTNET_DiagnosticPorts=&#34;dotnet-wait_34296&#34;
DOTNET_DefaultDiagnosticPortSuspend=0
</code></pre><p>And this is exactly what the <strong>-r 34296</strong> parameter will do!</p>
<p>You can now install <strong>dotnet wait</strong> and monitor the lock and wait contentions of your .NET9+ applications.</p>
]]></content:encoded></item><item><title>Measuring the impact of locks and waits on latency in your .NET apps</title><link>https://chrisnas.github.io/posts/2025-01-13_measuring-the-impact-of/</link><pubDate>Mon, 13 Jan 2025 15:31:03 +0000</pubDate><guid>https://chrisnas.github.io/posts/2025-01-13_measuring-the-impact-of/</guid><description>Monitor mutex, semaphore and event wait duration</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>In an <a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">old post</a>, I detailed how to use <strong>ContentionStart</strong> and <strong>ContentionStop</strong> events to measure the lock contentions duration for a .NET application. In a <a href="https://github.com/DataDog/dd-trace-dotnet/issues/5814">.NET 9 pull request</a>, a former Criteo’s colleague <a href="https://www.linkedin.com/in/gregoire-verdier">Grégoire Verdier</a> has added new events to be notified when wait time similar to lock contention is happening for Mutex, Semaphore, Manual/AutoResetEvent. Read <a href="https://techblog.criteo.com/a-perfview-alternative-in-webassembly-f6833820b699">his post</a> for more details about what he was trying to investigate.</p>
<p>With asynchronous and multi-threaded algorithms, it is essential to detect unexpected wait/locks in our applications. This post shows you how to leverage these events to measure the duration of these waits and get the call stack when the wait started:</p>
<p><img loading="lazy" src="/posts/2025-01-13_measuring-the-impact-of/1_OTO7qWO5aYvNXprhaPvRoA.png"></p>
<h2 id="new-waithandlewait-events">New WaitHandleWait events</h2>
<p>These new events are emitted by the <strong>Microsoft-Windows-DotNETRuntime</strong> CLR provider when you enable the <strong>WaitHandle</strong> (= 0x40000000000) keyword with <strong>Verbose</strong> verbosity. Each time <strong>WaitOne</strong> is called on a waitable object and this object is already owned, a <strong>WaitHandleWaitStart</strong> event is emitted. When the object is released, a <strong>WaitHandleWaitStop</strong> event is emitted.</p>
<p>For example, the following code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">static</span> <span class="n">Mutex</span> <span class="n">mutex</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Mutex</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">static</span> <span class="k">void</span> <span class="n">Main</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">owningThread</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Thread</span><span class="p">(</span><span class="n">OwningThread</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">owningThread</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">mutexThread</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Thread</span><span class="p">(</span><span class="n">MutexThread</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">mutexThread</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">owningThread</span><span class="p">.</span><span class="n">Join</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">mutexThread</span><span class="p">.</span><span class="n">Join</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">static</span> <span class="k">void</span> <span class="n">OwningThread</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;    [{GetCurrentThreadId(), 8}] Start to hold resources&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;___________________________________________&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">mutex</span><span class="p">.</span><span class="n">WaitOne</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">Thread</span><span class="p">.</span><span class="n">Sleep</span><span class="p">(</span><span class="m">3000</span><span class="p">);</span>  <span class="c1">// the wait should last ~3 seconds</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;    Release resources&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">mutex</span><span class="p">.</span><span class="n">ReleaseMutex</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">static</span> <span class="k">void</span> <span class="n">MutexThread</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;    [{GetCurrentThreadId(), 8}] waiting for Mutex...&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">mutex</span><span class="p">.</span><span class="n">WaitOne</span><span class="p">();</span>  <span class="c1">// events are emitted in the implementation when a contention happens</span>
</span></span><span class="line"><span class="cl">    <span class="n">mutex</span><span class="p">.</span><span class="n">ReleaseMutex</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;    &lt;-- Mutex&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>generates a Start and Stop events pair:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="m">125980</span> <span class="p">|</span> <span class="m">00000000</span><span class="p">-</span><span class="m">0000</span><span class="p">-</span><span class="m">0000</span><span class="p">-</span><span class="m">0000</span><span class="p">-</span><span class="m">000000000000</span> <span class="p">&gt;</span> <span class="k">event</span> <span class="m">301</span> <span class="n">__</span> <span class="p">[</span> <span class="m">1</span><span class="p">|</span> <span class="n">Start</span><span class="p">]</span> <span class="n">WaitHandleWait</span><span class="p">/</span><span class="n">Start</span>
</span></span><span class="line"><span class="cl"><span class="m">125980</span> <span class="p">|</span> <span class="m">00000000</span><span class="p">-</span><span class="m">0000</span><span class="p">-</span><span class="m">0000</span><span class="p">-</span><span class="m">0000</span><span class="p">-</span><span class="m">000000000000</span> <span class="p">&gt;</span> <span class="k">event</span> <span class="m">302</span> <span class="n">__</span> <span class="p">[</span> <span class="m">2</span><span class="p">|</span>  <span class="n">Stop</span><span class="p">]</span> <span class="n">WaitHandleWait</span><span class="p">/</span><span class="n">Stop</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>There is no associated activity ID so you rely on the fact that the same waiter thread (125980 in the previous example) is emitting for both events.</p>
<h2 id="listening-to-the-new-waitevents">Listening to the new Wait events</h2>
<p><a href="/posts/2024-11-13_implementing-dotnet-http-to/">As usual</a>, you should rely on the <a href="https://www.nuget.org/packages/Microsoft.Diagnostics.Tracing.TraceEvent/">TraceEvent nuget</a> to start an EventPipe session with an already running .NET application. The last version already contains the definition of the keyword:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">keywords</span> <span class="p">|=</span> <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">WaitHandle</span><span class="p">;</span> <span class="c1">// .NET 9 WaitHandle kind of contention</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>and the C# events for Start and Stop:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">source</span><span class="p">.</span><span class="n">Clr</span><span class="p">.</span><span class="n">WaitHandleWaitStart</span> <span class="p">+=</span> <span class="n">OnWaitHandleWaitStart</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">source</span><span class="p">.</span><span class="n">Clr</span><span class="p">.</span><span class="n">WaitHandleWaitStop</span> <span class="p">+=</span> <span class="n">OnWaitHandleWaitStop</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The handler’s implementation is straightforward. The start of the wait is recorded for the current thread:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnWaitHandleWaitStart</span><span class="p">(</span><span class="n">WaitHandleWaitStartTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// get the contention info for the current thread</span>
</span></span><span class="line"><span class="cl">    <span class="n">ContentionInfo</span> <span class="n">info</span> <span class="p">=</span> <span class="n">_contentionStore</span><span class="p">.</span><span class="n">GetContentionInfo</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">ThreadID</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">info</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// keep track of the wait start</span>
</span></span><span class="line"><span class="cl">    <span class="n">info</span><span class="p">.</span><span class="n">ContentionStartRelativeMSec</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">TimeStampRelativeMSec</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>When the wait ends, the duration is computed based on the recorded wait start because it is not provided in the payload <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/ClrEtwAll.man#L1788">like for ContentionStop</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnWaitHandleWaitStop</span><span class="p">(</span><span class="n">WaitHandleWaitStopTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ContentionInfo</span> <span class="n">info</span> <span class="p">=</span> <span class="n">_contentionStore</span><span class="p">.</span><span class="n">GetContentionInfo</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">ThreadID</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">info</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// unlucky case when we start to listen just after the WaitHandleStart event</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">info</span><span class="p">.</span><span class="n">ContentionStartRelativeMSec</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Too bad the duration is not provided in the payload like in ContentionStop...</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">contentionDurationMSec</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">TimeStampRelativeMSec</span> <span class="p">-</span> <span class="n">info</span><span class="p">.</span><span class="n">ContentionStartRelativeMSec</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">info</span><span class="p">.</span><span class="n">ContentionStartRelativeMSec</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">duration</span> <span class="p">=</span> <span class="n">TimeSpan</span><span class="p">.</span><span class="n">FromMilliseconds</span><span class="p">(</span><span class="n">contentionDurationMSec</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;{e.ThreadId,7} | {e.Duration.TotalMilliseconds} ms&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This is nice but it would be more useful if we could get the call stack of long waits.</p>
<h2 id="call-stacks-with-eventpipe">Call stacks with EventPipe</h2>
<p>In a <a href="https://techblog.criteo.com/build-your-own-net-memory-profiler-in-c-call-stacks-2-2-1-f67b440a8cc">previous post</a>, I explained that it is possible to get the call stack when an event is emitted thanks to the <a href="https://learn.microsoft.com/en-us/dotnet/framework/performance/stack-etw-event?WT.mc_id=DT-MVP-5003325"><strong>ClrStackWalk</strong> event</a> that follows the event you are interested in. Unfortunately, this is not more the case for .NET 5+ that is using EventPipe instead of ETW.</p>
<p>As <a href="https://x.com/ocoanet">Olivier Coanet</a> presents in his <a href="https://medium.com/@ocoanet/tracing-allocations-with-eventpipe-part-2-reading-call-stacks-without-tracelog-4b0bfe4592aa">post</a>, you can get the call stack as an array of addresses from the hidden event record that is mapped by the <strong>TraceEvent</strong> parameter passed to each event handlers. This <a href="https://learn.microsoft.com/en-us/windows/win32/api/evntcons/ns-evntcons-event_record?WT.mc_id=DT-MVP-5003325"><strong>EVENT_RECORD</strong></a> structure contains a <strong>ExtendedData</strong> field that is an array of <a href="https://learn.microsoft.com/en-us/windows/win32/api/evntcons/ns-evntcons-event_header_extended_data_item?WT.mc_id=DT-MVP-5003325"><strong>EVENT_HEADER_EXTENDED_DATA_ITEM</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">struct</span> <span class="nc">EVENT_HEADER_EXTENDED_DATA_ITEM</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ushort</span> <span class="n">Reserved1</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ushort</span> <span class="n">ExtType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ushort</span> <span class="n">Reserved2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ushort</span> <span class="n">DataSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">DataPtr</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>If the <strong>ExtType</strong> value is <strong>EVENT_HEADER_EXT_TYPE_STACK_TRACE64</strong> (=6) then <strong>DataPtr</strong> points to a <a href="https://learn.microsoft.com/en-us/windows/win32/api/evntcons/ns-evntcons-event_extended_item_stack_trace64?%3FWT.mc_id=DT-MVP-5003325"><strong>EVENT_EXTENDED_ITEM_STACK_TRACE64</strong></a> structure:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">struct</span> <span class="nc">EVENT_EXTENDED_ITEM_STACK_TRACE64</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">MatchId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">unsafe</span> <span class="k">fixed</span> <span class="kt">ulong</span> <span class="n">Address</span><span class="p">[</span><span class="m">1</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>that contains an array of 64-bit addresses. The size of this array is given by <strong>DataSize — sizeof(ulong)</strong>.</p>
<p>For 32-bit applications, you will get <strong>EVENT_HEADER_EXT_TYPE_STACK_TRACE32</strong> (=5) as <strong>ExtType</strong> value and DataPtr will point to <a href="https://learn.microsoft.com/en-us/windows/win32/api/evntcons/ns-evntcons-event_extended_item_stack_trace32?WT.mc_id=DT-MVP-5003325"><strong>EVENT_EXTENDED_ITEM_STACK_TRACE32</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">struct</span> <span class="nc">EVENT_EXTENDED_ITEM_STACK_TRACE32</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">MatchId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">unsafe</span> <span class="k">fixed</span> <span class="kt">uint</span> <span class="n">Address</span><span class="p">[</span><span class="m">1</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>that stores an array of 32-bit addresses.</p>
<p>Knowing that makes writing the code to get the call stacks as an array of 64-bit addresses (same with 32-bit applications for simplicity sake) pretty straightforward:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span><span class="lnt">46
</span><span class="lnt">47
</span><span class="lnt">48
</span><span class="lnt">49
</span><span class="lnt">50
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="n">EventPipeUnresolvedStack</span> <span class="n">ReadStackUsingUnsafeAccessor</span><span class="p">(</span><span class="n">TraceEvent</span> <span class="n">traceEvent</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">GetFromEventRecord</span><span class="p">(</span><span class="n">traceEvent</span><span class="p">.</span><span class="n">eventRecord</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="n">EventPipeUnresolvedStack</span> <span class="n">GetFromEventRecord</span><span class="p">(</span><span class="n">TraceEventNativeMethods</span><span class="p">.</span><span class="n">EVENT_RECORD</span><span class="p">*</span> <span class="n">eventRecord</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">eventRecord</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">extendedDataCount</span> <span class="p">=</span> <span class="n">eventRecord</span><span class="p">-&gt;</span><span class="n">ExtendedDataCount</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">var</span> <span class="n">dataIndex</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">dataIndex</span> <span class="p">&lt;</span> <span class="n">extendedDataCount</span><span class="p">;</span> <span class="n">dataIndex</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">extendedData</span> <span class="p">=</span> <span class="n">eventRecord</span><span class="p">-&gt;</span><span class="n">ExtendedData</span><span class="p">[</span><span class="n">dataIndex</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">extendedData</span><span class="p">.</span><span class="n">ExtType</span> <span class="p">==</span> <span class="n">TraceEventNativeMethods</span><span class="p">.</span><span class="n">EVENT_HEADER_EXT_TYPE_STACK_TRACE64</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">stackRecord</span> <span class="p">=</span> <span class="p">(</span><span class="n">TraceEventNativeMethods</span><span class="p">.</span><span class="n">EVENT_EXTENDED_ITEM_STACK_TRACE64</span><span class="p">*)</span><span class="n">extendedData</span><span class="p">.</span><span class="n">DataPtr</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">addresses</span> <span class="p">=</span> <span class="p">&amp;</span><span class="n">stackRecord</span><span class="p">-&gt;</span><span class="n">Address</span><span class="p">[</span><span class="m">0</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">addressCount</span> <span class="p">=</span> <span class="p">(</span><span class="n">extendedData</span><span class="p">.</span><span class="n">DataSize</span> <span class="p">-</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">UInt64</span><span class="p">))</span> <span class="p">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">UInt64</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">addressCount</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">                <span class="k">return</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">callStackAddresses</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">ulong</span><span class="p">[</span><span class="n">addressCount</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="k">for</span> <span class="p">(</span><span class="kt">var</span> <span class="n">index</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">index</span> <span class="p">&lt;</span> <span class="n">addressCount</span><span class="p">;</span> <span class="n">index</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">callStackAddresses</span><span class="p">[</span><span class="n">index</span><span class="p">]</span> <span class="p">=</span> <span class="n">addresses</span><span class="p">[</span><span class="n">index</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="k">new</span> <span class="n">EventPipeUnresolvedStack</span><span class="p">(</span><span class="n">callStackAddresses</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">extendedData</span><span class="p">.</span><span class="n">ExtType</span> <span class="p">==</span> <span class="n">TraceEventNativeMethods</span><span class="p">.</span><span class="n">EVENT_HEADER_EXT_TYPE_STACK_TRACE32</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">stackRecord</span> <span class="p">=</span> <span class="p">(</span><span class="n">TraceEventNativeMethods</span><span class="p">.</span><span class="n">EVENT_EXTENDED_ITEM_STACK_TRACE32</span><span class="p">*)</span><span class="n">extendedData</span><span class="p">.</span><span class="n">DataPtr</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">addresses</span> <span class="p">=</span> <span class="p">&amp;</span><span class="n">stackRecord</span><span class="p">-&gt;</span><span class="n">Address</span><span class="p">[</span><span class="m">0</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">addressCount</span> <span class="p">=</span> <span class="p">(</span><span class="n">extendedData</span><span class="p">.</span><span class="n">DataSize</span> <span class="p">-</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">UInt32</span><span class="p">))</span> <span class="p">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">UInt32</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">addressCount</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">                <span class="k">return</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">callStackAddresses</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">ulong</span><span class="p">[</span><span class="n">addressCount</span><span class="p">];</span>  <span class="c1">// store the 32 addresses as 64 bit addresses</span>
</span></span><span class="line"><span class="cl">            <span class="k">for</span> <span class="p">(</span><span class="kt">var</span> <span class="n">index</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">index</span> <span class="p">&lt;</span> <span class="n">addressCount</span><span class="p">;</span> <span class="n">index</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">callStackAddresses</span><span class="p">[</span><span class="n">index</span><span class="p">]</span> <span class="p">=</span> <span class="n">addresses</span><span class="p">[</span><span class="n">index</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="k">new</span> <span class="n">EventPipeUnresolvedStack</span><span class="p">(</span><span class="n">callStackAddresses</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that the last version of TraceEvent nuget provides a public access to the <strong>eventRecord</strong> field so it is no more needed to use the <strong>UnsafeAccessor</strong> attribute used by Olivier.</p>
<h2 id="symbolize-the-call-stack-addresses">Symbolize the call stack addresses</h2>
<p>Address is good but the corresponding method name is better. I won’t repeat what I’ve already detailed in <a href="https://techblog.criteo.com/build-your-own-net-memory-profiler-in-c-call-stacks-2-2-2-ec9657eb17f9?source=friends_link&amp;sk=b34465f583867cb7dcf5bad6395bf151">an older post</a> that shows how to get the name of a native and managed name from an instruction pointer address. Instead, I want to pinpoint a big limitation of this solution to listen to CLR provider <strong>MethodLoadVerbose</strong>/<strong>MethodDCStartVerboseV2</strong> events. If the methods you are interested in are jitted BEFORE your tool attaches to the application, you will never get these events.</p>
<p>You could get the same mapping address span/method name via the other “<em>Microsoft-Windows-DotNETRuntimeRundown</em>” provider and its <strong>MethodDCEndVerbose</strong> event that contains the expected <strong>MethodStartAddress</strong>, <strong>MethodSize</strong> and <strong>MethodName</strong> in its <a href="https://github.com/dotnet/runtime/blob/d897415e02340a13dc1c5078c09937bdf7ec8a56/src/coreclr/vm/ClrEtwAll.man#L4864">payload</a>. But I need this information before the end of the application…</p>
<p>Looking at <a href="https://github.com/dotnet/docs/blob/main/docs/framework/performance/clr-etw-keywords-and-levels.md#keyword-combinations-for-symbol-resolution-for-the-rundown-provider">the documentation</a>, it seems that the rundown provider accepts the <strong>StartRundownKeyword</strong> value to emit the DCStart events when the provider is enabled! <a href="https://github.com/dotnet/runtime/issues/42378">Since .NET 9</a>, it is possible to pass the keywords you want (before, the default value did not contain <strong>StartRundownKeyword</strong>) when creating the EventPipe session</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">//                V-- this is the default rundown keyword</span>
</span></span><span class="line"><span class="cl"><span class="n">rundownKeywords</span> <span class="p">=</span> <span class="m">0x80020139</span> <span class="p">|</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">StartEnumeration</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">config</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EventPipeSessionConfiguration</span><span class="p">(</span><span class="n">GetProviders</span><span class="p">(),</span> <span class="m">256</span><span class="p">,</span> <span class="n">rundownKeywords</span><span class="p">,</span> <span class="kc">true</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">using</span> <span class="p">(</span><span class="kt">var</span> <span class="n">session</span> <span class="p">=</span> <span class="n">client</span><span class="p">.</span><span class="n">StartEventPipeSession</span><span class="p">(</span><span class="n">config</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">source</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EventPipeEventSource</span><span class="p">(</span><span class="n">session</span><span class="p">.</span><span class="n">EventStream</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">RegisterListeners</span><span class="p">(</span><span class="n">source</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// this is a blocking call</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">Process</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that you should not add the rundown provider to the list passed as parameter.</p>
<p>Unfortunately, there is <a href="https://github.com/dotnet/runtime/issues/42378">currently an issue in the runtime since September 2020</a> that pinpoints this exact problem. I even tried to create and close a session to get the DCStop events before recreating a new one, but I failed.</p>
<p>The next episode will talk about how it is possible to start a .NET application and get the events since its startup… with the problems that are happening.</p>
]]></content:encoded></item><item><title>Monitor HTTP redirects to reduce unexpected latency</title><link>https://chrisnas.github.io/posts/2024-12-13_monitor-http-redirects-to/</link><pubDate>Fri, 13 Dec 2024 14:21:41 +0000</pubDate><guid>https://chrisnas.github.io/posts/2024-12-13_monitor-http-redirects-to/</guid><description>This post details how redirection handling impacts dotnet-http tool implementation but also possibly your requests latency.</description><content:encoded><![CDATA[<hr>
<p>In the <a href="/posts/2024-11-13_implementing-dotnet-http-to/">previous post</a>, I detailed how I used the undocumented events from the BCL to create the <a href="https://www.nuget.org/packages/dotnet-http">dotnet-http CLI tool</a> to monitor your outgoing HTTP requests. After testing with older versions of .NET, I realized that the code needed to be updated and I’m sharing my findings in this post.</p>
<p>The main point is that url redirections could have a major impact on requests latency:</p>
<p><img loading="lazy" src="/posts/2024-12-13_monitor-http-redirects-to/1_w_2f8F9E6hGmoiGLBCMqfw.png"></p>
<h2 id="always-test-supported-versions">Always test supported versions…</h2>
<p>When I wrote the initial version of dotnet-http, I only tested it with .NET 8 and .NET 9 with limited formats of urls. Unfortunately, things went bad when I tried to monitor applications running on .NET 5 and .NET 6: no events are emitted by these versions of the BCL.</p>
<p>So, the next step was to test .NET 7 and the result was simple: crash! After investigating, I realized that some events I looked at in .NET 8 source code did not have the same payload in .NET 7; even no payload at all:</p>
<p><img loading="lazy" src="/posts/2024-12-13_monitor-http-redirects-to/1_2HJxVmyxQfzhCr1i7_dpWw.png"></p>
<p>Even though there is no version field in the events payload, it is easy to check its size such as the following:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnConnectionEstablished</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">     <span class="n">DateTime</span> <span class="n">timestamp</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="kt">int</span> <span class="n">threadId</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="n">Guid</span> <span class="n">activityId</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="n">Guid</span> <span class="n">relatedActivityId</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">     <span class="kt">byte</span><span class="p">[]</span> <span class="n">eventData</span>
</span></span><span class="line"><span class="cl">     <span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="p">...</span>
</span></span><span class="line"><span class="cl">     <span class="n">EventSourcePayload</span> <span class="n">payload</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EventSourcePayload</span><span class="p">(</span><span class="n">eventData</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">     <span class="kt">var</span> <span class="n">versionMajor</span> <span class="p">=</span> <span class="n">payload</span><span class="p">.</span><span class="n">GetByte</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">     <span class="kt">var</span> <span class="n">versionMinor</span> <span class="p">=</span> <span class="n">payload</span><span class="p">.</span><span class="n">GetByte</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">     <span class="c1">// in .NET 7, nothing else is available</span>
</span></span><span class="line"><span class="cl">     <span class="n">Int64</span> <span class="n">connectionId</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="kt">var</span> <span class="n">scheme</span> <span class="p">=</span> <span class="s">&#34;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="kt">var</span> <span class="n">host</span> <span class="p">=</span> <span class="s">&#34;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="n">UInt32</span> <span class="n">port</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="kt">var</span> <span class="n">path</span> <span class="p">=</span> <span class="s">&#34;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">     <span class="k">if</span> <span class="p">(</span><span class="n">eventData</span><span class="p">.</span><span class="n">Length</span> <span class="p">&gt;</span> <span class="m">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">     <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="n">connectionId</span> <span class="p">=</span> <span class="n">payload</span><span class="p">.</span><span class="n">GetInt64</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">         <span class="n">scheme</span> <span class="p">=</span> <span class="n">payload</span><span class="p">.</span><span class="n">GetString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">         <span class="n">host</span> <span class="p">=</span> <span class="n">payload</span><span class="p">.</span><span class="n">GetString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">         <span class="n">port</span> <span class="p">=</span> <span class="n">payload</span><span class="p">.</span><span class="n">GetUInt32</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">         <span class="n">path</span> <span class="p">=</span> <span class="n">payload</span><span class="p">.</span><span class="n">GetString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">     <span class="p">}</span>
</span></span><span class="line"><span class="cl">     <span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Another difference is that one even is not even emitted in .NET 7:</p>
<p><img loading="lazy" src="/posts/2024-12-13_monitor-http-redirects-to/1_Iwpe5YVHJwyBnRFvHMTiQw.png"></p>
<h2 id="explain-what-a-redirection-isplease">Explain what a redirection is please!</h2>
<p>Because I based the code on this Redirect event, I needed to find another way to support .NET 7 even though I would not have a redirected url to display. But first, let’s see what I’m talking about in terms of HTTP communication.</p>
<p>When you try to get the content / status code behind a url, the code is following different phases with the related events:</p>
<ul>
<li>**Start
**RequestStart</li>
<li>**DNS resolution
**ResolutionStart
ResolutionStop/Fail</li>
<li>**Socket connection
**ConnectStart
ConnectStop</li>
<li>**Security hand check (HTTPS only)
**HandshakeStart
HandshakeStop/Failed</li>
<li>**Request/response
**RequestHeadersStart
RequestHeadersStop
ResponseHeadersStart
ResponseHeadersStop
Redirect (.NET 8+)
ResponseContentStart
ResponseContentStop</li>
<li>**Request stop
**RequestStop/Failed</li>
</ul>
<p>Based on the received url, a server can decide to answer that another url should be used instead. For example, if you call github with <strong>http://</strong> instead of <a href="https://,"><strong>https://</strong>,</a> such a redirection will happen. Without invasive tools such as Wireshark, these redirections are impossible to detect and could cause unnecessary delay.</p>
<p>From the client perspective, this can be detected in the <strong>ResponseHeadersStop</strong> event payload that provides a status code. If its value is 301, then it is a redirection. The other effect of a redirection is that the BCL code will change the flow of events because it needs to start over with the new url from step 2. to step 6. As you can see, instead of paying the cost of just one request, two are actually emitted and processed.</p>
<h2 id="impact-on-the-implementation">Impact on the implementation</h2>
<p>In addition to the payload size checks addition, my initial implementation was not properly handling the redirection because the values (timestamps and durations) where overridden by the events related to the second redirected url.</p>
<p>The new implementation is splitting the request details into two classes. A base class that contains common fields to both parts of the request in case of redirection:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">class</span> <span class="nc">HttpRequestInfoBase</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">HttpRequestInfoBase</span><span class="p">(</span><span class="n">DateTime</span> <span class="n">timestamp</span><span class="p">,</span> <span class="kt">string</span> <span class="n">scheme</span><span class="p">,</span> <span class="kt">string</span> <span class="n">host</span><span class="p">,</span> <span class="kt">uint</span> <span class="n">port</span><span class="p">,</span> <span class="kt">string</span> <span class="n">path</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">       <span class="p">{</span>
</span></span><span class="line"><span class="cl">           <span class="n">StartTime</span> <span class="p">=</span> <span class="n">timestamp</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">           <span class="k">if</span> <span class="p">(</span><span class="n">scheme</span> <span class="p">==</span> <span class="kt">string</span><span class="p">.</span><span class="n">Empty</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">           <span class="p">{</span>
</span></span><span class="line"><span class="cl">               <span class="n">Url</span> <span class="p">=</span> <span class="kt">string</span><span class="p">.</span><span class="n">Empty</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">           <span class="p">}</span>
</span></span><span class="line"><span class="cl">           <span class="k">else</span>
</span></span><span class="line"><span class="cl">           <span class="p">{</span>
</span></span><span class="line"><span class="cl">               <span class="k">if</span> <span class="p">(</span><span class="n">port</span> <span class="p">!=</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">               <span class="p">{</span>
</span></span><span class="line"><span class="cl">                   <span class="n">Url</span> <span class="p">=</span> <span class="s">$&#34;{scheme}://{host}:{port}{path}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">               <span class="p">}</span>
</span></span><span class="line"><span class="cl">               <span class="k">else</span>
</span></span><span class="line"><span class="cl">               <span class="p">{</span>
</span></span><span class="line"><span class="cl">                   <span class="n">Url</span> <span class="p">=</span> <span class="s">$&#34;{scheme}://{host}:{path}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">               <span class="p">}</span>
</span></span><span class="line"><span class="cl">           <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">string</span> <span class="n">Url</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">DateTime</span> <span class="n">StartTime</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">DateTime</span> <span class="n">ReqRespStartTime</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">ReqRespDuration</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="c1">// DNS</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">DnsWait</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">DateTime</span> <span class="n">DnsStartTime</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">DnsDuration</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="c1">// HTTPS</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">HandshakeWait</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">DateTime</span> <span class="n">HandshakeStartTime</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">HandshakeDuration</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="c1">// socket connection</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">DateTime</span> <span class="n">SocketConnectionStartTime</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">SocketWait</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">SocketDuration</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">DateTime</span> <span class="n">QueueuingEndTime</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">double</span> <span class="n">QueueingDuration</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The second one inherits from the base class and contains addition details; including the details of the redirected url if any:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">class</span> <span class="nc">HttpRequestInfo</span> <span class="p">:</span> <span class="n">HttpRequestInfoBase</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">HttpRequestInfo</span><span class="p">(</span><span class="n">DateTime</span> <span class="n">timestamp</span><span class="p">,</span> <span class="kt">string</span> <span class="n">scheme</span><span class="p">,</span> <span class="kt">string</span> <span class="n">host</span><span class="p">,</span> <span class="kt">uint</span> <span class="n">port</span><span class="p">,</span> <span class="kt">string</span> <span class="n">path</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">           <span class="p">:</span>
</span></span><span class="line"><span class="cl">           <span class="k">base</span><span class="p">(</span><span class="n">timestamp</span><span class="p">,</span> <span class="n">scheme</span><span class="p">,</span> <span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">       <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">HttpRequestInfoBase</span> <span class="n">Redirect</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="n">UInt32</span> <span class="n">StatusCode</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="c1">// HTTPS</span>
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">string</span> <span class="n">HandshakeErrorMessage</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="kd">public</span> <span class="kt">string</span> <span class="n">Error</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>A new instance of <strong>HttpRequestInfoBase</strong> is created when a 301 status code is received in <strong>HttpResponseHeaderStop</strong> handler:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnHttpResponseHeaderStop</span><span class="p">(</span><span class="kt">object</span> <span class="n">sender</span><span class="p">,</span> <span class="n">HttpRequestStatusEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// used to detect redirection in .NET 8+</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">StatusCode</span> <span class="p">!=</span> <span class="m">301</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// create a new request info for the redirected request</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// because .NET 7 does not emit a Redirect event, we need to create a new request info here</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// --&gt; it means that the redirect url will be empty in .NET 7</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">root</span> <span class="p">=</span> <span class="n">GetRoot</span><span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">ActivityId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">_requests</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="k">out</span> <span class="n">HttpRequestInfo</span> <span class="n">info</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">info</span><span class="p">.</span><span class="n">Redirect</span> <span class="p">=</span> <span class="k">new</span> <span class="n">HttpRequestInfoBase</span><span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">Timestamp</span><span class="p">,</span> <span class="s">&#34;&#34;</span><span class="p">,</span> <span class="s">&#34;&#34;</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="s">&#34;&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="c1">// if you really want to have the duration of both original request + redirected request,</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// then do the following:</span>
</span></span><span class="line"><span class="cl">            <span class="c1">//    info.ReqRespDuration = (e.Timestamp - info.ReqRespStartTime).TotalMilliseconds;</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// However, I prefer to show the duration of the redirected request only to more easily</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// compute the cost of the initial redirected request = total duration - other durations</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For .NET 8+, the redirected url is provided to the <strong>Redirect</strong> handler and stored in the Url field of the instance created in the previous handler:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnHttpRedirect</span><span class="p">(</span><span class="kt">object</span> <span class="n">sender</span><span class="p">,</span> <span class="n">HttpRedirectEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// since this is an Info event, the activityID is the root</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">root</span> <span class="p">=</span> <span class="n">ActivityHelpers</span><span class="p">.</span><span class="n">ActivityPathString</span><span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">ActivityId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">_requests</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="k">out</span> <span class="n">HttpRequestInfo</span> <span class="n">info</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">info</span><span class="p">.</span><span class="n">Redirect</span><span class="p">.</span><span class="n">Url</span> <span class="p">=</span> <span class="n">e</span><span class="p">.</span><span class="n">RedirectUrl</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In each handler of events that could be received for both initial and redirected requests,</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnHttpResponseContentStop</span><span class="p">(</span><span class="kt">object</span> <span class="n">sender</span><span class="p">,</span> <span class="n">EventPipeBaseArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">       <span class="kt">var</span> <span class="n">root</span> <span class="p">=</span> <span class="n">GetRoot</span><span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">ActivityId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">       <span class="k">if</span> <span class="p">(!</span><span class="n">_requests</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="k">out</span> <span class="n">HttpRequestInfo</span> <span class="n">info</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">       <span class="p">{</span>
</span></span><span class="line"><span class="cl">           <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">       <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="k">if</span> <span class="p">(</span><span class="n">info</span><span class="p">.</span><span class="n">Redirect</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">       <span class="p">{</span>
</span></span><span class="line"><span class="cl">           <span class="n">info</span><span class="p">.</span><span class="n">ReqRespDuration</span> <span class="p">=</span> <span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">Timestamp</span> <span class="p">-</span> <span class="n">info</span><span class="p">.</span><span class="n">ReqRespStartTime</span><span class="p">).</span><span class="n">TotalMilliseconds</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">       <span class="p">}</span>
</span></span><span class="line"><span class="cl">       <span class="k">else</span>
</span></span><span class="line"><span class="cl">       <span class="p">{</span>
</span></span><span class="line"><span class="cl">           <span class="n">info</span><span class="p">.</span><span class="n">Redirect</span><span class="p">.</span><span class="n">ReqRespDuration</span> <span class="p">=</span> <span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">Timestamp</span> <span class="p">-</span> <span class="n">info</span><span class="p">.</span><span class="n">Redirect</span><span class="p">.</span><span class="n">ReqRespStartTime</span><span class="p">).</span><span class="n">TotalMilliseconds</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">       <span class="p">}</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The wait time and durations are now computed as the events are received and aggregated at the end of the request by adding the value of both parts (initial and redirected if any:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">double</span> <span class="n">dnsDuration</span> <span class="p">=</span> <span class="n">info</span><span class="p">.</span><span class="n">DnsDuration</span> <span class="p">+</span> <span class="p">((</span><span class="n">info</span><span class="p">.</span><span class="n">Redirect</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span> <span class="p">?</span> <span class="n">info</span><span class="p">.</span><span class="n">Redirect</span><span class="p">.</span><span class="n">DnsDuration</span> <span class="p">:</span> <span class="m">0</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">dnsDuration</span> <span class="p">&gt;</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">double</span> <span class="n">dnsWait</span> <span class="p">=</span> <span class="n">info</span><span class="p">.</span><span class="n">DnsWait</span> <span class="p">+</span> <span class="p">((</span><span class="n">info</span><span class="p">.</span><span class="n">Redirect</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span> <span class="p">?</span> <span class="n">info</span><span class="p">.</span><span class="n">Redirect</span><span class="p">.</span><span class="n">DnsWait</span> <span class="p">:</span> <span class="m">0</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">Write</span><span class="p">(</span><span class="s">$&#34;{dnsWait,9:F3} | {dnsDuration,9:F3} | &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">Write</span><span class="p">(</span><span class="s">$&#34;          |           | &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>As a conclusion, you should try to monitor these redirections by using my <a href="https://www.nuget.org/packages/dotnet-http">dotnet-http</a> CLI tool. Feel free to download it or install it with the following command line: <strong>dotnet tool install -g dotnet-http <strong>or update to the latest version with</strong>: dotnet tool update -g dotnet-http</strong></p>
<p>You could also integrate some event listening code into your framework that simply handles the <strong>ResponseHeadersStop</strong>/<strong>Redirect</strong> events.</p>
]]></content:encoded></item><item><title>Digging into the undocumented .NET events from the BCL</title><link>https://chrisnas.github.io/posts/2024-10-13_digging-into-the-undocumented/</link><pubDate>Sun, 13 Oct 2024 12:12:37 +0000</pubDate><guid>https://chrisnas.github.io/posts/2024-10-13_digging-into-the-undocumented/</guid><description>This is the first episode of a new series describing how to listen to undocumented .NET events and focusing on monitoring HTTP requests</description><content:encoded><![CDATA[<hr>
<p>I’ve presented in depth the events emitted by the CLR <a href="https://github.com/chrisnas/ClrEvents">in many posts</a> to get insightful details about how the .NET runtime is working (lock contention, GC, allocations, …). Some .NET features are not implemented at the runtime level but at the Base Class Library (a.k.a. BCL) level. For example, if you are using <a href="https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httpclient?WT.mc_id=DT-MVP-5003325"><strong>HttpClient</strong></a>, you might want to measure how long it takes to get the response to your HTTP requests.</p>
<p>In this new series, I will describe how the BCL is using <a href="https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.eventsource?WT.mc_id=DT-MVP-5003325"><strong>EventSource</strong></a> to emit the events, how you can listen to them with TraceEvent; focusing on HTTP requests.</p>
<h2 id="where-do-these-events-comefrom">Where do these events come from?</h2>
<p>When you look for these BCL events in the documentation, you end up to this <a href="https://learn.microsoft.com/en-us/dotnet/core/diagnostics/well-known-event-providers#systemnethttp-provider?WT.mc_id=DT-MVP-5003325">well-known event provides in .NET</a> page and in the <a href="https://learn.microsoft.com/en-us/dotnet/core/diagnostics/well-known-event-providers#framework-libraries?WT.mc_id=DT-MVP-5003325">Framework libraries section</a>. It lists the providers with their name and the emitted events with their keyword and verbosity. However, there is no detail about their payload! For example, for the “System.Net.Http” provider, you know that the <strong>RequestStart</strong> informal event is emitted each time “an HTTP request has started”. But you don’t know the corresponding url… Let me tell you how to get these tiny details.</p>
<p>Within the BCL, the classes responsible for emitting events derive from <a href="https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.eventsource?WT.mc_id=DT-MVP-5003325"><strong>EventSource</strong></a> with the “<em>Telemetry</em>” suffix as a naming convention. They are decorated with an <strong>EventSource</strong> attribute to define their name (used as provider name by a listener) and their Guid that will be provided with each event. For example, here is <a href="https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/HttpTelemetry.cs">the declaration of the class</a> responsible for events related to sending HTTP requests:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[EventSource(Name = &#34;System.Net.Http&#34;)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">internal</span> <span class="kd">sealed</span> <span class="kd">partial</span> <span class="k">class</span> <span class="nc">HttpTelemetry</span> <span class="p">:</span> <span class="n">EventSource</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>If a Guid is not provided, one is automatically computed (see the <a href="https://github.com/microsoft/perfview/blob/main/src/TraceEvent/TraceEventSession.cs#L2854">corresponding code in Perfview</a>).</p>
<p>Next, public helper methods decorated with a <a href="https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.noneventattribute?view=net-8.0&amp;WT.mc_id=DT-MVP-5003325"><strong>NonEvent</strong> attribute</a> are provided to be used in the BCL code when it is needed to emit events. These methods are calling private methods decorated with an <a href="https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.eventattribute?WT.mc_id=DT-MVP-5003325"><strong>Event</strong> attribute</a> to define their unique ID and their verbosity level:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[NonEvent]</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">void</span> <span class="n">RequestStart</span><span class="p">(</span><span class="n">HttpRequestMessage</span> <span class="n">request</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="n">RequestStart</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">request</span><span class="p">.</span><span class="n">RequestUri</span><span class="p">.</span><span class="n">Scheme</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">request</span><span class="p">.</span><span class="n">RequestUri</span><span class="p">.</span><span class="n">IdnHost</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">request</span><span class="p">.</span><span class="n">RequestUri</span><span class="p">.</span><span class="n">Port</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">request</span><span class="p">.</span><span class="n">RequestUri</span><span class="p">.</span><span class="n">PathAndQuery</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="kt">byte</span><span class="p">)</span><span class="n">request</span><span class="p">.</span><span class="n">Version</span><span class="p">.</span><span class="n">Major</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="kt">byte</span><span class="p">)</span><span class="n">request</span><span class="p">.</span><span class="n">Version</span><span class="p">.</span><span class="n">Minor</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">request</span><span class="p">.</span><span class="n">VersionPolicy</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span> 
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">[Event(1, Level = EventLevel.Informational)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">RequestStart</span><span class="p">(</span><span class="kt">string</span> <span class="n">scheme</span><span class="p">,</span> <span class="kt">string</span> <span class="n">host</span><span class="p">,</span> <span class="kt">int</span> <span class="n">port</span><span class="p">,</span> <span class="kt">string</span> <span class="n">pathAndQuery</span><span class="p">,</span> <span class="kt">byte</span> <span class="n">versionMajor</span><span class="p">,</span> <span class="kt">byte</span> <span class="n">versionMinor</span><span class="p">,</span> <span class="n">HttpVersionPolicy</span> <span class="n">versionPolicy</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="n">WriteEvent</span><span class="p">(</span><span class="n">eventId</span><span class="p">:</span> <span class="m">1</span><span class="p">,</span> <span class="n">scheme</span><span class="p">,</span> <span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="p">,</span> <span class="n">pathAndQuery</span><span class="p">,</span> <span class="n">versionMajor</span><span class="p">,</span> <span class="n">versionMinor</span><span class="p">,</span> <span class="n">versionPolicy</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The different <strong>WriteEvent</strong> overloads are responsible for filling an array of <a href="https://github.com/dotnet/runtime/blob/main/src/libraries/System.Diagnostics.Tracing/ref/System.Diagnostics.Tracing.cs#L255"><strong>EventData</strong> elements</a> (each one contains a pointer to the data and its size) that is passed to <a href="https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Tracing/EventSource.cs#L1332"><strong>EventSource.WriteEventCore</strong></a>. This helper methods dispatches the event to EventPipe/ETW pipelines.</p>
<h2 id="document-the-undocumented">Document the undocumented</h2>
<p>Unlike CLR events, their payload is not explicitly described in a text file but, instead, you have to look at the implementation of the different <strong>WriteEvent</strong> overloads. The rest of this section provides the details of the sources I’ve looked at and the corresponding events payload.</p>
<h2 id="http">HTTP</h2>
<p>Sources: <a href="https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/HttpTelemetry.cs">https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/HttpTelemetry.cs</a>
Name: System.Net.Http
Guid: d30b5633–7ef1–5485-b4e0–94979b102068</p>
<p><img loading="lazy" src="/posts/2024-10-13_digging-into-the-undocumented/1_3M-B2oQSv9q67xj-wkb_HA.png"></p>
<h2 id="sockets">Sockets</h2>
<p>Sources: <a href="https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketsTelemetry.cs">https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketsTelemetry.cs</a></p>
<p>Name: System.Net.Sockets
Guid: d5b2e7d4-b6ec-50ae-7cde-af89427ad21f</p>
<p><img loading="lazy" src="/posts/2024-10-13_digging-into-the-undocumented/1_9NOqFY_5ZXR6toGqRE3IPg.png"></p>
<h2 id="dns">DNS</h2>
<p>Sources: <a href="https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionTelemetry.cs">https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionTelemetry.cs</a></p>
<p>Name: System.Net.NameResolution
Guid: 4b326142-bfb5–5ed3–8585–7714181d14b0</p>
<p><img loading="lazy" src="/posts/2024-10-13_digging-into-the-undocumented/1_ZcftMvnViwYebVCW_q1nXg.png"></p>
<h2 id="network-security">Network Security</h2>
<p>Sources: <a href="https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System/Net/Security/NetSecurityTelemetry.cs">https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Security/src/System/Net/Security/NetSecurityTelemetry.cs</a></p>
<p>Name: System.Net.Security
 Guid: 7beee6b1-e3fa-5ddb-34be-1404ad0e2520</p>
<p><img loading="lazy" src="/posts/2024-10-13_digging-into-the-undocumented/1_aNAxTUuPK3kNRcCbUnwN9A.png"></p>
<p>The next episode will describe how to listen to these events and extract their payload in C#.</p>
]]></content:encoded></item><item><title>Unexpected usage of EventSource or how to test statistical results in CLR pull request</title><link>https://chrisnas.github.io/posts/2024-09-13_unexpected-usage-of-eventsourc/</link><pubDate>Fri, 13 Sep 2024 14:25:30 +0000</pubDate><guid>https://chrisnas.github.io/posts/2024-09-13_unexpected-usage-of-eventsourc/</guid><description>Unexpected usage of EventSource</description><content:encoded><![CDATA[<hr>
<h2 id="testing-the-statistical-results">Testing the statistical results</h2>
<p>In parallel of the performance impact, it is important to validate the expected statistical distribution of the sampled allocations. Basically, I need to execute the same run of allocations multiple times in a row. Each run allocates the same number of instances of different types. For example, it is interesting to know if sampling instances of types with sizes proportional to a base value gives good results. Same question for totally different sized types or with Finalizers.</p>
<p>I would like to pass the number of runs to execute and a given scenario to a C# runner program and listen to the emitted events in another C# listener.</p>
<p>I’m facing 3 issues here:</p>
<ul>
<li>How many instances are allocated to validate the upscaling algorithm (sampled vs real count)</li>
<li>What are the types I want to focus on because I don’t want to hard code them in the listener application.</li>
<li>When does each run start?</li>
</ul>
<p>It would be great if I could send the answer to these questions via events so the listener would know at runtime. Well… This is exactly what a class inherited from <a href="https://learn.microsoft.com/en-us/dotnet/core/diagnostics/eventsource?WT.mc_id=DT-MVP-5003325"><strong>EventSource</strong></a> allows you to do!</p>
<p>In the runner application, I’ve defined the <strong>AllocationsRunEventSource</strong> that is decorated with the <strong>EventSource</strong> attribute to set its name that will be used as a provider name like <em>Microsoft-Windows-DotNETRuntime</em> for the .NET runtime provider.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[EventSource(Name = &#34;Allocations-Run&#34;)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">AllocationsRunEventSource</span> <span class="p">:</span> <span class="n">EventSource</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="k">readonly</span> <span class="n">AllocationsRunEventSource</span> <span class="n">Log</span> <span class="p">=</span> <span class="k">new</span> <span class="n">AllocationsRunEventSource</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">The</span> <span class="n">four</span> <span class="n">implemented</span> <span class="n">methods</span> <span class="n">are</span> <span class="n">defining</span> <span class="n">which</span> <span class="k">event</span> <span class="n">ID</span> <span class="k">for</span> <span class="n">which</span> <span class="n">verbosity</span> <span class="n">will</span> <span class="n">be</span> <span class="n">emitted</span> <span class="n">with</span> <span class="n">which</span> <span class="n">payload</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="na">    [Event(600, Level = EventLevel.Informational)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">StartRun</span><span class="p">(</span><span class="kt">int</span> <span class="n">iterationsCount</span><span class="p">,</span> <span class="kt">int</span> <span class="n">allocationCount</span><span class="p">,</span> <span class="kt">string</span> <span class="n">listOfTypes</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">WriteEvent</span><span class="p">(</span><span class="n">eventId</span><span class="p">:</span> <span class="m">600</span><span class="p">,</span> <span class="n">iterationsCount</span><span class="p">,</span> <span class="n">allocationCount</span><span class="p">,</span> <span class="n">listOfTypes</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [Event(601, Level = EventLevel.Informational)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">StopRun</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">WriteEvent</span><span class="p">(</span><span class="n">eventId</span><span class="p">:</span> <span class="m">601</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [Event(602, Level = EventLevel.Informational)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">StartIteration</span><span class="p">(</span><span class="kt">int</span> <span class="n">iteration</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">WriteEvent</span><span class="p">(</span><span class="n">eventId</span><span class="p">:</span> <span class="m">602</span><span class="p">,</span> <span class="n">iteration</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [Event(603, Level = EventLevel.Informational)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">StopIteration</span><span class="p">(</span><span class="kt">int</span> <span class="n">iteration</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">WriteEvent</span><span class="p">(</span><span class="n">eventId</span><span class="p">:</span> <span class="m">603</span><span class="p">,</span> <span class="n">iteration</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>To make the payload serialization and parsing easy, the list of types that will be allocated is passed as a string with the following format <em>allocatedTypes = “Object24;Object48;Object72;Object32;Object64;Object96”</em>.</p>
<p>The code of the runner calls these methods as expected at different moment of the execution:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">AllocationsRunEventSource</span><span class="p">.</span><span class="n">Log</span><span class="p">.</span><span class="n">StartRun</span><span class="p">(</span><span class="n">iterations</span><span class="p">,</span> <span class="n">allocationsCount</span><span class="p">,</span> <span class="n">allocatedTypes</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"> <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">iterations</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="n">AllocationsRunEventSource</span><span class="p">.</span><span class="n">Log</span><span class="p">.</span><span class="n">StartIteration</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">     <span class="n">allocationsRun</span><span class="p">.</span><span class="n">Allocate</span><span class="p">(</span><span class="n">allocationsCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">     <span class="n">AllocationsRunEventSource</span><span class="p">.</span><span class="n">Log</span><span class="p">.</span><span class="n">StopIteration</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl"> <span class="n">AllocationsRunEventSource</span><span class="p">.</span><span class="n">Log</span><span class="p">.</span><span class="n">StopRun</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Instead of recording the events with dotnet-trace, this time I’m using TraceEvent and Microsoft.Diagnostics.NETCore.Client to code a listener application. The code is very similar to <a href="/posts/2024-05-22_trigger-your-gcs-with/">what was presented</a> for my dotnet-fullgc CLI tool except that I’m enabling the <strong>AllocationsRun</strong> provider corresponding to the event source of the runner in addition to the .NET runtime one:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">PrintEventsLive</span><span class="p">(</span><span class="kt">int</span> <span class="n">processId</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">providers</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">EventPipeProvider</span><span class="p">&gt;()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">new</span> <span class="n">EventPipeProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">                <span class="s">&#34;Microsoft-Windows-DotNETRuntime&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="n">EventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span> <span class="c1">// verbose is required for AllocationTick</span>
</span></span><span class="line"><span class="cl">                <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="m">0x80000000001</span> <span class="c1">// new AllocationSamplingKeyword + GCKeyword</span>
</span></span><span class="line"><span class="cl">                <span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="k">new</span> <span class="n">EventPipeProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">                <span class="s">&#34;Allocations-Run&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="n">EventLevel</span><span class="p">.</span><span class="n">Informational</span>
</span></span><span class="line"><span class="cl">                <span class="p">),</span>
</span></span><span class="line"><span class="cl">    <span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The custom events from that provider are received via the <strong>source.Dynamic.All</strong> C# event:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">client</span> <span class="p">=</span> <span class="k">new</span> <span class="n">DiagnosticsClient</span><span class="p">(</span><span class="n">processId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">using</span> <span class="p">(</span><span class="kt">var</span> <span class="n">session</span> <span class="p">=</span> <span class="n">client</span><span class="p">.</span><span class="n">StartEventPipeSession</span><span class="p">(</span><span class="n">providers</span><span class="p">,</span> <span class="kc">false</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">Task</span> <span class="n">streamTask</span> <span class="p">=</span> <span class="n">Task</span><span class="p">.</span><span class="n">Run</span><span class="p">(()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">source</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EventPipeEventSource</span><span class="p">(</span><span class="n">session</span><span class="p">.</span><span class="n">EventStream</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="n">_source</span> <span class="p">=</span> <span class="n">source</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">ClrTraceEventParser</span> <span class="n">clrParser</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ClrTraceEventParser</span><span class="p">(</span><span class="n">source</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="n">clrParser</span><span class="p">.</span><span class="n">GCAllocationTick</span> <span class="p">+=</span> <span class="n">OnAllocationTick</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">source</span><span class="p">.</span><span class="n">Dynamic</span><span class="p">.</span><span class="n">All</span> <span class="p">+=</span> <span class="n">OnEvents</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Because TraceEvent is not already aware of the new <strong>AllocationSampled</strong> event emitted by the PR code, it will also be received via the same <strong>OnEvent</strong> handler:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">OnEvents</span><span class="p">(</span><span class="n">TraceEvent</span> <span class="n">eventData</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">eventData</span><span class="p">.</span><span class="n">ID</span> <span class="p">==</span> <span class="p">(</span><span class="n">TraceEventID</span><span class="p">)</span><span class="m">303</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// AllocationSampled parsing </span>
</span></span><span class="line"><span class="cl">        <span class="p">...</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">eventData</span><span class="p">.</span><span class="n">ID</span> <span class="p">==</span> <span class="p">(</span><span class="n">TraceEventID</span><span class="p">)</span><span class="m">600</span><span class="p">)</span>  <span class="c1">// Start run</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// keep track of the expected types and the number of allocated instances </span>
</span></span><span class="line"><span class="cl">        <span class="p">...</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">eventData</span><span class="p">.</span><span class="n">ID</span> <span class="p">==</span> <span class="p">(</span><span class="n">TraceEventID</span><span class="p">)</span><span class="m">601</span><span class="p">)</span>  <span class="c1">// Stop run</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// show the results of the run</span>
</span></span><span class="line"><span class="cl">        <span class="p">...</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">eventData</span><span class="p">.</span><span class="n">ID</span> <span class="p">==</span> <span class="p">(</span><span class="n">TraceEventID</span><span class="p">)</span><span class="m">602</span><span class="p">)</span>  <span class="c1">// Start an iteration in a run</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// reset for a new iteration</span>
</span></span><span class="line"><span class="cl">        <span class="p">...</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">eventData</span><span class="p">.</span><span class="n">ID</span> <span class="p">==</span> <span class="p">(</span><span class="n">TraceEventID</span><span class="p">)</span><span class="m">603</span><span class="p">)</span>  <span class="c1">// Stop an iteration in a run</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// Show iteration results</span>
</span></span><span class="line"><span class="cl">        <span class="p">...</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The parsing of the payload of the run related events is done <a href="/posts/2024-08-13_tips-and-tricks-from/">the same way as for AllocationSampled</a> by a dedicated <strong>xxxData</strong> class:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">AllocationsRunData</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">const</span> <span class="kt">int</span> <span class="n">EndOfStringCharLength</span> <span class="p">=</span> <span class="m">2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">TraceEvent</span> <span class="n">_payload</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">AllocationsRunData</span><span class="p">(</span><span class="n">TraceEvent</span> <span class="n">payload</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_payload</span> <span class="p">=</span> <span class="n">payload</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">ComputeFields</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">Iterations</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">Count</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">string</span> <span class="n">AllocatedTypes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">void</span> <span class="n">ComputeFields</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">int</span> <span class="n">offsetBeforeString</span> <span class="p">=</span> <span class="m">4</span> <span class="p">+</span> <span class="m">4</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">Span</span><span class="p">&lt;</span><span class="kt">byte</span><span class="p">&gt;</span> <span class="n">data</span> <span class="p">=</span> <span class="n">_payload</span><span class="p">.</span><span class="n">EventData</span><span class="p">().</span><span class="n">AsSpan</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">Iterations</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToInt32</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="m">4</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="n">Count</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToInt32</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="m">4</span><span class="p">,</span> <span class="m">4</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="n">AllocatedTypes</span> <span class="p">=</span> <span class="n">Encoding</span><span class="p">.</span><span class="n">Unicode</span><span class="p">.</span><span class="n">GetString</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="n">offsetBeforeString</span><span class="p">,</span> <span class="n">_payload</span><span class="p">.</span><span class="n">EventDataLength</span> <span class="p">-</span> <span class="n">offsetBeforeString</span> <span class="p">-</span> <span class="n">EndOfStringCharLength</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>By keeping track of this data, it is possible to show each iteration results:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">&gt; starts 100 iterations allocating 1000000 instances
</span></span><span class="line"><span class="cl">0|
</span></span><span class="line"><span class="cl">Tag  SCount  TCount          SSize          TSize   UnitSize     UpscaledSize  UpscaledCount  Name
</span></span><span class="line"><span class="cl">--------------------------------------------------------------------------------------------------
</span></span><span class="line"><span class="cl"> ST     247     384           5928           9216         24         24702711        1029279  Object24
</span></span><span class="line"><span class="cl"> ST     322     106          10304           3392         32         32205122        1006410  Object32
</span></span><span class="line"><span class="cl"> ST     435     509          20880          24432         48         43510266         906463  Object48
</span></span><span class="line"><span class="cl"> ST     587     776          37568          49664         64         58718825         917481  Object64
</span></span><span class="line"><span class="cl"> ST     747     481          53784          34632         72         74726662        1037870  Object72
</span></span><span class="line"><span class="cl"> ST     958     916          91968          87936         96         95845392         998389  Object96
</span></span></code></pre></td></tr></table>
</div>
</div><p>that integrates <strong>AllocationTick</strong> numbers too.</p>
<p>At the end of the run, error distribution per type is also computed over the iterations:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-plaintext" data-lang="plaintext"><span class="line"><span class="cl">Object72
</span></span><span class="line"><span class="cl">-------------------------
</span></span><span class="line"><span class="cl">   1  -10.5 %
</span></span><span class="line"><span class="cl">   2   -8.9 %
</span></span><span class="line"><span class="cl">   3   -8.4 %
</span></span><span class="line"><span class="cl">   4   -7.8 %
</span></span><span class="line"><span class="cl">   5   -6.4 %
</span></span><span class="line"><span class="cl">        ...
</span></span><span class="line"><span class="cl">  49    0.2 %
</span></span><span class="line"><span class="cl">  50    0.2 %
</span></span><span class="line"><span class="cl">  51    0.2 %
</span></span><span class="line"><span class="cl">        ...
</span></span><span class="line"><span class="cl">  96    6.8 %
</span></span><span class="line"><span class="cl">  97    6.8 %
</span></span><span class="line"><span class="cl">  98    7.7 %
</span></span><span class="line"><span class="cl">  99    8.6 %
</span></span><span class="line"><span class="cl"> 100   10.0 %
</span></span></code></pre></td></tr></table>
</div>
</div><p>You could use the same mechanisms (a custom <strong>EventSource</strong> to emit additional information and an <strong>EventPipe</strong> listener to aggregate the data) for your own usage. This is a different way to use <strong>EventSource</strong> rather than emitting events for monitoring like <a href="https://learn.microsoft.com/en-us/dotnet/core/diagnostics/well-known-event-providers#framework-libraries">what is done by the BCL</a>.</p>
<h2 id="testing-the-standalone-gc">Testing the standalone GC</h2>
<p>In addition to the usual .NET GC, it is needed to validate that the changes are also working for the <em>standalone GC</em>. <a href="https://github.com/dotnet/runtime/blob/main/docs/design/features/standalone-gc-loading.md#identifying-candidate-shared-libraries">Long story short</a>, it is possible to replace the existing .NET garbage collector by your own implementation. For the test, I needed to check that the standalone GC clrgcexp.dll generated by the .NET compilation generates the expected <strong>AllocationSampled</strong> events when the corresponding keyword with informational verbosity is enabled.</p>
<h2 id="debugging-nativeaot-scenarios">Debugging NativeAOT scenarios</h2>
<p>The final step was to implement the feature for the NativeAOT scenario. When you build your C# application for NativeAOT, a lot happens behind the scenes, based on compilers known by Visual Studio corresponding to the official released version of the .NET runtime. In my case, I needed to use the brand-new code of my local branch and debug some simple C# applications. The steps to reach that goal are not that simple.</p>
<p>First, you follow the steps <a href="https://github.com/dotnet/runtime/blob/main/docs/workflow/building/coreclr/nativeaot.md#convenience-visual-studio-repro-project">given by the documentation</a> to build AOT CLR in debug and libs in release:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-sql" data-lang="sql"><span class="line"><span class="cl"><span class="n">build</span><span class="w"> </span><span class="n">clr</span><span class="p">.</span><span class="n">aot</span><span class="o">+</span><span class="n">lib</span><span class="w"> </span><span class="o">-</span><span class="n">rc</span><span class="w"> </span><span class="n">debug</span><span class="w"> </span><span class="o">-</span><span class="n">lc</span><span class="w"> </span><span class="n">release</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">build</span><span class="w"> </span><span class="o">-</span><span class="k">c</span><span class="w"> </span><span class="n">release</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>Then, open src\coreclr\tools\aot\ilc.sln</p>
<p>the repro project contains a program.cs file where you write the C# code you want to test and debug.</p>
<p>When you build a NativeAOT application, you need to select if you want the runtime that emits events or not. This is done by setting the <strong>EventSourceSupport</strong> to true in the .csproj:</p>
<p>Add the following in the .csproj to get events:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl"><span class="nt">&lt;EventSourceSupport&gt;</span>true<span class="nt">&lt;/EventSourceSupport&gt;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>However, with the repro project, you need to change <strong>ILCompiler.csproj</strong> in a different way:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl"><span class="nt">&lt;ReproResponseLines</span> <span class="na">Include=</span><span class="s">&#34;--feature:System.Diagnostics.Tracing.EventSource.IsSupported=true&#34;</span> <span class="nt">/&gt;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Also, change the reproNative.vcxproj file to bind to <strong>eventpipe-enabled.lib</strong> instead of <strong>eventpipe-disabled.lib</strong> for the platform/configuration you want to debug:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl"><span class="nt">&lt;ItemDefinitionGroup</span> <span class="na">Condition=</span><span class="s">&#34;&#39;$(Configuration)|$(Platform)&#39;==&#39;Debug|x64&#39;&#34;</span><span class="nt">&gt;</span>
</span></span><span class="line"><span class="cl">    ...
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;Link&gt;</span>
</span></span><span class="line"><span class="cl">      ...
</span></span><span class="line"><span class="cl">      <span class="nt">&lt;AdditionalDependencies&gt;</span>...$(ArtifactsRoot)bin\coreclr\windows.x64.Debug\aotsdk\eventpipe-enabled.lib;...<span class="nt">&lt;/AdditionalDependencies&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;/Link&gt;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Then, build <strong>repro</strong> in Debug x64</p>
<p>Next, change the target for the <strong>ILCompiler</strong> project:</p>
<p><img loading="lazy" src="/posts/2024-09-13_unexpected-usage-of-eventsourc/1_GD25I4_RbCY1-6vEV7zetw.png"></p>
<p>Build and run it to generate the .obj file corresponding to the <strong>repro</strong> project</p>
<p>Finally, open src\coreclr\tools\aot\ILCompiler\reproNative\reproNative.vcxproj that will allow you to debug the program.cs you’ve just built!</p>
]]></content:encoded></item><item><title>Tips and tricks from validating a Pull Request in .NET CLR</title><link>https://chrisnas.github.io/posts/2024-08-13_tips-and-tricks-from/</link><pubDate>Tue, 13 Aug 2024 09:23:28 +0000</pubDate><guid>https://chrisnas.github.io/posts/2024-08-13_tips-and-tricks-from/</guid><description>Performance and validation in a .NET CLR pull request</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>During the implementation of our .NET allocation profiler, we realized that the current sampling mechanism based on a fixed threshold did not provide a good enough statistical distribution. With the help of <a href="https://x.com/noahsfalk">Noah Falk</a> from the CLR Diagnostics team, I started to implement a randomized sampling based on a <a href="https://github.com/dotnet/runtime/blob/ce40d3df8fb2d13750acfb075acc2c2adb3c8812/docs/design/features/RandomizedAllocationSampling.md#the-sampling-model">Bernoulli distribution model</a> for .NET.</p>
<p><img loading="lazy" src="/posts/2024-08-13_tips-and-tricks-from/1_x2tSxxCnXqoEO8rW9nHW0A.png"></p>
<p>With this kind of changes, you need to ensure that you don’t break any existing code, the impact on performance is limited and the mathematical results map the expected mathematical distribution.</p>
<p>The rest of this blog series details the different tests I wrote and the corresponding tips and tricks that could be reused when you write C# code.</p>
<h2 id="testing-thebasics">Testing the basics</h2>
<p>From a high-level view, the code change does something simple: each time an allocation context is needed to fulfill an allocation, the code checks if it should be sampled. In that case, a new <strong>AllocationSampled</strong> event is emitted with the same information as the existing <strong>AllocationTick</strong> event plus an additional field. So, the first level of testing is to validate that the events are emitted when the keyword and verbosity are enabled for the .NET runtime provider.</p>
<p>The runtime has already some tests in place to validate that some events are emitted under the** \src\tests\tracing\eventpipe** folder. Here is the code of my XUnit test that mimics the existing ones such as <strong>simpleruntimeeventvalidation</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[Fact]</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="kt">int</span> <span class="n">TestEntryPoint</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// check that AllocationSampled events are generated and size + type name are correct</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">ret</span> <span class="p">=</span> <span class="n">IpcTraceTest</span><span class="p">.</span><span class="n">RunAndValidateEventCounts</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="k">new</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="n">ExpectedEventCount</span><span class="p">&gt;()</span> <span class="p">{</span> <span class="p">{</span> <span class="s">&#34;Microsoft-Windows-DotNETRuntime&#34;</span><span class="p">,</span> <span class="p">-</span><span class="m">1</span> <span class="p">}</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="n">_eventGeneratingActionForAllocations</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// AllocationSamplingKeyword (0x80000000000): 0b1000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000</span>
</span></span><span class="line"><span class="cl">        <span class="k">new</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">EventPipeProvider</span><span class="p">&gt;()</span> <span class="p">{</span> <span class="k">new</span> <span class="n">EventPipeProvider</span><span class="p">(</span><span class="s">&#34;Microsoft-Windows-DotNETRuntime&#34;</span><span class="p">,</span> <span class="n">EventLevel</span><span class="p">.</span><span class="n">Informational</span><span class="p">,</span> <span class="m">0x80000000000</span><span class="p">)</span> <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="m">1024</span><span class="p">,</span> <span class="n">_DoesTraceContainEnoughAllocationSampledEvents</span><span class="p">,</span> <span class="n">enableRundownProvider</span><span class="p">:</span> <span class="kc">false</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">ret</span> <span class="p">!=</span> <span class="m">100</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">ret</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="m">100</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>IpcTraceTest.RunAndValidateEventCounts</strong> helper method accepts:</p>
<ul>
<li>The list of providers to enable with which keyword and verbosity level.</li>
<li>How many events are expected (using -1 in my case because I can’t predict how many random events will be generated</li>
<li>A callback with the code that will generate events (allocating a lot of instances of a custom type in my case)</li>
<li>A callback that looks at emitted events</li>
</ul>
<p>The last callback code relies on TraceEvent to listen to emitted events:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="n">Func</span><span class="p">&lt;</span><span class="n">EventPipeEventSource</span><span class="p">,</span> <span class="n">Func</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">&gt;&gt;</span> <span class="n">_DoesTraceContainEnoughAllocationSampledEvents</span> <span class="p">=</span> <span class="p">(</span><span class="n">source</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">AllocationSampledEvents</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">Object128Count</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">Dynamic</span><span class="p">.</span><span class="n">All</span> <span class="p">+=</span> <span class="p">(</span><span class="n">eventData</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">eventData</span><span class="p">.</span><span class="n">ID</span> <span class="p">==</span> <span class="p">(</span><span class="n">TraceEventID</span><span class="p">)</span><span class="m">303</span><span class="p">)</span>  <span class="c1">// AllocationSampled is not defined in TraceEvent yet</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">AllocationSampledEvents</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="n">AllocationSampledData</span> <span class="n">payload</span> <span class="p">=</span> <span class="k">new</span> <span class="n">AllocationSampledData</span><span class="p">(</span><span class="n">eventData</span><span class="p">,</span> <span class="n">source</span><span class="p">.</span><span class="n">PointerSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// uncomment to see the allocation events payload</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// Logger.logger.Log($&#34;{payload.HeapIndex} - {payload.AllocationKind} | ({payload.ObjectSize}) {payload.TypeName}  = 0x{payload.Address}&#34;);</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">payload</span><span class="p">.</span><span class="n">TypeName</span> <span class="p">==</span> <span class="s">&#34;Tracing.Tests.SimpleRuntimeEventValidation.Object128&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">Object128Count</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Logger</span><span class="p">.</span><span class="n">logger</span><span class="p">.</span><span class="n">Log</span><span class="p">(</span><span class="s">&#34;AllocationSampled counts validation&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Logger</span><span class="p">.</span><span class="n">logger</span><span class="p">.</span><span class="n">Log</span><span class="p">(</span><span class="s">&#34;Nb events: &#34;</span> <span class="p">+</span> <span class="n">AllocationSampledEvents</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Logger</span><span class="p">.</span><span class="n">logger</span><span class="p">.</span><span class="n">Log</span><span class="p">(</span><span class="s">&#34;Nb object128: &#34;</span> <span class="p">+</span> <span class="n">Object128Count</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="p">(</span><span class="n">AllocationSampledEvents</span> <span class="p">&gt;=</span> <span class="n">MinExpectedEvents</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">Object128Count</span> <span class="p">!=</span> <span class="m">0</span><span class="p">)</span> <span class="p">?</span> <span class="m">100</span> <span class="p">:</span> <span class="p">-</span><span class="m">1</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In my case, I’m adding a new event that is emitted when a new keyword is enabled. It means that TraceEvent does not know yet its ID (hence the <strong>303</strong> hardcoded value) or how to unpack the new event payload. This is why I created the <strong>AllocationSampleData</strong> type to expose the payload as public fields:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">AllocationSampledData</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">const</span> <span class="kt">int</span> <span class="n">EndOfStringCharLength</span> <span class="p">=</span> <span class="m">2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">TraceEvent</span> <span class="n">_payload</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">int</span> <span class="n">_pointerSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">AllocationSampledData</span><span class="p">(</span><span class="n">TraceEvent</span> <span class="n">payload</span><span class="p">,</span> <span class="kt">int</span> <span class="n">pointerSize</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_payload</span> <span class="p">=</span> <span class="n">payload</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_pointerSize</span> <span class="p">=</span> <span class="n">pointerSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">TypeName</span> <span class="p">=</span> <span class="s">&#34;?&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">ComputeFields</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">GCAllocationKind</span> <span class="n">AllocationKind</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">ClrInstanceID</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">UInt64</span> <span class="n">TypeID</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">string</span> <span class="n">TypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">HeapIndex</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">UInt64</span> <span class="n">Address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">long</span> <span class="n">ObjectSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">long</span> <span class="n">SampledByteOffset</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="n">And</span> <span class="n">the</span> <span class="n">extraction</span> <span class="n">of</span> <span class="n">each</span> <span class="n">field</span> <span class="k">from</span> <span class="n">the</span> <span class="n">payload</span> <span class="k">is</span> <span class="n">done</span> <span class="k">in</span> <span class="n">the</span> <span class="n">ComputeFields</span> <span class="n">method</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// The payload of AllocationSampled is not defined in TraceEvent yet</span>
</span></span><span class="line"><span class="cl"><span class="c1">//</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;AllocationKind&#34; inType=&#34;win:UInt32&#34; map=&#34;GCAllocationKindMap&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;ClrInstanceID&#34; inType=&#34;win:UInt16&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;TypeID&#34; inType=&#34;win:Pointer&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;TypeName&#34; inType=&#34;win:UnicodeString&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;HeapIndex&#34; inType=&#34;win:UInt32&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;Address&#34; inType=&#34;win:Pointer&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;ObjectSize&#34; inType=&#34;win:UInt64&#34; outType=&#34;win:HexInt64&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//  &lt;data name=&#34;SampledByteOffset&#34; inType=&#34;win:UInt64&#34; outType=&#34;win:HexInt64&#34; /&gt;</span>
</span></span><span class="line"><span class="cl"><span class="c1">//</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">void</span> <span class="n">ComputeFields</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">int</span> <span class="n">offsetBeforeString</span> <span class="p">=</span> <span class="m">4</span> <span class="p">+</span> <span class="m">2</span> <span class="p">+</span> <span class="n">_pointerSize</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This <strong>offsetBeforeString</strong> value is computed based on the size of <strong>UInt32</strong> (=4 bytes), <strong>UInt16</strong> (=2 bytes) and a <strong>Pointer</strong> (depends on 32 bit=4 or 64 bit=8) fields before the string. As <strong>Span<byte></strong> wraps the binary payload provided by TraceEvent:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">Span</span><span class="p">&lt;</span><span class="kt">byte</span><span class="p">&gt;</span> <span class="n">data</span> <span class="p">=</span> <span class="n">_payload</span><span class="p">.</span><span class="n">EventData</span><span class="p">().</span><span class="n">AsSpan</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Since I know the size of each field from the payload definition in ClrEtwAll.man, the numeric fields are extracted thanks to the <strong>BitConverter</strong> methods:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">AllocationKind</span> <span class="p">=</span> <span class="p">(</span><span class="n">GCAllocationKind</span><span class="p">)</span><span class="n">BitConverter</span><span class="p">.</span><span class="n">ToInt32</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="m">4</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrInstanceID</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToInt16</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="m">4</span><span class="p">,</span> <span class="m">2</span><span class="p">));</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Things start to be more complicated when you need to get the value of an address. Its size is 4 bytes in 32 bit and 8 bytes in 64 bit:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">_pointerSize</span> <span class="p">==</span> <span class="m">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">TypeID</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToUInt32</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="m">6</span><span class="p">,</span> <span class="n">_pointerSize</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">TypeID</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToUInt64</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="m">6</span><span class="p">,</span> <span class="n">_pointerSize</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The bitness of the monitored application is given by the <strong>EventPipeSource</strong>’s <strong>PointerSize</strong> property that is passed to the <strong>AllocationSampledData</strong> constructor.</p>
<p>For the string case, you need to know that it is stored as UTF16 (so each character requires 2 bytes) with the trailing \0 and its length is the total size of the payload minus the size of the other fields. That way, you can slice the Span to properly read the characters:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">//   \0 should not be included for GetString to work</span>
</span></span><span class="line"><span class="cl">        <span class="n">TypeName</span> <span class="p">=</span> <span class="n">Encoding</span><span class="p">.</span><span class="n">Unicode</span><span class="p">.</span><span class="n">GetString</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="n">offsetBeforeString</span><span class="p">,</span> <span class="n">_payload</span><span class="p">.</span><span class="n">EventDataLength</span> <span class="p">-</span> <span class="n">offsetBeforeString</span> <span class="p">-</span> <span class="n">EndOfStringCharLength</span> <span class="p">-</span> <span class="m">4</span> <span class="p">-</span> <span class="n">_pointerSize</span> <span class="p">-</span> <span class="m">8</span><span class="p">));</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The rest of the fields are extracted with <strong>BitConverter</strong> helpers taking into account the size of the string:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">HeapIndex</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToInt32</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="n">offsetBeforeString</span> <span class="p">+</span> <span class="n">TypeName</span><span class="p">.</span><span class="n">Length</span> <span class="p">*</span> <span class="m">2</span> <span class="p">+</span> <span class="n">EndOfStringCharLength</span><span class="p">,</span> <span class="m">4</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">_pointerSize</span> <span class="p">==</span> <span class="m">4</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">Address</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToUInt32</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="n">offsetBeforeString</span> <span class="p">+</span> <span class="n">TypeName</span><span class="p">.</span><span class="n">Length</span> <span class="p">*</span> <span class="m">2</span> <span class="p">+</span> <span class="n">EndOfStringCharLength</span> <span class="p">+</span> <span class="m">4</span><span class="p">,</span> <span class="n">_pointerSize</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">Address</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToUInt64</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="n">offsetBeforeString</span> <span class="p">+</span> <span class="n">TypeName</span><span class="p">.</span><span class="n">Length</span> <span class="p">*</span> <span class="m">2</span> <span class="p">+</span> <span class="n">EndOfStringCharLength</span> <span class="p">+</span> <span class="m">4</span><span class="p">,</span> <span class="n">_pointerSize</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>        <span class="n">ObjectSize</span> <span class="p">=</span> <span class="n">BitConverter</span><span class="p">.</span><span class="n">ToInt64</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">Slice</span><span class="p">(</span><span class="n">offsetBeforeString</span> <span class="p">+</span> <span class="n">TypeName</span><span class="p">.</span><span class="n">Length</span> <span class="p">*</span> <span class="m">2</span> <span class="p">+</span> <span class="n">EndOfStringCharLength</span> <span class="p">+</span> <span class="m">4</span> <span class="p">+</span> <span class="m">8</span><span class="p">,</span> <span class="m">8</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The name of the sampled allocated type from the parsed payload is used to ensure that the expected allocations are indeed emitted when the keyword/verbosity are enabled for the .NET provider.</p>
<h2 id="testing-the-performance-impact">Testing the performance impact</h2>
<p>The next step was to validate the impact of the changes on the GC performance. The baseline was the .NET 9 branch before the changes and in Release. The GCPerfSim library from the <a href="https://github.com/dotnet/performance">performance repository</a> was used to allocate 500 GB of mixed size objects on 4 threads with a 50MB live object size. From the output, the <strong>seconds_taken</strong> line provides the duration to allocate these objects.</p>
<p>To ensure that you run with the rebuilt branch, you need to use the following commands:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">build.cmd clr+libs -c release
</span></span><span class="line"><span class="cl">src<span class="se">\t</span>ests<span class="se">\b</span>uild.cmd generatelayoutonly Release
</span></span></code></pre></td></tr></table>
</div>
</div><p>The next step is to use <strong><repo>\artifacts\tests\coreclr\windows.x64.Release\Tests\Core_Root\corerun.exe</strong> instead of the usual <strong>dotnet.exe</strong> like the following:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">&lt;clr repo&gt;<span class="se">\a</span>rtifacts<span class="se">\t</span>ests<span class="se">\c</span>oreclr<span class="se">\w</span>indows.x64.Release<span class="se">\T</span>ests<span class="se">\C</span>ore_Root<span class="se">\c</span>orerun &lt;performance repo&gt;<span class="se">\a</span>rtifacts<span class="se">\b</span>in<span class="se">\G</span>CPerfSim<span class="se">\R</span>elease<span class="se">\n</span>et7.0<span class="se">\G</span>CPerfSim.dll -tc <span class="m">4</span> -tagb <span class="m">500</span> -tlgb 0.05 -lohar <span class="m">0</span> -sohsi <span class="m">0</span> -lohsi <span class="m">0</span> -pohsi <span class="m">0</span> -sohpi <span class="m">0</span> -lohpi <span class="m">0</span> -sohfi <span class="m">0</span> -lohfi <span class="m">0</span> -pohfi <span class="m">0</span> -allocType reference -testKind <span class="nb">time</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>I run this scenario 10 times to compute the median and the average. I’m doing the same for the PR branch. So far so good. Now, how to do the same but to measure the impact of the random sampling? Remember that the code only triggers if the .NET provider is enabled with a certain keyword and verbosity. It means that you have to use a tool such as dotnet-trace to start an event pipe session but you would need the process id. I could have changed the code of GCPerfSim to show the process id but I would still need to wait for the session to have been created before starting the <strong>seconds_taken</strong> computation. Not really easy to script a 10x runs that way…</p>
<p>Don’t worry! dotnet-trace supports the <strong>— show-child-io true</strong> arguments that makes it start the session as the process starts and <strong>— providers</strong> allows you to enable a provider the way you want. Here is an example of the command line used for the performance runs:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">dotnet-trace collect --show-child-io <span class="nb">true</span> --providers Microsoft-Windows-DotNETRuntime:0x80000000000:4 -- corerun &lt;performance repo&gt;<span class="se">\a</span>rtifacts<span class="se">\b</span>in<span class="se">\G</span>CPerfSim<span class="se">\R</span>elease<span class="se">\n</span>et7.0<span class="se">\G</span>CPerfSim.dll …
</span></span></code></pre></td></tr></table>
</div>
</div><p>These dotnet-trace features are very handy for any scripting scenario unrelated to testing the CLR. For example, you could use Perfview to later on analyze how an application behaves thanks to the emitted events stored in the generated .nettrace file!</p>
<p>The next episode will describe unexpected usage of <strong>EventSource</strong> and debugging NativeAOT scenario.</p>
]]></content:encoded></item><item><title>View your GCs statistics live with dotnet-gcstats!</title><link>https://chrisnas.github.io/posts/2024-03-01_view-your-gcs-statistics/</link><pubDate>Fri, 01 Mar 2024 18:30:13 +0000</pubDate><guid>https://chrisnas.github.io/posts/2024-03-01_view-your-gcs-statistics/</guid><description>Discover how to look at the .NET GC statistics to better understand your garbage collections</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>While working on the second edition of <a href="https://www.amazon.com/Pro-NET-Memory-Management-Performance/dp/148424026X">Pro .NET Memory Management</a>, it was needed to get statistics about each garbage collection to explain the condemned generation and other decisions taken by the GC. This post explains the different internal data structures used by the GC and how to get their value for each collection. Some require debugging the CLR and others are emitted via events. For the latter, I will show how I wrote the new <strong>dotnet-gcstats</strong> CLI tool to collect them and a personal Perfview GCStats displaying live data, garbage collection after garbage collection.</p>
<h2 id="high-level-view-of-gc-internals">High level view of GC Internals</h2>
<p>With regions, the GC keeps track of managed memory allocated by your application in instances of the <strong>gc_heap</strong> class. In Workstation mode, only 1 instance exists and in Server mode, by default, 1 instance is created per core. Each <strong>gc_heap</strong> keeps track of its 5 generations (gen0, gen1, gen2, Large Object Heap and Pinned Object Heap) in an array of 5 <strong>generation</strong> instances. Each generation references its dedicated regions wrapped by instances of <strong>heap_segment</strong>. These regions are reserved from a giant part of the process address space and committed as needed.</p>
<p><img loading="lazy" src="/posts/2024-03-01_view-your-gcs-statistics/1_RVd5JiSJBJzqCqqYjUJBog.png"></p>
<p>During a garbage collection, the GC code relies on global fields per <strong>gc_heap</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">static</span> <span class="n">gc_mechanisms</span> <span class="n">settings</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">gc_history_global</span> <span class="n">gc_data_global</span><span class="p">;</span>  <span class="c1">// for non background GC including foreground GC during a background
</span></span></span><span class="line"><span class="cl"><span class="n">gc_history_global</span> <span class="n">bgc_data_global</span><span class="p">;</span> <span class="c1">// for background GC only
</span></span></span><span class="line"><span class="cl"><span class="k">static</span> <span class="n">dynamic_data</span> <span class="n">dynamic_data_table</span><span class="p">[</span><span class="n">total_generation_count</span> <span class="o">=</span> <span class="mi">5</span><span class="p">];</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>settings</strong> field contains a few interesting fields:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">gc_mechanisms</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">gc_index</span><span class="p">;</span> <span class="c1">// starts from 1 for the first GC
</span></span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">condemned_generation</span><span class="p">;</span>  <span class="c1">// generation to collect
</span></span></span><span class="line"><span class="cl">    <span class="n">BOOL</span> <span class="n">compaction</span><span class="p">;</span>  <span class="c1">// true when compaction instead of sweep
</span></span></span><span class="line"><span class="cl">    <span class="n">BOOL</span> <span class="n">loh_compaction</span><span class="p">;</span>  <span class="c1">// true when LOH needs compaction
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">concurrent</span><span class="p">;</span>  <span class="c1">// 1 = concurrent/background GC 
</span></span></span><span class="line"><span class="cl">    <span class="n">gc_reason</span> <span class="n">reason</span><span class="p">;</span>  <span class="c1">// trigger reason
</span></span></span><span class="line"><span class="cl">    <span class="n">gc_pause_mode</span> <span class="n">pause_mode</span><span class="p">;</span>  <span class="c1">// see GCSettings.LatencyMode
</span></span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>including the trigger reason that will tell if your code called <strong>GC.Collect</strong> (i.e. induced), or if it was due to a LOH or SOH allocation for example. If <strong>compaction</strong> is true, a compacting GC will happen (instead of a sweeping one).</p>
<p>The <strong>gc/bgc_data_global</strong> contains almost the same information:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">gc_history_global</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">num_heaps</span><span class="p">;</span>  <span class="c1">// number of gc_heap instances
</span></span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">condemned_generation</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">gc_reason</span> <span class="n">reason</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">pause_mode</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">mem_pressure</span><span class="p">;</span> 
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">global_mechanisms_p</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Most of the fields are available from different events:</p>
<ul>
<li><strong>GCStart</strong>: <strong>gc_index</strong> in <strong>Count</strong>, <strong>condemned_generation</strong> in <strong>Depth</strong>, <strong>reason</strong> in <strong>Reason</strong></li>
<li><strong>GCGlobalHeapHistory</strong>: <strong>pause_mode</strong> in <strong>PauseMode</strong> and some others in <strong>GlobalMechanisms</strong></li>
</ul>
<h2 id="which-generation-to-collect--condemned-generation">Which generation to collect = condemned generation</h2>
<p>The computation of the <strong>condemned_generation</strong> is complicated and relies on many factors including metrics stored for each “generation” (gen0, gen1, gen2, LOH and POH) in an array of <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/gc/gcpriv.h#L1058"><strong>dynamic_data</strong></a> called <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/gc/gcpriv.h#L3627"><strong>dynamic_data_table</strong></a>. The <strong>dynamic_data</strong> class contains a few fields used by the GC to take decisions such as when a collection should be triggered and which generation to condemn:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">dynamic_data</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">ptrdiff_t</span> <span class="n">new_allocation</span><span class="p">;</span>     <span class="c1">// remaining budget = budget - allocated
</span></span></span><span class="line"><span class="cl">    <span class="n">size_t</span>    <span class="n">desired_allocation</span><span class="p">;</span> <span class="c1">// budget to trigger a GC
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// # of bytes taken by survived objects after mark.
</span></span></span><span class="line"><span class="cl">    <span class="n">size_t</span>    <span class="n">survived_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// # of bytes taken by survived pinned plugs after mark.
</span></span></span><span class="line"><span class="cl">    <span class="n">size_t</span>    <span class="n">pinned_survived_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// total object size after a GC, ie, doesn&#39;t include fragmentation
</span></span></span><span class="line"><span class="cl">    <span class="n">size_t</span>    <span class="n">current_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">size_t</span>    <span class="n">promoted_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">size_t</span>    <span class="n">fragmentation</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Most of these fields are found in the payload of <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/ClrEtwAll.man#L1296">GCPerHeapHistory</a> or <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/ClrEtwAll.man#L962">GCHeapStat</a> events. However, the most interesting one, <strong>new_allocation</strong> is not available. Why is it interesting? Because it would give you which generation had its budget exceeded. It is initialized with the generation budget at the end of a GC and then, each time an allocation context gets created, its size is deducted from it. When it reaches 0, it means that the budget is exceeded, and a collection should happen.</p>
<p>Since I needed to debug the CLR to better understand all these algorithms, I added a breakpoint at the beginning of <strong>gc_heap::garbage_collect</strong> with the following action:</p>
<pre tabindex="0"><code>#{settings.gc_index}[{gc_trigger_reason}]{&#34;\n&#34;,s8b} new_allocation(0) = {dynamic_data_table[0].new_allocation}{&#34;\n&#34;,s8b} desired_allocation(0) = {dynamic_data_table[0].desired_allocation}{&#34;\n&#34;,s8b} begin_data_size(0) = {dynamic_data_table[0].begin_data_size}{&#34;\n&#34;,s8b} promoted_size(0) = {dynamic_data_table[0].promoted_size}{&#34;\n&#34;,s8b}-{&#34;\n&#34;,s8b} new_allocation(1) = 
...
{dynamic_data_table[4].new_allocation}{&#34;\n&#34;,s8b} desired_allocation(4) = {dynamic_data_table[4].desired_allocation}{&#34;\n&#34;,s8b} begin_data_size(4) = {dynamic_data_table[4].begin_data_size}{&#34;\n&#34;,s8b} promoted_size(4) = {dynamic_data_table[4].promoted_size}{&#34;\n&#34;,s8b}__________{&#34;\n&#34;,s8b}
</code></pre><p>And now, each time a GC happens, I get the corresponding log in my Output pane in Visual Studio:</p>
<pre tabindex="0"><code>#2[reason_alloc_soh (0)]
 new_allocation(0) = -22728
 desired_allocation(0) = 134217728
 begin_data_size(0) = 8391376
 promoted_size(0) = 8383432
-
 new_allocation(1) = -5910416
 desired_allocation(1) = 2473016
 begin_data_size(1) = 375528
 promoted_size(1) = 353288
-
 new_allocation(2) = -91144
 desired_allocation(2) = 262144
 begin_data_size(2) = 0
 promoted_size(2) = 0
-
 new_allocation(3) = 28000088
 desired_allocation(3) = 28000088
 begin_data_size(3) = 8000024
 promoted_size(3) = 8000024
-
 new_allocation(4) = 3145728
 desired_allocation(4) = 3145728
 begin_data_size(4) = 32712
 promoted_size(4) = 32712
</code></pre><p>As you can see, gen0, gen1 and gen2 have all their budget exceeded (i.e. their <strong>new_allocation</strong> is negative) and it explains why a simple gen0 collection (from allocation in SOH = gen0) becomes a gen2 collection. If you wonder how gen1 and gen2 budgets are exceeded as your application is only allocating in gen0, you need to understand that when a GC copy surviving objects from one younger generation to the older, they are counted as allocations in the older and subtracted from its <strong>new_allocation</strong> metric.</p>
<p>The GC is encoding the different steps leading to the final condemned generation in a 32 bit value stored in a <strong>gen_to_condemn_tuning</strong> field that allows you to get:</p>
<ul>
<li>initial condemned generation,</li>
<li>final generation to condemn,</li>
<li>which generation’s budget is exceeded.</li>
</ul>
<p>The value of the last one corresponds to the highest generation for which its <strong>new_allocation</strong> was negative.</p>
<p>This information is available in the <strong>CondemnReasons0</strong> field of the <strong>GCPerHeapHistory</strong> event, and you need some arithmetic to get the generation you want:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">const</span> <span class="kt">int</span> <span class="n">gen_initial</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>          <span class="c1">// indicates the initial gen to condemn.</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">const</span> <span class="kt">int</span> <span class="n">gen_final_per_heap</span> <span class="p">=</span> <span class="m">1</span><span class="p">;</span>   <span class="c1">// indicates the final gen to condemn per heap.</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">const</span> <span class="kt">int</span> <span class="n">gen_alloc_budget</span> <span class="p">=</span> <span class="m">2</span><span class="p">;</span>     <span class="c1">// indicates which gen&#39;s budget is exceeded.</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">const</span> <span class="kt">int</span> <span class="n">InitialGenMask</span> <span class="p">=</span> <span class="m">0x0</span> <span class="p">+</span> <span class="m">0x1</span> <span class="p">+</span> <span class="m">0x2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">static</span> <span class="kt">int</span> <span class="n">GetGen</span><span class="p">(</span><span class="kt">int</span> <span class="n">val</span><span class="p">,</span> <span class="kt">int</span> <span class="n">reason</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">gen</span> <span class="p">=</span> <span class="p">(</span><span class="n">val</span> <span class="p">&gt;&gt;</span> <span class="m">2</span> <span class="p">*</span> <span class="n">reason</span><span class="p">)</span> <span class="p">&amp;</span> <span class="n">InitialGenMask</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">gen</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="building-your-owntool">Building your own tool</h2>
<p>Even though I could dig into the different matrices available in the Perfview’s GCStats view or its export to Excel, I decided to write dotnet-gcstats. This CLI tool listens to the CLR events emitted by a .NET application thanks to Microsoft.Diagnostics.NETCore.Client (connect to the application EventPipe) and TraceEvent (receive and analyze the CLR events).</p>
<p>The code is amazingly simple:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">providers</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">EventPipeProvider</span><span class="p">&gt;()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">new</span> <span class="n">EventPipeProvider</span><span class="p">(</span><span class="s">&#34;Microsoft-Windows-DotNETRuntime&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">EventLevel</span><span class="p">.</span><span class="n">Informational</span><span class="p">,</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">GC</span><span class="p">),</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">client</span> <span class="p">=</span> <span class="k">new</span> <span class="n">DiagnosticsClient</span><span class="p">(</span><span class="n">processId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">using</span> <span class="p">(</span><span class="kt">var</span> <span class="n">session</span> <span class="p">=</span> <span class="n">client</span><span class="p">.</span><span class="n">StartEventPipeSession</span><span class="p">(</span><span class="n">providers</span><span class="p">,</span> <span class="kc">false</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">Task</span> <span class="n">streamTask</span> <span class="p">=</span> <span class="n">Task</span><span class="p">.</span><span class="n">Run</span><span class="p">(()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">source</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EventPipeEventSource</span><span class="p">(</span><span class="n">session</span><span class="p">.</span><span class="n">EventStream</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span> <span class="n">clrParser</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ClrTraceEventParser</span><span class="p">(</span><span class="n">source</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">clrParser</span><span class="p">.</span><span class="n">GCPerHeapHistory</span> <span class="p">+=</span> <span class="n">OnGCPerHeapHistory</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">clrParser</span><span class="p">.</span><span class="n">GCStart</span> <span class="p">+=</span> <span class="n">OnGCStart</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">clrParser</span><span class="p">.</span><span class="n">GCGlobalHeapHistory</span> <span class="p">+=</span> <span class="n">OnGCGlobalHeapHistory</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">try</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">source</span><span class="p">.</span><span class="n">Process</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">catch</span> <span class="p">(</span><span class="n">Exception</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">ShowError</span><span class="p">(</span><span class="s">$&#34;Error encountered while processing events: {e.Message}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Each event handler is responsible for extracting and translating the interesting fields of its event payload with a few color enhancements:</p>
<ul>
<li><strong>GCStart</strong>: collection count and reason (highlight induced collections).</li>
<li><strong>GCGlobalHeapHistory</strong>: condemned generation, pause mode and memory pressure.</li>
<li><strong>GCPerHeapHistory</strong>: starting -&gt; final condemned generation and for each heap, budget, begin size, begin obj size, final size, promoted size and fragmentation.</li>
</ul>
<p>The final step was to transform a simple console application into a .NET CLI tool that everyone will be able to install with <strong>dotnet tool install -g dotnet-gcstats</strong> and use with <strong>dotnet gcstats <pid></strong>. I followed <a href="https://learn.microsoft.com/en-us/dotnet/core/tools/global-tools-how-to-create?WT.mc_id=DT-MVP-5003325">the documentation</a> by adding the following to the project file:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl"><span class="nt">&lt;PropertyGroup&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackAsTool&gt;</span>true<span class="nt">&lt;/PackAsTool&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;ToolCommandName&gt;</span>dotnet-gcstats<span class="nt">&lt;/ToolCommandName&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackageOutputPath&gt;</span>./nupkg<span class="nt">&lt;/PackageOutputPath&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;GeneratePackageOnBuild&gt;</span>true<span class="nt">&lt;/GeneratePackageOnBuild&gt;</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&lt;/PropertyGroup&gt;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In addition, I provided a few additional details:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl"><span class="nt">&lt;PropertyGroup&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackageId&gt;</span>dotnet-gcstats<span class="nt">&lt;/PackageId&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackageVersion&gt;</span>1.0.0<span class="nt">&lt;/PackageVersion&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;Title&gt;</span>dotnet-gcstats<span class="nt">&lt;/Title&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;Authors&gt;</span>christophe Nasarre<span class="nt">&lt;/Authors&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;Owners&gt;</span>chrisnas<span class="nt">&lt;/Owners&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;RepositoryUrl&gt;</span>https://github.com/chrisnas<span class="nt">&lt;/RepositoryUrl&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;RepositoryType&gt;</span>git<span class="nt">&lt;/RepositoryType&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackageProjectUrl&gt;</span>https://github.com/chrisnas/GCStats<span class="nt">&lt;/PackageProjectUrl&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackageLicenseFile&gt;</span>LICENSE<span class="nt">&lt;/PackageLicenseFile&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;Description&gt;</span>Global CLI tool to display live statistics during .NET garbage collections<span class="nt">&lt;/Description&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackageReleaseNotes&gt;</span>Initial release<span class="nt">&lt;/PackageReleaseNotes&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;Copyright&gt;</span>Copyright Christophe Nasarre 2024-$([System.DateTime]::UtcNow.ToString(yyyy))<span class="nt">&lt;/Copyright&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;PackageTags&gt;</span>.NET TraceEvent CLR GC<span class="nt">&lt;/PackageTags&gt;</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&lt;/PropertyGroup&gt;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Once built, I simply uploaded the generated package to nuget.org et voila!</p>
<p>Now, you should be able to better understand why some collections are triggered:</p>
<p><img loading="lazy" src="/posts/2024-03-01_view-your-gcs-statistics/1_3oHvH2Vxb3PgW46khVMSMQ.png"></p>
<p>And if it is not enough, wait for reading the second edition of Pro .NET Memory Management ;^)</p>
]]></content:encoded></item><item><title>Be Aligned! Or how to investigate a stack corruption</title><link>https://chrisnas.github.io/posts/2023-12-11_be-aligned-or-how/</link><pubDate>Mon, 11 Dec 2023 11:01:45 +0000</pubDate><guid>https://chrisnas.github.io/posts/2023-12-11_be-aligned-or-how/</guid><description>This post describes the different steps I followed to investigate a stack corruption with Visual Studio</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>During the Datadog R&amp;D week, my goal is to mimic the generation of a .gcdump from our .NET profiler. I’ve already written most of the code <a href="/posts/2023-08-11_net-gcdump-internals/">for a previous post</a> and after changing the required plumbing, it is time to test the workflow.</p>
<p>Unfortunately, I’m facing the dreaded stack corruption dialog:</p>
<p><img loading="lazy" src="/posts/2023-12-11_be-aligned-or-how/1_5ydnBBOQBccS0f016OUDIw.png"></p>
<p>The rest of the post explains the different steps I’m following to investigate this issue.</p>
<h2 id="trying-to-understand-theproblem">Trying to understand the problem</h2>
<p>This stack check is done by the debug version of the C Runtime library by basically adding some special bytes on the stack before calling a function and checking these bytes are not tampered when returning from the call.</p>
<p>So, the next step is to debug the application to get more details and at least where in the code the problem happened:</p>
<p><img loading="lazy" src="/posts/2023-12-11_be-aligned-or-how/1_XEEcEc2kLKmy8xRDF2e_xA.png"></p>
<p>The failed check occurs at the end of a function that looks like the following:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">GcDumpProvider</span><span class="o">::</span><span class="n">Get</span><span class="p">(</span><span class="n">IGcDumpProvider</span><span class="o">::</span><span class="n">gcdump_t</span><span class="o">&amp;</span> <span class="n">gcDump</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// trigger the GC and get the dump
</span></span></span><span class="line"><span class="cl">    <span class="n">GcDump</span> <span class="nf">gcd</span><span class="p">(</span><span class="o">::</span><span class="n">GetCurrentProcessId</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">gcd</span><span class="p">.</span><span class="n">TriggerDump</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">auto</span> <span class="k">const</span><span class="o">&amp;</span> <span class="n">dump</span> <span class="o">=</span> <span class="n">gcd</span><span class="p">.</span><span class="n">GetGcDumpState</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">auto</span><span class="o">&amp;</span> <span class="n">types</span> <span class="o">=</span> <span class="n">dump</span><span class="p">.</span><span class="n">_types</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="k">auto</span><span class="o">&amp;</span> <span class="nl">type</span> <span class="p">:</span> <span class="n">types</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">auto</span><span class="o">&amp;</span> <span class="n">typeInfo</span> <span class="o">=</span> <span class="n">type</span><span class="p">.</span><span class="n">second</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="kt">uint64_t</span> <span class="n">instancesCount</span> <span class="o">=</span> <span class="n">typeInfo</span><span class="p">.</span><span class="n">_instances</span><span class="p">.</span><span class="n">size</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint64_t</span> <span class="n">instancesSize</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="n">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">instancesCount</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">instancesSize</span> <span class="o">+=</span> <span class="n">typeInfo</span><span class="p">.</span><span class="n">_instances</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">gcDump</span><span class="p">.</span><span class="n">push_back</span><span class="p">({</span><span class="n">typeInfo</span><span class="p">.</span><span class="n">_name</span><span class="p">,</span> <span class="n">instancesCount</span><span class="p">,</span> <span class="n">instancesSize</span><span class="p">});</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>There is much more code behind this; especially in the <strong>TriggerDump()</strong> function. I have already tested this code many times when I dug into the .gcdump generation process without facing this stack corruption. I’m spending a few hours digging back into the code because:</p>
<ul>
<li>I’m not running inside the profiled process and not outside like in the blog post</li>
<li>I’m introducing a “slight” change because I need to exit the communication with the CLR when the GC ends.</li>
<li>I need to mention that Visual Studio is refusing to debug (Step Over or Step Into) and only Run to Cursor was possible due to mixed mode (managed and native) debugging. So, I created a simple native console application with my updated code for easier and faster debugging.</li>
</ul>
<p>After a couple of hours, it is time to go back to the <strong>Get()</strong> implementation because I do not find anything obviously wrong.</p>
<h2 id="make-it-simpler-and-simpler-and-simpleragain">Make it simpler and simpler and simpler again</h2>
<p>In that type of situation, I recommend the “remove code and debug” strategy (from “divide and conquer” attributed to Julius Cesar). From the simplified console application, the code now looks like:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="nf">GetGcDump</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">,</span> <span class="n">IGcDumpProvider</span><span class="o">::</span><span class="n">gcdump_t</span><span class="o">&amp;</span> <span class="n">gcDump</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">GcDump</span> <span class="n">gcd</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// no more gcd.TriggerDump()
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// no more for (auto&amp; type : types)
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>No more complicated call nor iteration on the vector of results. Guess what? Same stack corruption.</p>
<p>It is time to go one level deeper: what does this <strong>GcDump</strong> class look like?</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">GcDump</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">GcDump</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="o">~</span><span class="n">GcDump</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">private</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">_pid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">DiagnosticsClient</span><span class="o">*</span> <span class="n">_pClient</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">EventPipeSession</span><span class="o">*</span> <span class="n">_pSession</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">HANDLE</span> <span class="n">_hListenerThread</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">GcDumpState</span> <span class="n">_gcDumpState</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The constructor is setting the fields value to zero/nullptr and the destructor is cleaning up these fields if necessary. Since <strong>TriggerDump()</strong> is no more called, these fields never change.</p>
<p>I’m commenting out all fields until only <strong>_gcDumpState</strong> remains and it continues to crash. When is it commented out, it is not more crashing.</p>
<h2 id="use-your-debuggerluke">Use your debugger Luke!</h2>
<p>Let’s turn to the <strong>GcDumpState</strong> class that is even simpler:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">GcDumpState</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">GcDumpState</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="o">~</span><span class="n">GcDumpState</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// fields removed for brevity
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">private</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">_isStarted</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">_hasEnded</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">_collectionIndex</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The code of the destructor only sends a trace to the console (removing it completely does not fix the issue) and here is the constructor code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">GcDumpState</span><span class="o">::</span><span class="n">GcDumpState</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_isStarted</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">_hasEnded</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">_collectionIndex</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Again, same strategy: remove one field after the other. This time, the code stops crashing if the two Boolean fields are removed or if the 32 bits index is removed. If the index field is not set, no more corruption!</p>
<p>How could this assignation corrupt the stack? It is time to use the Visual Studio debugger to better understand what is going on.</p>
<p>First, set a breakpoint on the assignment line and click Debug | Disassembly to see the corresponding assembly code:</p>
<p><img loading="lazy" src="/posts/2023-12-11_be-aligned-or-how/1_vD7b7Lv-bcUmfOwGFOGM8Q.png"></p>
<p>The two lines of assembly code are easy to understand:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">mov</span> <span class="n">rax</span><span class="p">,</span><span class="n">qword</span> <span class="n">ptr</span> <span class="p">[</span><span class="k">this</span><span class="p">]</span>  
</span></span><span class="line"><span class="cl"><span class="n">mov</span> <span class="n">dword</span> <span class="n">ptr</span> <span class="p">[</span><span class="n">rax</span><span class="o">+</span><span class="mi">4</span><span class="p">],</span><span class="mi">0</span>
</span></span></code></pre></td></tr></table>
</div>
</div><ul>
<li>The <strong>this</strong> pointer is stored in the <strong>rax</strong> register</li>
<li>The 32 bits (<strong>mov dword</strong>) memory starting 4 bytes after the beginning of the object pointed to by <strong>this</strong>, is set to 0</li>
</ul>
<p>I enter “this” in a Memory panel (Debug | Windows | Memory xx)</p>
<p><img loading="lazy" src="/posts/2023-12-11_be-aligned-or-how/1_EMOVsaUfPsdbWwKx11c0eQ.png"></p>
<p>And Visual Studio gives me the corresponding address and the content of the memory there:</p>
<p><img loading="lazy" src="/posts/2023-12-11_be-aligned-or-how/1_tKK6VDl7svGm4mA8JJBM_w.png"></p>
<p>Pressing F10 twice to Step Over the two assembly instructions and this is confirmed:</p>
<p><img loading="lazy" src="/posts/2023-12-11_be-aligned-or-how/1_sECyiTU4IBMIuqtzTr4Ecg.png"></p>
<p>Instead of storing the 32 bits 0 value just after the two bytes corresponding to the bool fields, it is stored 2 bytes away. It looks like a padding is added on my behalf.</p>
<p>I change the build settings for the GcDumpState.cpp file to enable all warnings:</p>
<p><img loading="lazy" src="/posts/2023-12-11_be-aligned-or-how/1__0XcRwnGs8Qaoj4_S99LZQ.png"></p>
<p>The compilation confirms what has been seen in the memory:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">GcDumpState</span><span class="p">.</span><span class="n">h</span><span class="p">(</span><span class="mi">41</span><span class="p">,</span><span class="mi">14</span><span class="p">)</span><span class="o">:</span> <span class="n">warning</span> <span class="nl">C4820</span><span class="p">:</span> <span class="err">&#39;</span><span class="n">GcDumpState</span><span class="err">&#39;</span><span class="o">:</span> <span class="sc">&#39;2&#39;</span> <span class="n">bytes</span> <span class="n">padding</span> <span class="n">added</span> <span class="n">after</span> <span class="n">data</span> <span class="n">member</span> <span class="err">&#39;</span><span class="n">GcDumpState</span><span class="o">::</span><span class="n">_hasEnded</span><span class="err">&#39;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="whats-next">What’s next?</h2>
<p>My understanding is that the compiler is:</p>
<ul>
<li>adding a 2 bytes padding to align the 32 bits index field</li>
<li>generating the constructor code based on that padding</li>
<li>but the stack corruption checking code does not take it into account</li>
</ul>
<p>The solution is to either add a <strong>uint16_t</strong> field after the 2 bool fields as an explicit padding or use <strong>#pragma pack(1)</strong> to decorate the class definition.</p>
<p>However, this looks really weird to me. We should have faced this issue a long time ago because we were never cautious about alignment in all the classes and structures that we allocate in our code. To validate the assumption, I’m writing a small reproduction code outside of all the .gcdump complexity. And guess what? I’m not able to reproduce the stack corruption. Another mystery of the C++ compilation optimizations probably…</p>
<p>This is the end of my debugging Friday at Datadog :^)</p>
]]></content:encoded></item><item><title>How to dig into the CLR</title><link>https://chrisnas.github.io/posts/2023-11-12_how-to-dig-into/</link><pubDate>Sun, 12 Nov 2023 09:12:28 +0000</pubDate><guid>https://chrisnas.github.io/posts/2023-11-12_how-to-dig-into/</guid><description>The goal of this post is to share the tips and tricks I used to navigate into the CLR implementation to better understand how .NET works</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>When I started to work on the second edition of <em>Pro .NET Memory Management : For Better Code, Performance, and Scalability</em> by Konrad Kokosa, I already spent some time in the CLR code for a couple of pull requests related to the garbage collector. However, updating the book to cover 5 new versions of .NET requires looking at new APIs but also digging deep inside the CLR (and especially the GC) hundreds of thousand lines of code!</p>
<p>The first step is to install Visual Studio 2022 Preview that allows you to compile and run projects targeting .NET 8. Then, goto <a href="https://github.com/dotnet/runtime">https://github.com/dotnet/runtime</a> and git clone the tag of the <a href="https://dotnet.microsoft.com/en-us/download/dotnet/8.0">.NET 8 preview version you have installed</a>.</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_m3M1yrJypN6eeed3c_6RRg.png"></p>
<p>That way, you will be able to directly run the same version that you will debug.</p>
<p>And now, what are the next steps?</p>
<p>The goal of this post is to share with you the tips and tricks I used to navigate into the CLR implementation so you could better understand how things are working.</p>
<h2 id="from-c-toc">From C# to C++</h2>
<p>As a .NET developer, I’m used to the APIs provided by the Base Class Library built on top of the CLR. Let’s take as an example the following code that is using the <a href="https://learn.microsoft.com/en-us/dotnet/api/system.gc.allocatearray?WT.mc_id=DT-MVP-5003325">GC.AllocateArray</a> method that allows you to allocate a pinned in memory array and available since .NET 5.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">using</span> <span class="nn">System</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">internal</span> <span class="k">class</span> <span class="nc">Program</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">static</span> <span class="k">void</span> <span class="n">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">byte</span><span class="p">[]</span> <span class="n">pinned</span> <span class="p">=</span> <span class="n">GC</span><span class="p">.</span><span class="n">AllocateArray</span><span class="p">&lt;</span><span class="kt">byte</span><span class="p">&gt;(</span><span class="m">90000</span><span class="p">,</span> <span class="kc">true</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;generation = {GC.GetGeneration(pinned)}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>When you Ctrl+click the method name (or use F12), thanks to Source Link integration, you go to its implementation where you can even set breakpoint:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_Zwaunoa3Ott_dsWqzcwqog.png"></p>
<p>If you don’t use Visual Studio, you could open the generated assembly into a decompiler such as <a href="https://github.com/icsharpcode/ILSpy/releases">ILSpy</a> or <a href="https://github.com/dnSpy/dnSpy/releases">DnSpy</a>. The latter even allows you to set breakpoints and debug the disassembly IL without any source.</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_HPBNrPaDhKaX-eOGuNqJtA.png"></p>
<p>In both cases, only the managed implementation will be available: you soon end up to an “internal call” corresponding to a native function implemented by the CLR. The managed methods are decorated with the <a href="https://learn.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.methodimploptions?WT.mc_id=DT-MVP-5003325">MethodImplOptions.InternalCall</a> attribute.</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_4kg2jOleRs06edogUKEgQA.png"></p>
<p>For the garbage collector code, you can look into the <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/System.Private.CoreLib/src/System/GC.CoreCLR.cs">GC.CoreCLR.cs</a> file where these methods are defined. You can note some methods decorated with the <a href="https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.dllimportattribute?WT.mc_id=DT-MVP-5003325">DllImport</a> attribute to bind to native functions exported by a “QCall” library. There is an optimized path in P/Invoke done by the JIT to transform these calls not like a usual LoadLibrary/GetProcAddress as you could expect. Instead, they will be routed to the exported methods by coredll.dll and defined in the <strong>s_QCall</strong> array in <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/qcallentrypoints.cpp">qcallentrypoints.cpp</a>. But where to look further for the native implementation?</p>
<p>Instead of searching among the thousands of files, focus on <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/comutilnative.h">comutilnative.h</a> that defines the signature of most exported functions. The implementation of the exported native functions is found in <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/comutilnative.cpp">comutilnative.cpp</a>. This is where you should start your journey in the native implementation of the CLR. For the list of <strong>all</strong> functions called by the libraries in the runtime, look at the <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/ecalllist.h">ecalllist.h</a> file (around <strong>gGCInterfaceFuncs</strong> and <strong>gGCSettingsFuncs</strong> specifically for the GC).</p>
<p><em>Note that you might also find some implementations under the <em><a href="https://github.com/dotnet/runtime/tree/main/src/coreclr/classlibnative"><em>classlibnative</em></a></em> folder like in the <em><a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/classlibnative/bcltype/system.cpp"><em>system.cpp</em></a></em> file for <em><a href="https://learn.microsoft.com/en-us/dotnet/api/system.runtime.gcsettings.isservergc?WT.mc_id=DT-MVP-5003325"><em>GCSettings.IsServerGC</em></a></em>.</em></p>
<h2 id="clr-source-code-debugging">CLR Source code debugging</h2>
<p>It is nice to know that the implementation of most CLR exported native functions used by the BCL is in <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/comutilnative.cpp">comutilnative.cpp</a>. For the GC, the functions are either statics from the <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/comutilnative.h#L142">GCInterface class</a> or static functions prefixed by <strong>GCInterface_</strong>; I don’t know why all are not part of <strong>GCInterface</strong>…</p>
<p>When you look at the GC-related methods implementation, a lot are calling methods from the instance returned by <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/gcheaputilities.h#L70">GCHeapUtilities::GetGCHeap()</a> that corresponds to the static <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/gcheaputilities.h#L10">g_pGCHeap</a> global variable. It is interesting to follow the threads of calls like that, but I have to admit that, after a few hops, I’m starting to get lost. So, I’m drawing boxes for types on a piece of paper and arrays from their fields to other types as boxes.</p>
<p>However, with a code base that big, I definitively prefer to set breakpoints and write a small C# application to call the methods I’m interested in and see what data structures are used in the different layers of implementation. Don’t be scared: WinDBG is not required to achieve this goal. As <a href="https://github.com/dotnet/runtime/blob/main/docs/workflow/debugging/coreclr/debugging-runtime.md">this page explains</a>, you need to type the following commands in a shell at the root of the repo:</p>
<p><strong>.\build.cmd -s clr -c Debug
.\build.cmd clr.nativeprereqs -a x64 -c debug
.\build.cmd -msbuild</strong></p>
<p>The last command generates a CoreCLR.sln solution file in artifacts\obj\coreclr\windows.x64.Debug\ide) that you can open in Visual Studio 2022 Preview.</p>
<p>In VS, right-click the <strong>INSTALL</strong> project, select Properties and setup the Debugging properties</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_HmIEli065siGvrPzFrMZ1g.png"></p>
<p>Here are the details of each property:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_3n7plqDVcjYhqIKlKZaW-g.png"></p>
<p>It could be interesting to set some environment variables such as <strong>DOTNET_gcServer</strong> to 1 for a GC Server configuration instead of workstation. In that case, click the &lt;Edit..&gt; choice in the combo-box:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_fwEUBlJ1tFdw-aTgHZuAcQ.png"></p>
<p>And update the textbox at the top:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_I61yQQaObRsRCX8v_jy3hA.png"></p>
<p>The final step is to set this project as the startup project:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_cSZTuIuo2UMSfc9DpC_vdw.png"></p>
<p>You are now able to set the breakpoint you want in the native code of the CLR and type F5/Debug in Visual Studio to step into the code!</p>
<h2 id="and-what-about-the-assemblycode">And what about the assembly code?</h2>
<p>Some specific data structures, such as the NonGC Heap, are used by the JIT compiler when generating the assembly code from the IL compiled from your C# code. It means that you need to look at that JITted code to fully understand what is going on.</p>
<p>A first way to get it is to use <a href="https://sharplab.io/">https://sharplab.io/</a>, type your C# code and select x64 for Core of x86/x64 for Framework:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_CHCDpZ5DxdJUTdjOjMUBnw.png"></p>
<p>But as you can see from this screenshot, it is using the .NET 7 compiler. What if you would like to see the .NET 8 compilation result just in case something changed?</p>
<p>The solution I’m using is to generate a memory dump with procdump -ma <pid> of a test application. Before opening the dump in WinDBG, there is something you should be aware of: with the <a href="https://learn.microsoft.com/en-us/dotnet/core/runtime-config/compilation?WT.mc_id=DT-MVP-5003325#tiered-compilation">tiered compilation</a>, you will need to call a method several times before the final optimized assembly code gets JITed. Or… decorate the method you are interested in with the [MethodImpl(<a href="https://learn.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.methodimploptions?WT.mc_id=DT-MVP-5003325">MethodImplOptions.AggressiveOptimization</a>)] attribute to instruct the JIT to directly generate the most optimized tier.</p>
<p>Once the dump loaded in WinDBG, the first step is to get the MethodTable pointer corresponding to the method you are interested in. For that, use the <strong>name2ee</strong> SOS command:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_88o_noKnyvMx3ogJqnaq5Q.png"></p>
<p>Click the link corresponding to MethodDesc to run the <strong>dumpmd</strong> SOS command:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_RWF60HbQmrh8-NIgt_HpXg.png"></p>
<p>The last step is to click the link corresponding to CodeAddr to run the <strong>U</strong> command and see the JITted assembly code:</p>
<p><img loading="lazy" src="/posts/2023-11-12_how-to-dig-into/1_G3N6TdlQRvDhU291kSamww.png"></p>
<p>If you compare this code to get the “Hello, World!” string, with the one shown by sharplab,</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="l">Program.Hello()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">L0000</span><span class="p">:</span><span class="w"> </span><span class="l">mov rcx, 0x257f7cbc368</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">L000a</span><span class="p">:</span><span class="w"> </span><span class="l">mov rcx, [rcx]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">L000d</span><span class="p">:</span><span class="w"> </span><span class="l">jmp qword ptr [0x7ff9c9bd7f48]</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>you might notice a tiny difference: there is one less indirection in .NET 8!
 But this is another story that will be told in the second edition of the “Pro .NET Memory Management: For Better Code, Performance, and Scalability” book ;^)</p>
]]></content:encoded></item><item><title>Crap: the application is randomly crashing!</title><link>https://chrisnas.github.io/posts/2023-10-02_crap-the-application-is/</link><pubDate>Mon, 02 Oct 2023 09:02:26 +0000</pubDate><guid>https://chrisnas.github.io/posts/2023-10-02_crap-the-application-is/</guid><description>This post is listing which steps were followed to investigate a customer random crash issue I faced last week.</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>When you have a call with a customer who explains to you that his application is crashing when your profiler is enabled, it is never a great experience. This post is listing which steps were followed to investigate such an issue I faced last week; from the basics up to the final in analysing memory dumps in WinDbg.</p>
<h2 id="get-as-many-setup-details-aspossible">Get as many setup details as possible</h2>
<p>The situation was the following:</p>
<ul>
<li>A web application was running fine with our Datadog .NET profiler on some non-production servers with less traffic.</li>
<li>The same application was crashing on production servers with more traffic.</li>
</ul>
<p>We were lucky to be able to remote access both machines. A lot of time was spent to check the setup that is based on environment variables. Basically, for our profiler to be loaded by a .NET application, a few <a href="https://learn.microsoft.com/en-us/dotnet/core/runtime-config/debugging-profiling?WT.mc_id=DT-MVP-5003325">Microsoft related environment variables</a> need to be set. Then, you enable the Datadog profiler by setting <strong>DD_PROFILING_ENABLED</strong> to 1 in order to get the profiling details available in our UI. Since the web application is running in IIS, things get more complicated because some environments variables <a href="https://docs.datadoghq.com/profiler/enabling/dotnet/?tab=internetinformationservicesiis">must be set in… the Registry</a>.</p>
<p>So, we checked the environment variables set at the machine level with the <strong>set</strong> command in a prompt and those for IIS with the Registry Editor. However, we got some inconsistencies, and we needed a way to validate what were the environment variables really seen by the web application! The <a href="https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer?WT.mc_id=DT-MVP-5003325">Process Explorer</a> tool from Sysinternals was downloaded and launched. After finding the process ID of the running w3wp.exe corresponding to the web application, a simple right-click to get the Properties and selecting the <strong>Environment</strong> Tab gave us the truth:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_NvcCrY_Y9yUePtZbq1UmcA.png"></p>
<p><em>(This screenshot shows the results for one of our test applications on my development machine).</em></p>
<h2 id="getting-a-memorydump">Getting a memory dump</h2>
<p>Once the setup was checked on both machines without any too weird issues, the next step was to figure out why the application was randomly crashing. Even if the machines received different traffic loads, since applications running without our profiler enabled were not crashing, the chances were high that our C++ code was at the source of the problem. But the crashes were random… And you can’t install Visual Studio on a production server and attach to the process hoping that it will crash and start a debugging session there!</p>
<p>Windows Error Reporting is generating mini dumps when applications are crashing but they are usually not enough to start an investigation. Again, the other Sysinternals tools <a href="https://learn.microsoft.com/en-us/sysinternals/downloads/procdump?WT.mc_id=DT-MVP-5003325">procdump</a> was installed as a global crash handler with <strong>procdump -i c:\dumps -ma</strong>. The next time the application crashed, a memory dump was be generated in the c:\dumps folder. Don’t forget to create it manually if it does not exist.</p>
<h2 id="from-addresses-to-sourcecode">From addresses to source code</h2>
<p>To play with a memory dump, <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/">WinDbg</a> is my preferred toy. I opened the memory dump and, in the case of a crash, the stack panel automatically displayed the call stack of the faulted thread:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_ef6Cz0WBGx_CYoVbBD2vrw.png"></p>
<p>The last frame triggering the issue (i.e., before KiUserExceptionDispatch) is <strong>Datadog_Profiler_Native!DllCanUnloadNow+0x2954b</strong>. Knowing that WinDbg transforms . in file names into _ leads to Datadog.Profiler.Native.dll which is the file where our profiler is implemented. However, WinDbg was not able to find the name of the function and only looked at the exported public symbols. With the <strong>lm</strong> command, you can see how WinDbg gets the symbols for this dll:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_Zy1aXBP-s3_xo-GEKDNPQA.png"></p>
<p>With <strong>DllCanUnloadNow</strong>, you could tell that we are dealing with some COM stuff but it did not really help me for the investigation: I needed to know which function was running which part of its code. Hopefully, for <a href="https://github.com/DataDog/dd-trace-dotnet/releases/tag/v2.36.0">each release of the .NET profiler</a> in Github, in addition to the .msi installer, the symbols and the source code are also provided.</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_BKpIKTzOcaCKOx7J0riLVA.png"></p>
<p>Both files were unzipped in the folder where the dumps were copied. Then, I changed the Debugging Settings in WinDbg to point to these folders:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_h65jBCPNiOwXfkIwC2cZtQ.png"></p>
<p>Let’s start with the symbols to let WinDbg match an instruction pointer to a function name. I asked WinDbg to provide details about the symbol resolution with <strong>!sym noisy</strong>. Then I forced the symbols for my module to gets reloaded with <strong>.reload /f “Datadog.Profiler.Native.dll”</strong>. In the flow of errors, I find out where the .pdb file should be stored so that WinDbg would find it:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_RGatvVvuLeGvBPXlCWLytQ.png"></p>
<p>So the problem is triggered somewhere in our <strong>Windows64BitStackFramesCollector::CollectStackSampleImplementation</strong> function. By simply double-clicking this frame, WinDbg automagically found the corresponding source file and pinpointed the culprit line:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_5a47NcKwyXhk9YBAVgBMAw.png"></p>
<h2 id="a-bit-of-windbgmagic">A bit of WinDbg magic</h2>
<p>To follow me a bit further, you need to understand what this code is doing: it is walking the stack of a thread to find the instruction pointers of each called function. This line 260 is dereferencing the address contained in <strong>context.Rsp</strong>. I looked at Locals panel to get its value:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_PQkFGP5_kGKgDRulRMHRtA.png"></p>
<p>The <strong>!address</strong> command gave me in which module this code was executed from:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_6xKNeOuogOmD_1PGKvUgnA.png"></p>
<p>It looked like a valid page with executable code…</p>
<p>I wanted to see why our stack walking code would break here. What if I asked WinDbg to show me this stack? To do that, I first needed to know which thread our code was trying to stack walk. I knew that <strong>Windows64BitStackFramesCollector</strong> was keeping track of the currently walked thread in a <strong>ManagedThreadInfo</strong> instance pointed to by its <strong>_pCurrentCollectionThreadInfo</strong> field:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_QIZFZokpkgINnnOJgtseFA.png"></p>
<p>This instance stores the thread ID in its <strong>_osThreadId</strong> field: now let’s ask WinDbg to switch to this thread.</p>
<p>The <strong>~</strong> command lists all threads:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_m67ck2tmPvIT1bB9Y6JrPQ.png"></p>
<p>A quick CTRL+F with “27600” stopped on the thread #72. Threads have a lot of identifiers in WinDbg and the first one allowed me to switch with <strong>~72s</strong>.</p>
<p>The Stack panel was almost empty:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_JhLeTbMaR-Hm3U0QzVmCyw.png"></p>
<p>To be sure, I used the <strong>kp</strong> command… that told me that WinDbg was not really happy neither:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_B1_-hEKGWWk9leh4p6IfhA.png"></p>
<p>I was kind of stuck but my colleague <a href="https://twitter.com/kookiz">Kevin Gosse</a> mentioned that I could use <strong>r rip</strong> to see what would be the next instruction to be executed by this thread:</p>
<p><img loading="lazy" src="/posts/2023-10-02_crap-the-application-is/1_ayVTO6s3e0jSFwVeGbl3zQ.png"></p>
<p>Then, the <strong>ln</strong> command (close to the <strong>!address</strong> command I used just before) allowed me to click the <strong>Browse Module</strong> link and see that, again, some code from Sentinel One was ready to execute.</p>
<p>This agent is part of an anti-virus (and more) solution that seems to highjack the stack of threads and our code was not dealing properly with this kind of situation. The fix was to protect our dereferencing code against access violation and stop walking the stack in that case.</p>
<p>Another debugging day at Datadog :^)</p>
]]></content:encoded></item><item><title>.NET .gcdump Internals</title><link>https://chrisnas.github.io/posts/2023-08-11_net-gcdump-internals/</link><pubDate>Fri, 11 Aug 2023 16:31:43 +0000</pubDate><guid>https://chrisnas.github.io/posts/2023-08-11_net-gcdump-internals/</guid><description>Learn what the .NET CLR does behind the scene to help the tools to generate a .gcdump!</description><content:encoded><![CDATA[<hr>
<p>The .NET runtime (both .NET Framework and .NET Core) allows you to generate a lightweight dump containing the allocated type instances count and references including roots. They are usually generated into .gcdump files by tools such as <a href="https://github.com/microsoft/perfview">Perfview</a> or <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/dotnet-gcdump-instructions.md">dotnet-gcdump</a> and can also be viewed in Visual Studio. In addition to a view of the allocated types in the managed heap, these files are often used during memory leak investigations because they are much smaller than full memory dump and they contain explicit dependency information between types up to their roots.</p>
<p>The goal of this document is to dig into their generation and see how to leverage the same mechanisms from the .NET CLR for live heap profiling and memory leak detection.</p>
<h2 id="simply-listening-to-clrevents">Simply listening to CLR events</h2>
<p>Both .NET Framework and .NET Core work the same way to allow a tool to generate a .gcdump file:</p>
<ul>
<li>It seems that the type table needs to be flushed for .NET Core by first creating and immediately closing an event session with the <em>Microsoft-DotNETCore-SampleProfiler</em> provider.</li>
<li>You create an event session (either through ETW for Framework or EventPipe for Core) where the <em>Microsoft-Windows-DotNETRuntime</em> provider is enabled with a <em>verbose</em> level for a long <a href="https://github.com/microsoft/perfview/blob/main/src/TraceEvent/Parsers/ClrTraceEventParser.cs#L208">list of keywords</a> corresponding to the <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/ClrEtwAll.man#L54">0x1980001 value</a>:</li>
</ul>
<p><img loading="lazy" src="/posts/2023-08-11_net-gcdump-internals/1_jibOcciJA026RJhDuxwMAQ.png"></p>
<p>This will trigger an induced non concurrent gen2 garbage collection during which the GC will walk the remaining live objects with their size and emit tons of events, most of them not documented:</p>
<ul>
<li><strong>GCStart</strong>: wait for the first induced gen2 foreground GC (Depth = 2, Type = GCType.NonConcurrentGC and Reason = GCReason.Induced)</li>
<li><strong>GCStop</strong>: detect when the heap walk is over</li>
<li><strong>BulkType</strong>: type blocks are enqueued</li>
<li><strong>GCBulkNode</strong>: node blocks are enqueued</li>
<li><strong>GCBulkEdge</strong>: edge blocks are enqueued</li>
<li><strong>GCBulkRootEdge</strong>: enqueue non-weak reference roots (GCRootFlag &amp; GCRootFlags.WeakRef != 0) based on their GCRootKind</li>
<li><strong>GCBulkRootStaticVar</strong>: static variable blocks</li>
<li><strong>GCBulkRCW</strong> and <strong>GCBulkRootCCW</strong> for Runtime Callable Wrappers and COM Callable Wrappers COM-based roots</li>
<li><strong>GCBulkRootConditionalWeakTableElementEdge</strong>: ??</li>
<li><strong>GCGenerationRange</strong>: one for each managed heap segment with generations boundary (not really needed to build the dependency graph but interesting to figure out objects in generation 2)</li>
</ul>
<p>The payload of most of these events contains arrays of instances with their dependencies. For Perfview and dotnet-gcdump, the whole graph is built in the <a href="https://github.com/dotnet/diagnostics/blob/main/src/Tools/dotnet-gcdump/DotNetHeapDump/DotNetHeapDumpGraphReader.cs#L507">ConvertHeapDataToGraph</a> method after the garbage collection ends.</p>
<h2 id="deciphering-clr-eventspayload">Deciphering CLR events payload</h2>
<p>When you look at the <a href="https://github.com/dotnet/diagnostics/tree/main/src/Tools/dotnet-gcdump/DotNetHeapDump">dotnet-gcdump implementation</a>, you realize that most of the complex code to compute the .gcdump files is physically copied from the Perfview repository.</p>
<p>The <strong>GCBulkXXX</strong> events payload contains an array of elements; each element being different. The common <strong>Count</strong> field contains the number of elements. If the element contains a string such as for <strong>BulkType</strong>, it means that each one has a different size and the string must be read entirely from the payload before accessing the next element.</p>
<h2 id="type-definition">Type definition</h2>
<p>To avoid sending expensive type names all the time, each type will have an identifier that will be used in the <strong>GCBulkXXX</strong>-nodes related events.</p>
<p>The <strong>BulkType</strong> events contain an array of types definition elements with the following layout:</p>
<ul>
<li><em>TypeID</em>: id of the type (i.e. pointer to the Method Table)</li>
<li><em>ModuleID</em>: id of the module where the type is defined</li>
<li><em>TypeNameID</em>: if Name is empty, use this address as a name</li>
<li><em>Flags</em>: if this bitset contains 0x8, it is an array so append “<strong>[]</strong>” to the name in that case</li>
<li><em>CorElementType</em>:?</li>
<li><em>Name</em>: Unicode string corresponding to the name of the type where <em>`xxx</em> need to be removed in case of generics</li>
<li><em>TypeParameterCount</em>: for generics but not used</li>
<li>Array of type parameter: for generics but not used</li>
</ul>
<p>Once the type mappings are known, it becomes possible to build the graph of live type instances instead of just nodes with IDs.</p>
<h2 id="listing-live-objects-and-references">Listing Live Objects and References</h2>
<p>The live objects are sent in the <strong>GCBulkNode</strong> events payload:</p>
<ul>
<li><em>Index</em>: incrementing index of the bulk starting from 0</li>
<li><em>Count</em>: number of objects in the array</li>
</ul>
<p>followed by an array of Values</p>
<ul>
<li><em>Address</em>: address in memory where the object is stored</li>
<li><em>Size</em>: size of the object (including for arrays)</li>
<li><em>TypeID</em>: identifier of the object class usable</li>
<li><em>EdgeCount</em>: number of objects pointed to by this object (i.e. non null reference type fields)</li>
</ul>
<p>Each event contains an array of live objects identified by their address. The <em>Size</em> and <em>TypeID</em> fields are easy to understand but what does the <em>EdgeCount</em> field represent? This is the number of objects that are referenced by this object. At the code level, this is the count of non-null reference type fields. For example, if a class <strong>A</strong> defines one integer field and a second one as a reference to an instance of type <strong>B</strong>, the <em>EdgeCount</em> would be 1 (because an integer is not a reference type).</p>
<p>So the next question is from where do you get which instances are referenced by the objects received in <strong>GCBulkNode</strong> payload? Since these objects are in memory, they are part of the <strong>GCBulkNode</strong> events payload but where is the relationship between the <em>EdgeCount</em> value and the corresponding objects? You will have to rebuild this relationship because the referenced objects are received in the <strong>GCBulkEdge</strong> events payload:</p>
<ul>
<li><em>Index</em>: incrementing index of the bulk starting from 0</li>
<li><em>Count</em>: number of objects in the array</li>
</ul>
<p>followed by an array of Values</p>
<ul>
<li><em>Value</em>: address in memory where the object is stored</li>
<li><em>ReferencingFieldID</em>: this is not used and is always 0</li>
</ul>
<p>Again, an array of elements is received as payload and the only interesting information is the address of the object.</p>
<p>But the magic is that both nodes and edges events payload are “in sync”: when an object is read from a <strong>GCBulkNode</strong> array with let’s say 2 as <em>EdgeCount</em> value, the current 2 elements in the array of the <strong>GCBulkEdge</strong> payload will contain the addresses of these 2 objects. If the next object in the GCBulkNode array has 1 as EdgeCount value, the next element in the array of the <strong>GCBulkEdge</strong> payload will be address of this object as shown in the following figure:</p>
<p><img loading="lazy" src="/posts/2023-08-11_net-gcdump-internals/1_nMFTtE3rNI50uxIs7qMUtw.png"></p>
<p>It means that both payloads must be iterated in sync.</p>
<p>Just with these two events, it is possible to get a detailed view of the objects still used in memory with their size and their type like what you get with <strong>dotnet-gcdump report</strong> or <strong>!sos.dumpheap -stat</strong>.</p>
<p>With the nodes (live objects) and the edges (objects referenced by each object), it is now possible to build the reference graph of live objects.</p>
<h2 id="listing-roots">Listing Roots</h2>
<p>In addition to the objects related events, the GC is also emitting events to list the roots that are referencing objects in the managed heap from the stack, statics, handles or other weird places.</p>
<p>The most interesting roots are available thanks to the following events:</p>
<p><strong>GCBulkRootEdge</strong></p>
<ul>
<li><em>Index</em>: incrementing index of the bulk starting from 0</li>
<li><em>Count</em>: number of roots in the array</li>
</ul>
<p>followed by an array of Values</p>
<ul>
<li><em>RootedNodeAddress</em>: address in memory of the root object</li>
<li><em>GCRootKind</em>: is Stack for local variables</li>
<li><em>GCRootFlag</em>: if not a local variable, could be RefCounted, Finalizer, strong/pinning handles, or other handles</li>
<li><em>GCRootID</em>: address of the handle that points to the root object</li>
</ul>
<p>The static ones are given by the following events:</p>
<p><strong>GCBulkRootStaticVar</strong></p>
<ul>
<li><em>Count</em>: number of static roots in the array</li>
<li><em>AppDomainID</em>: app domain in which the static variable is stored</li>
</ul>
<p>followed by an array of Values</p>
<ul>
<li><em>GCRootID</em>:address of the handle that points to the root object</li>
<li><em>ObjectID</em>: address of the root object</li>
<li><em>TypeID</em>: type identifier of the root object</li>
<li><em>Flags</em>: could be ThreadLocal or not</li>
<li><em>FieldName</em>: Unicode string corresponding to the name of the field in the type corresponding to the root</li>
</ul>
<p>Other rarely used roots are available from the <strong>GCBulkRootConditionalWeakTableElementEdge</strong> and COM-related ones from <strong>GCBulkRootCCW</strong>/<strong>GCBulkRCW</strong> with the ref count for example.</p>
<p>So, each of these events provides arrays of root objects addresses. These can be used in conjunction with the reference graph built from the previous node/edge events to identify the reason why objects stay in memory. Like for the <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-objectreferences-method?WT.mc_id=DT-MVP-5003325">ICorProfilerCallback::ObjectReferences</a> usage <a href="/posts/2023-05-08_raiders-of-the-lost/">previously described</a>, it is needed to rebuild the inverse reference chain from the reference graph:</p>
<p><img loading="lazy" src="/posts/2023-08-11_net-gcdump-internals/0_x9X-LUw2zA81BHs_"></p>
<p>and deal with cycles:</p>
<p><img loading="lazy" src="/posts/2023-08-11_net-gcdump-internals/0_vvaBgFGxraMKoLKF"></p>
<p>These roots could be an expected cache or a memory leak. For the memory leak scenario, filtering on objects in the gen2 could definitely help. This is where the <strong>GCGenerationRange</strong> events could help because their payload contains the ranges of memory addresses in each segment with the corresponding generation:</p>
<p><strong>GCGenerationRange</strong></p>
<ul>
<li><em>Generation</em>: generation of the segment</li>
<li><em>RangeStart</em>: address of the start of the segment</li>
<li><em>RangeUsedLength</em>: size of the committed part of the segment</li>
<li><em>RangeReservedLength</em>: size of the reserved part of the segment</li>
</ul>
<p>When an address fits inside <em>RangeStart</em> and <em>RangeStart</em> + RangeUsedLength, it is part of this segment. The generation of the segment could be 0, 1 or 2 for the ephemeral segments, 3 for the Large Object Heap, and 4 for the Pinned Object Heap.</p>
<h2 id="integration-with-anetprofiler">Integration with a .NET Profiler</h2>
<p>As a .NET Profiler, it is possible to listen to CLR events via <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback10-eventpipeeventdelivered-method?WT.mc_id=DT-MVP-5003325">ICorProfilerCallback::EventPipeEventDelivered</a>. If the same keywords as <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/ClrEtwAll.man#L54">0x1980001</a> have been enabled thanks to <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo12-eventpipestartsession-method?WT.mc_id=DT-MVP-5003325">ICorProfilerInfo12::EventPipeStartSession</a>, the corresponding messages will be received and you have to keep track of the fact that a gcdump is in progress. This should not be a big deal because only the GC keyword (0x1) might be already used and <a href="/posts/2019-05-28_spying-on-net-garbage/">events describing the collections</a> will be processed anyway. There won’t be duplication of events in that case.</p>
<p>However, since it is needed to start a session with the right keyword to trigger the special garbage collection, the <strong>ICorProfilerCallback</strong> mechanism cannot be used to continuously process the corresponding specific messages. This one time EventPipe session should be started independently by manually connecting to the EventPipe of the currently running CLR as described in details in <a href="/posts/2023-03-10_from-metadata-to-event/">this blog series</a>.</p>
<p>You are now ready to integrate this feature of the CLR without the need to install a tool!</p>
]]></content:encoded></item><item><title>Raiders of the lost root: looking for memory leaks in .NET</title><link>https://chrisnas.github.io/posts/2023-05-08_raiders-of-the-lost/</link><pubDate>Mon, 08 May 2023 09:05:15 +0000</pubDate><guid>https://chrisnas.github.io/posts/2023-05-08_raiders-of-the-lost/</guid><description>This post explains how you could write your own memory profiler based on new .NET 7.0 profiler APIs in C++</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>It’s been almost 12 years since I wrote <a href="https://github.com/chrisnas/DebuggingExtensions/tree/master/src/LeakShell">LeakShell</a> to help me <a href="https://codenasarre.wordpress.com/2011/05/18/leakshell-or-how-to-automatically-find-managed-leaks/">automate the search of memory leaks</a> in .NET. The idea was simple: compare 2 memory dumps of a leaking .NET application to show the types with increasing instances count.</p>
<p>Today, you could use <a href="https://learn.microsoft.com/en-us/visualstudio/profiling/memory-usage-without-debugging2?view=vs-2022?WT.mc_id=DT-MVP-5003325">Visual Studio Memory Usage</a> tool to do the same but with a much better user interface! The additional killer feature is the ability to see the references chain that explains why a “leaky” object stays in memory.</p>
<p>My previous series about <a href="/posts/2020-06-19_build-your-own-net/">building your own .NET memory profiler in C#</a> is based on CLR events and does not allow to get the references chain. This post explains how you could write your own memory profiler based on.NET profiler APIs in C++. Refer to <a href="/posts/2021-08-07_start-journey-into-the/">this post</a> for an introduction of how to implement <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-interface?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback</strong></a> to be loaded by the CLR in a .NET process.</p>
<h2 id="how-to-detect-memoryleaks">How to detect memory leaks</h2>
<p>From a high level view, detecting a memory leak means being able to know which objects stay alive garbage collection after garbage collection:</p>
<ul>
<li>implement <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-objectallocated-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback::ObjectAllocated</strong></a> to keep track of ALL objects in the heap,</li>
<li>use <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-objectreferences-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback::MovedReferences2</strong></a> to fixup the addresses when the live objects are moved during compaction garbage collections,</li>
<li>since** **<a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-objectreferences-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback::ObjectReferences</strong></a> is called for each surviving object, clean your list of live objects.</li>
</ul>
<p>The first drawback of this solution is that the CLR has to disable concurrent GC to call these functions with probable impact on performances. However, if you can’t find a leak in production that leads to out of memory crashes, running one instance in this mode is perfectly acceptable. The second drawback is the complexity of keeping track of objects through compacting GCs.</p>
<p>This is why <a href="https://github.com/dotnet/runtime/pull/71257">I implemented in .NET 7</a> a new set of functions in <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo13-interface?WT.mc_id=DT-MVP-5003325">ICorProfilerInfo13</a> to mimic what you can do in C# with a <a href="https://learn.microsoft.com/en-us/dotnet/standard/garbage-collection/weak-references?WT.mc_id=DT-MVP-5003325"><em>weak reference</em></a>:</p>
<ul>
<li><a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo13-createhandle-method?WT.mc_id=DT-MVP-5003325"><strong>CreateHandle</strong></a> : create a weak handle to wrap an object,</li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo13-getobjectidfromhandle-method?WT.mc_id=DT-MVP-5003325"><strong>GetObjectIDFromHandle</strong></a>: get the address of the wrapped object or null if the object is no more in the heap,</li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo13-destroyhandle-method"><strong>DestroyHandle</strong></a>: clean up the weak handle.</li>
</ul>
<p>Creating such a weak handle for allocated objects get rid of the address fixup complexity. However, you should not create a handle for ALL allocated objects because it will slow down the garbage collections. So the next step is to listen to the <a href="https://learn.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events#gcallocationtick_v3-event?WT.mc_id=DT-MVP-5003325">AllocationTick</a> CLR event and create a weak handle for each sampled allocation. Even though the statistical distribution of such 100 KB threshold-based sampling is not perfect, leaking objects should appear.</p>
<p>After each garbage collection detected in <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback2-garbagecollectionfinished-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback2::GarbageCollectionFinished</strong></a> or via <a href="/posts/2019-05-28_spying-on-net-garbage/">specific GC events</a>, you could clean up this list of allocated objects by removing those for which <strong>GetObjectIDFromHandle</strong> returns null.</p>
<p>Feel free to look at the <a href="https://github.com/DataDog/dd-trace-dotnet/blob/master/profiler/src/ProfilerEngine/Datadog.Profiler.Native/LiveObjectsProvider.cpp">corresponding implementation</a> in Datadog .NET profiler code.</p>
<h2 id="rebuild-references-chain-up-to-aroot">Rebuild references chain up to a root</h2>
<p>Even though it is possible to get the call stack that led to allocating a leaking object thanks to the <strong>AllocationTick</strong> event, it would be better to know why it stays in memory. So the next step is to rebuild the references chain up to the root.</p>
<p>As explained in <a href="/posts/2021-12-18_accessing-arrays-and-class/">a previous post</a>, it is possible, for a given object, to get the list of its fields and build a graph of dependencies from a parent to its children. However, you are interested in the opposite and it would require to get these parent/children references for ALL objects in the heap. And this is not possible with the sampled <strong>AllocationTick</strong> event…</p>
<p>This is where <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-objectreferences-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback::ObjectReferences</strong></a> shines:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="nf">ObjectReferences</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">   <span class="n">ObjectID</span> <span class="n">objectId</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">   <span class="n">ClassID</span>  <span class="n">classId</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span>    <span class="n">cObjectRefs</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">   <span class="n">ObjectID</span> <span class="n">objectRefIds</span><span class="p">[]</span> 
</span></span><span class="line"><span class="cl"><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This method is called during a garbage collection for all objects (i.e. <strong>objectId</strong> first parameter) still alive and lists its fields referencing objects in the heap (i.e. <strong>objectRefIds</strong> last parameter).</p>
<p>You could store each object as an <strong>ObjectNode</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ObjectNode</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">ObjectNode</span><span class="p">(</span><span class="n">ObjectID</span> <span class="n">objectId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">ObjectID</span> <span class="n">instance</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ObjectNode</span><span class="o">*&gt;</span> <span class="n">rootRefs</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>in a vector that represents the heap.</p>
<p>For each fields, look for its corresponding node in the vector and add the node of its parent (given by <strong>objectId</strong>) to its <strong>rootRefs</strong> vector of parents. That way, you are building a back reference graph:</p>
<p><img loading="lazy" src="/posts/2023-05-08_raiders-of-the-lost/1_GXNAUQtq1-moMncqOM_yHw.png"></p>
<p>The small blue arrows show the parent/children reference given by <strong>ObjectReferences</strong> and the large purple ones are kept to build a reverse references graph you are interested in.</p>
<p>You know when all live objects in the heap have been listed when <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback2-garbagecollectionfinished-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback2::GarbageCollectionFinished</strong></a> is called. It is now time to get the build the references chain for all sampled objects still alive (thanks to <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo13-getobjectidfromhandle-method?WT.mc_id=DT-MVP-5003325"><strong>GetObjectIDFromHandle</strong></a> returning non null address).</p>
<p>It is important to understand that it is a graph and not a tree because cycles exist in .NET.</p>
<p><img loading="lazy" src="/posts/2023-05-08_raiders-of-the-lost/1_t9MonO439AUATDdp4fmdoQ.png"></p>
<p>This is common in situations where objects need to keep a reference to their “parents”. It means that these cycles should be detected when looking for the list of references of a given live object to avoid infinite recursion:</p>
<p>bool DumpNode(ObjectNode* node, std::vector<ObjectID>&amp; referenceStack)</p>
<p>The traversing <strong>DumpNode</strong> method takes a node (i.e. an object of the heap) and a stack where the parents will be added as we dig into the graph.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span><span class="lnt">46
</span><span class="lnt">47
</span><span class="lnt">48
</span><span class="lnt">49
</span><span class="lnt">50
</span><span class="lnt">51
</span><span class="lnt">52
</span><span class="lnt">53
</span><span class="lnt">54
</span><span class="lnt">55
</span><span class="lnt">56
</span><span class="lnt">57
</span><span class="lnt">58
</span><span class="lnt">59
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="nf">DumpNode</span><span class="p">(</span><span class="n">ObjectNode</span><span class="o">*</span> <span class="n">node</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">ObjectID</span><span class="o">&gt;&amp;</span> <span class="n">referenceStack</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// end of recursion: the node is a root
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">node</span><span class="o">-&gt;</span><span class="n">rootRefs</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">//  dump the root
</span></span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">hex</span> <span class="o">&lt;&lt;</span> <span class="n">node</span><span class="o">-&gt;</span><span class="n">instance</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">dec</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">COR_PRF_GC_ROOT_KIND</span> <span class="n">kind</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">COR_PRF_GC_ROOT_FLAGS</span> <span class="n">flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">FindRoot</span><span class="p">(</span><span class="n">_roots</span><span class="p">,</span> <span class="n">node</span><span class="o">-&gt;</span><span class="n">instance</span><span class="p">,</span> <span class="n">kind</span><span class="p">,</span> <span class="n">flags</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; | &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">DumpKind</span><span class="p">(</span><span class="n">kind</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; - &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">DumpFlags</span><span class="p">(</span><span class="n">flags</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; | ?&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; = &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">DumpObjectType</span><span class="p">(</span><span class="n">node</span><span class="o">-&gt;</span><span class="n">instance</span><span class="p">,</span> <span class="n">_pCorProfilerInfo</span><span class="p">,</span> <span class="n">_pFrameStore</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// dump the references from the root
</span></span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="kt">int16_t</span> <span class="n">i</span> <span class="o">=</span> <span class="n">referenceStack</span><span class="p">.</span><span class="n">size</span><span class="p">()</span><span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="n">i</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">--</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">ObjectID</span> <span class="n">reference</span> <span class="o">=</span> <span class="n">referenceStack</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; --&gt; &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">hex</span> <span class="o">&lt;&lt;</span> <span class="n">reference</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">dec</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; = &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">DumpObjectType</span><span class="p">(</span><span class="n">reference</span><span class="p">,</span> <span class="n">_pCorProfilerInfo</span><span class="p">,</span> <span class="n">_pFrameStore</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// detect cycles
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">Find</span><span class="p">(</span><span class="n">referenceStack</span><span class="p">,</span> <span class="n">node</span><span class="o">-&gt;</span><span class="n">instance</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// go up into the reference chain
</span></span></span><span class="line"><span class="cl">    <span class="n">referenceStack</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">node</span><span class="o">-&gt;</span><span class="n">instance</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="k">auto</span><span class="o">&amp;</span> <span class="nl">parentNode</span> <span class="p">:</span> <span class="n">node</span><span class="o">-&gt;</span><span class="n">rootRefs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">DumpNode</span><span class="p">(</span><span class="n">parentNode</span><span class="p">,</span> <span class="n">referenceStack</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">referenceStack</span><span class="p">.</span><span class="n">pop_back</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>If a parent node is already in the stack, a cycle is detected and that path is not used. Once a root is reached, the stack is dumped as shown in the following output:</p>
<pre tabindex="0"><code>OnGarbageCollectionFinished: 3859 objects in the heap.

OnRootReferences2: 90/109 roots.
            stack:40
        finalizer:1
           handle:49
            other:0
------------------

21266c00020 | H - 0 = Object[]
 --&gt; 21268c092c8 = NativeRuntimeEventSource
 --&gt; 21268c3fbe8 = EventSource.EventMetadata[]
 --&gt; 21268c18010 = ParameterInfo[]
 --&gt; 21268c17ef0 = RuntimeParameterInfo
 --&gt; 21268c0ed58 = RuntimeMethodInfo
 --&gt; 21268c0e708 = RuntimeType.RuntimeTypeCache
 --&gt; 21268c0e840 = RuntimeType.RuntimeTypeCache.MemberInfoCache&lt;System.Reflection.RuntimeMethodInfo&gt;
 --&gt; 21268c14978 = RuntimeMethodInfo[]
 --&gt; 21268c10d70 = RuntimeMethodInfo
 --&gt; 21268c29720 = Signature
 --&gt; 21268c29770 = RuntimeType[]
=====================================
</code></pre><p>As shown in the output, it is possible to provide details about the kind of root is keeping the references chain alive thanks to <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback2-rootreferences2-method?WT.mc_id=DT-MVP-5003325">ICorProfilerCallback::RootReferences2</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="nf">RootReferences2</span><span class="p">(</span>  
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span>  <span class="n">cRootRefs</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">   <span class="n">ObjectID</span> <span class="n">rootRefIds</span><span class="p">[],</span>  
</span></span><span class="line"><span class="cl">   <span class="n">COR_PRF_GC_ROOT_KIND</span> <span class="n">rootKinds</span><span class="p">[],</span>  
</span></span><span class="line"><span class="cl">   <span class="n">COR_PRF_GC_ROOT_FLAGS</span> <span class="n">rootFlags</span><span class="p">[],</span>  
</span></span><span class="line"><span class="cl">   <span class="n">UINT_PTR</span> <span class="n">rootIds</span><span class="p">[]</span>
</span></span><span class="line"><span class="cl"><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This function is called with three synchronized arrays <strong>cRootRefs</strong> long that contain for each root:</p>
<ul>
<li>the address (**rootRefsIds **objectID),</li>
<li>the kind (<strong>rootKind</strong> for stack, finalizer, handle and other)</li>
<li>and flags (<strong>rootFlags</strong> for pinned, weak reference interior or ref counted).</li>
</ul>
<p>These are stored in a vector of <strong>ObjectRoot</strong>:</p>
<h2 id="goodies-how-to-get-arrays-typename">Goodies: how to get arrays type name</h2>
<p>I did not mention how to get the type name of either an <strong>ObjectID</strong> or a <strong>ClassID</strong> because it is explained in <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">a previous post</a>. However, I forgot to explain how to deal with the different kinds of arrays: single dimension (ex: <strong>byte[]</strong>), multidimensional (ex: <strong>byte[,]</strong>) or jagged (ex: <strong>byte[][]</strong>).</p>
<p>When you call <strong>ICorProfilerInfo::GetClassInfo</strong> on a <strong>ClassID</strong> corresponding to an array,</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ModuleID</span> <span class="n">moduleId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdTypeDef</span> <span class="n">typeDefToken</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pCorProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetClassIDInfo</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">moduleId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">typeDefToken</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>it won’t fail but the module id and the metadata token will both be set to 0.</p>
<p>Instead, you have to call <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-isarrayclass-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::IsArrayClass</strong></a> to get the rank and the item class ID of the array. This is then done recursively on the item class ID until it fails:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">arrayBuilder</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">CorElementType</span> <span class="n">baseElementType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ClassID</span> <span class="n">itemClassId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">rank</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">_pCorProfilerInfo</span><span class="o">-&gt;</span><span class="n">IsArrayClass</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">baseElementType</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">itemClassId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">rank</span><span class="p">)</span> <span class="o">==</span> <span class="n">S_OK</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">classId</span> <span class="o">=</span> <span class="n">itemClassId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">isArray</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">AppendArrayRank</span><span class="p">(</span><span class="n">arrayBuilder</span><span class="p">,</span> <span class="n">rank</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// in case of matrices, it is needed to look for the last &#34;good&#34; item class ID
</span></span></span><span class="line"><span class="cl">    <span class="c1">// because all others might be array of array of ...
</span></span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="n">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">rank</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">_pCorProfilerInfo</span><span class="o">-&gt;</span><span class="n">IsArrayClass</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">baseElementType</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">itemClassId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">rank</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">((</span><span class="n">hr</span> <span class="o">==</span> <span class="n">S_FALSE</span><span class="p">)</span> <span class="o">||</span> <span class="n">FAILED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">itemClassId</span> <span class="o">=</span> <span class="n">classId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">AppendArrayRank</span><span class="p">(</span><span class="n">arrayBuilder</span><span class="p">,</span> <span class="n">rank</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">classId</span> <span class="o">=</span> <span class="n">itemClassId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Notice that the way to concatenate the possible <strong>[]</strong> / <strong>[,]</strong> / <strong>[][]</strong> could is the opposite of how the array type is defined in C#:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">void</span> <span class="nf">AppendArrayRank</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&amp;</span> <span class="n">arrayBuilder</span><span class="p">,</span> <span class="n">ULONG</span> <span class="n">rank</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">rank</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">arrayBuilder</span> <span class="o">=</span> <span class="s">&#34;[]&#34;</span> <span class="o">+</span> <span class="n">arrayBuilder</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">stringstream</span> <span class="n">builder</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">builder</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;[&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="n">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">rank</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">builder</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;,&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="n">builder</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;]&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">arrayBuilder</span> <span class="o">=</span> <span class="n">builder</span><span class="p">.</span><span class="n">str</span><span class="p">()</span> <span class="o">+</span> <span class="n">arrayBuilder</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For example, a <strong>byte[][,]</strong> is defined as an rank 2 array of array of byte.</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://codenasarre.wordpress.com/2011/05/18/leakshell-or-how-to-automatically-find-managed-leaks/">Automate the search of memory leaks with LeakShell</a></li>
<li><a href="/posts/2020-06-19_build-your-own-net/">Building your own .NET memory profiler in C#</a></li>
<li><a href="/posts/2021-08-07_start-journey-into-the/">Introduction to .NET Profiling with ICorProfilerCallback</a></li>
<li><a href="https://github.com/dotnet/runtime/pull/71257">Pull request in .NET 7</a> for <a href="https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo13-interface?WT.mc_id=DT-MVP-5003325">ICorProfilerInfo13</a> to create weak handles</li>
<li>Datadog .NET <a href="https://github.com/DataDog/dd-trace-dotnet/blob/master/profiler/src/ProfilerEngine/Datadog.Profiler.Native/LiveObjectsProvider.cpp">Live Heap Profiler implementation</a></li>
</ul>
]]></content:encoded></item><item><title>From Metadata to Event block in nettrace format</title><link>https://chrisnas.github.io/posts/2023-03-10_from-metadata-to-event/</link><pubDate>Fri, 10 Mar 2023 16:15:40 +0000</pubDate><guid>https://chrisnas.github.io/posts/2023-03-10_from-metadata-to-event/</guid><description>The previous episodes started the parsing of the “nettrace” format. This last episode covers Metadata and Event blocks format.</description><content:encoded><![CDATA[<hr>
<p>The previous episodes started the parsing of the “nettrace” format used when <a href="/posts/2022-09-18_net-diagnostic-ipc-protocol/">contacting the .NET Diagnostics IPC server</a>, <a href="/posts/2022-10-23_clr-events-go-for/">initiate the protocol to receive CLR events</a> and start to <a href="/posts/2023-01-15_reading-object-in-memory/">parse stacks</a>. This last episode covers the Metadata and Event blocks.</p>
<p>In terms of format, both Metadata and Event blocks share the same memory layout:</p>
<p><img loading="lazy" src="/posts/2023-03-10_from-metadata-to-event/1_8U7zPxOVCe2Bws5g5TkX_A.png"></p>
<p>The common <strong>EventBlockHeader</strong> starts the block:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="cp">#pragma pack(1)
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">EventBlockHeader</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint16_t</span> <span class="n">HeaderSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint16_t</span> <span class="n">Flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">MinTimestamp</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">MaxTimestamp</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// some optional reserved space might be following
</span></span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The timestamp fields give the time of the first and last event in the block. The <strong>HeaderSize</strong> fields is important because additional information can be stored in the header. Since I have no idea what could be stored there, I simply skip it:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">EventParserBase</span><span class="o">::</span><span class="n">OnParse</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// read event block header
</span></span></span><span class="line"><span class="cl">    <span class="n">EventBlockHeader</span> <span class="n">ebHeader</span> <span class="o">=</span> <span class="p">{};</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">Read</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ebHeader</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">ebHeader</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// skip any optional content if any
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">ebHeader</span><span class="p">.</span><span class="n">HeaderSize</span> <span class="o">&gt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">EventBlockHeader</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint8_t</span> <span class="n">optionalSize</span> <span class="o">=</span> <span class="n">ebHeader</span><span class="p">.</span><span class="n">HeaderSize</span> <span class="o">-</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">EventBlockHeader</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">SkipBytes</span><span class="p">(</span><span class="n">optionalSize</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The important piece of information to figure out how to unpack the rest of the block is kept in the <strong>Flags</strong> field. If the lowest bit is set, it means that the blobs header will be compressed:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// the rest of the block is a list of Event blobs
</span></span></span><span class="line"><span class="cl">    <span class="c1">//
</span></span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">blobSize</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">totalBlobSize</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">remainingBlockSize</span> <span class="o">=</span> <span class="n">_blockSize</span> <span class="o">-</span> <span class="n">ebHeader</span><span class="p">.</span><span class="n">HeaderSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">isCompressed</span> <span class="o">=</span> <span class="p">((</span><span class="n">ebHeader</span><span class="p">.</span><span class="n">Flags</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The rest of the code iterates on each blob:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Note: in order to gain space, some fields of the header could be &#34;inherited&#34;
</span></span></span><span class="line"><span class="cl">    <span class="c1">// from the header of the previous blob --&gt; need to pass it from blob to blob
</span></span></span><span class="line"><span class="cl">    <span class="n">EventBlobHeader</span> <span class="n">header</span> <span class="o">=</span> <span class="p">{};</span>
</span></span><span class="line"><span class="cl">    <span class="k">while</span> <span class="p">(</span><span class="n">OnParseBlob</span><span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">isCompressed</span><span class="p">,</span> <span class="n">blobSize</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">totalBlobSize</span> <span class="o">+=</span> <span class="n">blobSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">blobSize</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">totalBlobSize</span> <span class="o">&gt;=</span> <span class="n">remainingBlockSize</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="c1">// try to detect last blob
</span></span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// don&#39;t forget to check the end of block tag
</span></span></span><span class="line"><span class="cl">            <span class="kt">uint8_t</span> <span class="n">tag</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadByte</span><span class="p">(</span><span class="n">tag</span><span class="p">)</span> <span class="o">||</span> <span class="p">(</span><span class="n">tag</span> <span class="o">!=</span> <span class="n">NettraceTag</span><span class="o">::</span><span class="n">EndObject</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;Missing end of block tag</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Here is the tricky part: to gain space, each blob starts with a header that could be “compressed”. The compression mechanism is simple: the first byte is a bitfield value that indicates which fields are present (i.e. their value should be read from the memory block) or skipped (i.e. their value is the same as the previous blob header). Therefore, an <strong>EventBlobHeader</strong> is passed by reference to the <strong>OnParseBlob</strong> function. My <strong>MetadataParser</strong> and <strong>EventParser</strong> implementations of <strong>OnParseBlob</strong> both starts with the same code to read the header:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">XXXParser</span><span class="o">::</span><span class="n">OnParseBlob</span><span class="p">(</span><span class="n">EventBlobHeader</span><span class="o">&amp;</span> <span class="n">header</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">isCompressed</span><span class="p">,</span> <span class="n">DWORD</span><span class="o">&amp;</span> <span class="n">blobSize</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">isCompressed</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadCompressedHeader</span><span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">blobSize</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadUncompressedHeader</span><span class="p">(</span><span class="n">header</span><span class="p">,</span> <span class="n">blobSize</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The implementation to read compressed and uncompressed version of the header is a direct translation of <a href="https://github.dev/microsoft/perfview/blob/b5d1f0423ed5fb6521fae0f3c9e92c886752ac8d/src/TraceEvent/EventPipe/EventPipeEventSource.cs#L1439">the TraceEvent C# code</a> into C++.</p>
<p>The <strong>EventBlobHeader</strong> contains details of events:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="cp">#pragma pack(1)
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">EventBlobHeader_V4</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">EventSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">MetadataId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">SequenceNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">ThreadId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">CaptureThreadId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">ProcessorNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">StackId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">Timestamp</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">GUID</span> <span class="n">ActivityId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">GUID</span> <span class="n">RelatedActivityId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">PayloadSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><ul>
<li>The “identity” of an event is given by the <strong>MetadataId</strong> field that refers to information defined in Metadata “object“ (for which <strong>MetadataId</strong> is 0).</li>
<li>The <strong>SequenceNumber</strong> field is incremented on a per thread basis each time an event is emitted. This could be used to detect if some events have been dropped (for a given <strong>CaptureThreadId</strong>, two consecutive events have a <strong>SequenceNumber</strong> incremented by more than 1 — more on dropped events in the forthcoming SequencePoint “object” description). Its value is 0 for a metadata “object”</li>
<li>The <strong>ThreadId</strong> and <strong>CaptureThreadId</strong> field have always the same value for Event “object”; <strong>CaptureThreadId</strong> is 0 for Metadata “object”.</li>
<li>In case of Event “object”, the <strong>StackId</strong> field refers to one of the stacks extracted from a Stack “object”. Its value is 0 for Metadata “object”.</li>
</ul>
<h2 id="the-metadataobject">The Metadata “object”</h2>
<p>As <a href="https://github.com/microsoft/perfview/blob/main/src/TraceEvent/EventPipe/EventPipeFormat.md">the documentation states</a>, <em>each MetadataBlock holds a set of metadata records. Each metadata record has an ID and it describes one type of event. Each event has a metadataId field which will indicate the ID of the metadata record which describes that event</em>.</p>
<p>The resulting mapping is stored in <strong>EventPipeSession</strong> class:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// per metadataID event metadata description
</span></span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">unordered_map</span><span class="o">&lt;</span><span class="kt">uint32_t</span><span class="p">,</span> <span class="n">EventCacheMetadata</span><span class="o">&gt;</span> <span class="n">_metadata</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>However, the rest of the documentation is partially right in the case of nettrace stream received through EventPipe: <em>Metadata includes an event name, provider name, and the layout of fields that are encoded in the event’s payload section.</em></p>
<p>First, the fields layout is simply not there. In addition, for some providers (dotnet runtime, private and rundown), the event names are empty strings. So, the data structure filled from the <strong>MetadataBlock</strong> will most of the time have an empty <strong>EventName</strong> field. Note that the “Microsoft-DotNETCore-EventPipe” provider (i.e. command events for that specific provider) and EventSource-derived classes written in C# provide the events name:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">EventCacheMetadata</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>     <span class="n">MetadataId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">ProviderName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>     <span class="n">EventId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">wstring</span> <span class="n">EventName</span><span class="p">;</span> <span class="c1">// empty most of the time
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span>     <span class="n">Keywords</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>     <span class="n">Version</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span>     <span class="n">Level</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In addition to the provider’s name serialized as a UTF16 string (including last ‘\0’ wide character), the <strong>EventId</strong> field is the key used to identify an event.</p>
<p>After these details, you will find a 4 bytes value corresponding to the number of fields in the event payload. As already mentioned, this value is always 0 so my code is skipping the rest of the metadata block payload.</p>
<h2 id="the-eventobject">The Event “object”</h2>
<p>And at last, here comes the time to parse Event “object” payload! The <strong>MetadataId</strong> field of the <strong>EventBlobHeader</strong> is used to find the provider’s name and event id:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">EventParser</span><span class="o">::</span><span class="n">OnParseBlob</span><span class="p">(</span><span class="n">EventBlobHeader</span><span class="o">&amp;</span> <span class="n">header</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">isCompressed</span><span class="p">,</span> <span class="n">DWORD</span><span class="o">&amp;</span> <span class="n">blobSize</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">auto</span><span class="o">&amp;</span> <span class="n">metadataDef</span> <span class="o">=</span> <span class="n">_metadata</span><span class="p">[</span><span class="n">header</span><span class="p">.</span><span class="n">MetadataId</span><span class="p">];</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>So, the rest of the function reads the payload based on the expected event id:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">switch</span> <span class="p">(</span><span class="n">metadataDef</span><span class="p">.</span><span class="n">EventId</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">case</span> <span class="n">EventIDs</span><span class="o">::</span><span class="nl">AllocationTick</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">OnAllocationTick</span><span class="p">(</span><span class="n">header</span><span class="p">.</span><span class="n">PayloadSize</span><span class="p">,</span> <span class="n">metadataDef</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">case</span> <span class="p">...</span>
</span></span><span class="line"><span class="cl">            <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">case</span> <span class="n">EventIDs</span><span class="o">::</span><span class="nl">ExceptionThrown</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">OnExceptionThrown</span><span class="p">(</span><span class="n">header</span><span class="p">.</span><span class="n">PayloadSize</span><span class="p">,</span> <span class="n">metadataDef</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">default</span><span class="o">:</span>  <span class="c1">// skip events we are not interested in
</span></span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">SkipBytes</span><span class="p">(</span><span class="n">header</span><span class="p">.</span><span class="n">PayloadSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">blobSize</span> <span class="o">+=</span> <span class="n">header</span><span class="p">.</span><span class="n">PayloadSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The format of each event payload is usually given by <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-events?WT.mc_id=DT-MVP-5003325">the Microsoft documentation</a>. If not, you should look into the <a href="https://github.com/dotnet/coreclr/blob/release/3.1/src/vm/ClrEtwAll.man">ClrEtwall.man file</a> where the payload of ALL events are defined. For example, the <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events#gcallocationtick_v3-event?WT.mc_id=DT-MVP-5003325"><em>AllocationTick</em> event payload</a> provides the name of the last allocated type to reach the 100 KB threshold (read <a href="/posts/2020-04-18_build-your-own-net/">this blog post</a> for more details about how to use this event):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">//  AllocationAmount    UInt32          The allocation size, in bytes.
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                      This value is accurate for allocations that are less than the length of a ULONG(4,294,967,295 bytes).
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                      If the allocation is greater, this field contains a truncated value.
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                      Use AllocationAmount64 for very large allocations.
</span></span></span><span class="line"><span class="cl"><span class="c1">//  AllocationKind      UInt32          0x0 - Small object allocation(allocation is in small object heap).
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                      0x1 - Large object allocation(allocation is in large object heap).
</span></span></span><span class="line"><span class="cl"><span class="c1">//  ClrInstanceID       UInt16          Unique ID for the instance of CLR or CoreCLR.
</span></span></span><span class="line"><span class="cl"><span class="c1">//  AllocationAmount64  UInt64          The allocation size, in bytes.This value is accurate for very large allocations.
</span></span></span><span class="line"><span class="cl"><span class="c1">//  TypeId              Pointer         The address of the MethodTable.When there are several types of objects that were allocated during this event,
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                      this is the address of the MethodTable that corresponds to the last object allocated (the object that caused the 100 KB threshold to be exceeded).
</span></span></span><span class="line"><span class="cl"><span class="c1">//  TypeName            UnicodeString   The name of the type that was allocated.When there are several types of objects that were allocated during this event,
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                      this is the type of the last object allocated (the object that caused the 100 KB threshold to be exceeded).
</span></span></span><span class="line"><span class="cl"><span class="c1">//  HeapIndex           UInt32          The heap where the object was allocated.This value is 0 (zero)when running with workstation garbage collection.
</span></span></span><span class="line"><span class="cl"><span class="c1">//  Address             Pointer         The address of the last allocated object.
</span></span></span><span class="line"><span class="cl"><span class="c1">//
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>Based on this fields definition, the <strong>EventParser::OnAllocationTick</strong> function is reading each field after the other thanks to the <strong>ReadWord</strong>, <strong>ReadDWord</strong>, <strong>ReadLong</strong> and <strong>ReadWString</strong> :</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">EventParser</span><span class="o">::</span><span class="n">OnAllocationTick</span><span class="p">(</span><span class="n">DWORD</span> <span class="n">payloadSize</span><span class="p">,</span> <span class="n">EventCacheMetadata</span><span class="o">&amp;</span> <span class="n">metadataDef</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">readBytesCount</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">size</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;</span><span class="se">\n</span><span class="s">Allocation Tick:</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// get common fields
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">dword</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadDWord</span><span class="p">(</span><span class="n">dword</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">readBytesCount</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">dword</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;   Amount        = &#34;</span> <span class="o">&lt;&lt;</span> <span class="n">dword</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; bytes</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadDWord</span><span class="p">(</span><span class="n">dword</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">readBytesCount</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">dword</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;   Kind          = &#34;</span> <span class="o">&lt;&lt;</span> <span class="p">((</span><span class="n">dword</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="o">?</span> <span class="s">&#34;LOH&#34;</span> <span class="o">:</span> <span class="s">&#34;small&#34;</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; bytes</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">uint16_t</span> <span class="n">word</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadWord</span><span class="p">(</span><span class="n">word</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">readBytesCount</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">word</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;   CLR ID        = &#34;</span> <span class="o">&lt;&lt;</span> <span class="n">word</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">ulong</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadLong</span><span class="p">(</span><span class="n">ulong</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">readBytesCount</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">ulong</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;   Amount64      = &#34;</span> <span class="o">&lt;&lt;</span> <span class="n">ulong</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; bytes</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The bitness of the monitored application is important when “pointers” need to be read from the payload: use <strong>ReadDWord</strong> for 32-bit and <strong>ReadLong</strong> for 64-bit:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// skip useless MT address
</span></span></span><span class="line"><span class="cl">    <span class="c1">// Note: handle 32/64 bit difference
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_is64Bit</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadLong</span><span class="p">(</span><span class="n">ulong</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="n">readBytesCount</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">ulong</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadDWord</span><span class="p">(</span><span class="n">dword</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="n">readBytesCount</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">dword</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And if you don’t need the rest of the payload, <strong>SkipBytes</strong> is your friend:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// skip the rest of the payload
</span></span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nf">SkipBytes</span><span class="p">(</span><span class="n">payloadSize</span> <span class="o">-</span> <span class="n">readBytesCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>I had some issues when dealing with the <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/exception-thrown-v1-etw-event?WT.mc_id=DT-MVP-5003325"><strong>ExceptionThrown</strong> event payload</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// Type             wstring     Exception type
</span></span></span><span class="line"><span class="cl"><span class="c1">// Message          wstring     Exception message
</span></span></span><span class="line"><span class="cl"><span class="c1">// EIPCodeThrow     win:Pointer Instruction pointer where exception occurred.
</span></span></span><span class="line"><span class="cl"><span class="c1">// ExceptionHR      win:UInt32  Exception HRESULT.
</span></span></span><span class="line"><span class="cl"><span class="c1">// ExceptionFlags   win:UInt16
</span></span></span><span class="line"><span class="cl"><span class="c1">//      0x01: HasInnerException (see CLR ETW Events in the Visual Basic documentation).
</span></span></span><span class="line"><span class="cl"><span class="c1">//      0x02: IsNestedException.
</span></span></span><span class="line"><span class="cl"><span class="c1">//      0x04: IsRethrownException.
</span></span></span><span class="line"><span class="cl"><span class="c1">//      0x08: IsCorruptedStateException (indicates that the process state is corrupt).
</span></span></span><span class="line"><span class="cl"><span class="c1">//      0x10: IsCLSCompliant (an exception that derives from Exception is CLS-compliant).
</span></span></span><span class="line"><span class="cl"><span class="c1">// ClrInstanceID win:UInt16 Unique ID for the instance of CLR or CoreCLR.
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>In case of an empty message, the field itself was not even there! Not even 0 for a ‘\0’ wide character… In fact, there is a bug in the serialization code that skips the field in that case. This has been <a href="https://github.com/dotnet/runtime/commit/72e2420fd227aa45c86577622cf3ed4adfbbb461">fixed in .NET 6</a> by storing “NULL” as the serialized string: I would have preferred ‘\0’ but it seems to be compatible with the ETW implementation.</p>
<p>To support .NET Core 3+ and .NET 5, my code is comparing the size of the remaining of the payload after reading the exception type with the expected size of the 4 remaining fields after the exception message. If it is greater then it means that there is a string for the message. If not, I know that the message is empty:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">EventParser</span><span class="o">::</span><span class="n">OnExceptionThrown</span><span class="p">(</span><span class="n">DWORD</span> <span class="n">payloadSize</span><span class="p">,</span> <span class="n">EventCacheMetadata</span><span class="o">&amp;</span> <span class="n">metadataDef</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">readBytesCount</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">DWORD</span> <span class="n">size</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// read exception type
</span></span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Size of the ExceptionThrown payload AFTER the Message field
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint16_t</span> <span class="n">exceptionRemainingPayloadSize</span> <span class="o">=</span> <span class="p">(</span><span class="n">_is64Bit</span> <span class="o">?</span> <span class="mi">8</span> <span class="o">:</span> <span class="mi">4</span><span class="p">)</span> <span class="o">+</span> <span class="mi">4</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">+</span> <span class="mi">2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// In case of &#34;empty&#34; message, it might not be even visible as &#34;\0&#34; before .NET Core 6 (and after, will be &#34;NULL&#34;)
</span></span></span><span class="line"><span class="cl">    <span class="c1">// so it is needed to check if the remaining payload contains such a string
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">((</span><span class="n">payloadSize</span> <span class="o">-</span> <span class="n">readBytesCount</span><span class="p">)</span> <span class="o">==</span> <span class="n">_exceptionRemainingPayloadSize</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">wcout</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;   message = &#39;&#39;</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadWString</span><span class="p">(</span><span class="n">strBuffer</span><span class="p">,</span> <span class="n">size</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="n">readBytesCount</span> <span class="o">+=</span> <span class="n">size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// handle empty string case (check for &#34;NULL&#34; in case of .NET 6+)
</span></span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">strBuffer</span><span class="p">.</span><span class="n">empty</span><span class="p">()</span> <span class="o">||</span> <span class="p">(</span><span class="n">wcscmp</span><span class="p">(</span><span class="n">strBuffer</span><span class="p">.</span><span class="n">c_str</span><span class="p">(),</span> <span class="sa">L</span><span class="s">&#34;NULL&#34;</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">wcout</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;   message = &#39;&#39;</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">std</span><span class="o">::</span><span class="n">wcout</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;   message = &#34;</span> <span class="o">&lt;&lt;</span> <span class="n">strBuffer</span><span class="p">.</span><span class="n">c_str</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// skip the rest of the payload
</span></span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nf">SkipBytes</span><span class="p">(</span><span class="n">payloadSize</span> <span class="o">-</span> <span class="n">readBytesCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="the-sequencepointblock-object">The SequencePointBlock “object”</h2>
<p>The last “object” type is the sequence point block that contains the following fields:</p>
<p><img loading="lazy" src="/posts/2023-03-10_from-metadata-to-event/1_aPaYZ4KOad8jUjnBzHggpg.png"></p>
<p>In addition to these fields, it also <a href="https://github.com/microsoft/perfview/blob/main/src/TraceEvent/EventPipe/EventPipeFormat.md#sequencepointblock-object">implicitly tells you</a> that new stack “object” will be received (with stack id restarting from 1) to match next Event “objects”. For example, the following trace shows how a sequence point block resets the stacks by restarting at 1:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-markdown" data-lang="markdown"><span class="line"><span class="cl">Event block (140 bytes)
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">blob header:
</span></span><span class="line"><span class="cl">   StackId           = 3
</span></span><span class="line"><span class="cl">Contention
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">blob header:
</span></span><span class="line"><span class="cl">   StackId           = 4
</span></span><span class="line"><span class="cl">Event = 81
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">blob header:
</span></span><span class="line"><span class="cl">   StackId           = 3
</span></span><span class="line"><span class="cl">Contention
</span></span><span class="line"><span class="cl">...
</span></span><span class="line"><span class="cl">------------------------------------------------ 
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gs">________________________________________________</span>
</span></span><span class="line"><span class="cl">SequencePoint block (217 bytes)
</span></span><span class="line"><span class="cl">...
</span></span><span class="line"><span class="cl">------------------------------------------------
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gs">________________________________________________</span>
</span></span><span class="line"><span class="cl">Stack block (105 bytes)
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Stack block header:
</span></span><span class="line"><span class="cl">   FirstID: 1
</span></span><span class="line"><span class="cl">   Count  : 2
</span></span><span class="line"><span class="cl">------------------------------------------------
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gs">________________________________________________</span>
</span></span><span class="line"><span class="cl">Event block (92 bytes)
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">blob header:
</span></span><span class="line"><span class="cl">   StackId           = 1
</span></span><span class="line"><span class="cl">Contention
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">blob header:
</span></span><span class="line"><span class="cl">   StackId           = 2
</span></span><span class="line"><span class="cl">Event = 81
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">blob header:
</span></span><span class="line"><span class="cl">   StackId           = 1
</span></span><span class="line"><span class="cl">Contention:
</span></span><span class="line"><span class="cl">------------------------------------------------
</span></span></code></pre></td></tr></table>
</div>
</div><p>So, the stacks you might have cached based on the already received stack “objects” should now be invalidated like what I’m doing in <strong>SequencePointParser::OnParse</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">SequencePointParser</span><span class="o">::</span><span class="n">OnParse</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// reset stack caches
</span></span></span><span class="line"><span class="cl">    <span class="n">_stacks32</span><span class="p">.</span><span class="n">clear</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">_stacks64</span><span class="p">.</span><span class="n">clear</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You now have all the elements you need to listen to CLR events on Windows and Linux for .NET Core 3+ and .NET 5+. If you are still running applications with .NET Framework, you will need to use ETW but this is another story.</p>
<h2 id="resources">Resources</h2>
<ul>
<li><a href="/posts/2022-07-28_digging-into-the-clr/">Episode 1</a> — <em>Digging into the CLR Diagnostics IPC Protocol in C#</em></li>
<li><a href="/posts/2022-09-18_net-diagnostic-ipc-protocol/">Episode 2</a> — <em>.NET Diagnostic IPC protocol: the C++ way</em></li>
<li>[Episode 3 ](/posts/2022-10-23_clr-events-go-for/ <em>CLR events: go for the nettrace file format!</em></li>
<li>[Episode 4 ](/posts/2022-11-27_parsing-the-nettrace-stream/ <em>Parsing the “nettrace” steam</em></li>
<li>[Episode 5 ](/posts/2023-01-15_reading-object-in-memory/ <em>Reading “object” in memory — starting with stacks</em></li>
<li><a href="https://github.com/chrisnas/ClrEvents/tree/master/Events/NativeEventListener">Source code</a> for the C++ implementation of CLR events listener</li>
<li>Diagnostics IPC protocol <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/design-docs/ipc-protocol.md">documentation</a></li>
</ul>
]]></content:encoded></item><item><title>Reading “object” in memory — starting with stacks</title><link>https://chrisnas.github.io/posts/2023-01-15_reading-object-in-memory/</link><pubDate>Sun, 15 Jan 2023 16:38:45 +0000</pubDate><guid>https://chrisnas.github.io/posts/2023-01-15_reading-object-in-memory/</guid><description>During the parsing of the nettrace format, blocks are serialized as “object”. Let’s look at “stack” objects.</description><content:encoded><![CDATA[<hr>
<p>The previous episodes started the parsing of the “nettrace” format used when <a href="/posts/2022-09-18_net-diagnostic-ipc-protocol/">contacting the .NET Diagnostics IPC server</a> and <a href="/posts/2022-10-23_clr-events-go-for/">initiate the protocol to receive CLR events</a>. It is now time to see how to get the payload of each “object” type, especially how stacks are stored.</p>
<p>We have seen that the stream starts with a <strong>TraceObject</strong> that describes the rest of the stream followed by a sequence of “object”:</p>
<p><img loading="lazy" src="/posts/2023-01-15_reading-object-in-memory/1_E9Rq89JSc_OIfW9ooEfm1A.png"></p>
<p>The remaining of each “object” is a 32 bit block size followed by the payload.</p>
<p>Well… not only. One thing I missed when I started to work on the nettrace format is the fact that all “object” payloads must be 4-bytes aligned <strong>on the beginning of the stream</strong>!</p>
<p>This is why I’m keeping track of the current position in the <strong>EventPipeSession</strong> class:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">private</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">IIpcEndpoint</span><span class="o">*</span> <span class="n">_pEndpoint</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">_stopRequested</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// parsers
</span></span></span><span class="line"><span class="cl">    <span class="n">MetadataParser</span> <span class="n">_metadataParser</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">EventParser</span> <span class="n">_eventParser</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">StackParser</span> <span class="n">_stackParser</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">SequencePointParser</span> <span class="n">_sequencePointParser</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Keep track of the position since the beginning of the &#34;file&#34;
</span></span></span><span class="line"><span class="cl">    <span class="c1">// i.e. starting at 0 from the first character of the NettraceHeader
</span></span></span><span class="line"><span class="cl">    <span class="c1">//      Nettrace
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">_position</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>So each <strong>ParseXXXBlock</strong> function checks the minimum reader version in the header before reading the “object” payload as a memory block. The idea is being able to support backward compatibility:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">EventPipeSession</span><span class="o">::</span><span class="n">ParseMetadataBlock</span><span class="p">(</span><span class="n">ObjectHeader</span><span class="o">&amp;</span> <span class="n">header</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">header</span><span class="p">.</span><span class="n">MinReaderVersion</span> <span class="o">!=</span> <span class="mi">2</span><span class="p">)</span> <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">blockSize</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// read the block and send it to the corresponding parser
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">blockOriginInFile</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ExtractBlock</span><span class="p">(</span><span class="s">&#34;Metadata&#34;</span><span class="p">,</span> <span class="n">blockSize</span><span class="p">,</span> <span class="n">blockOriginInFile</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">_metadataParser</span><span class="p">.</span><span class="n">Parse</span><span class="p">(</span><span class="n">_pBlock</span><span class="p">,</span> <span class="n">blockSize</span><span class="p">,</span> <span class="n">blockOriginInFile</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>ExtractBlock</strong> function reads the size of the payload (and skips the padding if any) with <strong>ReadBlockSize</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">EventPipeSession</span><span class="o">::</span><span class="n">ExtractBlock</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">blockName</span><span class="p">,</span> <span class="kt">uint32_t</span><span class="o">&amp;</span> <span class="n">blockSize</span><span class="p">,</span> <span class="kt">uint64_t</span><span class="o">&amp;</span> <span class="n">blockOriginInFile</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// get the block size
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadBlockSize</span><span class="p">(</span><span class="n">blockName</span><span class="p">,</span> <span class="n">blockSize</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// skip the block + final EndOfObject tag
</span></span></span><span class="line"><span class="cl">    <span class="n">blockSize</span><span class="o">++</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The block name is only used for error messages if needed.</p>
<p>The next step is to read the payload in a memory block using these two <strong>EventPipeSession</strong> fields:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// buffer used to read each block that will be then parsed
</span></span></span><span class="line"><span class="cl">    <span class="kt">uint8_t</span><span class="o">*</span> <span class="n">_pBlock</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">_blockSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In the session constructor, <strong>_blockSize</strong> is set to 4 KB and <strong>_pBlock</strong> points to an allocated memory buffer of that size.</p>
<p>The rest of <strong>ExtractBlock</strong> deals with payload size: if the current payload to parse is larger than <strong>_blockSize</strong>, then these fields are updated up to a maximum of 100 KB (i.e. max block size sent by the CLR).</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// check if it is needed to resize the block buffer
</span></span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_blockSize</span> <span class="o">&lt;</span> <span class="n">blockSize</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// don&#39;t expect blocks larger than 100KB
</span></span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">blockSize</span> <span class="o">&gt;</span> <span class="n">MAX_BLOCK_SIZE</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">delete</span> <span class="p">[]</span> <span class="n">_pBlock</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_pBlock</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">uint8_t</span><span class="p">[</span><span class="n">blockSize</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">        <span class="o">::</span><span class="n">ZeroMemory</span><span class="p">(</span><span class="n">_pBlock</span><span class="p">,</span> <span class="n">blockSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">_blockSize</span> <span class="o">=</span> <span class="n">blockSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// keep track of the current position in file for padding
</span></span></span><span class="line"><span class="cl">    <span class="n">blockOriginInFile</span> <span class="o">=</span> <span class="n">_position</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">Read</span><span class="p">(</span><span class="n">_pBlock</span><span class="p">,</span> <span class="n">blockSize</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Error</span> <span class="o">=</span> <span class="o">::</span><span class="n">GetLastError</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;Error while extracting &#34;</span> <span class="o">&lt;&lt;</span> <span class="n">blockName</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; block: 0x&#34;</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">hex</span> <span class="o">&lt;&lt;</span> <span class="n">Error</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">dec</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;</span><span class="se">\n</span><span class="s">&#34;</span> <span class="o">&lt;&lt;</span> <span class="n">blockName</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; block (&#34;</span> <span class="o">&lt;&lt;</span> <span class="n">blockSize</span> <span class="o">&lt;&lt;</span> <span class="s">&#34; bytes)</span><span class="se">\n</span><span class="s">&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">DumpBuffer</span><span class="p">(</span><span class="n">_pBlock</span><span class="p">,</span> <span class="n">blockSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For debugging sake, I’m displaying each “object” payload</p>
<p><img loading="lazy" src="/posts/2023-01-15_reading-object-in-memory/1_bWygt-T2kMQfIlcxIv_3VA.png"></p>
<p>thanks to the <strong>DumpBuffer</strong> helper.</p>
<p>To ease the memory access to the memory block content, my <strong>BlockParser</strong> will be used as a base class for each dedicated parsers:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">BlockParser</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">BlockParser</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">Parse</span><span class="p">(</span><span class="kt">uint8_t</span><span class="o">*</span> <span class="n">pBlock</span><span class="p">,</span> <span class="kt">uint32_t</span> <span class="n">bytesCount</span><span class="p">,</span> <span class="kt">uint64_t</span> <span class="n">blockOriginInFile</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">void</span> <span class="nf">SetPointerSize</span><span class="p">(</span><span class="kt">uint8_t</span> <span class="n">pointerSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint8_t</span> <span class="n">PointerSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">protected</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="k">virtual</span> <span class="kt">bool</span> <span class="n">OnParse</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Access helpers
</span></span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">Read</span><span class="p">(</span><span class="n">LPVOID</span> <span class="n">buffer</span><span class="p">,</span> <span class="n">DWORD</span> <span class="n">bufferSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadByte</span><span class="p">(</span><span class="kt">uint8_t</span><span class="o">&amp;</span> <span class="n">byte</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadWord</span><span class="p">(</span><span class="kt">uint16_t</span><span class="o">&amp;</span> <span class="n">word</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadDWord</span><span class="p">(</span><span class="kt">uint32_t</span><span class="o">&amp;</span> <span class="n">dword</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadLong</span><span class="p">(</span><span class="kt">uint64_t</span><span class="o">&amp;</span> <span class="n">ulong</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadDouble</span><span class="p">(</span><span class="kt">double</span><span class="o">&amp;</span> <span class="n">d</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadVarUInt32</span><span class="p">(</span><span class="kt">uint32_t</span><span class="o">&amp;</span> <span class="n">val</span><span class="p">,</span> <span class="n">DWORD</span><span class="o">&amp;</span> <span class="n">size</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadVarUInt64</span><span class="p">(</span><span class="kt">uint64_t</span><span class="o">&amp;</span> <span class="n">val</span><span class="p">,</span> <span class="n">DWORD</span><span class="o">&amp;</span> <span class="n">size</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">ReadWString</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">wstring</span><span class="o">&amp;</span> <span class="n">wstring</span><span class="p">,</span> <span class="n">DWORD</span><span class="o">&amp;</span> <span class="n">bytesRead</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="nf">SkipBytes</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">byteCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// shared fields
</span></span></span><span class="line"><span class="cl"><span class="k">protected</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">_is64Bit</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">_blockSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">_pos</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">private</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint8_t</span><span class="o">*</span> <span class="n">_pBlock</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint64_t</span> <span class="n">_blockOriginInFile</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>Parse</strong> function accepts the memory buffer containing an “object” payload, its size and its position since the beginning of the stream. The derived class will have to implement the <strong>OnParse</strong> function using the <strong>ReadXXX</strong> helpers.</p>
<p>The two <strong>ReadVarUintXXX</strong> functions are different from the other direct read helpers because they deal with some simple compression mechanisms used by the serialization of 32-bit and 64-bit numbers.</p>
<p>In the different types of “object” payloads, the strings are serialized as UTF16 strings ending with a “\0” wide character. Here is the implementation of the helper function used to read a <strong>std::wstring</strong> from a memory block:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">bool</span> <span class="n">BlockParser</span><span class="o">::</span><span class="n">ReadWString</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">wstring</span><span class="o">&amp;</span> <span class="n">wstring</span><span class="p">,</span> <span class="n">DWORD</span><span class="o">&amp;</span> <span class="n">bytesRead</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint16_t</span> <span class="n">character</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">bytesRead</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>  <span class="c1">// in case of empty string
</span></span></span><span class="line"><span class="cl">    <span class="k">while</span> <span class="p">(</span><span class="nb">true</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ReadWord</span><span class="p">(</span><span class="n">character</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// protect against invalid UNICODE character (due to missing fields in ExceptionThrown event)
</span></span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">character</span> <span class="o">&gt;</span> <span class="mi">256</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// rewind the character
</span></span></span><span class="line"><span class="cl">            <span class="n">_pos</span> <span class="o">=</span> <span class="n">_pos</span> <span class="o">-</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">character</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="c1">// this is only covering a missing string
</span></span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="p">(</span><span class="n">bytesRead</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">bytesRead</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">character</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// Note that an empty string contains only that \0 character
</span></span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">character</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="c1">// \0 final character of the string
</span></span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">wstring</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">character</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note the check for character content in the loop: this is due to a serialization issue I will discuss later when the event “object” block will be detailed.</p>
<h2 id="the-stackobject">The Stack “object”</h2>
<p>If you remember my previous post about <a href="/posts/2020-05-18_build-your-own-net/">retrieving call stacks for CLR events with TraceEvent</a>, you might be wondering why there is a specific stack object since a <strong>ClrStackWalk</strong> event should contain the frames if the <strong>Stack</strong> keyword is enabled for the .NET provider. In fact, the current TraceEvent implementation is not using the stack object sent by the CLR (maybe to have the same code between ETW and EventPipe).</p>
<p>One stack “object” received in a nettrace stream contains one or more stacks. Each stack is identified by an id (more about this soon) and contains a list of instruction pointer addresses.</p>
<p><img loading="lazy" src="/posts/2023-01-15_reading-object-in-memory/1_H_HbR0-xWzR3SWV2KEimpQ.png"></p>
<p>In the previous screenshot, the id of the first stack is 1 and the second is 2. In the next stack “object”, the <strong>FirstId</strong> field will be 3 and so on. This avoids storing the id in each call stack and saves space.</p>
<p>Note that even if this does not seem to make any sense, it might happen that the addresses list is empty.</p>
<p><img loading="lazy" src="/posts/2023-01-15_reading-object-in-memory/1_3NZb7jI5cPwz4wtDnZzhCw.png"></p>
<p>These call stacks are stored in <strong>EventPipeSession</strong> as a per id cache:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// per stackID stack
</span></span></span><span class="line"><span class="cl">    <span class="c1">// only one will be used depending on the bitness of the monitored application
</span></span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">unordered_map</span><span class="o">&lt;</span><span class="kt">uint32_t</span><span class="p">,</span> <span class="n">EventCacheStack32</span><span class="o">&gt;</span> <span class="n">_stacks32</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">unordered_map</span><span class="o">&lt;</span><span class="kt">uint32_t</span><span class="p">,</span> <span class="n">EventCacheStack64</span><span class="o">&gt;</span> <span class="n">_stacks64</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The frames are stored as addresses in a vector:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">EventCacheStack32</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="kt">uint32_t</span> <span class="n">Id</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">uint32_t</span><span class="o">&gt;</span> <span class="n">Frames</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The CLR is sending one stack per unique callstack (i.e. at least one frame is different). As you will soon see, each event “object” contains a stack id corresponding to the chain of code from which it is sent.</p>
<p>The next episode will detail the <strong>Metadata</strong> and <strong>Event</strong> blocks to end the series.</p>
<h2 id="resources">Resources</h2>
<ul>
<li><a href="/posts/2022-07-28_digging-into-the-clr/">Episode 1</a> — <em>Digging into the CLR Diagnostics IPC Protocol in C#</em></li>
<li><a href="/posts/2022-09-18_net-diagnostic-ipc-protocol/">Episode 2</a> — <em>.NET Diagnostic IPC protocol: the C++ way</em></li>
<li>[Episode 3 ](/posts/2022-10-23_clr-events-go-for/ <em>CLR events: go for the nettrace file format!</em></li>
<li>[Episode 4 ](/posts/2022-11-27_parsing-the-nettrace-stream/ <em>Parsing the “nettrace” steam</em></li>
<li><a href="https://github.com/chrisnas/ClrEvents/tree/master/Events/NativeEventListener">Source code</a> for the C++ implementation of CLR events listener</li>
<li>Diagnostics IPC protocol <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/design-docs/ipc-protocol.md">documentation</a></li>
</ul>
]]></content:encoded></item><item><title>Digging into the CLR Diagnostics IPC Protocol in C#</title><link>https://chrisnas.github.io/posts/2022-07-28_digging-into-the-clr/</link><pubDate>Thu, 28 Jul 2022 08:59:40 +0000</pubDate><guid>https://chrisnas.github.io/posts/2022-07-28_digging-into-the-clr/</guid><description>Learn how to directly connect to the .NET CLR and send diagnostics commands</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>As I explained during <a href="https://www.youtube.com/watch?v=Jpoy3O6x-wM&amp;t=1530s">a DotNext conference session</a>, the .NET CLI tools such as <strong>dotnet-trace</strong>, <strong>dotnet-counter</strong> or <strong>dotnet-dump</strong> are communicating with the CLR thanks to Named Pipe on Windows and Domain Socket on Linux. Within the CLR, a <a href="https://github.com/dotnet/coreclr/blob/release/3.1/src/vm/diagnosticserver.cpp#24">diagnostic server thread</a> is responsible for answering requests. A communication protocol allows a tool to send <em>commands</em> and expect <em>responses</em>. This Diagnostic IPC Protocol is <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/design-docs/ipc-protocol.md">pretty well documented</a> in the dotnet Diagnostics repository.</p>
<p>Before going into the protocol details, here is a list of the available commands and their effect:</p>
<p><img loading="lazy" src="/posts/2022-07-28_digging-into-the-clr/1_0LmzdTyId2oJIPkSac1EAA.png"></p>
<p>This series will detail how to communicate with a CLR using this protocol both in C# and in C++. Also note that processing CLR events thanks to EventPipe will also be covered.</p>
<h2 id="make-it-simple-use-microsoftdiagnosticsnetcoreclient-nuget">Make it simple: use Microsoft.Diagnostics.NETCore.Client nuget</h2>
<p>With <a href="https://www.nuget.org/packages/Microsoft.Diagnostics.Tracing.TraceEvent">TraceEvent nugget package</a>, Microsoft provided a great library to <a href="/posts/2018-07-26_grab-etw-session-providers/">easily listen to CLR events</a> in C#. If you want to easily send CLR diagnostic IPC protocol commands to a CLR in a .NET process, <a href="https://www.nuget.org/packages/Microsoft.Diagnostics.NETCore.Client/">Microsoft.Diagnostics.NETCore.Client nuget package</a> is for you. Remember that EventPipe is implemented by .NET Core and .NET 5+ (so no .NET Framework support)</p>
<p>The Swiss knife class <strong>DiagnosticsClient</strong> gives you access to most of the commands plus a way to list .NET processes as a bonus:</p>
<p><img loading="lazy" src="/posts/2022-07-28_digging-into-the-clr/1_lbYwy45LUJHmX-WsdrGJYw.png"></p>
<p>If you want to get the pid of all supported running .NET applications, call the static <strong>GetPublishedProcesses()</strong> method. Beware that the pid of your own application will also be included.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ListProcesses</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">selfPid</span> <span class="p">=</span> <span class="n">Process</span><span class="p">.</span><span class="n">GetCurrentProcess</span><span class="p">().</span><span class="n">Id</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">pid</span> <span class="k">in</span> <span class="n">DiagnosticsClient</span><span class="p">.</span><span class="n">GetPublishedProcesses</span><span class="p">())</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">process</span> <span class="p">=</span> <span class="n">Process</span><span class="p">.</span><span class="n">GetProcessById</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;{pid,6}{GetSeparator(pid == selfPid)}{process.ProcessName}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Otherwise, create an instance passing the process ID of the .NET application you are interested in. With this object, call the method corresponding to the command you want to send. For example, the following code is calling <strong>GetProcessEnvironment()</strong> to list the environment variables:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ListEnvironmentVariables</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// get environment variables via existing wrapper in DiagnosticsClient</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">client</span> <span class="p">=</span> <span class="k">new</span> <span class="n">DiagnosticsClient</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">envVariables</span> <span class="p">=</span> <span class="n">client</span><span class="p">.</span><span class="n">GetProcessEnvironment</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">variable</span> <span class="k">in</span> <span class="n">envVariables</span><span class="p">.</span><span class="n">Keys</span><span class="p">.</span><span class="n">OrderBy</span><span class="p">(</span><span class="n">k</span> <span class="p">=&gt;</span> <span class="n">k</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;{variable,26} = {envVariables[variable]}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that the value “ExitCode=00000000” is associated to the “” (empty) key for reason unknown to me…</p>
<p>Even though the undocumented command to set an environment variable is available via the <strong>SetEnvironmentVariable()</strong> method, there is no helper method wrapping the <strong>ProcessInfo</strong> command. In fact, a <strong>GetProcessInfo()</strong> method exists but it is internal! The <strong>PidIpcEndpoint</strong> type in charge of the transport and the <strong>IpcMessage</strong>, <strong>IpcResponse</strong> and <strong>IpcClient</strong> types dealing with commands are also internal. It means that the nuget will not help if you need to send the <strong>ProcessInfo</strong> command.</p>
<h2 id="still-easy-use-microsoftdiagnosticsnetcoreclient-sourcecode">Still easy: use Microsoft.Diagnostics.NETCore.Client source code</h2>
<p>The .NET team spends some extra time testing, documenting, and verifying they are happy with the APIs in NetCore.Client before making them public, so sometimes you will see types that they used in their own tools that are still internal. But wait, if the CLI tools need some of these types, how will it work? Well…</p>
<p>The C# project corresponding to the MicrosoftDiagnostics.NETCore.Client assembly is part of the dotnet Diagnostic repository where the tools are implemented. If you look at <a href="https://github.com/dotnet/diagnostics/blob/main/src/Microsoft.Diagnostics.NETCore.Client/Microsoft.Diagnostics.NETCore.Client.csproj">the .csproj file</a>, you will see <strong>InternalsVisibleTo</strong> attributes to allow the tools to access the internal types:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl">  <span class="nt">&lt;ItemGroup&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;dotnet-counters&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;dotnet-dsrouter&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;dotnet-monitor&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;dotnet-trace&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;Microsoft.Diagnostics.Monitoring&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;Microsoft.Diagnostics.Monitoring.EventPipe&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="c">&lt;!-- Temporary until Diagnostic Apis are finalized--&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;Microsoft.Diagnostics.Monitoring.WebApi&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;InternalsVisibleTo</span> <span class="na">Include=</span><span class="s">&#34;Microsoft.Diagnostics.NETCore.Client.UnitTests&#34;</span> <span class="nt">/&gt;</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&lt;/ItemGroup&gt;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The great thing about OSS is that you can compile your own fork to make these types public. Of course you will be on your own to support these custom builds of the library and it is possible there will be changes to the API before .NET makes it public.</p>
<p>So what you could do to use these internal types in your code is the following:</p>
<ul>
<li>copy the folder from the Diagnostics repository</li>
<li>add the name of your assembly that needs to access the internal types and members into the .csproj</li>
<li>replace the reference to the nuget package by a project reference to the copied project</li>
</ul>
<p>And now <strong>GetProcessInfo</strong> and the other internal types are public for you:</p>
<p><img loading="lazy" src="/posts/2022-07-28_digging-into-the-clr/1_RHI5XtwfR4iN2EI3S9o6bg.png"></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ListProcessInfo</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">client</span> <span class="p">=</span> <span class="k">new</span> <span class="n">DiagnosticsClient</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">info</span> <span class="p">=</span> <span class="n">client</span><span class="p">.</span><span class="n">GetProcessInfo</span><span class="p">();</span>  <span class="c1">// this method is internal</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;              Command Line = {info.CommandLine}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;              Architecture = {info.ProcessArchitecture}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;      Entry point assembly = {info.ManagedEntrypointAssemblyName}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;               CLR Version = {info.ClrProductVersionString}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that during my tests, I was able to get a value for the <strong>ManagedEntrypointAssemblyName</strong> or <strong>ClrProductVersionString</strong> properties only with .NET 6+: the <strong>ProcessInfo2</strong> (0x404) command does not seem to be implemented in previous versions.</p>
<p>The next episode of the series will start to explain the EventPipe IPC protocol from a native C++ developer perspective.</p>
<h2 id="resources">Resources</h2>
<ul>
<li><a href="https://www.nuget.org/packages/Microsoft.Diagnostics.NETCore.Client/">Microsoft.Diagnostics.NETCore.Client nuget package</a></li>
<li><a href="https://www.nuget.org/packages/Microsoft.Diagnostics.Tracing.TraceEvent">TraceEvent nugget package</a></li>
<li>Diagnostics IPC protocol <a href="https://github.com/dotnet/diagnostics/blob/main/documentation/design-docs/ipc-protocol.md">documentation</a></li>
</ul>
]]></content:encoded></item><item><title>Troubleshooting CPU and exceptions issues with Datadog toolbox</title><link>https://chrisnas.github.io/posts/2022-06-09_troubleshooting-cpu-and-except/</link><pubDate>Thu, 09 Jun 2022 16:09:27 +0000</pubDate><guid>https://chrisnas.github.io/posts/2022-06-09_troubleshooting-cpu-and-except/</guid><description>Learn how to use Datadog CPU and exceptions profiling to troubleshoot well known Tess Ferrandez BuggyBits</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>With the new 2.10 release of the Datadog .NET Tracer and Continuous Profiler available, it is time to update some investigation workflows <a href="/posts/2022-01-28_troubleshooting-net-performanc/">I already introduced</a>. New features have been added to help you diagnose performance issues in your .NET applications:</p>
<ul>
<li>Linux support!</li>
<li>Code Hotspots: allow you to automatically navigate from lengthy spans and requests to profiles</li>
<li>CPU profiling: pinpoint high CPU consuming methods</li>
<li>Exceptions profiling: identify exceptions distributions</li>
<li>Profile sequence: easily profile an application startup</li>
</ul>
<p>The goal of this post is to show you how all these features make your investigations easier. I would recommend reading <a href="/posts/2022-01-28_troubleshooting-net-performanc/">the previous post</a>; especially for the environment setup that I won’t repeat here.</p>
<h2 id="its-linux-showtime">It’s Linux showtime!</h2>
<p>The .NET Continuous Profiler is now available for Linux. The only limitation is the presence of glibc 2.18+ in the distribution; for example, CentOS 7 is not supported. Beyond that, we provide features parity between Linux and Windows.</p>
<p>In terms of installation, download the <a href="https://github.com/DataDog/dd-trace-dotnet/releases">.NET Tracer package</a> that supports your operating system and architecture. Go to <a href="https://docs.datadoghq.com/tracing/profiler/enabling/dotnet?tab=linux">the documentation</a> for the additional configuration steps.</p>
<h2 id="from-spans-toprofiles">From spans to profiles</h2>
<p>When analysing lengthy requests, you usually start from looking at the corresponding spans in the APM Traces part of the UI. It is now possible to view the corresponding profiles by clicking the “View Profile” button in the “Code Hotspots” tab:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_uNg9te1UxHzFUzCCwJN3PQ.png"></p>
<p>Before digging into the profiling information, you are already able to see that more than half of the time is spent in <strong>Buffer._Memmove</strong> that is called by Buggybits <strong>ProductsController.Index</strong> method:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_FjzJDKmiR40r7UYzjtV8HQ.png"></p>
<p>From the profile view, it is also possible to come back to the traces:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_6M_WOoI8WOUxUukEti7TxQ.png"></p>
<p>Let’s see now what new features are available at the profiling side.</p>
<h2 id="cpu-profiling">CPU profiling</h2>
<p>The most demanded feature was the ability to analyse CPU consumption (a.k.a. CPU profiling). The idea is to be able to identify code that really consumes CPU usage and optimize it. This is particularly important in the context of cloud-based computing where what you pay is related to the consumed CPU.</p>
<p>In term of implementation, unlike Wall Time profiling, we look at the time spent by a thread on a CPU core and not the elapsed time since the last time we checked (every ~10ms). We also collect the call stack of a thread only if it is currently running on a core. Why? Because we want to only record call stacks corresponding to code paths that are consuming CPU. For example, ThreadPool threads are usually waiting (not interesting call stack) for a work item to process (interesting call stack).</p>
<p>In the <a href="/posts/2022-01-28_troubleshooting-net-performanc/">last blog post</a>, the <a href="https://github.com/DataDog/dd-trace-dotnet/blob/master/profiler/src/Demos/Samples.BuggyBits/Controllers/ProductsController.cs#L124">code responsible for lengthy requests</a> is doing too many string concatenations (look for += in the following code):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">IActionResult</span> <span class="n">Index</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">sw</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Stopwatch</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">sw</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">products</span> <span class="p">=</span> <span class="n">dataLayer</span><span class="p">.</span><span class="n">GetAllProducts</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">productsTable</span> <span class="p">=</span> <span class="s">&#34;&lt;table&gt;&lt;tr&gt;&lt;th&gt;Product Name&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;th&gt;Price&lt;/th&gt;&lt;/tr&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">product</span> <span class="k">in</span> <span class="n">products</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">productsTable</span> <span class="p">+=</span> <span class="s">$&#34;&lt;tr&gt;&lt;td&gt;{product.ProductName}&lt;/td&gt;&lt;td&gt;{product.Description}&lt;/td&gt;&lt;td&gt;{product.Price}&lt;/td&gt;&lt;/tr&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">productsTable</span> <span class="p">+=</span> <span class="s">&#34;&lt;/table&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">sw</span><span class="p">.</span><span class="n">Stop</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ViewData</span><span class="p">[</span><span class="s">&#34;ElapsedTimeInMs&#34;</span><span class="p">]</span> <span class="p">=</span> <span class="n">sw</span><span class="p">.</span><span class="n">ElapsedMilliseconds</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">ViewData</span><span class="p">[</span><span class="s">&#34;ProductsTable&#34;</span><span class="p">]</span> <span class="p">=</span> <span class="n">productsTable</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">View</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The Wall Time view was very explicit about <strong>ProductsController.Index()</strong> calling <strong>String.Concat()</strong> culprit:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_ajUXPb0-ANMBllBPfU91oQ.png"></p>
<p>A simple solution is to <a href="https://github.com/DataDog/dd-trace-dotnet/blob/master/profiler/src/Demos/Samples.BuggyBits/Controllers/ProductsController.cs#L103">use a StringBuilder to optimize the concatenations</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">IActionResult</span> <span class="n">Builder</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">sw</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Stopwatch</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">sw</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">products</span> <span class="p">=</span> <span class="n">dataLayer</span><span class="p">.</span><span class="n">GetAllProducts</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">productsTable</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StringBuilder</span><span class="p">(</span><span class="m">1000</span> <span class="p">*</span> <span class="m">80</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">productsTable</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="s">&#34;&lt;table&gt;&lt;tr&gt;&lt;th&gt;Product Name&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;th&gt;Price&lt;/th&gt;&lt;/tr&gt;&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">product</span> <span class="k">in</span> <span class="n">products</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">productsTable</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="s">$&#34;&lt;tr&gt;&lt;td&gt;{product.ProductName}&lt;/td&gt;&lt;td&gt;{product.Description}&lt;/td&gt;&lt;td&gt;{product.Price}&lt;/td&gt;&lt;/tr&gt;&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">productsTable</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="s">&#34;&lt;/table&gt;&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">sw</span><span class="p">.</span><span class="n">Stop</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ViewData</span><span class="p">[</span><span class="s">&#34;ElapsedTimeInMs&#34;</span><span class="p">]</span> <span class="p">=</span> <span class="n">sw</span><span class="p">.</span><span class="n">ElapsedMilliseconds</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">ViewData</span><span class="p">[</span><span class="s">&#34;ProductsTable&#34;</span><span class="p">]</span> <span class="p">=</span> <span class="n">productsTable</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">View</span><span class="p">(</span><span class="s">&#34;Index&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The Wall Time view of the profile with the fix does not provide anything useful…</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_Mqpq2MVAkCUpA8Z62wRtYA.png"></p>
<p>If you want to go deeper, you need to look at the CPU consumption:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_Ox4XRvFCIQozhGAeC-6kTQ.png"></p>
<p>The <strong>ProductController.Builder</strong> method is handling the request (like <strong>Index()</strong> in the String.Concat case) and calls <strong>DataLayer.GetAllProducts()</strong> where most of the CPU-related work is done.</p>
<p>Would it be interesting to continue optimizing the code? Notice that <strong>GetAllProducts()</strong> is “only” consuming 94ms and the other <strong>Number.</strong>* and <strong>String.Concat</strong> method around 350ms. So a gain might be neglectable compared to the total ~3 seconds CPU usage</p>
<p>Remember that you should not optimize for the sake of “optimizing”: you should have metrics that tell you when to start (too lengthy request processing) and when to stop.</p>
<h2 id="exceptions-profiling">Exceptions profiling</h2>
<p>In the .NET world, exceptions are at the center of errors handling. It is now possible to get a sampled view of the exceptions that happened during an application lifetime; by type:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_uyUmkTTLlQjg1IhCBVoCcg.png"></p>
<p>and by message:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_GRyr-g-jo8KXySpTRMTLpQ.png"></p>
<p>At the implementation level, the new exceptions profiler is notified by the CLR when an exception is thrown. Since in special cases (such as network issue or invalid parsed data for example), an application could trigger thousands of exceptions in a very short period, it is needed to sample them. Otherwise, the impact on performances would be severe; especially if the call stack needs to be rebuilt for each exception.</p>
<p>First, at least one exception per type is kept, ensuring that weird specific exceptions are not lost in the flow. Second, exceptions are sampled over time based on a fixed number of exceptions per profile and the rate of appearance. For knowing the exact number of exceptions, feel free to leverage the Runtime Metrics package as explained in the previous blog post.</p>
<h2 id="profiling-the-application-bootstrap">Profiling the application bootstrap</h2>
<p>In some situations, you are interested in analysing an application bootstrap. In Datadog APM Profile Search UI, it means finding the first profile of the given service execution. However, even with the date and time column, it is not obvious to find the right one:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_KapOJXMlqOX2nflnyNB1ZA.png"></p>
<p>To help you find the initial profile of a service execution, a new “profile_seq” tag has been added to the HTTP request used to upload the profiles. It contains the count of generated profiles for a given execution of a service, starting from 0.</p>
<p>So now, in the Options of the profile list, add a “profile_seq” column:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_BwnvwM5y6e0Xd20JXVL8XQ.png"></p>
<p>The first profile is then easily spottable with a 0 value:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1__8Kf93AIiGsBCjGZl_SoyA.png"></p>
<p>In the future, a more visual hint might be added to identify it without the need to add the column.</p>
<h2 id="major-implementation-refactoring">Major implementation refactoring</h2>
<p>Finally, our implementation has benefited from a large code refactoring. As the previous post explained, the generation of .pprof files and their upload was done in C#. This has been replaced by using a rust library shared amongst different profiler libraries (native, Ruby, .NET).</p>
<p>It means that you should not anymore see these frames in the application call stacks:</p>
<p><img loading="lazy" src="/posts/2022-06-09_troubleshooting-cpu-and-except/1_FHcvrh7PIKIrMimv1FrW6g.png"></p>
<p>It does not mean that one third of the processing has been removed! Just that no more C# code is running with performance gain. First, the managed implementation was allocating objects managed by the garbage collector; adding pressure that might trigger more collections. Second, with the native rust implementation, there is no need to duplicate data between the collecting native part of the continuous profiler and the managed code used to serialize it.</p>
<p>In addition, several optimizations have been done in the symbol’s resolution (i.e., type and method names) part of the code that also reduce memory consumption and CPU usage.</p>
<p>Happy profiling!</p>
<h2 id="references">References</h2>
<ul>
<li>Datadog Tracer &amp; Continuous Profiler <a href="https://github.com/DataDog/dd-trace-dotnet/releases/tag/v2.10.0">.msi Installer and Linux tar.gz</a></li>
<li><a href="https://docs.datadoghq.com/tracing/profiler/enabling/dotnet">Datadog Continuous Profiler documentation</a></li>
<li><a href="https://docs.datadoghq.com/tracing/setup_overview/setup/dotnet-framework/?tab=windows">Datadog Tracer documentation</a></li>
<li><a href="https://docs.datadoghq.com/tracing/runtime_metrics/dotnet/">Datadog Runtime metrics documentation</a></li>
<li><a href="https://twitter.com/TessFerrandez">Tess Ferrandez</a> repository for <a href="https://www.tessferrandez.com/blog/2008/02/04/debugging-demos-setup-instructions.html">BuggyBits labs</a></li>
</ul>
]]></content:encoded></item><item><title>Value types and exceptions in .NET profiling</title><link>https://chrisnas.github.io/posts/2022-03-14_value-types-and-exceptions/</link><pubDate>Mon, 14 Mar 2022 12:02:31 +0000</pubDate><guid>https://chrisnas.github.io/posts/2022-03-14_value-types-and-exceptions/</guid><description>This final episode describes how to get fields of a value type instance and how deal with exceptions.</description><content:encoded><![CDATA[<hr>
<p>Here comes the end of the series about .NET profiling APIs. This final episode describes how to get fields of a value type instance and how to deal with exceptions.</p>
<h2 id="getting-fields-of-a-value-typeinstance">Getting fields of a value type instance</h2>
<p>The case of a value type is very similar to a reference type except that the address you receive points directly to the beginning of the fields value; instead of the type MethodTable (or <strong>ObjectID</strong> if you prefer).</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">case</span> <span class="n">ELEMENT_TYPE_VALUETYPE</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// same as reference type except that the received address points to the beginning of the value type instance fields</span>
</span></span><span class="line"><span class="cl">   <span class="kt">byte</span><span class="p">*</span> <span class="n">managedReference</span> <span class="p">=</span> <span class="p">(</span><span class="kt">byte</span><span class="p">*)</span><span class="n">address</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It means that you won’t be able to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getclassfromobject-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetClassFromObject</strong></a> to get its <strong>ClassID</strong> and start the field enumeration like for a reference type. Note that despite its name, <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getclasslayout-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetClassLayout</strong></a> is perfectly capable of providing fields offset for a value type.</p>
<p>Instead, you will have to use the metadata token (extracted from the method signature) corresponding to the parameter. If the type is defined in the same assembly as the method, it is just a matter of calling <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getclassfromtoken-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetClassFromToken</strong></a> with the same moduleID as the method:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">TypeFromToken</span><span class="p">(</span><span class="n">elementTypeToken</span><span class="p">)</span> <span class="p">==</span> <span class="n">mdtTypeDef</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="p">=</span> <span class="n">_pProfilerInfo</span><span class="p">-&gt;</span><span class="n">GetClassFromToken</span><span class="p">(</span><span class="n">moduleId</span><span class="p">,</span> <span class="n">elementTypeToken</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">classID</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>If the type is defined in another assembly (i.e. <strong>TypeFromToken()</strong> will return <strong>mdTypeRef</strong>), the metadata keeps track of the relationships:</p>
<p><img loading="lazy" src="/posts/2022-03-14_value-types-and-exceptions/1_6Pl_pl7xBqjbFgv2HsHPzw.png"></p>
<p>The <em>ResolutionScope</em> (i.e. assembly where the typeref is defined) is given by <a href="https://docs.microsoft.com/en-us/windows/win32/api/rometadataapi/nf-rometadataapi-imetadataimport-gettyperefprops?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetTypeRefProps</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">WCHAR</span> <span class="n">szName</span><span class="p">[</span><span class="n">MAX_CLASS_NAME</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">chName</span> <span class="p">=</span> <span class="n">MAX_CLASS_NAME</span><span class="p">-</span><span class="m">1</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdToken</span> <span class="n">resolutionScope</span> <span class="p">=</span> <span class="n">mdTokenNil</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">pMetaDataImport</span><span class="p">-&gt;</span><span class="n">GetTypeRefProps</span><span class="p">(</span><span class="n">elementTypeToken</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">resolutionScope</span><span class="p">,</span> <span class="n">szName</span><span class="p">,</span> <span class="n">chName</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">chName</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Unfortunately, I did not find any direct API call (either from <strong>IMetaDataImport</strong> or <strong>ICorProfilerInfo</strong>) to find the <strong>ModuleID</strong> where a typeref is defined in another assembly. The only link is the <strong>IMetaDataImport</strong> corresponding to the module implementing the typeref that is available via the <a href="https://docs.microsoft.com/en-us/archive/blogs/davbr/metadata-tokens-run-time-ids-and-type-loading?WT.mc_id=DT-MVP-5003325">not recommended</a> <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/imetadataimport-resolvetyperef-method?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::ResolveTypeRef</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">IMetaDataImport</span><span class="p">*</span> <span class="n">pMetaDataImportRef</span> <span class="p">=</span> <span class="n">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdToken</span> <span class="n">referencedElementTypeToken</span> <span class="p">=</span> <span class="n">mdTokenNil</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">pMetaDataImport</span><span class="p">-&gt;</span><span class="n">ResolveTypeRef</span><span class="p">(</span><span class="n">elementTypeToken</span><span class="p">,</span> <span class="n">IID_IMetaDataImport</span><span class="p">,</span> <span class="p">(</span><span class="n">IUnknown</span><span class="p">**)&amp;</span><span class="n">pMetaDataImportRef</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">referencedElementTypeToken</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This looks like a dead end: the metadata API knows about tokens (i.e. values generated by C# compiler) and the profiling API knows about IDs (i.e. pointers to internal data structures).</p>
<p>Remember that <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getmodulemetadata-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo:: GetModuleMetaData</strong></a> returns the <strong>IMetaDataImport</strong> corresponding to a given <strong>ModuleID</strong>. So the idea is to be able to identify a <strong>ModuleID</strong> by its <strong>IMetaDataImport</strong> counterpart, enumerate the modules loaded by the profiler and get their “identifier” to compare with the one implementing the type we are interested in. This identifier could be the <strong>mdModule</strong> token return by <a href="https://docs.microsoft.com/en-us/windows/win32/api/rometadataapi/nf-rometadataapi-imetadataimport-getmodulefromscope?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetModuleFromScope</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">mdModule</span> <span class="n">module</span> <span class="p">=</span> <span class="n">mdModuleNil</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">pMetaDataImport</span><span class="p">-&gt;</span><span class="n">GetModuleFromScope</span><span class="p">(&amp;</span><span class="n">module</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Well… not really because I always got 0x1 in my test. This value could be the module in the assembly and I only tested single-module assemblies generated by Visual Studio. Hopefully, each module is labelled by a unique “mvid” (i.e. a GUID identifying each module) returned by <a href="https://docs.microsoft.com/en-us/windows/win32/api/rometadataapi/nf-rometadataapi-imetadataimport-getscopeprops?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetScopeProps</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">GUID</span> <span class="n">refMvid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">pMetaDataImport</span><span class="p">-&gt;</span><span class="n">GetScopeProps</span><span class="p">(</span><span class="n">szName</span><span class="p">,</span> <span class="n">chName</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">chName</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">refMvid</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Here is the code to enumerate profiled modules and check for the given <strong>refMvid</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ICorProfilerModuleEnum</span><span class="o">*</span> <span class="n">pEnumModule</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">EnumModules</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pEnumModule</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">ModuleID</span> <span class="n">enumeratedModuleId</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ModuleID</span> <span class="n">refModuleId</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">GUID</span> <span class="n">mvid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">IMetaDataImport</span><span class="o">*</span> <span class="n">pEnumeratedModuleMetadata</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdModule</span> <span class="n">enumeratedModuleToken</span> <span class="o">=</span> <span class="n">mdModuleNil</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">fetchedModulesCount</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">do</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">pEnumModule</span><span class="o">-&gt;</span><span class="n">Next</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">enumeratedModuleId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fetchedModulesCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">FAILED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">fetchedModulesCount</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// get the IMetadataImport corresponding to this module
</span></span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetModuleMetaData</span><span class="p">(</span><span class="n">enumeratedModuleId</span><span class="p">,</span> <span class="n">ofRead</span><span class="p">,</span> <span class="n">IID_IMetaDataImport</span><span class="p">,</span> <span class="p">(</span><span class="n">IUnknown</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pEnumeratedModuleMetadata</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      
</span></span><span class="line"><span class="cl">   <span class="c1">// get the module token
</span></span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">pEnumeratedModuleMetadata</span><span class="o">-&gt;</span><span class="n">GetModuleFromScope</span><span class="p">(</span><span class="o">&amp;</span><span class="n">enumeratedModuleToken</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">pEnumeratedModuleMetadata</span><span class="o">-&gt;</span><span class="n">GetScopeProps</span><span class="p">(</span><span class="n">szName</span><span class="p">,</span> <span class="n">chName</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">chName</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">mvid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">pEnumeratedModuleMetadata</span><span class="o">-&gt;</span><span class="n">Release</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">refMvid</span> <span class="o">==</span> <span class="n">mvid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">refModuleId</span> <span class="o">=</span> <span class="n">enumeratedModuleId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">TRUE</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">pEnumModule</span><span class="o">-&gt;</span><span class="n">Release</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// this is the one!
</span></span></span><span class="line"><span class="cl"><span class="n">moduleId</span> <span class="o">=</span> <span class="n">refModuleId</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For performance sake, it would be better to build (in your <strong>IProfilerCallback</strong> implementation of <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-moduleloadfinished-method?WT.mc_id=DT-MVP-5003325"><strong>ModuleLoadFinished</strong></a> and <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-moduleunloadfinished-method?WT.mc_id=DT-MVP-5003325"><strong>ModuleUnloadFinished</strong></a>), a map between the loaded modules and their mvid. This map could then be used when a <strong>ModuleID</strong> is needed while only the metadata side is known.</p>
<h2 id="what-has-been-returned">What has been returned?</h2>
<p>The final step of our journey is to figure out what is returned by a method. The leave callback executed each time a method returns receives a <strong>FunctionID</strong> and a <strong>COR_PRF_ELT_INFO</strong> as parameters:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">PROFILER_STUB</span> <span class="nf">LeaveStub</span><span class="p">(</span><span class="n">FunctionID</span> <span class="n">functionId</span><span class="p">,</span> <span class="n">COR_PRF_ELT_INFO</span> <span class="n">eltInfo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The signature parsing for a <strong>FunctionID</strong> already shown tells whether it returns <strong>void</strong> or an instance of a type identified by an element type and a metadata token.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">void</span> <span class="n">CorProfilerHelpers</span><span class="o">::</span><span class="n">DumpLeaveReturnValue</span><span class="p">(</span><span class="n">FunctionID</span> <span class="n">functionId</span><span class="p">,</span> <span class="n">FunctionSignature</span><span class="o">*</span> <span class="n">pSignature</span><span class="p">,</span> <span class="n">COR_PRF_ELT_INFO</span> <span class="n">eltInfo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">char</span> <span class="n">value</span><span class="p">[</span><span class="mi">128</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">_stricmp</span><span class="p">(</span><span class="n">pSignature</span><span class="o">-&gt;</span><span class="n">pszReturnType</span><span class="p">,</span> <span class="s">&#34;void&#34;</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="s">&#34;void&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>COR_PRF_ELT_INFO</strong> parameter is the key to get the address of the returned instance thanks to <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo3-getfunctionleave3info-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo3::GetFunctionLeave3Info</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl">   <span class="k">else</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">ULONG</span> <span class="n">pcbArgumentInfo</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">COR_PRF_FRAME_INFO</span> <span class="n">frameInfo</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">COR_PRF_FUNCTION_ARGUMENT_RANGE</span> <span class="n">returnValueInfo</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetFunctionLeave3Info</span><span class="p">(</span><span class="n">functionId</span><span class="p">,</span> <span class="n">eltInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">frameInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">returnValueInfo</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="n">UINT_PTR</span> <span class="n">pStartValue</span> <span class="o">=</span> <span class="n">returnValueInfo</span><span class="p">.</span><span class="n">startAddress</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">ULONG</span> <span class="n">length</span> <span class="o">=</span> <span class="n">returnValueInfo</span><span class="p">.</span><span class="n">length</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">      <span class="k">const</span> <span class="n">FunctionParameter</span><span class="o">*</span> <span class="n">pReturnParameter</span> <span class="o">=</span> <span class="n">pSignature</span><span class="o">-&gt;</span><span class="n">GetReturnParameter</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">      <span class="n">GetObjectValue</span><span class="p">(</span><span class="n">pStartValue</span><span class="p">,</span> <span class="n">length</span><span class="p">,</span> <span class="n">pReturnParameter</span><span class="o">-&gt;</span><span class="n">ElementType</span><span class="p">,</span> <span class="n">pReturnParameter</span><span class="o">-&gt;</span><span class="n">TypeToken</span><span class="p">,</span> <span class="n">pSignature</span><span class="o">-&gt;</span><span class="n">ModuleId</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="o">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">   
</span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>To get meaningful information from this API, the <strong>COR_PRF_ENABLE_FUNCTION_RETVAL</strong> flag must be set when <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-seteventmask-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::SetEventMask</strong></a> is called during <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-initialize-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback::Initialize</strong></a>. The returned <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/cor-prf-function-argument-range-structure?WT.mc_id=DT-MVP-5003325"><strong>COR_PRF_FUNCTION_ARGUMENT_RANGE</strong></a> contains the address of the returned instance in its <strong>startAddress</strong> field.</p>
<p>The same <strong>GetObjectValue</strong> helper function already used to get parameters’ value is still valid here.</p>
<h2 id="and-what-about-exceptions">And what about exceptions?</h2>
<p>I discussed how to follow the normal flow of execution by entering and exiting a method. When, in a method, an exception is thrown and not caught, you won’t get notified by the Leave callback. Instead other methods of <strong>ICorProfilerCallback</strong> are called if you pass <strong>COR_PRF_MONITOR_EXCEPTIONS</strong> to <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-seteventmask-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::SetEventMask</strong></a>.</p>
<p>Let’s take the following C# example to understand when which callbacks are executed:</p>
<p><img loading="lazy" src="/posts/2022-03-14_value-types-and-exceptions/1_lIP5rxv4DVU5zs22mVWSXA.png"></p>
<p>The blue arrows are showing the flow of execution from <strong>Throws</strong> to <strong>ThrowLevel3</strong>. When the <strong>InvalidOperationException</strong> is thrown, <strong>ExceptionSearchFunctionXXX</strong> callbacks are executed “backward” to find the first catch block that will match the exception (i.e. up to <strong>Throws</strong>). It is now time to run the <strong>finally</strong> blocks (if any) starting from where the exception was thrown (i.e. <strong>ThrowLevel3</strong>) up to the catch block in <strong>Throws</strong>.</p>
<p>The object corresponding to the exception is passed to <strong>ExceptionThrown</strong> and <strong>ExceptionCatcherEnter</strong> as <strong>ObjectID</strong>. Feel free to use the code that has been presented earlier to get the type of the exception. However, getting interesting fields such as <strong>_message</strong>, or <strong>_innerException</strong> requires to figure out the <strong>ClassID</strong> of the <strong>System.Exception</strong> base class.</p>
<p>As already mentioned, the <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getclassidinfo2-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetClassIDInfo2</strong></a> function returns the <strong>ClassID</strong> of the parent type. Here is the code to search a parent type in a type hierarchy:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="nf">GetExceptionBaseClass</span><span class="p">(</span><span class="n">ICorProfilerInfo8</span><span class="o">*</span> <span class="n">pInfo</span><span class="p">,</span> <span class="n">ClassID</span> <span class="n">classId</span><span class="p">,</span> <span class="n">ClassID</span><span class="o">*</span> <span class="n">baseClassId</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ModuleID</span> <span class="n">moduleId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">ClassID</span> <span class="n">parentClassId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetClassIDInfo2</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">moduleId</span><span class="p">,</span> <span class="k">nullptr</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">parentClassId</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="k">nullptr</span><span class="p">,</span> <span class="k">nullptr</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">FAILED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">hr</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">WCHAR</span> <span class="n">szName</span><span class="p">[</span><span class="mi">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">    <span class="n">hr</span> <span class="o">=</span> <span class="n">CorProfilerHelpers</span><span class="o">::</span><span class="n">GetTypeName</span><span class="p">(</span><span class="n">pInfo</span><span class="p">,</span> <span class="n">classId</span><span class="p">,</span> <span class="n">moduleId</span><span class="p">,</span> <span class="n">szName</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">szName</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">wcscmp</span><span class="p">(</span><span class="sa">L</span><span class="s">&#34;System.Exception&#34;</span><span class="p">,</span> <span class="n">szName</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="o">*</span><span class="n">baseClassId</span> <span class="o">=</span> <span class="n">classId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">S_OK</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">GetExceptionBaseClass</span><span class="p">(</span><span class="n">pInfo</span><span class="p">,</span> <span class="n">parentClassId</span><span class="p">,</span> <span class="n">baseClassId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>FunctionID</strong> corresponding to the method is passed as a parameter to <strong>ExceptionSearchFunctionEnter</strong>, <strong>ExceptionSearchFilterEnter</strong>, <strong>ExceptionSearchCatcherFound</strong>, <strong>ExceptionUnwindFunctionEnter</strong>, <strong>ExceptionUnwindFinallyEnter</strong>, and <strong>ExceptionCatcherEnter</strong>. (i.e. not to the <strong>xxxLeave</strong> callbacks)</p>
<h2 id="conclusion">Conclusion</h2>
<p>This series of articles introduced the .NET native profiling API in the context of method enter/leave tracing. The relationships between its metadata counterpart has also been detailed. You should now be able to implement other overrides of <strong>ICorProfilerCallback</strong> to get details about allocations for example.</p>
<h2 id="references">References</h2>
<ul>
<li>Episode 1: <a href="/posts/2021-08-07_start-journey-into-the/">Start a journey into the .NET Profiling APIs</a></li>
<li>Episode 2: <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">Dealing with Modules, Assemblies and Types with CLR profiling API</a></li>
<li>Episode 3: <a href="/posts/2021-10-12_decyphering-method-signature-w/">Decyphering methods signature with .NET Profiling APIs</a></li>
<li>Episode 4: <a href="/posts/2021-11-16_reading-parameters-value-with/">Reading parameters value with the .NET Profiling APIs</a></li>
<li>Episode 5: <a href="/posts/2021-12-18_accessing-arrays-and-class/">Accessing arrays and class fields with .NET profiling APIs</a></li>
</ul>
]]></content:encoded></item><item><title>Troubleshooting .NET performance issues with Datadog toolbox</title><link>https://chrisnas.github.io/posts/2022-01-28_troubleshooting-net-performanc/</link><pubDate>Fri, 28 Jan 2022 12:55:40 +0000</pubDate><guid>https://chrisnas.github.io/posts/2022-01-28_troubleshooting-net-performanc/</guid><description>How to troubleshoot well known Tess Ferrandez BuggyBits application with the Datadog toolbox including the new Continuous Profiler</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>The beta of Datadog .NET Continuous Profiler is <a href="https://github.com/DataDog/dd-trace-dotnet/releases/tag/v2.1.1-profiler-beta1">available</a>!</p>
<p>This is a great opportunity to show how to use the different tools provided by Datadog to troubleshoot .NET applications facing performance issues. <a href="https://twitter.com/TessFerrandez">Tess Ferrandez</a> updated her famous BuggyBits application to .NET Core. Among the <a href="https://www.tessferrandez.com/blog/2008/02/04/debugging-demos-setup-instructions.html">different available scenarios</a>, let’s see how to investigate the <a href="https://www.tessferrandez.com/blog/2008/02/27/net-debugging-demos-lab-4-walkthrough.html">Lab 4 — High CPU Hang</a> with Datadog. It will be completely different from Tess way: no need to analyze memory dump anymore.</p>
<h2 id="setup-the-environment">Setup the environment</h2>
<p>First, you need to download and run the .msi from our Tracer repository: it will install both the Tracer and the Profiler. The former allows you, among other things, to see how long it takes to process ASP.NET Core requests. The latter is in Beta today and provides wall time duration of your threads (more on this later). Look at the corresponding documentations for the details of enabling <a href="https://docs.datadoghq.com/tracing/setup_overview/setup/dotnet-framework/?tab=windows">tracing</a> and <a href="https://docs.datadoghq.com/tracing/profiler/enabling/dotnet">profiling</a> once installed.</p>
<p>Next, ensure that <strong>.NET Runtime Metrics</strong> are installed for your organization:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_87cPlnHNwMWqRpt4veVxkw.png"></p>
<p>Add <strong>DD_RUNTIME_METRICS_ENABLED=true</strong> environment variable for the application/service you want to monitor. Once enabled for your application, this package allows you to see the evolution of <a href="https://docs.datadoghq.com/tracing/runtime_metrics/dotnet/">important metrics</a> including some that you won’t find anywhere else such as GC pause time, thread contention time or count of exceptions per type.</p>
<p>Ensure that <a href="https://docs.datadoghq.com/developers/dogstatsd/?tab=hostagent#setup">DogstatsD is setup</a> for the Agent</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_V1sQz72I3ZwAgzjG6G-8eA.png"></p>
<h2 id="looking-at-thesymptoms">Looking at the symptoms</h2>
<p>In my example, the buggybits application is running under the <em>datadog.demos.buggybits</em> service name. This is how I can filter the related traces in the APM/Traces part of the Datadog portal:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_IpoI4Qh-Gm0sbgPeM7u8Sg.png"></p>
<p>In this screenshot, the <strong>Products/Index</strong> requests duration is around 6 seconds; which is way too long!</p>
<p>When clicking such a request, the details panel provides the exact URL in the <strong>Tags</strong> tab:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_rMphP6jfVQNQ7ZD9p5dr0g.png"></p>
<p>The <strong>Metrics</strong> tab shows CPU usage and other few metrics around the trace time:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_zL8l7705sS_x-0zNhU6v6A.png"></p>
<p>So, in addition to having slow requests, the CPU usage seems to increase.</p>
<p>It is now time to go to the <em>.NET runtime metrics</em> dashboard and look at what is going on in more details. The first graph that shows up is the number of gen2 collections:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_RUSWt8-h068RgOP76rSQSQ.png"></p>
<p>It means that during the 10 minutes test where very few requests are processed, almost 800 gen2 GC are happening every 10s (all runtime metrics are computed every 10 seconds).</p>
<p>The load test corresponding to these requests lasted 10 minutes between 4pm and 6+pm. Each time the requests were processed:</p>
<ul>
<li>the CPU usage increased</li>
<li>the number of gen2 collections increased</li>
<li>the duration of pauses due to garbage collections increased</li>
<li>the threads contention time increased</li>
</ul>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_iCzt6wCUu-yUef5btID_-g.png"></p>
<p>In addition to being slow, it seems that the code that processes <em>products/index</em> HTTP requests has also an impact on the CPU (i.e. on the overall application and machine performances).</p>
<p>It would be great if we could see the callstacks corresponding to this processing. This is exactly what the .NET Wall time Continuous Profiler is all about: looking at the duration of methods through a flamegraph representation.</p>
<h2 id="here-comes-theprofiler">Here comes the profiler</h2>
<p>Today, there is no direct way to jump from a trace to the profile containing the callstacks while the corresponding request was processed. We are currently working on this new feature called <em>Code Hotspots</em>.</p>
<p>However, it is easy to use the service name to filter the profiles and select the period of time from APM/Profile Search:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_ft1EYQ2z0aXcHpxjgRZzgg.png"></p>
<p>When you click a one-minute profile (the callstacks are gathered and sent every minute), a panel appears with the <strong>Performance</strong> tab selected. It shows a framegraph on the left and a list on the right.</p>
<h2 id="getting-used-to-flamegraph">Getting used to flamegraph</h2>
<p>When you look at the wall time flamegraph, you see everything that happened during a single minute:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1__B5VXVtYIxx32h8wm-HNLA.png"></p>
<p>The previous screenshot highlights the groups of callstacks corresponding to the different threads of execution (from left to right):</p>
<ul>
<li>the <strong>Main()</strong> entry point of the application</li>
<li>the code in the CLR responsible for sending counters</li>
<li>the code in the Profiler in charge of generating and sending (the very thin spike) the profile every minute</li>
<li>the Tracer code</li>
<li>the code that listens to the CLR events to generate the runtime metrics</li>
<li>…and the application code that processes the requests!</li>
</ul>
<p>In the flamegraph, the width of each frame on a row represents the relative time during which the frame was found on a callstack. For example, in our tests, we have 4 threads simply calling <strong>Thread.Sleep</strong>; one for 10 seconds, one for 20 seconds, one for 30 seconds and a last one for 40 seconds. This is the expected result in a flamegraph (i.e. the widths are consistent with the 1/2/3/4 ratio):</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_mVT5PCAbJhkA2lrq3UbeCw.png"></p>
<p>This also applies to CPU-bound threads. For example, if 3 threads are computing the sum of numbers in a tight loop, this is the expected result (i.e. all <strong>OnCPUxxx</strong> have the same width)</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_byJpeSeYTLlJNgyLOBH73A.png"></p>
<p>These explanations should stop the fear that started to crawl inside your head about the “visible cost” of the Datadog Tracer and Profiler based on the previous screenshot. The large width of the Datadog threads frames is all about wall time, not CPU time: we are mostly sleeping or waiting but we don’t stop :^)</p>
<h2 id="investigate-the-performance-issue">Investigate the performance issue</h2>
<p>The next step is to focus on the stack frames corresponding to the request processing to better understand what is going on.</p>
<p>Basically, you would like to either remove a branch or keep only a branch. You simply have to move the mouse over a frame (i.e. <strong>ThreadPoolWorkQueue</strong> in the previous screenshot) and click the three dots that just appeared. Next, select <strong>Show From</strong> to keep only that branch in the flamegraph:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_fq0_NeVQlii7qQS0gvBrRw.png"></p>
<p>Now, scroll-down into the flamegraph and the flow of execution corresponding to processing the <em>Products/Index</em> request becomes more visible:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_TZHJO6vuO6VWar4FjVhlQQ.png"></p>
<p>It seems that the <strong>Index()</strong> method of the <strong>ProductsController</strong> is spending most of its time calling <strong>String.Concat()</strong>.</p>
<p>Let’s have a look at the <a href="https://github.com/TessFerrandez/BuggyBits/blob/main/src/BuggyBits/Controllers/ProductsController.cs#L18">source code</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// GET: Products</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">IActionResult</span> <span class="n">Index</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">products</span> <span class="p">=</span> <span class="n">dataLayer</span><span class="p">.</span><span class="n">GetAllProducts</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">productsTable</span> <span class="p">=</span> <span class="s">&#34;&lt;table&gt;&lt;tr&gt;&lt;th&gt;Product Name&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;th&gt;Price&lt;/th&gt;&lt;/tr&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">product</span> <span class="k">in</span> <span class="n">products</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">productsTable</span> <span class="p">+=</span> <span class="s">$&#34;&lt;tr&gt;&lt;td&gt;{product.ProductName}&lt;/td&gt;&lt;td&gt;{product.Description}&lt;/td&gt;&lt;td&gt;{product.Price}&lt;/td&gt;&lt;/tr&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">productsTable</span> <span class="p">+=</span> <span class="s">&#34;&lt;/table&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ViewData</span><span class="p">[</span><span class="s">&#34;ProductsTable&#34;</span><span class="p">]</span> <span class="p">=</span> <span class="n">productsTable</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">View</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>But still no sign of <strong>String.Concat()</strong>… Well, this is because the C# compiler is hiding it from you with the <strong>+=</strong> syntaxic sugar. Let’s have a look at the decompiled code as shown by IlSpy (without the <strong>string.Concat</strong> transformation):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">Microsoft</span><span class="p">.</span><span class="n">AspNetCore</span><span class="p">.</span><span class="n">Mvc</span><span class="p">.</span><span class="n">IActionResult</span> <span class="n">Index</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">products</span> <span class="p">=</span> <span class="n">dataLayer</span><span class="p">.</span><span class="n">GetAllProducts</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">string</span> <span class="n">productsTable</span> <span class="p">=</span> <span class="s">&#34;&lt;table&gt;&lt;tr&gt;&lt;th&gt;Product Name&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;th&gt;Price&lt;/th&gt;&lt;/tr&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">enumerator</span> <span class="p">=</span> <span class="n">products</span><span class="p">.</span><span class="n">GetEnumerator</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">while</span> <span class="p">(</span><span class="n">enumerator</span><span class="p">.</span><span class="n">MoveNext</span><span class="p">())</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">BuggyBits</span><span class="p">.</span><span class="n">Models</span><span class="p">.</span><span class="n">Product</span> <span class="n">product</span> <span class="p">=</span> <span class="n">enumerator</span><span class="p">.</span><span class="n">Current</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="kt">string</span><span class="p">[]</span> <span class="n">array</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">string</span><span class="p">[</span><span class="m">8</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">0</span><span class="p">]</span> <span class="p">=</span> <span class="n">productsTable</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">1</span><span class="p">]</span> <span class="p">=</span> <span class="s">&#34;&lt;tr&gt;&lt;td&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">2</span><span class="p">]</span> <span class="p">=</span> <span class="n">product</span><span class="p">.</span><span class="n">ProductName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">3</span><span class="p">]</span> <span class="p">=</span> <span class="s">&#34;&lt;/td&gt;&lt;td&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">4</span><span class="p">]</span> <span class="p">=</span> <span class="n">product</span><span class="p">.</span><span class="n">Description</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">5</span><span class="p">]</span> <span class="p">=</span> <span class="s">&#34;&lt;/td&gt;&lt;td&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">6</span><span class="p">]</span> <span class="p">=</span> <span class="n">product</span><span class="p">.</span><span class="n">Price</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">array</span><span class="p">[</span><span class="m">7</span><span class="p">]</span> <span class="p">=</span> <span class="s">&#34;&lt;/td&gt;&lt;/tr&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">productsTable</span> <span class="p">=</span> <span class="kt">string</span><span class="p">.</span><span class="n">Concat</span><span class="p">(</span><span class="n">array</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">finally</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="p">((</span><span class="n">System</span><span class="p">.</span><span class="n">IDisposable</span><span class="p">)</span><span class="n">enumerator</span><span class="p">).</span><span class="n">Dispose</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">productsTable</span> <span class="p">=</span> <span class="kt">string</span><span class="p">.</span><span class="n">Concat</span><span class="p">(</span><span class="n">productsTable</span><span class="p">,</span> <span class="s">&#34;&lt;/table&gt;&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">base</span><span class="p">.</span><span class="n">ViewData</span><span class="p">[</span><span class="s">&#34;ProductsTable&#34;</span><span class="p">]</span> <span class="p">=</span> <span class="n">productsTable</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">View</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>So now we can see the call to <strong>string.Concat()</strong> at the end of the <strong>while</strong> loop iteration.</p>
<p>Behind the scene, since a string object is immutable, <strong>string.Concat()</strong> will create a new string each time it is called and the previous string referenced by <strong>productsTable</strong> is no more rooted and will put more pressure on the GC. If I’m telling you that <strong>datalayer.GetAllProducts()</strong> returns 10.000 products, it means that <strong>string.Concat</strong> gets called 10.000 times.</p>
<p>As the string grows, it will reach the 85000 bytes limit and start to be allocated in the LOH, adding more pressure on GC that will trigger gen2 collections; hence the high number of gen2 collections seen in the runtime metrics dashboard.</p>
<p>Note that if the native frames were visible in the flamegraph (by the way, let me know if this is a feature that would make sense to add), you would see the methods of the CLR responsible for the GC.</p>
<p>Look at <a href="https://www.tessferrandez.com/blog/2008/02/27/net-debugging-demos-lab-4-walkthrough.html">Tess Ferrandez post</a> for a possible solution to this expensive code pattern (i.e. calling <strong>string.Contact</strong> in a large tight loop)</p>
<h2 id="different-types-offilters">Different types of filters</h2>
<p>Before leaving, I would like to quicky talk about the list shown on the right hand-side of the UI.</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_S3wDYybXPZIbdB3fFdVong.png"></p>
<p>It allows you to see a wall time summary per method name, namespace (i.e. sum of methods from types in the same namespace, thread ID (i.e. internal Datadog unique ID), thread name or AppDomain Name.</p>
<p>First, the only methods listed here are “leaf” methods: they appear at the top of at least one callstack. If you would like to visually see some specific frames, you should use the filter box:</p>
<p><img loading="lazy" src="/posts/2022-01-28_troubleshooting-net-performanc/1_GTUUF548uj1JVUV3kLkUcA.png"></p>
<p>All other frames are faded out.</p>
<p>Second, the list is sorted with the largest wall time at the top: this could be an easy way to spot method “expensive” in term of CPU (i.e. will frequently be running so appear at the top of the stack). You simply need to skip the wait and sleep related methods like shown in the previous screenshot: <strong>String.Concat</strong> and <strong>Buffer._Memmove</strong> (used by <strong>string.Concat</strong>) were just in front of your eyes!</p>
<p>When you select an element of the list, the flamegraph is updated accordingly: only the callstacks containing this element will be visible (it could speed up the filtering process)</p>
<h2 id="references">References</h2>
<ul>
<li>Datadog Tracer &amp; Continuous Profiler <a href="https://github.com/DataDog/dd-trace-dotnet/releases/tag/v2.1.1-profiler-beta1">.msi Installer</a></li>
<li><a href="https://docs.datadoghq.com/tracing/profiler/enabling/dotnet">Datadog Continuous Profiler documentation</a></li>
<li><a href="https://docs.datadoghq.com/tracing/setup_overview/setup/dotnet-framework/?tab=windows">Datadog Tracer documentation</a></li>
<li><a href="https://docs.datadoghq.com/tracing/runtime_metrics/dotnet/">Datadog Runtime metrics documentation</a></li>
<li><a href="https://twitter.com/TessFerrandez">Tess Ferrandez</a> repository for <a href="https://www.tessferrandez.com/blog/2008/02/04/debugging-demos-setup-instructions.html">BuggyBits labs</a></li>
</ul>
]]></content:encoded></item><item><title>Accessing arrays and class fields with .NET profiling APIs</title><link>https://chrisnas.github.io/posts/2021-12-18_accessing-arrays-and-class/</link><pubDate>Sat, 18 Dec 2021 16:31:37 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-12-18_accessing-arrays-and-class/</guid><description>This post describes how to access arrays and fields of reference type instances with .NET Profiler APIs.</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>After getting basic and strings parameters, it is time to look at arrays and reference types.</p>
<h2 id="accessing-managedarrays">Accessing managed arrays</h2>
<p>You check against null array parameter the same way as for string:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">case</span> <span class="n">ELEMENT_TYPE_SZARRAY</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// look at the reference stored at the given address</span>
</span></span><span class="line"><span class="cl">   <span class="n">unsigned</span> <span class="n">__int64</span><span class="p">*</span> <span class="n">pAddress</span> <span class="p">=</span> <span class="p">(</span><span class="n">unsigned</span> <span class="n">__int64</span><span class="p">*)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="kt">byte</span><span class="p">*</span> <span class="n">managedReference</span> <span class="p">=</span> <span class="p">(</span><span class="kt">byte</span><span class="p">*)(*</span><span class="n">pAddress</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">managedReference</span> <span class="p">==</span> <span class="n">NULL</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="k">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;null array&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>ELEMENT_TYPE_SZARRAY</strong> applies to single dimension arrays including jagged arrays. <strong>ELEMENT_TYPE_ARRAY</strong> is used for matrice :</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">case</span> <span class="n">ELEMENT_TYPE_ARRAY</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// look at the reference stored at the given address</span>
</span></span><span class="line"><span class="cl">   <span class="n">unsigned</span> <span class="n">__int64</span><span class="p">*</span> <span class="n">pAddress</span> <span class="p">=</span> <span class="p">(</span><span class="n">unsigned</span> <span class="n">__int64</span><span class="p">*)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="kt">byte</span><span class="p">*</span> <span class="n">managedReference</span> <span class="p">=</span> <span class="p">(</span><span class="kt">byte</span><span class="p">*)(*</span><span class="n">pAddress</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">managedReference</span> <span class="p">==</span> <span class="n">NULL</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="k">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;null matrix&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Since arrays are reference types, we know that the managed reference points to the address of the Method Table but we need more insights to get the elements. Again, Sergey Tepliakov explains in great details how single dimension arrays are laid out in memory:</p>
<p><img loading="lazy" src="/posts/2021-12-18_accessing-arrays-and-class/0_3fZ9bQv70K96FXD2.png"></p>
<p>The length is stored in front of the elements as you can see in Visual Studio for the following 10 elements integer array:</p>
<pre tabindex="0"><code>var ints = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
</code></pre><p><img loading="lazy" src="/posts/2021-12-18_accessing-arrays-and-class/1_ZGsf7_dqCStZX_0JsDTLfw.png"></p>
<p>Note that “jagged” arrays (i.e. array of array such as int[][]) are stored the same way: each element of the first array contains a reference to another array:</p>
<p><img loading="lazy" src="/posts/2021-12-18_accessing-arrays-and-class/1_RkKLazyuqobQVMkqwjFgww.png"></p>
<p>The layout is a little bit different for matrices (i.e. multi-dimensional arrays) such as this 2 x 4 integers array:</p>
<pre tabindex="0"><code>var matrix = new int[,] { { 1, 1, 1, 1 }, { 2, 2, 2, 2 } };
</code></pre><p><img loading="lazy" src="/posts/2021-12-18_accessing-arrays-and-class/1_enx0CIYive7JfoskgeX6vA.png"></p>
<p>In that case, the total element count appears before each dimension length. The elements are stored row after row.</p>
<p>The profiling API <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getarrayobjectinfo-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetArrayObjectInfo</strong></a> gives us all the implementation details we need:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl">   <span class="c1">// single dimension array so the following arrays only need 1 element to receive size and lower bound</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG32</span><span class="p">*</span> <span class="n">pDimensionSizes</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ULONG32</span><span class="p">[</span><span class="m">1</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="kt">int</span><span class="p">*</span> <span class="n">pDimensionLowerBounds</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="m">1</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="kt">byte</span><span class="p">*</span> <span class="n">pElements</span><span class="p">;</span> <span class="c1">// will point to the beginning of the array elements</span>
</span></span><span class="line"><span class="cl">   <span class="n">HRESULT</span> <span class="n">hr</span> <span class="p">=</span> <span class="n">_pProfilerInfo</span><span class="p">-&gt;</span><span class="n">GetArrayObjectInfo</span><span class="p">((</span><span class="n">ObjectID</span><span class="p">)</span><span class="n">managedReference</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="n">pDimensionSizes</span><span class="p">,</span> <span class="n">pDimensionLowerBounds</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">pElements</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Here is the description of each parameter:</p>
<ul>
<li>an <strong>ObjectID</strong> (i.e. a reference to an object in the managed heap) corresponding to an array.</li>
<li>the number of dimensions (a.k.a. rank) so 1 for <strong>ELEMENT_TYPE_SZARRAY</strong> array. I will show in a moment how to get it for matrices.</li>
<li>an allocated array to receive the size of each dimension</li>
<li>an allocated array to receive the lower bound of each dimension; should be 0 for C#</li>
<li>the address of the beginning of the elements</li>
</ul>
<p>So it is easy to detect an empty array: it means that its length is 0:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl">   <span class="n">ULONG32</span> <span class="n">arrayLength</span> <span class="p">=</span> <span class="n">pDimensionSizes</span><span class="p">[</span><span class="m">0</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="n">delete</span> <span class="n">pDimensionSizes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">delete</span> <span class="n">pDimensionLowerBounds</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">arrayLength</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="k">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;empty single dimension array&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The next step is to get the value of each array element. It is easy to get the ClassID of a given object by calling <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getclassfromobject-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetClassFromObject</strong></a> and then <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-isarrayclass-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::IsArrayClass</strong></a> will provide the array rank and its elements <strong>CorElementType</strong> and <strong>ClassID</strong>.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// get element type and array rank </span>
</span></span><span class="line"><span class="cl"><span class="c1">// (could be used before calling GetArrayObjectInfo to allocate the size + bounds arrays)</span>
</span></span><span class="line"><span class="cl"><span class="n">ClassID</span> <span class="n">classId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">rank</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">CorElementType</span> <span class="n">baseElementType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">_pProfilerInfo</span><span class="p">-&gt;</span><span class="n">GetClassFromObject</span><span class="p">((</span><span class="n">ObjectID</span><span class="p">)</span><span class="n">managedReference</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">classId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">_pProfilerInfo</span><span class="p">-&gt;</span><span class="n">IsArrayClass</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">baseElementType</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">rank</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>With these details, iterating over each element to get its value is not that complicated:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">char</span> <span class="n">elementValue</span><span class="p">[</span><span class="mi">128</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="n">ULONG</span> <span class="n">current</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">current</span> <span class="o">&lt;</span> <span class="n">arrayLength</span><span class="p">;</span> <span class="n">current</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">GetElementValue</span><span class="p">(</span><span class="n">pElements</span><span class="p">,</span> <span class="n">baseElementType</span><span class="p">,</span> <span class="n">elementValue</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">elementValue</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">strcat_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="n">elementValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">FAILED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span> <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">current</span> <span class="o">&lt;</span> <span class="n">arrayLength</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcat_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;, &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="n">strcat_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;)&#34;</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>GetElementValue</strong> is where you need to use the element type to compute the value but also to know how many byte you need to move forward to look at the next element:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span><span class="lnt">46
</span><span class="lnt">47
</span><span class="lnt">48
</span><span class="lnt">49
</span><span class="lnt">50
</span><span class="lnt">51
</span><span class="lnt">52
</span><span class="lnt">53
</span><span class="lnt">54
</span><span class="lnt">55
</span><span class="lnt">56
</span><span class="lnt">57
</span><span class="lnt">58
</span><span class="lnt">59
</span><span class="lnt">60
</span><span class="lnt">61
</span><span class="lnt">62
</span><span class="lnt">63
</span><span class="lnt">64
</span><span class="lnt">65
</span><span class="lnt">66
</span><span class="lnt">67
</span><span class="lnt">68
</span><span class="lnt">69
</span><span class="lnt">70
</span><span class="lnt">71
</span><span class="lnt">72
</span><span class="lnt">73
</span><span class="lnt">74
</span><span class="lnt">75
</span><span class="lnt">76
</span><span class="lnt">77
</span><span class="lnt">78
</span><span class="lnt">79
</span><span class="lnt">80
</span><span class="lnt">81
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="n">CorProfilerHelpers</span><span class="o">::</span><span class="n">GetElementValue</span><span class="p">(</span><span class="n">byte</span><span class="o">*&amp;</span> <span class="n">pElement</span><span class="p">,</span> <span class="n">CorElementType</span> <span class="n">elementType</span> <span class="p">,</span> <span class="n">mdToken</span> <span class="n">elementToken</span><span class="p">,</span> <span class="n">ModuleID</span> <span class="n">moduleId</span><span class="p">,</span> <span class="kt">char</span><span class="o">*</span> <span class="n">value</span><span class="p">,</span> <span class="n">ULONG</span> <span class="n">charCount</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">GetObjectValue</span><span class="p">((</span><span class="n">UINT_PTR</span><span class="p">)</span><span class="n">pElement</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">),</span> <span class="n">elementType</span><span class="p">,</span> <span class="n">elementToken</span><span class="p">,</span> <span class="n">moduleId</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">switch</span> <span class="p">(</span><span class="n">elementType</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_BOOLEAN</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">bool</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_CHAR</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">WCHAR</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_I1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">char</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_U1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_I2</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">short</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_U2</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">short</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_I4</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_U4</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">int</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_I8</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">long</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_U8</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">long</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_R4</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">float</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_R8</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_STRING</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_CLASS</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="c1">// NOTE: can&#39;t call GetObjectValue recursively because won&#39;t fit on one line
</span></span></span><span class="line"><span class="cl">      <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;0x%p&#34;</span><span class="p">,</span> <span class="o">*</span><span class="p">(</span><span class="n">UINT_PTR</span><span class="o">*</span><span class="p">)</span><span class="n">pElement</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_SZARRAY</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="c1">// arrays are reference types so skip the size of an address
</span></span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">case</span> <span class="nl">ELEMENT_TYPE_OBJECT</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;obj&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="n">pElement</span> <span class="o">+=</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">default</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;?&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span> <span class="n">E_FAIL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">return</span> <span class="n">S_OK</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For matrices, it is needed to know the rank ahead of time to allocate the <strong>GetArrayObjectInfo</strong> out parameters:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_ARRAY</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// same code to check null matrix
</span></span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// get element type and array rank 
</span></span></span><span class="line"><span class="cl">   <span class="c1">// --&gt; used before calling GetArrayObjectInfo to allocate the size + bounds arrays
</span></span></span><span class="line"><span class="cl">   <span class="n">ClassID</span> <span class="n">classId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span> <span class="n">rank</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">CorElementType</span> <span class="n">baseElementType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetClassFromObject</span><span class="p">((</span><span class="n">ObjectID</span><span class="p">)</span><span class="n">managedReference</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">classId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">IsArrayClass</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">baseElementType</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">rank</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">ULONG32</span><span class="o">*</span> <span class="n">pDimensionSizes</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ULONG32</span><span class="p">[</span><span class="n">rank</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="kt">int</span><span class="o">*</span> <span class="n">pDimensionLowerBounds</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">rank</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="n">byte</span><span class="o">*</span> <span class="n">pElements</span><span class="p">;</span> <span class="c1">// will point to the beginning of the array elements
</span></span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetArrayObjectInfo</span><span class="p">((</span><span class="n">ObjectID</span><span class="p">)</span><span class="n">managedReference</span><span class="p">,</span> <span class="n">rank</span><span class="p">,</span> <span class="n">pDimensionSizes</span><span class="p">,</span> <span class="n">pDimensionLowerBounds</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pElements</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The following code shows how to compute each dimension length:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// show dimensions
</span></span></span><span class="line"><span class="cl"><span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;[&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kt">char</span> <span class="n">buffer</span><span class="p">[</span><span class="mi">16</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="n">ULONG32</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">rank</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">buffer</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">buffer</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="s">&#34;%u&#34;</span><span class="p">,</span> <span class="n">pDimensionSizes</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
</span></span><span class="line"><span class="cl">   <span class="n">strcat_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="n">buffer</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">&lt;</span> <span class="n">rank</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcat_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;, &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="n">strcat_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;]&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">delete</span> <span class="n">pDimensionSizes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">delete</span> <span class="n">pDimensionLowerBounds</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="getting-fields-of-a-reference-typeinstance">Getting fields of a reference type instance</h2>
<p>Since most “basic” types have been covered, it is now time to discuss the case of reference type parameters. Let’s take the following simple class as an example:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">ClassType</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">ClassType</span><span class="p">(</span><span class="kt">int</span> <span class="n">val</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">intField</span> <span class="p">=</span> <span class="n">val</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">stringField</span> <span class="p">=</span> <span class="p">(</span><span class="n">val</span> <span class="p">+</span> <span class="m">1</span><span class="p">).</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">IntProperty</span> <span class="p">=</span> <span class="n">val</span> <span class="p">*</span> <span class="m">2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">IntProperty</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">intField</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">string</span> <span class="n">stringField</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>On purpose, one property and two fields are defined. The C# compiler translates the automatic property syntax into a backing field to store the value</p>
<p><img loading="lazy" src="/posts/2021-12-18_accessing-arrays-and-class/1_7DnYG2yxKx0CH15sxP2bOA.png"></p>
<p>And the corresponding get/set accessors pair:</p>
<p><img loading="lazy" src="/posts/2021-12-18_accessing-arrays-and-class/1_-UF5kEZXd0LFs7gANNGOrw.png"></p>
<p>So when an instance of this class is passed as a parameter to the <strong>ClassParamReturnClass(ClassType obj)</strong> method, you should be able to list these three fields and access their value to build the following output:</p>
<pre tabindex="0"><code>--&gt; ClassType ClassParamReturnClass (ClassType obj)
|  int32 &lt;IntProperty&gt;k__BackingField = 84
|  int32 intField = 42
|  String stringField = 43
ClassType obj = 0x00000276D0A98E78
</code></pre><p>When you read Sergey Tepliakov in <a href="https://devblogs.microsoft.com/premier-developer/managed-object-internals-part-4-fields-layout?WT.mc_id=DT-MVP-5003325">Managed object internals, Part 4. Fields layout</a>, it sounds quite hard to achieve due to the complicated padding rules dictating where each field is stored in memory. Hopefully, the profiling API will help you a lot with a three steps process:</p>
<ul>
<li>Get the offset of each field value</li>
<li>Get the name of each field</li>
<li>Get the type of each field and then compute the value using the offset</li>
</ul>
<p>First, you need the <strong>ClassID</strong> corresponding to the type of the reference you want to dump and we have seen that <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getclassfromobject-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetClassFromObject</strong></a> is perfect for that. Then, pass it to <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getclasslayout-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetClassLayout</strong></a> to get the number of fields and their offset within an instance. This API expects you to call it once to get the number of fields and a second time to get the offset that are stored in a buffer you allocate after the first call.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">fieldCount</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> 
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetClassLayout</span><span class="p">(</span><span class="n">classID</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fieldCount</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// no field to dump
</span></span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">fieldCount</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">COR_FIELD_OFFSET</span><span class="o">*</span> <span class="n">pFieldOffsets</span> <span class="o">=</span> <span class="k">new</span> <span class="n">COR_FIELD_OFFSET</span><span class="p">[</span><span class="n">fieldCount</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetClassLayout</span><span class="p">(</span><span class="n">classID</span><span class="p">,</span> <span class="n">pFieldOffsets</span><span class="p">,</span> <span class="n">fieldCount</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fieldCount</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/cor-field-offset-structure?WT.mc_id=DT-MVP-5003325"><strong>COR_FIELD_OFFSET</strong></a> structure has a confusing name because it contains more than just the offset:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-c" data-lang="c"><span class="line"><span class="cl"><span class="k">typedef</span> <span class="k">struct</span> <span class="n">COR_FIELD_OFFSET</span> <span class="p">{</span>  
</span></span><span class="line"><span class="cl">    <span class="n">mdFieldDef</span>  <span class="n">ridOfField</span><span class="p">;</span>  
</span></span><span class="line"><span class="cl">    <span class="n">ULONG</span>       <span class="n">ulOffset</span><span class="p">;</span>  
</span></span><span class="line"><span class="cl"><span class="p">}</span> <span class="n">COR_FIELD_OFFSET</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>ridOfField</strong> part gives you the metadata token corresponding to a field as shown in ILSpy:</p>
<p><img loading="lazy" src="/posts/2021-12-18_accessing-arrays-and-class/1_gVi3qmL-iMZpQR2a89SkHQ.png"></p>
<p>It will allow you to get its name and the usual binary signature for its type via <a href="https://docs.microsoft.com/en-us/windows/win32/api/rometadataapi/nf-rometadataapi-imetadataimport-getfieldprops?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetFieldProps</strong></a>.</p>
<p>So you first need to get the <strong>IMetaDataImport</strong> implementation for the class module:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">IMetaDataImport</span><span class="o">*</span> <span class="n">pMetaDataImport</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>  
</span></span><span class="line"><span class="cl"><span class="n">ModuleID</span> <span class="n">moduleID</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdTypeDef</span> <span class="n">typeDefToken</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">WCHAR</span> <span class="n">name</span><span class="p">[</span><span class="mi">256</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetClassIDInfo</span><span class="p">(</span><span class="n">classID</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">moduleID</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">typeDefToken</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetModuleMetaData</span><span class="p">(</span><span class="n">moduleID</span><span class="p">,</span> <span class="n">ofRead</span><span class="p">,</span> <span class="n">IID_IMetaDataImport</span><span class="p">,</span> <span class="p">(</span><span class="n">IUnknown</span><span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">pMetaDataImport</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It is now time to iterate on each field to get its name, type and value for the given object:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">char</span> <span class="n">value</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="kt">char</span> <span class="n">buffer</span><span class="p">[</span><span class="mi">2</span> <span class="o">*</span> <span class="mi">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="n">ULONG</span> <span class="n">fieldIndex</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">fieldIndex</span> <span class="o">&lt;</span> <span class="n">fieldCount</span><span class="p">;</span> <span class="n">fieldIndex</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">PCCOR_SIGNATURE</span> <span class="n">pSigBlob</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span> <span class="n">sigBlobSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">name</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="sa">L</span><span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">pMetaDataImport</span><span class="o">-&gt;</span><span class="n">GetFieldProps</span><span class="p">(</span><span class="n">pFieldOffsets</span><span class="p">[</span><span class="n">fieldIndex</span><span class="p">].</span><span class="n">ridOfField</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="n">name</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">name</span><span class="p">),</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pSigBlob</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">sigBlobSize</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">      <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// skip the &#34;calling convention&#34; that should correspond to a &#39;field&#39;
</span></span></span><span class="line"><span class="cl">      <span class="n">ULONG</span> <span class="n">callingConvention</span> <span class="o">=</span> <span class="o">*</span><span class="n">pSigBlob</span><span class="o">++</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">assert</span><span class="p">(</span><span class="n">callingConvention</span> <span class="o">==</span> <span class="n">IMAGE_CEE_CS_CALLCONV_FIELD</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="n">buffer</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">pSigBlob</span> <span class="o">=</span> <span class="n">ParseElementType</span><span class="p">(</span><span class="n">pMetaDataImport</span><span class="p">,</span> <span class="n">pSigBlob</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">elementType</span><span class="p">,</span> <span class="n">buffer</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">buffer</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">      <span class="c1">// get its value from pFieldOffsets[fieldIndex].ulOffset
</span></span></span><span class="line"><span class="cl">      <span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">GetObjectValue</span><span class="p">((</span><span class="n">UINT_PTR</span><span class="p">)(</span><span class="n">managedReference</span> <span class="o">+</span> <span class="n">pFieldOffsets</span><span class="p">[</span><span class="n">fieldIndex</span><span class="p">].</span><span class="n">ulOffset</span><span class="p">),</span> <span class="n">length</span><span class="p">,</span> <span class="n">elementType</span><span class="p">,</span> <span class="n">elementToken</span><span class="p">,</span> <span class="n">moduleId</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">delete</span> <span class="p">[]</span> <span class="n">pFieldOffsets</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">pMetaDataImport</span><span class="o">-&gt;</span><span class="n">Release</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The only thing that differs from the blob signature parsing you have seen earlier is that it starts with a “calling convention” (yes I know that we are talking about field and not method!) equal to <strong>IMAGE_CEE_CS_CALLCONV_FIELD</strong>.</p>
<p>The field value is stored in memory at <strong>ulOffset</strong> bytes after the address pointed to by the given managed reference.</p>
<p>The next episode will describe how to dump value type instances, the return values and exceptions handling: a good way to start 2022!</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://mattwarren.org/2017/05/08/Arrays-and-the-CLR-a-Very-Special-Relationship/">Arrays and the CLR — a Very Special Relationship</a> by <a href="https://twitter.com/matthewwarren">Matt Warren</a></li>
<li><a href="https://devblogs.microsoft.com/premier-developer/managed-object-internals-part-3-the-layout-of-a-managed-array-3?WT.mc_id=DT-MVP-5003325">Managed object internals, Part 3. The layout of a managed array</a> by <a href="https://devblogs.microsoft.com/premier-developer/author/seteplia/">Sergey Tepliakov</a></li>
<li><a href="https://devblogs.microsoft.com/premier-developer/managed-object-internals-part-4-fields-layout?WT.mc_id=DT-MVP-5003325">Managed object internals, Part 4. Fields layout</a> by <a href="https://devblogs.microsoft.com/premier-developer/author/seteplia/">Sergey Tepliakov</a></li>
<li>Episode 1: <a href="/posts/2021-08-07_start-journey-into-the/">Start a journey into the .NET Profiling APIs</a></li>
<li>Episode 2: <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">Dealing with Modules, Assemblies and Types with CLR profiling API</a></li>
<li>Episode 3: <a href="/posts/2021-10-12_decyphering-method-signature-w/">Decyphering methods signature with .NET Profiling APIs</a></li>
<li>Episode 4: <a href="/posts/2021-11-16_reading-parameters-value-with/">Reading parameters value with the .NET Profiling APIs</a></li>
</ul>
]]></content:encoded></item><item><title>Reading parameters value with the .NET Profiling APIs</title><link>https://chrisnas.github.io/posts/2021-11-16_reading-parameters-value-with/</link><pubDate>Tue, 16 Nov 2021 09:02:44 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-11-16_reading-parameters-value-with/</guid><description>This post describes how to access method call parameters and get the value of numbers and strings.</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>From <a href="/posts/2021-10-12_decyphering-method-signature-w/">the list of arguments with their type</a>, it becomes possible to figure out their value when a method gets called. The rest of this post describes how to access method call parameters and get the value of numbers and strings.</p>
<h2 id="where-are-my-parameters">Where are my parameters?</h2>
<p>When you pass <strong>COR_PRF_ENABLE_FUNCTION_ARGS</strong> to <strong>ICorProfilerInfo::SetEventMask</strong>, the runtime prepares a <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/cor-prf-function-argument-info-structure?WT.mc_id=DT-MVP-5003325"><strong>COR_PRF_FUNCTION_ARGUMENT_INFO</strong></a> structure before your enter callback is called:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-c" data-lang="c"><span class="line"><span class="cl"><span class="k">typedef</span> <span class="k">struct</span> <span class="n">_COR_PRF_FUNCTION_ARGUMENT_INFO</span> <span class="p">{</span>  
</span></span><span class="line"><span class="cl">    <span class="n">ULONG</span> <span class="n">numRanges</span><span class="p">;</span>  
</span></span><span class="line"><span class="cl">    <span class="n">ULONG</span> <span class="n">totalArgumentSize</span><span class="p">;</span>  
</span></span><span class="line"><span class="cl">    <span class="n">COR_PRF_FUNCTION_ARGUMENT_RANGE</span> <span class="n">ranges</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>  
</span></span><span class="line"><span class="cl"><span class="p">}</span> <span class="n">COR_PRF_FUNCTION_ARGUMENT_INFO</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>I have to admit that the Microsoft Docs did not really help me to figure out what is the meaning of each field of this structure because the word “range” is very confusing here…</p>
<p>Based on my experiments, <strong>numRanges</strong> gives you the number of parameters; including the implicit <em>this</em> parameter in case of a non-static method. It is different from the signature that we have already parsed from the metadata where <em>this</em> is not mentioned. The <strong>ranges</strong> fields is an array of <strong>COR_PRF_FUNCTION_ARGUMENT_RANGE</strong> ; one per parameter:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-c" data-lang="c"><span class="line"><span class="cl"><span class="k">typedef</span> <span class="k">struct</span> <span class="n">_COR_PRF_FUNCTION_ARGUMENT_RANGE</span> <span class="p">{</span>  
</span></span><span class="line"><span class="cl">    <span class="n">UINT_PTR</span> <span class="n">startAddress</span><span class="p">;</span>  
</span></span><span class="line"><span class="cl">    <span class="n">ULONG</span> <span class="n">length</span><span class="p">;</span>  
</span></span><span class="line"><span class="cl"><span class="p">}</span> <span class="n">COR_PRF_FUNCTION_ARGUMENT_RANGE</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>startAddress</strong> points to where the parameter value is stored in memory.</p>
<p>However, in addition to the <strong>FunctionID</strong>, you only receive a <strong>COR_PRF_ELT_INFO</strong> in your enter callback. You need to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo3-getfunctionenter3info-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo3:: GetFunctionEnter3Info</strong></a> to get the corresponding <strong>COR_PRF_FUNCTION_ARGUMENT_INFO</strong> you are interested in. As often with COM, you need to call a first time to get the size of the buffer to allocate and a second time to fill it up:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">argumentInfoSize</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">COR_PRF_FRAME_INFO</span> <span class="n">frameInfo</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">_pInfo</span><span class="o">-&gt;</span><span class="n">GetFunctionEnter3Info</span><span class="p">(</span><span class="n">functionId</span><span class="p">,</span> <span class="n">eltInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">frameInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">argumentInfoSize</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">byte</span><span class="o">*</span> <span class="n">pBuffer</span> <span class="o">=</span> <span class="k">new</span> <span class="n">byte</span><span class="p">[</span><span class="n">argumentInfoSize</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="n">_pInfo</span><span class="o">-&gt;</span><span class="n">GetFunctionEnter3Info</span><span class="p">(</span><span class="n">functionId</span><span class="p">,</span> <span class="n">eltInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">frameInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">argumentInfoSize</span><span class="p">,</span> <span class="p">(</span><span class="n">COR_PRF_FUNCTION_ARGUMENT_INFO</span><span class="o">*</span><span class="p">)</span><span class="n">pBuffer</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">COR_PRF_FUNCTION_ARGUMENT_INFO</span><span class="o">*</span> <span class="n">pArgumentInfo</span> <span class="o">=</span> <span class="p">(</span><span class="n">COR_PRF_FUNCTION_ARGUMENT_INFO</span><span class="o">*</span><span class="p">)</span><span class="n">pBuffer</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Before iterating on the parameters, you need to deal with non-static method and their implicit this parameter stored in <strong>pArgumentInfo-&gt;ranges[0]</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">hiddenThisParameterIndexOffset</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">pSignature</span><span class="o">-&gt;</span><span class="n">IsStatic</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">hiddenThisParameterIndexOffset</span><span class="o">++</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// deal with the &#34;this&#34; hidden parameter for non static method
</span></span></span><span class="line"><span class="cl">   <span class="c1">// ex: show its address (i.e. pArgumentInfo-&gt;ranges[0].startAddress)
</span></span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Next, write a loop to iterate on each parameter based on the parameter count obtained previously from the metadata:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">char</span> <span class="n">value</span><span class="p">[</span><span class="mi">128</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="n">ULONG</span> <span class="n">currentParameterInSignature</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">currentParameterInSignature</span> <span class="o">&lt;</span> <span class="n">parameterCount</span><span class="p">;</span> <span class="n">currentParameterInSignature</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// Note: pParameter contains detail extracted from the metadata signature
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">UINT_PTR</span> <span class="n">pStartValue</span> <span class="o">=</span> <span class="n">pArgumentInfo</span><span class="o">-&gt;</span><span class="n">ranges</span><span class="p">[</span><span class="n">currentParameterInSignature</span> <span class="o">+</span> <span class="n">hiddenThisParameterIndexOffset</span><span class="p">].</span><span class="n">startAddress</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span> <span class="n">length</span> <span class="o">=</span> <span class="n">pArgumentInfo</span><span class="o">-&gt;</span><span class="n">ranges</span><span class="p">[</span><span class="n">currentParameterInSignature</span> <span class="o">+</span> <span class="n">hiddenThisParameterIndexOffset</span><span class="p">].</span><span class="n">length</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">IsPdOut</span><span class="p">(</span><span class="n">pParameter</span><span class="o">-&gt;</span><span class="n">Attributes</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="c1">// if [out] parameter, nothing to get from it
</span></span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">   <span class="k">else</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="c1">// call a helper function to extract the value of the parameter
</span></span></span><span class="line"><span class="cl">      <span class="c1">// a string from its address and type 
</span></span></span><span class="line"><span class="cl">      <span class="n">pHelpers</span><span class="o">-&gt;</span><span class="n">GetObjectValue</span><span class="p">(</span><span class="n">pStartValue</span><span class="p">,</span> <span class="n">length</span><span class="p">,</span> <span class="n">pParameter</span><span class="o">-&gt;</span><span class="n">ElementType</span> <span class="p">,</span> <span class="n">pParameter</span><span class="o">-&gt;</span><span class="n">TypeToken</span><span class="p">,</span> <span class="n">pSignature</span><span class="o">-&gt;</span><span class="n">ModuleId</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="simple-type-parameters-case">Simple type parameters case</h2>
<p>The <strong>GetObjectValue()</strong> helper function looks like the following:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">void</span> <span class="n">CorProfilerHelpers</span><span class="o">::</span><span class="n">GetObjectValue</span><span class="p">(</span><span class="n">UINT_PTR</span> <span class="n">address</span><span class="p">,</span> <span class="n">ULONG</span> <span class="n">length</span><span class="p">,</span> <span class="n">ULONG</span> <span class="n">elementType</span><span class="p">,</span> <span class="n">mdToken</span> <span class="n">elementTypeToken</span><span class="p">,</span> <span class="n">ModuleID</span> <span class="n">moduleId</span><span class="p">,</span> <span class="kt">char</span><span class="o">*</span> <span class="n">value</span><span class="p">,</span> <span class="n">ULONG</span> <span class="n">charCount</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span> <span class="n">numberValue</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;???&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">switch</span><span class="p">(</span><span class="n">elementType</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="p">...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">      <span class="k">default</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">         <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;unknown type 0x%x&#34;</span><span class="p">,</span> <span class="n">elementType</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The way to get the value of a parameter really depends on its type. I know that a length is provided by the <strong>COR_PRF_FUNCTION_ARGUMENT_INFO</strong> structure but I did not used it except for sanity check.</p>
<p>The value for simple types are easy to compute because they are mostly stored at the given address :</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">  1
</span><span class="lnt">  2
</span><span class="lnt">  3
</span><span class="lnt">  4
</span><span class="lnt">  5
</span><span class="lnt">  6
</span><span class="lnt">  7
</span><span class="lnt">  8
</span><span class="lnt">  9
</span><span class="lnt"> 10
</span><span class="lnt"> 11
</span><span class="lnt"> 12
</span><span class="lnt"> 13
</span><span class="lnt"> 14
</span><span class="lnt"> 15
</span><span class="lnt"> 16
</span><span class="lnt"> 17
</span><span class="lnt"> 18
</span><span class="lnt"> 19
</span><span class="lnt"> 20
</span><span class="lnt"> 21
</span><span class="lnt"> 22
</span><span class="lnt"> 23
</span><span class="lnt"> 24
</span><span class="lnt"> 25
</span><span class="lnt"> 26
</span><span class="lnt"> 27
</span><span class="lnt"> 28
</span><span class="lnt"> 29
</span><span class="lnt"> 30
</span><span class="lnt"> 31
</span><span class="lnt"> 32
</span><span class="lnt"> 33
</span><span class="lnt"> 34
</span><span class="lnt"> 35
</span><span class="lnt"> 36
</span><span class="lnt"> 37
</span><span class="lnt"> 38
</span><span class="lnt"> 39
</span><span class="lnt"> 40
</span><span class="lnt"> 41
</span><span class="lnt"> 42
</span><span class="lnt"> 43
</span><span class="lnt"> 44
</span><span class="lnt"> 45
</span><span class="lnt"> 46
</span><span class="lnt"> 47
</span><span class="lnt"> 48
</span><span class="lnt"> 49
</span><span class="lnt"> 50
</span><span class="lnt"> 51
</span><span class="lnt"> 52
</span><span class="lnt"> 53
</span><span class="lnt"> 54
</span><span class="lnt"> 55
</span><span class="lnt"> 56
</span><span class="lnt"> 57
</span><span class="lnt"> 58
</span><span class="lnt"> 59
</span><span class="lnt"> 60
</span><span class="lnt"> 61
</span><span class="lnt"> 62
</span><span class="lnt"> 63
</span><span class="lnt"> 64
</span><span class="lnt"> 65
</span><span class="lnt"> 66
</span><span class="lnt"> 67
</span><span class="lnt"> 68
</span><span class="lnt"> 69
</span><span class="lnt"> 70
</span><span class="lnt"> 71
</span><span class="lnt"> 72
</span><span class="lnt"> 73
</span><span class="lnt"> 74
</span><span class="lnt"> 75
</span><span class="lnt"> 76
</span><span class="lnt"> 77
</span><span class="lnt"> 78
</span><span class="lnt"> 79
</span><span class="lnt"> 80
</span><span class="lnt"> 81
</span><span class="lnt"> 82
</span><span class="lnt"> 83
</span><span class="lnt"> 84
</span><span class="lnt"> 85
</span><span class="lnt"> 86
</span><span class="lnt"> 87
</span><span class="lnt"> 88
</span><span class="lnt"> 89
</span><span class="lnt"> 90
</span><span class="lnt"> 91
</span><span class="lnt"> 92
</span><span class="lnt"> 93
</span><span class="lnt"> 94
</span><span class="lnt"> 95
</span><span class="lnt"> 96
</span><span class="lnt"> 97
</span><span class="lnt"> 98
</span><span class="lnt"> 99
</span><span class="lnt">100
</span><span class="lnt">101
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_BOOLEAN</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">bool</span><span class="o">*</span> <span class="n">pBool</span> <span class="o">=</span> <span class="p">(</span><span class="kt">bool</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="o">*</span><span class="n">pBool</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;true&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">else</span>
</span></span><span class="line"><span class="cl">      <span class="nf">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;false&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_CHAR</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">WCHAR</span><span class="o">*</span> <span class="n">pChar</span> <span class="o">=</span> <span class="p">(</span><span class="n">WCHAR</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%C&#34;</span><span class="p">,</span> <span class="o">*</span><span class="n">pChar</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_I1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// int8
</span></span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">char</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">numberValue</span> <span class="o">=</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%d&#34;</span><span class="p">,</span> <span class="n">numberValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_U1</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// unsigned int8
</span></span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">unsigned</span> <span class="kt">char</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">numberValue</span> <span class="o">=</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%d&#34;</span><span class="p">,</span> <span class="n">numberValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_I2</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// int16
</span></span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">short</span> <span class="kt">int</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kt">short</span> <span class="kt">int</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">numberValue</span> <span class="o">=</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%d&#34;</span><span class="p">,</span> <span class="n">numberValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_U2</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// unsigned int16
</span></span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">short</span> <span class="kt">unsigned</span> <span class="kt">int</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kt">short</span> <span class="kt">unsigned</span> <span class="kt">int</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">numberValue</span> <span class="o">=</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%d&#34;</span><span class="p">,</span> <span class="n">numberValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_I4</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// int32
</span></span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kr">__int32</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kr">__int32</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">numberValue</span> <span class="o">=</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%d&#34;</span><span class="p">,</span> <span class="n">numberValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_U4</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// unsigned int32
</span></span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">unsigned</span> <span class="kr">__int32</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kt">unsigned</span> <span class="kr">__int32</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">numberValue</span> <span class="o">=</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%d&#34;</span><span class="p">,</span> <span class="n">numberValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// NOTE: %lld might not work on linux
</span></span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_I8</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="c1">// int64
</span></span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kr">__int64</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kr">__int64</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%lld&#34;</span><span class="p">,</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_U8</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="c1">// unsigned int64
</span></span></span><span class="line"><span class="cl">   <span class="kt">unsigned</span> <span class="kr">__int64</span><span class="o">*</span> <span class="n">pNumber</span> <span class="o">=</span> <span class="p">(</span><span class="kt">unsigned</span> <span class="kr">__int64</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%lld&#34;</span><span class="p">,</span> <span class="o">*</span><span class="n">pNumber</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">And</span> <span class="n">guess</span> <span class="n">what</span><span class="o">?</span> <span class="n">It</span> <span class="n">is</span> <span class="n">the</span> <span class="n">same</span> <span class="k">for</span> <span class="kt">float</span> <span class="n">and</span> <span class="kt">double</span> <span class="n">because</span> <span class="n">it</span> <span class="n">is</span> <span class="n">stored</span> <span class="n">by</span> <span class="n">the</span> <span class="n">CLR</span> <span class="n">the</span> <span class="n">same</span> <span class="n">way</span> <span class="n">as</span> <span class="n">in</span> <span class="n">C</span><span class="o">++:</span>
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_R4</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">float</span><span class="o">*</span> <span class="n">pFloat</span> <span class="o">=</span> <span class="p">(</span><span class="kt">float</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%f&#34;</span><span class="p">,</span> <span class="o">*</span><span class="n">pFloat</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_R8</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kt">double</span><span class="o">*</span> <span class="n">pDouble</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">sprintf_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;%g&#34;</span><span class="p">,</span> <span class="o">*</span><span class="n">pDouble</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">break</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The other types require more knowledge about how the CLR stores their value.</p>
<h2 id="the-stringcase">The string case</h2>
<p>This is the first reference type we meet and, as for all reference types, the given address points to the memory where the reference (i.e. address of the object in the managed heap) is stored. It allows you to check against null parameter before looking at the “real” managed reference:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">case</span> <span class="nl">ELEMENT_TYPE_STRING</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// look at the reference stored at the given address
</span></span></span><span class="line"><span class="cl">   <span class="kt">unsigned</span> <span class="kr">__int64</span><span class="o">*</span> <span class="n">pAddress</span> <span class="o">=</span> <span class="p">(</span><span class="kt">unsigned</span> <span class="kr">__int64</span><span class="o">*</span><span class="p">)</span><span class="n">address</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">byte</span><span class="o">*</span> <span class="n">managedReference</span> <span class="o">=</span> <span class="p">(</span><span class="n">byte</span><span class="o">*</span><span class="p">)(</span><span class="o">*</span><span class="n">pAddress</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// easily check for null string
</span></span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">managedReference</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;null string&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>At that point, you need to know how an instance of a reference type instance is stored by the CLR in the managed heap. Hopefully, Sergey Tepliakov, a software engineer at Microsoft, has provided <a href="https://devblogs.microsoft.com/premier-developer/managed-object-internals-part-1-layout?WT.mc_id=DT-MVP-5003325">a lot of details about that</a>, especially where does the address stored by a managed reference point to:</p>
<p><img loading="lazy" src="/posts/2021-11-16_reading-parameters-value-with/1_mcau_z9bFdhqDG6ZLRlutw.png"></p>
<p>It means that you have to skip the Method Table pointer pointed to by the address you have. This applies to any reference types instance!</p>
<p>But for our <strong>string</strong> current case, you still need to know how a <strong>string</strong> is stored (i.e. its length followed by the buffer of UTF16 characters). I recommend that you read <a href="https://mattwarren.org/2016/05/31/Strings-and-the-CLR-a-Special-Relationship/">the post from Matt Warren</a> about the subject because it also covers a lot of interesting details related to string implementation. However, you should simply rely on the implementation details provided by the CLR via <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo3-getstringlayout2-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo3:: GetStringLayout2</strong></a><strong>:</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">_stringLengthOffset</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">_stringBufferOffset</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">_pProfilerInfo</span><span class="o">-&gt;</span><span class="n">GetStringLayout2</span><span class="p">(</span><span class="o">&amp;</span><span class="n">_stringLengthOffset</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">_stringBufferOffset</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>These two variables give you the offsets to use to access both the string size and the beginning of the array of WCHAR storing each character.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="c1">// -----------------------------------------------------------------
</span></span></span><span class="line"><span class="cl"><span class="c1">//    | MethodTable address | string length | buffer 
</span></span></span><span class="line"><span class="cl"><span class="c1">// -----------------------------------------------------------------
</span></span></span><span class="line"><span class="cl"><span class="c1">// 64       8 bytes              4 bytes      (length+1) x WCHAR
</span></span></span><span class="line"><span class="cl"><span class="c1">// off                           8            12
</span></span></span><span class="line"><span class="cl"><span class="c1">// -----------------------------------------------------------------
</span></span></span><span class="line"><span class="cl"><span class="c1">// 32       4 bytes              4 bytes      (length+1) x WCHAR
</span></span></span><span class="line"><span class="cl"><span class="c1">// off                           4            8
</span></span></span><span class="line"><span class="cl"><span class="c1">// -----------------------------------------------------------------
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>As shown in this table, you need to skip 8/4 bytes to read the length. It is just the confirmation that you need to jump over the address of the Method Table stored as a 64/32 bit value (i.e. an address in x64/x86). The length itself is stored as a 32 bit number (4 bytes) both in x64 an x86. So the array containing the consecutive UTF16 characters just follows (i.e. its offset from the reference address is 12/8 bytes). For example, here is what you get in Visual Studio Memory panel with the reference to the 3 characters “CLR” string for a 64 bit application:</p>
<p><img loading="lazy" src="/posts/2021-11-16_reading-parameters-value-with/1_H5QzJWaMyhx13cA8bFbV2Q.png"></p>
<p>With this information in hand, it is easy to detect empty strings or copy the UNICODE string into a simple <strong>char</strong>* buffer:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">byte</span><span class="o">*</span> <span class="n">pLength</span> <span class="o">=</span> <span class="n">GetPointerAfterNBytes</span><span class="p">(</span><span class="n">managedReference</span><span class="p">,</span> <span class="n">_stringLengthOffset</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">stringLength</span> <span class="o">=</span> <span class="o">*</span><span class="n">pLength</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">stringLength</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">strcpy_s</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="s">&#34;empty string&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">break</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">byte</span><span class="o">*</span> <span class="n">pBuffer</span> <span class="o">=</span> <span class="n">GetPointerAfterNBytes</span><span class="p">(</span><span class="n">managedReference</span><span class="p">,</span> <span class="n">_stringBufferOffset</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="c1">//                                                               V-- to copy the trailing \0
</span></span></span><span class="line"><span class="cl"><span class="o">::</span><span class="n">WideCharToMultiByte</span><span class="p">(</span><span class="n">CP_ACP</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="p">(</span><span class="n">WCHAR</span><span class="o">*</span><span class="p">)</span><span class="n">pBuffer</span><span class="p">,</span> <span class="n">stringLength</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="n">charCount</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>GetPointerAfterNBytes</strong> function simply helps me dealing with pointer arithmetic in C++</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">byte</span><span class="o">*</span> <span class="nf">GetPointerAfterNBytes</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span> <span class="n">pAddress</span><span class="p">,</span> <span class="n">ULONG</span> <span class="n">byteCount</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="k">return</span> <span class="p">((</span><span class="n">byte</span><span class="o">*</span><span class="p">)</span><span class="n">pAddress</span><span class="p">)</span> <span class="o">+</span> <span class="n">byteCount</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The next post will describe how to get the value of array parameters and the basics of extracting fields value from a reference type instance.</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://docs.microsoft.com/en-us/archive/blogs/davbr/sample-a-signature-blob-parser-for-your-profiler?WT.mc_id=DT-MVP-5003325">Sample: A Signature Blob Parser for your Profiler</a></li>
<li><a href="https://mattwarren.org/2016/05/31/Strings-and-the-CLR-a-Special-Relationship/">Strings and the CLR — a Special Relationship</a> by Matt Warren</li>
<li>Episode 1: <a href="/posts/2021-08-07_start-journey-into-the/">Start a journey into the .NET Profiling APIs</a></li>
<li>Episode 2: <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">Dealing with Modules, Assemblies and Types with CLR profiling API</a></li>
<li>Episode 3: <a href="/posts/2021-10-12_decyphering-method-signature-w/">Decyphering methods signature with .NET Profiling APIs</a></li>
</ul>
]]></content:encoded></item><item><title>Decyphering methods signature with .NET profiling APIs</title><link>https://chrisnas.github.io/posts/2021-10-12_decyphering-method-signature-w/</link><pubDate>Tue, 12 Oct 2021 13:23:15 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-10-12_decyphering-method-signature-w/</guid><description>The question answered by this post is how to build the signature of the method given a FunctionID.</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>After <a href="/posts/2021-08-07_start-journey-into-the/">introducing</a> the CLR profiling API by tracing managed methods calls, then <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">dealing with assemblies and types</a>, it is time to look at methods signatures. Remember that the starting point is the <strong>FunctionID</strong> received by the <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/functionenter2-function?WT.mc_id=DT-MVP-5003325"><strong>Enter</strong></a> callback each time a method is executed.</p>
<p>The question answered by this post is how to build the signature of the method given a <strong>FunctionID</strong>.</p>
<p>A method signature is built from its return value (or void), its name and a list of parameters. All these details are stored in the module metadata generated by the C# compiler. So the first step is to get the metadata token corresponding to a <strong>FunctionID</strong> thanks to <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getfunctioninfo-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetFunctionInfo</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">mdToken</span> <span class="n">token</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="n">hr</span> <span class="p">=</span> <span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetFunctionInfo</span><span class="p">(</span><span class="n">functionId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">classId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">moduleId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">token</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Next, use the <strong>IMetaDataImport</strong> corresponding to the module to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/imetadataimport-getmethodprops-method?WT.mc_id=DT-MVP-5003325"><strong>GetMethodProps</strong></a> and pass the function metadata token:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">mdTypeDef</span> <span class="n">type</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">WCHAR</span> <span class="n">name</span><span class="p">[</span><span class="m">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">attributes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">PCCOR_SIGNATURE</span> <span class="n">pSig</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">blobSize</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">codeRva</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">DWORD</span> <span class="n">flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">pMetaDataImport</span><span class="p">-&gt;</span><span class="n">GetMethodProps</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">token</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">type</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="p">-</span> <span class="m">1</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">size</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">attributes</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">pSig</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">blobSize</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">codeRva</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">flags</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">In</span> <span class="n">addition</span> <span class="n">to</span> <span class="n">the</span> <span class="n">function</span> <span class="n">name</span><span class="p">,</span> <span class="n">you</span> <span class="n">will</span> <span class="n">be</span> <span class="n">able</span> <span class="n">to</span> <span class="n">check</span> <span class="n">the</span> <span class="n">attributes</span> <span class="n">to</span> <span class="n">figure</span> <span class="k">out</span> <span class="k">if</span> <span class="n">the</span> <span class="n">function</span> <span class="k">is</span> <span class="kd">static</span> <span class="n">or</span> <span class="n">not</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">((</span><span class="n">attributes</span> <span class="p">&amp;</span> <span class="n">mdStatic</span><span class="p">)</span> <span class="p">==</span> <span class="n">mdStatic</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="p">&lt;&lt;</span> <span class="s">&#34; static &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The return type and parameters type of the function are encoded in a binary format defined in the <a href="https://www.ecma-international.org/publications-and-standards/standards/ecma-335/">ECMA-335</a> specification. This binary blob is pointed to by the <strong>pSig</strong> parameter. Hopefully, you don’t have to implement a blob signature parser yourself. This has been done my <a href="https://docs.microsoft.com/en-us/archive/blogs/davbr/sample-a-signature-blob-parser-for-your-profiler?WT.mc_id=DT-MVP-5003325">Rico Mariani</a> or <a href="https://github.com/microsoftarchive/clrprofiler/blob/master/CLRProfiler/profilerOBJ/ProfilerInfo.cpp#L1838">Peter Sollich</a> and it relies on low level helpers from cor.h such as <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/cor.h#L1944">CorSigUncompressData</a>.</p>
<p>Here is an example of a signature blob for a non-static method returning void and accepting a float and a double as parameters:</p>
<p><img loading="lazy" src="/posts/2021-10-12_decyphering-method-signature-w/1_6au1tjTSFLO-ljunAxv93w.png"></p>
<p>The well-known types are encoded and available as <strong>ELEMENT_TYPE_xxx</strong> constants from corhdr.h.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-smalltalk" data-lang="smalltalk"><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_VOID</span><span class="nf">,</span>    <span class="c">&#34;Void&#34;</span><span class="err">,</span>          
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_BOOLEAN</span><span class="nf">,</span> <span class="c">&#34;Boolean&#34;</span><span class="err">,</span>       
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_CHAR</span><span class="nf">,</span>    <span class="c">&#34;Char&#34;</span><span class="err">,</span>          
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_I1</span><span class="nf">,</span>      <span class="c">&#34;SByte&#34;</span><span class="err">,</span>         
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_U1</span><span class="nf">,</span>      <span class="c">&#34;Byte&#34;</span><span class="err">,</span>          
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_I2</span><span class="nf">,</span>      <span class="c">&#34;Int16&#34;</span><span class="err">,</span>         
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_U2</span><span class="nf">,</span>      <span class="c">&#34;UInt16&#34;</span><span class="err">,</span>        
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_I4</span><span class="nf">,</span>      <span class="c">&#34;Int32&#34;</span><span class="err">,</span>         
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_U4</span><span class="nf">,</span>      <span class="c">&#34;UInt32&#34;</span><span class="err">,</span>        
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_I8</span><span class="nf">,</span>      <span class="c">&#34;Int64&#34;</span><span class="err">,</span>         
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_U8</span><span class="nf">,</span>      <span class="c">&#34;UInt64&#34;</span><span class="err">,</span>        
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_R4</span><span class="nf">,</span>      <span class="c">&#34;Single&#34;</span><span class="err">,</span>        
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_R8</span><span class="nf">,</span>      <span class="c">&#34;Double&#34;</span><span class="err">,</span>        
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_STRING</span><span class="nf">,</span>  <span class="c">&#34;String&#34;</span><span class="err">,</span>        
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_I</span><span class="nf">,</span>       <span class="c">&#34;IntPtr&#34;</span><span class="err">,</span>        
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_U</span><span class="nf">,</span>       <span class="c">&#34;UIntPtr&#34;</span><span class="err">,</span>       
</span></span><span class="line"><span class="cl"><span class="nc">ELEMENT_TYPE_OBJECT</span><span class="nf">,</span>  <span class="c">&#34;Object&#34;</span><span class="err">,</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For custom types identified as <strong>ELEMENT_TYPE_CLASS</strong> for reference types or <strong>ELEMENT_TYPE_VALUETYPE</strong> for value types, the metadata token of the type is “compressed” as part of the signature (see <a href="https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/cor.h#L1944"><strong>CorSigUncompressToken</strong></a> in cor.h for implementation details). If the type is defined in the same assembly as the method, you get a <em>TypeDef</em> token (starting with 02) used to call <a href="https://docs.microsoft.com/en-us/windows/win32/api/rometadataapi/nf-rometadataapi-imetadataimport-gettypedefprops?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetTypeDefProps</strong></a>. If not, it will be a <em>TypeRef</em> token (starting with 01) used to call <a href="https://docs.microsoft.com/en-us/windows/win32/api/rometadataapi/nf-rometadataapi-imetadataimport-gettyperefprops?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetTypeRefProps</strong></a>.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">case</span> <span class="n">ELEMENT_TYPE_CLASS</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="k">case</span> <span class="n">ELEMENT_TYPE_VALUETYPE</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">mdToken</span> <span class="n">token</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="kt">char</span> <span class="n">classname</span><span class="p">[</span><span class="m">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">classname</span><span class="p">[</span><span class="m">0</span><span class="p">]</span> <span class="p">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">signature</span> <span class="p">+=</span> <span class="n">CorSigUncompressToken</span><span class="p">(</span><span class="n">signature</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">token</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">typeToken</span> <span class="p">!=</span> <span class="n">NULL</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="p">*</span><span class="n">typeToken</span> <span class="p">=</span> <span class="n">token</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">HRESULT</span> <span class="n">hr</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">WCHAR</span> <span class="n">zName</span><span class="p">[</span><span class="m">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">TypeFromToken</span><span class="p">(</span><span class="n">token</span><span class="p">)</span> <span class="p">==</span> <span class="n">mdtTypeRef</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">mdToken</span> <span class="n">resScope</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">ULONG</span> <span class="n">length</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">hr</span> <span class="p">=</span> <span class="n">pMDImport</span><span class="p">-&gt;</span><span class="n">GetTypeRefProps</span><span class="p">(</span><span class="n">token</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="p">&amp;</span><span class="n">resScope</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="n">zName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="m">260</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="p">&amp;</span><span class="n">length</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">   <span class="k">else</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">hr</span> <span class="p">=</span> <span class="n">pMDImport</span><span class="p">-&gt;</span><span class="n">GetTypeDefProps</span><span class="p">(</span><span class="n">token</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="n">zName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="m">260</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="n">NULL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="n">NULL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">         <span class="n">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="err">…</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>More on typeRef and typeDef later.</p>
<p>This is nice but since the parameter name is not encoded in the signature blob, you have to work more to get it. First, you have to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/imetadataimport-enumparams-method?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::EnumParams</strong></a>, to get the metadata token <strong>mdParamDef</strong> for each parameter:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">HCORENUM</span> <span class="n">hEnum</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">paramCount</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdParamDef</span><span class="p">*</span> <span class="n">paramDefs</span> <span class="p">=</span> <span class="k">new</span> <span class="n">mdParamDef</span><span class="p">[</span><span class="n">argCount</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="n">pMetaData</span><span class="p">-&gt;</span><span class="n">EnumParams</span><span class="p">(&amp;</span><span class="n">hEnum</span><span class="p">,</span> <span class="n">token</span><span class="p">,</span> <span class="n">paramDefs</span><span class="p">,</span> <span class="n">argCount</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">paramCount</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">pMetaData</span><span class="p">-&gt;</span><span class="n">CloseEnum</span><span class="p">(</span><span class="n">hEnum</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">The</span> <span class="n">next</span> <span class="n">step</span> <span class="k">is</span> <span class="n">to</span> <span class="n">iterate</span> <span class="k">on</span> <span class="n">the</span> <span class="n">array</span> <span class="n">of</span> <span class="n">metadata</span> <span class="n">parameter</span> <span class="n">definition</span> <span class="n">and</span> <span class="n">call</span> <span class="n">IMetaDataImport</span><span class="p">::</span><span class="n">GetParamProps</span> <span class="n">to</span> <span class="k">get</span> <span class="n">its</span> <span class="n">name</span><span class="p">,</span> <span class="k">if</span> <span class="n">it</span> <span class="k">is</span> <span class="n">a</span> <span class="k">value</span> <span class="n">type</span> <span class="n">and</span> <span class="n">the</span> <span class="n">ParseElementType</span> <span class="n">helper</span> <span class="n">extracts</span> <span class="n">the</span> <span class="n">type</span> <span class="k">from</span> <span class="n">the</span> <span class="n">signature</span> <span class="n">blob</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">pos</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">WCHAR</span> <span class="n">name</span><span class="p">[</span><span class="m">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">length</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">attributes</span><span class="p">;</span>  <span class="c1">// values from CorParamAttr in CorHdr.h</span>
</span></span><span class="line"><span class="cl"><span class="n">DWORD</span> <span class="n">bIsValueType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="n">ULONG</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">	<span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">pSigBlob</span> <span class="p">!=</span> <span class="n">NULL</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">i</span> <span class="p">&lt;</span> <span class="p">(</span><span class="n">argCount</span><span class="p">)));</span>
</span></span><span class="line"><span class="cl">	<span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">	<span class="c1">// get the parameter name</span>
</span></span><span class="line"><span class="cl">	<span class="n">hr</span> <span class="p">=</span> <span class="n">pMetaData</span><span class="p">-&gt;</span><span class="n">GetParamProps</span><span class="p">(</span><span class="n">paramDefs</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">NULL</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">pos</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">name</span><span class="p">)-</span><span class="m">1</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">length</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">attributes</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">bIsValueType</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="n">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">	<span class="c1">// note that we need to convert from WCHAR* to char* for the name</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">	<span class="c1">// get the parameter type</span>
</span></span><span class="line"><span class="cl">	<span class="n">buffer</span><span class="p">[</span><span class="m">0</span><span class="p">]</span> <span class="p">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">	<span class="n">pSigBlob</span> <span class="p">=</span> <span class="n">ParseElementType</span><span class="p">(</span><span class="n">pMetaData</span><span class="p">,</span> <span class="n">pSigBlob</span><span class="p">,</span> <span class="n">classTypeArgs</span><span class="p">,</span> <span class="n">methodTypeArgs</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">elementType</span><span class="p">,</span> <span class="n">buffer</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">buffer</span><span class="p">)</span> <span class="p">-</span> <span class="m">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="generic-methods-have-more-complicated-signatures-tocompute">Generic methods have more complicated signatures to compute</h2>
<p>The following figure shows how to handle generic methods:</p>
<p><img loading="lazy" src="/posts/2021-10-12_decyphering-method-signature-w/1_ZjGy-P8JWGn9aw_5khX2Pw.png"></p>
<p>The main change for such a generic method is that, in the signature blob, you will get the number of generic arguments just after the total parameters count and the first “calling convention” data will be <strong>IMAGE_CEE_CS_CALLCONV_GENERIC</strong>. The other difference is how to deal with generic parameters in the blob that will all share the same <strong>ELEMENT_TYPE_MVAR</strong> value followed by a <em>position</em> (starting from 0). This is the position in the array returned by <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getfunctioninfo2-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetFunctionInfo2</strong></a> for the <strong>ClassID</strong>.</p>
<p>The final code should look like the following:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">DWORD</span> <span class="n">bIsValueType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">currentGenericParam</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="n">ULONG</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">pSigBlob</span> <span class="p">!=</span> <span class="n">NULL</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">i</span> <span class="p">&lt;</span> <span class="p">(</span><span class="n">argCount</span><span class="p">)));</span>
</span></span><span class="line"><span class="cl">   <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// get the parameter name</span>
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="p">=</span> <span class="n">pMetaData</span><span class="p">-&gt;</span><span class="n">GetParamProps</span><span class="p">(</span><span class="n">paramDefs</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">NULL</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">pos</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">name</span><span class="p">)-</span><span class="m">1</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">length</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">attributes</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">bIsValueType</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="n">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// note that we need to convert from WCHAR* to char* for the name</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// get the parameter type</span>
</span></span><span class="line"><span class="cl">   <span class="n">buffer</span><span class="p">[</span><span class="m">0</span><span class="p">]</span> <span class="p">=</span> <span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// in case of generic function, get the type details from the runtime and not from the metadata</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// !! we don&#39;t know in advance which parameter is a generic parameter and this is given by elementType == ELEMENT_TYPE_MVAR</span>
</span></span><span class="line"><span class="cl">   <span class="n">pSigBlob</span> <span class="p">=</span> <span class="n">ParseElementType</span><span class="p">(</span><span class="n">pMetaData</span><span class="p">,</span> <span class="n">pSigBlob</span><span class="p">,</span> <span class="n">classTypeArgs</span><span class="p">,</span> <span class="n">methodTypeArgs</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">elementType</span><span class="p">,</span> <span class="n">buffer</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">buffer</span><span class="p">)</span> <span class="p">-</span> <span class="m">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">((</span><span class="n">methodTypeArgs</span> <span class="p">!=</span> <span class="n">NULL</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">elementType</span> <span class="p">==</span> <span class="n">ELEMENT_TYPE_MVAR</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">ModuleID</span> <span class="n">moduleId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">mdTypeDef</span> <span class="n">mdType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">hr</span> <span class="p">=</span> <span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetClassIDInfo2</span><span class="p">(</span><span class="n">methodTypeArgs</span><span class="p">[</span><span class="n">currentGenericParam</span><span class="p">],</span> <span class="p">&amp;</span><span class="n">moduleId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">mdType</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="n">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="n">WCHAR</span> <span class="n">paramTypeName</span><span class="p">[</span><span class="m">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">         <span class="n">IMetaDataImport2</span><span class="p">*</span> <span class="n">pImport2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">         <span class="n">hr</span> <span class="p">=</span> <span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetModuleMetaData</span><span class="p">(</span><span class="n">moduleId</span><span class="p">,</span> <span class="n">ofRead</span><span class="p">,</span> <span class="n">IID_IMetaDataImport2</span><span class="p">,</span> <span class="n">reinterpret_cast</span><span class="p">&lt;</span><span class="n">IUnknown</span><span class="p">**&gt;(&amp;</span><span class="n">pImport2</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">         <span class="n">ULONG</span> <span class="n">sigBlobLen</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">         <span class="c1">// NOTE: get elementType from type name because the metadata can&#39;t give us the instanciated generic in the blob signature</span>
</span></span><span class="line"><span class="cl">         <span class="n">hr</span> <span class="p">=</span> <span class="n">pImport2</span><span class="p">-&gt;</span><span class="n">GetTypeDefProps</span><span class="p">(</span><span class="n">mdType</span><span class="p">,</span> <span class="n">paramTypeName</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">paramTypeName</span><span class="p">)-</span><span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="n">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">         <span class="n">pImport2</span><span class="p">-&gt;</span><span class="n">Release</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">      <span class="n">currentGenericParam</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>One last detail before looking for the parameters value in the next post: for ALL reference types, the same implementation is generated by the JIT compiler. If you think about it, there is no need to implement a <strong>List<string></strong> in a different way than a <strong>List</strong> of any other reference type: they all deal with references (i.e. addresses). The name picked by the CLR team to identify this “generic reference type” is <strong>System.__Canon</strong> that stands for “canonical”. So expect to receive that type name a lot!</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://docs.microsoft.com/en-us/archive/blogs/davbr/sample-a-signature-blob-parser-for-your-profiler?WT.mc_id=DT-MVP-5003325">Sample: A Signature Blob Parser for your Profiler</a></li>
<li>Episode 1: <a href="/posts/2021-08-07_start-journey-into-the/">Start a journey into the .NET Profiling APIs</a></li>
<li>Episode 2: <a href="/posts/2021-09-06_dealing-with-modules-assemblie/">Dealing with Modules, Assemblies and Types with CLR profiling API</a></li>
</ul>
]]></content:encoded></item><item><title>Dealing with Modules, Assemblies and Types with CLR Profiling APIs</title><link>https://chrisnas.github.io/posts/2021-09-06_dealing-with-modules-assemblie/</link><pubDate>Mon, 06 Sep 2021 06:01:13 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-09-06_dealing-with-modules-assemblie/</guid><description>This episode explains how to get the type name and generic parameters hidden behind a ClassID with the .NET Profiling API</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>In<a href="/posts/2021-08-07_start-journey-into-the/"> the first post</a> of this series dedicated to CLR Profiling API, you have seen how to get a <strong>FunctionID</strong> each time a managed method is executed in a .NET application. As David Broman (source of most of the profiling implementation details at Microsoft) explains, a <strong>FunctionID</strong> is a pointer to an internal data structure of the CLR called a <strong>MethodDesc</strong>. For us, it is just an opaque value that is usable in different CLR APIs. So what if you would like to know the name of the method behind this <strong>FunctionID</strong>?</p>
<p>Unlike what you might think, this first question is not an easy one, especially if you would like to get the complete signature of the method such as what you get in Visual Studio Call Stack panel:</p>
<blockquote>
<p>ProfilingTest.dll!PublicClass.ClassParamReturnClass(ClassType obj)</p>
</blockquote>
<p>You will have to get the module name (i.e. the assembly where the method type is defined), the type name, the method name and the list of its parameters type and name.</p>
<p>This post deals with the notions of module, assembly and type in addition to introducing the .NET Metadata API.</p>
<h2 id="identifying-the-module-andassembly">Identifying the module and assembly</h2>
<p>I’m sure that most of you know what an assembly is: this is what gets generated when you compile a Class Library in Visual Studio. Easy answer. However, .NET (unlike Visual Studio) supports the notion of <a href="https://docs.microsoft.com/en-us/dotnet/framework/app-domains/build-multifile-assembly?WT.mc_id=DT-MVP-5003325">multi-module assembly creation</a> bound to several “<a href="https://docs.microsoft.com/en-us/dotnet/standard/assembly/?WT.mc_id=DT-MVP-5003325">modules</a>”. Each module can contain types and resources and the assembly contains the manifest listing all the modules defining the assembly.</p>
<p>This is why the profiling API allows you to get both assembly and module. Let’s use <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getfunctioninfo-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetFunctionInfo</strong></a> to find out which module and assembly is implementing the type of a given <strong>FunctionID</strong>.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">ClassID</span> <span class="n">classId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ModuleID</span> <span class="n">moduleId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdToken</span> <span class="n">mdtokenFunction</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetFunctionInfo</span><span class="p">(</span><span class="n">functionId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">classId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">moduleId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">mdtokenFunction</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now that you have a <strong>ModuleID</strong>, you can call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getmoduleinfo-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetModuleInfo</strong></a> to get its name, load address and assembly. The usage pattern of this API is common in COM: first you call it to get the size of the buffer to copy the name and then you call it a second time with the newly allocated buffer:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">LPCBYTE</span> <span class="n">loadAddress</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">nameLen</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">AssemblyID</span> <span class="n">assemblyId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetModuleInfo</span><span class="p">(</span><span class="n">moduleId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">loadAddress</span><span class="p">,</span> <span class="n">nameLen</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">nameLen</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">assemblyId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">WCHAR</span><span class="p">*</span> <span class="n">pszName</span> <span class="p">=</span> <span class="k">new</span> <span class="n">WCHAR</span><span class="p">[</span><span class="n">nameLen</span><span class="p">];</span>  <span class="c1">// count the trailing \0</span>
</span></span><span class="line"><span class="cl">   <span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetModuleInfo</span><span class="p">(</span><span class="n">moduleId</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">loadAddress</span><span class="p">,</span> <span class="n">nameLen</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">nameLen</span><span class="p">,</span> <span class="n">pszName</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">                        <span class="p">&amp;</span><span class="n">assemblyId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="p">&lt;&lt;</span> <span class="n">L</span><span class="s">&#34;(&#34;</span> <span class="p">&lt;&lt;</span> <span class="n">pszName</span> <span class="p">&lt;&lt;</span> <span class="n">L</span><span class="s">&#34;)&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">delete</span> <span class="p">[]</span> <span class="n">pszName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">else</span>
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="p">&lt;&lt;</span> <span class="n">L</span><span class="s">&#34;(UNKNOWN)&#34;</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that the module name is the full path name of the module file.</p>
<p>Here is the code that calls <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getassemblyinfo-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetAssemblyInfo</strong></a> to get the assembly name now that you have the <strong>AssemblyID</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">hr</span> <span class="p">=</span> <span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetAssemblyInfo</span><span class="p">(</span><span class="n">assemblyId</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">nameLen</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="n">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SUCCEEDED</span><span class="p">(</span><span class="n">hr</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">WCHAR</span><span class="p">*</span> <span class="n">pszName</span> <span class="p">=</span> <span class="k">new</span> <span class="n">WCHAR</span><span class="p">[</span><span class="n">nameLen</span><span class="p">];</span>  <span class="c1">// count the trailing \0</span>
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="p">=</span> <span class="n">pInfo</span><span class="p">-&gt;</span><span class="n">GetAssemblyInfo</span><span class="p">(</span><span class="n">assemblyId</span><span class="p">,</span> <span class="n">nameLen</span><span class="p">,</span> <span class="p">&amp;</span><span class="n">nameLen</span><span class="p">,</span> <span class="n">pszName</span><span class="p">,</span> <span class="n">NULL</span><span class="p">,</span> <span class="n">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="p">&lt;&lt;</span> <span class="n">pszName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">delete</span> <span class="p">[]</span> <span class="n">pszName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">else</span>
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="p">&lt;&lt;</span> <span class="n">L</span><span class="s">&#34;&lt;UNKNOWN&gt;&#34;</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The assembly name does not contain the file extension such as .dll or .so.</p>
<h2 id="id-or-token-it-depends-on-which-profiling-api-touse">ID or Token: it depends on which profiling API to use</h2>
<p>it is important to discuss what kind of information you get from the different profiling APIs. Like <strong>FunctionID</strong>, <strong>ClassID</strong> and <strong>ModuleID</strong> are opaque pointers to CLR internal data structures. They are used by the runtime to map into memory metadata generated by the compiler. The metadata identifiers are usually referenced as “token” and the <strong>mdToken</strong> type simply stands for “metadata token”. Unlike the different <strong>xxxID</strong> types with values different each time the code runs, the metadata tokens stay the same because they come from the compiled assembly. While debugging, it is good to be able to compare what token you get against their corresponding value in an assembly. As an example, here is what you get with ILSpy while browsing the medatata:</p>
<p><img loading="lazy" src="/posts/2021-09-06_dealing-with-modules-assemblie/1_EBaPkQAwCdKlDEH790bb8g.png"></p>
<p>Each kind of metadata is encoded into the first 2 digits so it is easy to see what you are manipulating. The 06 prefix tells you that you are dealing with a method:</p>
<p><img loading="lazy" src="/posts/2021-09-06_dealing-with-modules-assemblie/1_V6ZnzD7VLwBbzBnsXTirIw.png"></p>
<p>Instead of <strong>ICorProfilerInfo</strong>, you need to use <a href="https://docs.microsoft.com/en-us/windows/win32/api/rometadataapi/nn-rometadataapi-imetadataimport?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport</strong></a> to access information behind the metadata tokens. Since the metadata is bound to a given module, you have to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getmodulemetadata-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo:: GetModuleMetaData</strong></a> to get the implementation corresponding to a given <strong>ModuleID</strong>.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">IMetaDataImport</span><span class="o">*</span> <span class="n">pMetaDataImport</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="n">hr</span> <span class="o">=</span> <span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetModuleMetaData</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">   <span class="n">moduleId</span><span class="p">,</span> <span class="n">ofRead</span><span class="p">,</span> <span class="n">IID_IMetaDataImport</span><span class="p">,</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">IUnknown</span><span class="o">**&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pMetaDataImport</span><span class="p">));</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For the rest of the series, I will do my best to present which profiling/metadata API to use for what purpose. And in some cases, you will need both.</p>
<h2 id="identifying-thetype">Identifying the type</h2>
<p>After the module details, let’s see what we can get for the type that implements a given <strong>FunctionID</strong>. For one of my test, I defined the following C# generic type:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">GenericPublicClass</span><span class="p">&lt;</span><span class="n">K</span><span class="p">,</span> <span class="n">V</span><span class="p">&gt;</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span> 
</span></span><span class="line"><span class="cl"><span class="na">   [MethodImpl(MethodImplOptions.NoInlining)]</span>
</span></span><span class="line"><span class="cl">   <span class="kd">public</span> <span class="kt">string</span> <span class="n">Store</span><span class="p">(</span><span class="n">K</span> <span class="n">key</span><span class="p">,</span> <span class="n">IEnumerable</span><span class="p">&lt;</span><span class="n">V</span><span class="p">&gt;</span> <span class="n">val</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span> <span class="s">$&#34;{key} = {val.Count()} items&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It can be used like the following:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">g1</span> <span class="p">=</span> <span class="k">new</span> <span class="n">GenericPublicClass</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">int</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl"><span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="n">g1</span><span class="p">.</span><span class="n">Store</span><span class="p">(</span><span class="s">&#34;secret&#34;</span><span class="p">,</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[]</span> <span class="p">{</span> <span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">,</span> <span class="m">4</span><span class="p">,</span> <span class="m">5</span> <span class="p">}));</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Why am I starting with a generic type? That way, you will better understand that this feature has been added after the initial profiling API shipped and is not that well integrated. Basically, the first iteration of <strong>ICorProfilerInfo</strong> did not deal with generics but the second one <strong>ICorProfilerInfo2</strong> does.</p>
<p>But first, let’s summarize <a href="https://docs.microsoft.com/en-us/dotnet/standard/generics/?WT.mc_id=DT-MVP-5003325/">a few basics about generics</a>. When you define a generic type and generic methods such as for my <strong>GenericPublicClass</strong>, the C# compiler generates the metadata for the <em>generic</em> <em>type definition</em> that acts as a template. The <em>generic type parameters</em> (K and V in my case) are placeholders that will be instanciated by <em>generic type arguments</em> to get a final <em>generic</em> <em>constructed type</em>.</p>
<p>The important part to understand for our purpose is the fact that metadata will only contain generic type definitions</p>
<p><img loading="lazy" src="/posts/2021-09-06_dealing-with-modules-assemblie/1_zbCei-dGvTiRWmemzVBkXA.png"></p>
<p>The name stored in the metadata ends with the ` character followed by the number of generic type parameters. This is what you get when you call <strong>GetType().Name</strong> on a generic instance in C#.</p>
<p>As shown earlier, <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-getfunctioninfo-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo::GetFunctionInfo</strong></a> is used to get the <strong>ClassID</strong> of the type implementing the given <strong>FunctionID</strong>. Unfortunately, in case of a generic type, it returns <strong>S_OK</strong> but the <strong>ClassID</strong> you get is 0. In that case, you know you have to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getfunctioninfo2-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetFunctionInfo2</strong></a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">HRESULT</span> <span class="nf">GetFunctionInfo2</span><span class="p">(</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">in</span><span class="p">]</span>  <span class="n">FunctionID</span> <span class="n">funcId</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">in</span><span class="p">]</span>  <span class="n">COR_PRF_FRAME_INFO</span> <span class="n">frameInfo</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">out</span><span class="p">]</span> <span class="n">ClassID</span> <span class="o">*</span><span class="n">pClassId</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">out</span><span class="p">]</span> <span class="n">ModuleID</span> <span class="o">*</span><span class="n">pModuleId</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">out</span><span class="p">]</span> <span class="n">mdToken</span> <span class="o">*</span><span class="n">pToken</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">in</span><span class="p">]</span>  <span class="n">ULONG32</span> <span class="n">cTypeArgs</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">out</span><span class="p">]</span> <span class="n">ULONG32</span> <span class="o">*</span><span class="n">pcTypeArgs</span><span class="p">,</span>  
</span></span><span class="line"><span class="cl">    <span class="p">[</span><span class="n">out</span><span class="p">]</span> <span class="n">ClassID</span> <span class="n">typeArgs</span><span class="p">[]</span>
</span></span><span class="line"><span class="cl">    <span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You have the <strong>FunctionID</strong> but not the <strong>COR_PRF_FRAME_INFO</strong>… You need to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo3-getfunctionenter3info-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo3::GetFunctionEnter3Info</strong></a> to get it from the <strong>COR_PRF_ELT_INFO</strong> given by the enter stub. Here is the final code to get a <strong>ClassID</strong> for a generic type:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">classId</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// Call GetFunctionEnter3Info to get the COR_PRF_FRAME_INFO* needed by GetFunctionInfo2 
</span></span></span><span class="line"><span class="cl">   <span class="c1">// as a second parameter and get the instanciated generic argument types. 
</span></span></span><span class="line"><span class="cl">   <span class="c1">// Otherwise will get &lt;K, V&gt; instead of &lt;int, string&gt; for example
</span></span></span><span class="line"><span class="cl">   <span class="n">COR_PRF_FRAME_INFO</span> <span class="n">frameInfo</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span> <span class="n">nbArgumentInfo</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// NOTE: it is needed to pass &amp;nbArgumentInfo or the method will return INVALIDARGUMENT error
</span></span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetFunctionEnter3Info</span><span class="p">(</span><span class="n">functionId</span><span class="p">,</span> <span class="n">eltInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">frameInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">nbArgumentInfo</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// NOTE: hr will fail will insuffisant buffer size in case of generic but the frameInfo will be correct
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">hr</span> <span class="o">=</span> <span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetFunctionInfo2</span><span class="p">(</span><span class="n">functionId</span><span class="p">,</span> <span class="n">frameInfo</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">classId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">moduleId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">mdtokenFunction</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="c1">// from here, we are sure to have a valid ClassID
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>Here is a summary of the relationships between the different IDs with the corresponding APIs to call:</p>
<p><img loading="lazy" src="/posts/2021-09-06_dealing-with-modules-assemblie/1_U7o7D7K4u2OztCqK-xYrIQ.png"></p>
<h2 id="from-a-classid-to-a-classname">From a ClassID to a class name</h2>
<p>It is time to enter a complicated part of the story: how to get the “name” of the type that hides behind a <strong>ClassID</strong>. As you might guess, the first step is to figure out if it is a generic type and what are the corresponding type arguments. You have to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getclassidinfo2-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetClassIDInfo2</strong></a> with the <strong>ClassID</strong> to get the metadata token of the type, the number of type arguments and the <strong>ClassID</strong> of these types if any. As usual with this kind of API, a first call is needed to get the number of type arguments so you can allocate the right sized array of <strong>ClassID</strong>. The second call will fill up the newly allocated array:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">mdTypeDef</span> <span class="n">mdType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ClassID</span> <span class="n">parentClassId</span><span class="p">;</span> <span class="c1">// not needed in our scenario 
</span></span></span><span class="line"><span class="cl"><span class="n">ULONG32</span> <span class="n">numGenericTypeArgs</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">ClassID</span><span class="o">*</span> <span class="n">genericTypeArgs</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetClassIDInfo2</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">mdType</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">parentClassId</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">numGenericTypeArgs</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">numGenericTypeArgs</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">genericTypeArgs</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ClassID</span><span class="p">[</span><span class="n">numGenericTypeArgs</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetClassIDInfo2</span><span class="p">(</span><span class="n">classId</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">mdType</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">parentClassId</span><span class="p">,</span> <span class="n">numGenericTypeArgs</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">numGenericTypeArgs</span><span class="p">,</span> <span class="n">genericTypeArgs</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Since you obtained a metadata token, you will need the <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/imetadataimport-interface?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport</strong></a> of the module where the type is defined to get details such as… its name. The <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/imetadataimport2-interface?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport2</strong></a> is required to enumerate the parameter types:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">IMetaDataImport2</span><span class="o">*</span> <span class="n">pMetaDataImport</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetModuleMetaData</span><span class="p">(</span><span class="n">moduleId</span><span class="p">,</span> <span class="n">ofRead</span><span class="p">,</span> <span class="n">IID_IMetaDataImport2</span><span class="p">,</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">IUnknown</span><span class="o">**&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pMetaDataImport</span><span class="p">));</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Getting the type “name” is done by a call to <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/imetadataimport-gettypedefprops-method?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetTypeDefProps</strong></a>, passing the metadata token corresponding to the <strong>ClassID</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">ULONG</span> <span class="n">length</span> <span class="o">=</span> <span class="n">bufferLen</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">DWORD</span> <span class="n">flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">mdTypeDef</span> <span class="n">mdBaseType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">std</span><span class="o">::</span><span class="n">wostringstream</span> <span class="n">oss</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">pszName</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="sa">L</span><span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">hr</span> <span class="o">=</span> <span class="n">pMetaDataImport</span><span class="o">-&gt;</span><span class="n">GetTypeDefProps</span><span class="p">(</span><span class="n">mdType</span><span class="p">,</span> <span class="n">pszName</span><span class="p">,</span> <span class="n">length</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">length</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">flags</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">mdBaseType</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>But before jumping into the name, you need to take care of the case where you are dealing with a nested type (i.e. a type defined in another type). Checking the <strong>flags</strong> parameter is exactly what you need:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">IsTdNested</span><span class="p">(</span><span class="n">flags</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">mdToken</span> <span class="n">mdEnclosingClass</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">pMetaDataImport</span><span class="o">-&gt;</span><span class="n">GetNestedClassProps</span><span class="p">(</span><span class="n">mdType</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">mdEnclosingClass</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// create a new buffer to get the enclosing type name
</span></span></span><span class="line"><span class="cl">   <span class="n">WCHAR</span><span class="o">*</span> <span class="n">pszEnclosingTypeName</span> <span class="o">=</span> <span class="k">new</span> <span class="n">WCHAR</span><span class="p">[</span><span class="n">bufferLen</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">   <span class="n">GetTypeName</span><span class="p">(</span><span class="n">pInfo</span><span class="p">,</span> <span class="n">pMetaDataImport</span><span class="p">,</span> <span class="n">mdEnclosingClass</span><span class="p">,</span> <span class="n">numGenericTypeArgs</span><span class="p">,</span> <span class="n">genericTypeArgs</span><span class="p">,</span> <span class="n">pszEnclosingTypeName</span><span class="p">,</span> <span class="n">bufferLen</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="o">&lt;&lt;</span> <span class="n">pszEnclosingTypeName</span> <span class="o">&lt;&lt;</span> <span class="s">&#34;+&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="k">delete</span> <span class="n">pszEnclosingTypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>A call to <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/metadata/imetadataimport-getnestedclassprops-method?WT.mc_id=DT-MVP-5003325"><strong>IMetaDataImport::GetNestedClassProps</strong></a> returns the metadata token of the enclosing type and you simply recursively call the <strong>GetTypeName</strong> method that we are implementing in case of multi-nested types.</p>
<p>If this is not a generic type, we are done. However, as already mentioned in case of a generic type, it will end with the ` character followed by the number of type parameters. The following helper function swiftly gets rid of it:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="kt">void</span> <span class="nf">FixGenericSyntax</span><span class="p">(</span><span class="n">WCHAR</span><span class="o">*</span> <span class="n">name</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">ULONG</span> <span class="n">currentCharPos</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="k">while</span> <span class="p">(</span><span class="n">name</span><span class="p">[</span><span class="n">currentCharPos</span><span class="p">]</span> <span class="o">!=</span> <span class="sa">L</span><span class="sc">&#39;\0&#39;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="k">if</span> <span class="p">(</span><span class="n">name</span><span class="p">[</span><span class="n">currentCharPos</span><span class="p">]</span> <span class="o">==</span> <span class="sa">L</span><span class="sc">&#39;`&#39;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="c1">// skip `xx 
</span></span></span><span class="line"><span class="cl">         <span class="n">name</span><span class="p">[</span><span class="n">currentCharPos</span><span class="p">]</span> <span class="o">=</span> <span class="sa">L</span><span class="sc">&#39;\0&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">         <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="n">currentCharPos</span><span class="o">++</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The next step is to rebuild the list of generic argument types using the array of <strong>ClassID</strong> return by <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getclassidinfo2-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetClassIDInfo2</strong></a>. The most complicated part of the loop is avoid to add a “,” after the last argument type:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">numGenericTypeArgs</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// replace &#34;`xx&#34; by &#34;&lt;&#34;
</span></span></span><span class="line"><span class="cl">   <span class="n">FixGenericSyntax</span><span class="p">(</span><span class="n">pszName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="o">&lt;&lt;</span> <span class="n">pszName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;&lt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="k">for</span> <span class="p">(</span><span class="n">size_t</span> <span class="n">currentGenericArg</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">currentGenericArg</span> <span class="o">&lt;</span> <span class="n">numGenericTypeArgs</span><span class="p">;</span> <span class="n">currentGenericArg</span><span class="o">++</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">ClassID</span> <span class="n">argClassId</span> <span class="o">=</span> <span class="n">genericTypeArgs</span><span class="p">[</span><span class="n">currentGenericArg</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">      <span class="n">ModuleID</span> <span class="n">argModuleId</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="n">pInfo</span><span class="o">-&gt;</span><span class="n">GetClassIDInfo2</span><span class="p">(</span><span class="n">argClassId</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">argModuleId</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="n">WCHAR</span> <span class="n">argTypeName</span><span class="p">[</span><span class="mi">260</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">      <span class="n">GetTypeName</span><span class="p">(</span><span class="n">pInfo</span><span class="p">,</span> <span class="n">argClassId</span><span class="p">,</span> <span class="n">argModuleId</span><span class="p">,</span> <span class="n">argTypeName</span><span class="p">,</span> <span class="n">ARRAY_LEN</span><span class="p">(</span><span class="n">argTypeName</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">      <span class="n">oss</span> <span class="o">&lt;&lt;</span> <span class="n">argTypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">      <span class="k">if</span> <span class="p">(</span><span class="n">currentGenericArg</span> <span class="o">&lt;</span> <span class="n">numGenericTypeArgs</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">         <span class="n">oss</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;, &#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">oss</span> <span class="o">&lt;&lt;</span> <span class="sa">L</span><span class="s">&#34;&gt;&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo2-getclassidinfo2-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo2::GetClassIDInfo2</strong></a> on each parameter type <strong>ClassID</strong> to obtain the <strong>ModuleID</strong> where the type is defined and call our <strong>GetTypeName</strong> helper method.</p>
<p>The next episode will analyze methods signature.</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://docs.microsoft.com/en-us/dotnet/standard/generics/?WT.mc_id=DT-MVP-5003325/">Basics about generic types</a></li>
<li>Episode 1: <a href="/posts/2021-08-07_start-journey-into-the/">Start a journey into the .NET Profiling APIs</a></li>
</ul>
]]></content:encoded></item><item><title>Start a journey into the .NET Profiling APIs</title><link>https://chrisnas.github.io/posts/2021-08-07_start-journey-into-the/</link><pubDate>Sat, 07 Aug 2021 12:21:39 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-08-07_start-journey-into-the/</guid><description>This is the first episode of a series digging into the .NET Profiling API to trace each method call parameters and return value.</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>When I want to dig into a new API, I implement a real world scenario. This is exactly what I did for the .NET native profiling API. I want to know how to get parameters and return value of any method call during any .NET application life. The expected result would be something like :</p>
<pre tabindex="0"><code>Enter PublicClass::ClassParamReturnClass
this = 0x6f97e190 (8)
ClassType obj = 0x6f97e488 (8)
| int32 &lt;IntProperty&gt;k__BackingField = 84
| int32 intField = 42
| String stringField = 43
ClassType obj = 0x0000023A475BBAD8
</code></pre><pre tabindex="0"><code>Leave PublicClass::ClassParamReturnClass
| int32 &lt;IntProperty&gt;k__BackingField = 170
| int32 intField = 85
| String stringField = 86
returns 0x0000023A475BBBB0
</code></pre><p>when the following method is executed:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">ClassType</span> <span class="n">ClassParamReturnClass</span><span class="p">(</span><span class="n">ClassType</span> <span class="n">obj</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="k">return</span> <span class="k">new</span> <span class="n">ClassType</span><span class="p">(</span><span class="n">obj</span><span class="p">.</span><span class="n">IntProperty</span> <span class="p">+</span> <span class="m">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You will have to write native C/C++ code to leverage the .NET Profiling API as <a href="https://twitter.com/JohnWintellect">John Robbins</a> explained in his 2003 <a href="https://www.amazon.com/Debugging-Applications-Microsoft-Developer-Reference/dp/0735615365">Debugging Applications book</a>. Even though Microsoft is providing a <a href="https://github.com/Microsoft/clr-samples/tree/master/ProfilingAPI/ELTProfiler">code sample</a> for that, it does just show how to get notified when a function is called or exited but nothing about its type, its name, what are its parameters value and its return value. This series will detail both the .NET profiling API and the metadata API (i.e. the native reflection API).</p>
<p>Let’s start with the basics of .NET Profiling. As shown in the following figure from the <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/profiling-overview?WT.mc_id=DT-MVP-5003325">Microsoft Profiling documentation</a>, with the right <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/setting-up-a-profiling-environment?WT.mc_id=DT-MVP-5003325">environment configuration</a>, the CLR will load a COM-like object implementing <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-interface?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback</strong></a> interface to notify almost everything happening in a .NET application from startup to shutdown.</p>
<p><img loading="lazy" src="/posts/2021-08-07_start-journey-into-the/1_Qw5R9VvFCPePeH_g-6IgKg.png"></p>
<p>Since .NET was launched, more and more notifications have been added by versioning the interface up to <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback9-interface?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback9</strong></a>. Welcome to the usual COM world! I recommend that you watch <a href="https://twitter.com/zodiacon">Pavel Yosifovich</a> session about <a href="https://youtu.be/TqS4OEWn6hQ?t=30">writing a CLR Profiler in an hour</a> to get an overview. I will use <a href="https://github.com/zodiacon/DotNextMoscow2019">his sample solution</a> as a starting point.</p>
<p>For performance sake, you tell the runtime which events you are interested in; i.e. which <strong>ICorProfilerCallback</strong> functions will be called and you simply have to return <strong>S_OK</strong> from all other functions. This setup is done in your implementation of <strong>ICorProfilerCallback::Initialize</strong>. This first function called by the CLR provides a parameter from which you need to <strong>QueryInterface</strong> a version of <strong>ICorProfilerInfo</strong> interface (the current one is <strong>ICorProfilerInfo10</strong>). This interface provides functions to query information about parameters passed to your <strong>ICorProfilerCallback</strong> functions (such as AppDomain, assembly, type, function, thread and so on).</p>
<p>The first <strong>ICorProfilerInfo</strong> function you will use is <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo-seteventmask-method?WT.mc_id=DT-MVP-5003325"><strong>SetEventMask</strong></a> to filter the <strong>ICorProfilerCallback</strong> functions that will be called by the runtime. It accepts a flag combination of values from the <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/cor-prf-monitor-enumeration?WT.mc_id=DT-MVP-5003325"><strong>COR_PRF_MONITOR</strong></a> enumeration.</p>
<h2 id="assembly-code-is-needed-for-enterleavetailcall">Assembly code is needed for Enter/Leave/TailCall</h2>
<p>To get notified when a managed method is called or exits, you should pass:</p>
<pre tabindex="0"><code>COR_PRF_MONITOR_ENTERLEAVE | 
COR_PRF_ENABLE_FUNCTION_ARGS | 
COR_PRF_ENABLE_FUNCTION_RETVAL | 
COR_PRF_ENABLE_FRAME_INFO
</code></pre><p>to <strong>ICorProfilerInfo</strong>::<strong>SetEventMask</strong>. The first flag tells the runtime to call <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/profiling-global-static-functions?WT.mc_id=DT-MVP-5003325">static callbacks</a> (i.e. not exposed as functions of <strong>ICorProfilerCallback</strong>) when a managed method gets executed or returns. The other three flags ensure that these callbacks will receive enough information to extract method arguments and return value.</p>
<p>Unlike the other notifications that end up calling your <strong>ICorProfilerCallback</strong> functions, you need to register three special callbacks to the runtime via <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilerinfo3-setenterleavefunctionhooks3withinfo-method?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerInfo3</strong>::<strong>SetEnterLeaveFunctionHooks3WithInfo</strong>.</a> For performance reasons, the .NET team asks you to write the prolog and epilog (i.e. saving/restoring CPU registers on/from the stack) yourself in assembly code instead of relying on well-defined <a href="https://docs.microsoft.com/en-us/cpp/cpp/calling-conventions?WT.mc_id=DT-MVP-5003325">calling conventions</a> supported by the C/C++ compiler.</p>
<p>This is why you have to call <strong>ICorProfilerInfo3:: SetEnterLeaveFunctionHooks3WithInfo</strong> and pass pointers to these “naked” functions. The Microsoft <a href="https://github.com/Microsoft/clr-samples/tree/master/ProfilingAPI/ELTProfiler"><em>ELTProfiler</em></a> sample implements the stubs both for x86 (as inlined assembly embedded in a C++ file) and x64 (defined in .asm file). In x64, you need to update your project file to add the following:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-xml" data-lang="xml"><span class="line"><span class="cl"><span class="nt">&lt;ImportGroup</span> <span class="na">Label =</span> <span class="s">&#34;ExtensionSettings&#34;</span><span class="nt">&gt;</span>
</span></span><span class="line"><span class="cl">   <span class="nt">&lt;Import</span> <span class="na">Project =</span> <span class="s">&#34;$(VCTargetsPath)\BuildCustomizations\masm.props&#34;</span> <span class="nt">/ &gt;</span>
</span></span><span class="line"><span class="cl"><span class="nt">&lt;/ ImportGroup&gt;</span>
</span></span><span class="line"><span class="cl"><span class="nt">&lt;ItemGroup&gt;</span>
</span></span><span class="line"><span class="cl"><span class="nt">&lt;MASM</span> <span class="na">Include =</span> <span class="s">&#34;../DotNext.Profiler.Shared/asm/windows/nakedcallbacks.asm&#34;</span>
</span></span><span class="line"><span class="cl">	<span class="na">Condition =</span> <span class="s">&#34;&#39;$(Platform)&#39; == &#39;x64&#39;&#34;</span> <span class="nt">/ &gt;</span>
</span></span><span class="line"><span class="cl"><span class="nt">&lt;/ ItemGroup&gt;</span>
</span></span><span class="line"><span class="cl"><span class="nt">&lt;ImportGroup</span> <span class="na">Label =</span> <span class="s">&#34;ExtensionTargets&#34;</span><span class="nt">&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&lt;Import</span> <span class="na">Project =</span> <span class="s">&#34;$(VCTargetsPath)\BuildCustomizations\masm.targets&#34;</span> <span class="nt">/ &gt;</span>
</span></span><span class="line"><span class="cl"><span class="nt">&lt;/ ImportGroup&gt;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The nakedcallbacks.asm file contains the assembly code to call stub functions wrapped by the expected prolog and epilog written in assembly code. Here is the signature of the functions from where you will be able to start working in C++:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">PROFILER_STUB</span> <span class="n">EnterStub</span><span class="p">(</span><span class="n">FunctionIDOrClientID</span> <span class="n">functionId</span><span class="p">,</span> <span class="n">COR_PRF_ELT_INFO</span> <span class="n">eltInfo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">PROFILER_STUB</span> <span class="n">LeaveStub</span><span class="p">(</span><span class="n">FunctionID</span> <span class="n">functionId</span><span class="p">,</span> <span class="n">COR_PRF_ELT_INFO</span> <span class="n">eltInfo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">PROFILER_STUB</span> <span class="n">TailcallStub</span><span class="p">(</span><span class="n">FunctionID</span> <span class="n">functionId</span><span class="p">,</span> <span class="n">COR_PRF_ELT_INFO</span> <span class="n">eltInfo</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>A function is identified by a <strong>FunctionID</strong> and this is where you start your adventure (I will come back later to the <strong>COR_PRF_ELT_INFO</strong> parameter). Note that for the <strong>EnterStub</strong> function, you need to get the <strong>FunctionID</strong> from the <strong>FunctionIDOrClientID.functionId</strong> field.</p>
<p>Once your hook callbacks have been registered via <strong>ICorProfilerInfo3:: SetEnterLeaveFunctionHooks3WithInfo</strong>, it is still possible to decide whether or not a managed method call should trigger them. For that, it is needed to register a “mapper” function that is called once per managed method with one of these functions:</p>
<ul>
<li><code>HRESULT ICorProfilerInfo::SetFunctionIDMapper([in] FunctionIDMapper* pFunc); </code>
 with <code>UINT_PTR __stdcall Mapper(FunctionID functionId, BOOL* pHookFunction)</code></li>
<li><code>HRESULT ICorProfilerInfo3::SetFunctionIDMapper2([in] FunctionIDMapper2* pFunc, [in] void* clientData);</code>
 with <code>UINT_PTR __stdcall Mapper2(FunctionID functionId, void* clientData, BOOL* pHookFunction)</code></li>
</ul>
<p>The latter allows you to pass some “client data” to the mapper function such as a helper class to manipulate the received <strong>FunctionID</strong> or your profiler state.</p>
<p>If <strong>pHookFunction</strong> is set to <strong>TRUE</strong>, your enter/leave functions will be called and the returned <strong>UINT_PTR</strong> will be passed as the <strong>FunctionID</strong> parameter. This allows you to handle function name or signature computation at one single place outside of the real profiling work done each time a method is called. If <strong>pHookFunction</strong> is set to <strong>FALSE</strong>, the enter/leave functions will never be called for that <strong>FunctionID</strong>. The mapper callback is called only once per <strong>FunctionID</strong>: this could be a good way to avoid performance impact if you just want to profile a small subset of methods.</p>
<h2 id="how-to-debug-yourprofiler">How to debug your profiler</h2>
<p>Before going any further, it is needed to know how to debug your C++ profiler code with Visual Studio. The first step is to write a simple .NET application to execute the method calls you want to intercept. The second natural step would be to setup the environment variables needed to inject your profiler:</p>
<p><img loading="lazy" src="/posts/2021-08-07_start-journey-into-the/1_v9kUxczdLpxQd6RWAUh8gg.png"></p>
<p>and also check the <em>Enable native code debugging</em> option:</p>
<p><img loading="lazy" src="/posts/2021-08-07_start-journey-into-the/1_wD4vkFJAaBBxlxMh3MqK1w.png"></p>
<p>If you start a debug session, the Visual Studio debugger with use both managed and native debugging APIs. Unfortunately, the managed debugging API does not allow breakpoints sets in <strong>ICorProfilerCallback</strong> functions.</p>
<p>Instead, you need to <strong>Debug | Start Without Debugging</strong> (<strong>CTRL+F5</strong>), not <strong>Debug | Start Debugging</strong> (<strong>F5</strong>) the C# test application with the same environment variables and attach the debugger via <strong>Debug | Attach to Process</strong>. Select the process and click <strong>Select</strong> button in the <strong>Attach to</strong> section:</p>
<p><img loading="lazy" src="/posts/2021-08-07_start-journey-into-the/1_C9gYCSz2ytVj2V8EvTHAhA.png"></p>
<p>To make attachment simple, add a <strong>Console.ReadLine</strong> in a console application before starting the method calls to test.</p>
<p>The next post will show you how to extract information from a <strong>FunctionID</strong>.</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://www.youtube.com/watch?v=TqS4OEWn6hQ&amp;t=25s">Writing a .NET Core cross platform profiler in an hour</a> video and the corresponding <a href="https://github.com/zodiacon/DotNextMoscow2019">source code</a> from <a href="https://twitter.com/zodiacon">Pavel Yosifovich</a></li>
<li><a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/profiling-overview?WT.mc_id=DT-MVP-5003325">Microsoft Profiling documentation</a></li>
<li>Microsoft Enter/Leave <a href="https://github.com/Microsoft/clr-samples/tree/master/ProfilingAPI/ELTProfiler">code sample</a></li>
</ul>
]]></content:encoded></item><item><title>Profile memory allocations with Perfview</title><link>https://chrisnas.github.io/posts/2021-07-20_profile-memory-allocations-wit/</link><pubDate>Tue, 20 Jul 2021 07:14:59 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-07-20_profile-memory-allocations-wit/</guid><description>How to connect AllocationTick events to the corresponding callstacks thanks to the Microsoft Perfview free tool.</description><content:encoded><![CDATA[<hr>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_WDSmxes74rkDmJ-7qErfsg.jpeg"></p>
<p>I have already explained how to <a href="/posts/2020-04-18_build-your-own-net/">write your own allocation monitoring tool</a>. Each time 100 cumulated KB are allocated, the CLR emits an <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events#gcallocationtick_v2-event?WT.mc_id=DT-MVP-5003325">AllocationTick event</a> with the name of the last allocated type before the 100 KB threshold and if it is in the LOH or not. This post shows you how to get these events with the corresponding callstacks thanks to the Microsoft <a href="https://github.com/microsoft/perfview/releases/">Perfview free tool</a>.</p>
<p>On Linux, things are a little bit more complicated because the Kernel provider does not exist to emit callstacks events. Microsoft provides the <a href="https://github.com/microsoft/perfview/tree/main/src/perfcollect">perfcollect script</a> to get a zip file containing both the CLR events (collected by LTTng) and the callstacks (collected via perf). If you want, like <a href="https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace?WT.mc_id=DT-MVP-5003325">dotnet-trace</a>, to rely on EventPipe instead of LTTng, you could use <a href="https://github.com/criteo-forks/perfview">Criteo fork of the perfcollect script</a> and the corresponding updated version of Perfview to open the generated .trace.zip file. Note that our <a href="https://github.com/microsoft/perfview/pull/1291">Pull Request</a> to the Microsoft Perfview repository is still pending…</p>
<p>If you are on Windows, you could use Perfview (menu <strong>Collect</strong> | <strong>Collect</strong> or <strong>Alt+C</strong> shortcut)</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_DIwnG2xjbodRUlAeI4Plew.png"></p>
<p>and click the <strong>Start Collection</strong> button. When the workflow you want to analyze is finished, click the same button (with a <strong>Stop Collection</strong> text this time) and the corresponding file should open up as a tree:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_kyVKLpKup_IT5s4uIDzDNA.png"></p>
<p>Note that with a Linux collection, the nodes might be different:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_8W8hiRCtebpuUHY8DnXfDA.png"></p>
<p>In both cases, you are interested in the events visible when you double-click the <strong>Events</strong> node. You could use the <strong>Filter</strong> textbox to easily find the AllocationTick line in the left panel. Then, right-click and select <strong>Open Any Stacks</strong>:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_dt9nQ5JKRcy_cBmWzXdXUw.png"></p>
<p>This action opens up a new windows that is different between a Windows and a Linux collection:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_PHbvwq4obudoKwcgb_nM9g.png"></p>
<p>The reason is simple: on Windows, events from ALL processes have been collected while only process is targeted on Linux. This is why you have to double-click the process you want on Windows while it is already selected for Linux. You could also keep only events from a process by entering its ID in the <strong>IncPats</strong> combo-box:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_9yRcGmwSS0jdRhfjlRuljQ.png"></p>
<p>Instead of staying on the <strong>CallTree</strong> tab, I recommend to select the <strong>By Name</strong> tab instead and double-click the AllocationTick line:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_jzuA84byszCMZhvUlfUcdw.png"></p>
<p>This action moves to the <strong>Caller</strong> tab with AllocationTick selected:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_feYGSeRsDVBPA_QZfOaTTw.png"></p>
<p>Each line under the AllocationTick node starts with <strong>EventData TypeName</strong> followed by the allocation type name. <strong>EventData</strong> is the name of the event payload used by Perfview and <strong>TypeName</strong> is the property name in the payload.</p>
<p>The <strong>Inc</strong> columns gives an hint about the split of the different allocations. Remember that the AllocationTick events are providing a sampling of the allocations, not an exact picture but it should be enough. In the previous screenshot, the majority of allocations are byte arrays Byte[].</p>
<p>Click the corresponding checkbox to open the Byte[] node:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_OVZZw9uGcq3kOJWcqgTTDA.png"></p>
<p>As you can guess, each line represent a different value of the <strong>Size</strong> property in the <strong>EventData</strong> payload. You don’t really care about the different values so type <strong>EventData Size</strong> in the FoldPats combo-box to make them disappear:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_-CH9Ri3SMKpZMzJScf4X4g.png"></p>
<p>Most of the allocations are not in the LOH (i.e. less that 85.000 bytes with the default LOH threshold) and when you click the <strong>Small</strong> checkbox, the different callstacks leading to these allocations appear in the tree such as the <strong>Run</strong> method of the <strong>RandomAllocationAction</strong> class in the following screenshot:</p>
<p><img loading="lazy" src="/posts/2021-07-20_profile-memory-allocations-wit/1_WKCF7ncuLPe9JuRB3fyksA.png"></p>
<p>Don’t be scared by the raw state of the stack frame: read <a href="/posts/2021-03-02_how-to-ease-async/">this previous post</a> to see how to better understand async/await callstacks and make them more readable.</p>
<p>Happy memory profiling!</p>
<hr>
<p><strong>Interested in working on this topic? Check out our open positions:</strong></p>
<p><a href="https://careers.criteo.com"><strong>Careers at Criteo | Criteo jobs</strong>
<em>Find opportunities everywhere. ​</em>careers.criteo.com</a><a href="https://careers.criteo.com"></a><a href="https://careers.criteo.com/job/38e2cc1c-718c-4d2d-ae62-4f5206192de7/Senior-Site-Reliability-Engineer-PRE-Performance-remote-flexibility-with-base-in-France"><strong>Senior Site Reliability Engineer - PRE - Performance (remote flexibility with base in France) job…</strong>
careers.criteo.com</a><a href="https://careers.criteo.com/job/38e2cc1c-718c-4d2d-ae62-4f5206192de7/Senior-Site-Reliability-Engineer-PRE-Performance-remote-flexibility-with-base-in-France"></a></p>
]]></content:encoded></item><item><title>Memory Anti-Patterns in C#</title><link>https://chrisnas.github.io/posts/2021-07-01_memory-anti-patterns-in/</link><pubDate>Thu, 01 Jul 2021 13:56:00 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-07-01_memory-anti-patterns-in/</guid><description>In the context of aiming for a clean code base at Criteo, Christophe gathered C# anti-patterns we’d like to share with you!</description><content:encoded><![CDATA[<hr>
<p><img loading="lazy" src="/posts/2021-07-01_memory-anti-patterns-in/1_IC8UKl9GdNwj-CzbTLaoXg.jpeg"></p>
<p>In the context of helping the teams at Criteo to clean up our code base, I gathered and documented a few C# anti-patterns similar to <a href="https://twitter.com/KooKiz">Kevin</a>’s publication about <a href="https://kevingosse.medium.com/performance-best-practices-in-c-b85a47bdd93a">performance code smell</a>. Here is an extract related to good/bad memory patterns.</p>
<p>Even though the garbage collector is doing its works out of the control of the developers, the less allocations are done, the less the GC will impact an application. So the main goal is to avoid writing code that allocates unnecessary objects or references them too long.</p>
<h2 id="finalizer-and-idisposable-usage">Finalizer and IDisposable usage</h2>
<p>Let’s start with a hidden way to referencing an object: implementing a “finalizer”. In C#, you write a method whose name is the name of the class prefixed by <strong>~</strong>. The compiler generates an override for the virtual <a href="https://docs.microsoft.com/en-us/dotnet/api/system.object.finalize?WT.mc_id=DT-MVP-5003325"><strong>Object.Finalize</strong></a> method. An instance of such a type is treated in a particular way by the Garbage Collector:</p>
<ul>
<li>after it is allocated, a reference is kept in a** Finalization** internal queue</li>
<li>after a collection, if it is no more referenced, this reference is moved into another <strong>fReachable</strong> internal queue and treated as a root until a dedicated thread calls its finalizer code</li>
</ul>
<p>As Konrad Kokosa details in <a href="https://twitter.com/konradkokosa">one of his free GC Internals video</a>, instances of a type implementing a finalizer stay much longer in memory than needed; waiting for the next collection of the generation in which the previous collection left it (i.e. gen1 if it was in gen0 or even worse, gen2 if it was in gen1).</p>
<p>So the first question people are usually asking is: do you really need to implement a finalizer? Most of the time, the answer should be no. The code of a finalizer is responsible for cleaning up <strong>ONLY</strong> resources that are <strong>NOT</strong> managed. It usually means “stuff” received from COM interop or P/Invoke calls to native functions such as handles, native memory or memory allocated via <a href="https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal?WT.mc_id=DT-MVP-5003325">Marshal </a>helpers. If your class has <strong>IntPtr</strong> fields, it is a good sign that their lifetime finishes in a finalizer via <strong>Marsal</strong> helpers or P/Invoke cleanup calls. Look for <a href="https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.safehandle?WT.mc_id=DT-MVP-5003325"><strong>SafeHandle</strong></a>-derived class if you need to manipulate kernel object handles instead of raw <strong>IntPtr</strong> and avoiding finalizers. So in 99.9% of the cases, you don’t need a finalizer.</p>
<p>The second question is how implementing a finalizer relates to implementing <a href="https://docs.microsoft.com/en-us/dotnet/api/system.idisposable?WT.mc_id=DT-MVP-5003325"><strong>IDisposable</strong></a>? Unlike a finalizer, implementing the unique <strong>Dispose()</strong> method of <strong>IDisposable</strong> interface in a class means nothing for the Garbage Collector. So there is no side effect to extend the lifetime of its instances. This is only a way to allow the users of instances of this class to explicitly cleanup such an instance at a certain point in time instead of waiting for a garbage collection to be triggered.</p>
<p>Let’s take an example: when you want to write to a file, behind the scene, .NET will call native APIs that operate on real file (via kernel object handles on Windows) with limited concurrent access (i.e. two processes can’t corrupt a file by writing different things at the same time — this is a very high level view of the situation but valid enough for this discussion). Another class would allow access to databases via a limited number of connections that should be released as soon as possible. In all these scenarios, as a user of these classes, you want to be able to “release” the resources used behind the scene as quickly as possible when you don’t need to access them anymore. This translates into the well known **using **pattern in C#:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">using</span> <span class="p">(</span><span class="kt">var</span> <span class="n">disposableInstance</span> <span class="p">=</span> <span class="k">new</span> <span class="n">MyDisposable</span><span class="p">())</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">DoSomething</span><span class="p">(</span><span class="n">disposableInstance</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span> <span class="c1">// the instance will be cleanup and its resources released</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>that is transformed by the C# compiler into:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">disposableInstance</span> <span class="p">=</span> <span class="k">new</span> <span class="n">MyDisposable</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="k">try</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">DoSomething</span><span class="p">(</span><span class="n">disposableInstance</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">finally</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">disposableInstance</span><span class="p">?.</span><span class="n">Dispose</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>So when should you implement <strong>IDisposable</strong>? My answer is simple: when the class owns fields of classes that implement <strong>IDisposable</strong> <strong>and</strong> if it implements a finalizer (for the good reasons already explained). Don’t use <strong>IDisposable.Dispose</strong> for other reasons such as logging (like what we used to do in C++ destructor): prefer to implement another explicit interface dedicated to that purpose.</p>
<p>In term of implementation, I have to say that I never understood why Microsoft decided to provide such a confusing implementation in its <a href="https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/implementing-dispose?WT.mc_id=DT-MVP-5003325">documentation</a>. You have to implement the following method to <em>“free” unmanaged and managed resources</em>. It should be called by both the finalizer and <strong>IDisposable.Dispose()</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">protected</span> <span class="k">virtual</span> <span class="k">void</span> <span class="n">Dispose</span><span class="p">(</span><span class="kt">bool</span> <span class="n">disposing</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// free unmanaged and managed resources</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You also need to have a <strong>_disposed</strong> field to allow <strong>IDisposable.Dispose()</strong> to be called more than once without problem. In all methods and properties of the class, don’t forget to throw an <strong>ObjectDisposedException</strong> if <strong>_disposed</strong> is true to catch usage of already disposed objects.</p>
<p>Ask a group of developers when <strong>disposing</strong> should be true or false: half will say when called from the finalizer and the other half from <strong>Dispose</strong> (and I’m not counting those who are not sure). Why giving the same name to the method that already exists in <strong>IDisposable</strong>? Why picking “disposing” as parameter name? I don’t think it could been possible to find a more confusing solution: too many “<em>dispose</em>” kills the pattern…</p>
<p>Here is my own implementation that does exactly the same thing but with much less confusion:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">DisposableMe</span> <span class="p">:</span> <span class="n">IDisposable</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">bool</span> <span class="n">_disposed</span> <span class="p">=</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// 1. field that implements IDisposable</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// 2. field that stores &#34;native resource&#34; (ex: IntPtr)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="p">~</span><span class="n">DisposableMe</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Cleanup</span><span class="p">(</span><span class="s">&#34;called from GC&#34;</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>           <span class="c1">// = true</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">Dispose</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Cleanup</span><span class="p">(</span><span class="s">&#34;not from GC&#34;</span> <span class="p">==</span> <span class="kc">null</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>           <span class="c1">// = false</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>I also rename <strong>Dispose(bool disposing)</strong> into <strong>Cleanup(bool from GC)</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"> <span class="kd">private</span> <span class="k">void</span> <span class="n">Cleanup</span><span class="p">(</span><span class="kt">bool</span> <span class="n">fromGC</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="k">if</span> <span class="p">(</span><span class="n">_disposed</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">         <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">     <span class="k">try</span>
</span></span><span class="line"><span class="cl">     <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="c1">// always clean up the NATIVE resources</span>
</span></span><span class="line"><span class="cl">         <span class="k">if</span> <span class="p">(</span><span class="n">fromGC</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">             <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">         <span class="c1">// clean up managed resources ONLY if not called from GC</span>
</span></span><span class="line"><span class="cl">     <span class="p">}</span>
</span></span><span class="line"><span class="cl">     <span class="k">finally</span>
</span></span><span class="line"><span class="cl">     <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="n">_disposed</span> <span class="p">=</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">         <span class="k">if</span> <span class="p">(!</span><span class="n">fromGC</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">             <span class="n">GC</span><span class="p">.</span><span class="n">SuppressFinalize</span><span class="p">(</span><span class="k">this</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">     <span class="p">}</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The rules you have to keep in mind are simple:</p>
<ul>
<li>native resources (i.e. <strong>IntPtr</strong> fields) must always be cleaned up</li>
<li>managed resources (i.e. <strong>IDisposable</strong> fields) should be disposed when called from <strong>Dispose</strong> (not from GC)</li>
</ul>
<p>The <strong>_disposed</strong> boolean field is used to cleanup resources only once. In this implementation, it is set to true even if an exception happens because I’m assuming that if it just happened, it will also happen if called another time.</p>
<p>Last but not least, the call to <code>GC.SuppressFinalize(this)</code> simply tells the GC to remove the disposed object from the <strong>Finalization</strong> internal queue:</p>
<ul>
<li>it is only meaningful when called from <strong>Dispose</strong> (not from GC) to avoid extending its lifetime.</li>
<li>it means that the finalizer will never be called. If it were, it would have called <strong>Cleanup</strong> that would have returned immediately because <strong>_disposed</strong> is true.</li>
</ul>
<p>The rest of the post describes typical anti-patterns. However, as usual with performance related topic, remember that the impact might not be noticeable if it does not run in a hot path. Always balance between readability/ease of maintenance/understanding and performance gain.</p>
<h2 id="provide-list-capacity-whenpossible">Provide list capacity when possible</h2>
<p>It is recommended to provide a capacity when creating a <strong>List</strong> or a collection instance. The .NET implementation of such classes usually stores the values in an array that need to be resized when new elements are added: it means that:</p>
<ul>
<li>A new array is allocated</li>
<li>The former values are copied to the new array</li>
<li>The former array is no more referenced</li>
</ul>
<p>In the following example, the capacity of <strong>resultList</strong> is <strong>otherList.Count</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">resultList</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;...&gt;();</span>
</span></span><span class="line"><span class="cl"><span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">item</span> <span class="k">in</span> <span class="n">otherList</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span> 
</span></span><span class="line"><span class="cl">   <span class="n">resultList</span><span class="p">.</span><span class="n">Add</span><span class="p">(...);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="prefer-stringbuilder-to--for-string-concatenation">Prefer StringBuilder to +/+= for string concatenation</h2>
<p>Creating temporary objects will increase the number of garbage collections and impact performances. Since the string class is immutable, each time you need to get an updated version of a string of characters, the .NET framework ends up creating a new string.</p>
<p>For string concatenation, avoid using <strong>Concat</strong>, <strong>+</strong> or <strong>+=</strong>. This is especially important in loop or methods called very often. For example in the following code, a <strong>StringBuilder</strong> should be used:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">productIds</span> <span class="p">=</span> <span class="kt">string</span><span class="p">.</span><span class="n">Empty</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">while</span> <span class="p">(</span><span class="n">match</span><span class="p">.</span><span class="n">Success</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">productIds</span> <span class="p">+=</span> <span class="n">match</span><span class="p">.</span><span class="n">Groups</span><span class="p">[</span><span class="m">2</span><span class="p">].</span><span class="n">Value</span> <span class="p">+</span> <span class="s">&#34;\n&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">match</span> <span class="p">=</span> <span class="n">match</span><span class="p">.</span><span class="n">NextMatch</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Again in loops, avoid creating temporary string such as in the following code where <strong>SearchValue.ToUpper()</strong> do not change in the loop:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SelectedColumn</span> <span class="p">==</span> <span class="n">Resources</span><span class="p">.</span><span class="n">Journaux</span><span class="p">.</span><span class="n">All</span> <span class="p">&amp;&amp;</span> <span class="p">!</span><span class="n">String</span><span class="p">.</span><span class="n">IsNullOrEmpty</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span> <span class="p">=</span> <span class="n">model</span><span class="p">.</span><span class="n">DataSource</span><span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">x</span> <span class="p">=&gt;</span> <span class="n">x</span><span class="p">.</span><span class="n">ItemId</span><span class="p">.</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">||</span> <span class="n">x</span><span class="p">.</span><span class="n">ItemName</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">().</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">())</span>
</span></span><span class="line"><span class="cl">        <span class="p">||</span> <span class="n">x</span><span class="p">.</span><span class="n">ItemGroupName</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">().</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">())</span>
</span></span><span class="line"><span class="cl">        <span class="p">||</span> <span class="n">x</span><span class="p">.</span><span class="n">CountingGroupName</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">().</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">()));</span>
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SelectedColumn</span> <span class="p">==</span> <span class="n">Resources</span><span class="p">.</span><span class="n">Journaux</span><span class="p">.</span><span class="n">ItemNumber</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span> <span class="p">=</span> <span class="n">model</span><span class="p">.</span><span class="n">DataSource</span><span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">x</span> <span class="p">=&gt;</span> <span class="n">x</span><span class="p">.</span><span class="n">ItemId</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">().</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">()));</span>
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SelectedColumn</span> <span class="p">==</span> <span class="n">Resources</span><span class="p">.</span><span class="n">Journaux</span><span class="p">.</span><span class="n">ItemName</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span> <span class="p">=</span> <span class="n">model</span><span class="p">.</span><span class="n">DataSource</span><span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">x</span> <span class="p">=&gt;</span> <span class="n">x</span><span class="p">.</span><span class="n">ItemName</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">().</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">()));</span>
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SelectedColumn</span> <span class="p">==</span> <span class="n">Resources</span><span class="p">.</span><span class="n">Journaux</span><span class="p">.</span><span class="n">ItemGroup</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span> <span class="p">=</span> <span class="n">model</span><span class="p">.</span><span class="n">DataSource</span><span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">x</span> <span class="p">=&gt;</span> <span class="n">x</span><span class="p">.</span><span class="n">ItemGroupName</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">().</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">()));</span>
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">SelectedColumn</span> <span class="p">==</span> <span class="n">Resources</span><span class="p">.</span><span class="n">Journaux</span><span class="p">.</span><span class="n">CountingGroup</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span> <span class="p">=</span> <span class="n">model</span><span class="p">.</span><span class="n">DataSource</span><span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">x</span> <span class="p">=&gt;</span> <span class="n">x</span><span class="p">.</span><span class="n">CountingGroupName</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">().</span><span class="n">Contains</span><span class="p">(</span><span class="n">SearchValue</span><span class="p">.</span><span class="n">ToUpper</span><span class="p">()));</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The effect is even worse due to the <strong>Where()</strong> clause that create a new temporary upper string for each element of the sequence!</p>
<p>This recommendation also applies to types that provides string-based direct access to characters such as in the following code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(!</span><span class="n">uriBuilder</span><span class="p">.</span><span class="n">ToString</span><span class="p">().</span><span class="n">EndsWith</span><span class="p">(</span><span class="s">&#34;.&#34;</span><span class="p">,</span> <span class="kc">true</span><span class="p">,</span> <span class="n">invCulture</span><span class="p">))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>where <strong>ToString()</strong> is not needed because it is possible to directly access the last character:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">uriBuilder</span><span class="p">[</span><span class="n">uriBuilder</span><span class="p">.</span><span class="n">Length</span> <span class="p">-</span> <span class="m">1</span><span class="p">]</span> <span class="p">!=</span> <span class="sc">&#39;.&#39;</span><span class="p">)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="caching-strings-and-interning">Caching strings and interning</h2>
<p>Prefer static cache of read-only objects to recreating them in each call such as in the following example:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">allCampaignStatuses</span> <span class="p">=</span> 
</span></span><span class="line"><span class="cl">   <span class="p">((</span><span class="n">CampaignActivityStatus</span><span class="p">[])</span><span class="n">Enum</span><span class="p">.</span><span class="n">GetValues</span><span class="p">(</span><span class="k">typeof</span><span class="p">(</span><span class="n">CampaignActivityStatus</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">   <span class="p">.</span><span class="n">ToList</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p><em>(Replace by a static list since the enumeration elements won’t change)</em></p>
<p>Last but not least, when string keys (with only a few different values) are used, you could “intern” them (i.e. ask the CLR to cache a value and always return the same reference). Read the corresponding <a href="https://docs.microsoft.com/en-us/dotnet/api/system.string.intern">Microsoft Docs</a> for more details.</p>
<h2 id="dont-recreate-objects">Don’t (re)create objects</h2>
<p>The first pattern to use is the static classes with static methods to avoid the creation of temporary objects just to call fields-less methods. It is also recommended to pre-compute read-only list instead of re-creating it each time a method gets called like in the following example :</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">allCampaignStatuses</span> <span class="p">=</span> 
</span></span><span class="line"><span class="cl">   <span class="p">((</span><span class="n">CampaignActivityStatus</span><span class="p">[])</span><span class="n">Enum</span><span class="p">.</span><span class="n">GetValues</span><span class="p">(</span><span class="k">typeof</span><span class="p">(</span><span class="n">CampaignActivityStatus</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">   <span class="p">.</span><span class="n">ToList</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// use allCampaignStatuses in the rest of the method</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This list could have been computed once as a static field of the class because the enumeration will not change during the application lifetime.</p>
<p>Avoid repeated calls and keep values in local variables when used in a loop; this is particularly easy to forget when dealing with string <strong>ToLower()</strong> and <strong>ToUpper()</strong>.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">found</span> <span class="p">=</span> <span class="n">elements</span><span class="p">.</span><span class="n">Any</span><span class="p">(</span>
</span></span><span class="line"><span class="cl"><span class="c1">// ToLower() is called in each test</span>
</span></span><span class="line"><span class="cl"><span class="n">k</span> <span class="p">=&gt;</span> <span class="kt">string</span><span class="p">.</span><span class="n">Compare</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">       <span class="n">k</span><span class="p">.</span><span class="n">ToLower</span><span class="p">(),</span> 
</span></span><span class="line"><span class="cl">       <span class="n">key</span><span class="p">.</span><span class="n">ToLower</span><span class="p">(),</span> 
</span></span><span class="line"><span class="cl">       <span class="n">StringComparison</span><span class="p">.</span><span class="n">OrdinalIgnoreCase</span>
</span></span><span class="line"><span class="cl">       <span class="p">)</span> <span class="p">==</span> <span class="m">0</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p><em>(a new temporary string will be created by key.ToLower() by each test)</em></p>
<p>Prefer <strong>String.Compare(…, StringComparison.OrdinalIgnoreCase)</strong> to avoid calling <strong>ToLower()/ToUpper()</strong> just for string comparison such as in the following example:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">transactionIdAsString</span> <span class="p">!=</span> <span class="kc">null</span> <span class="p">&amp;&amp;</span> <span class="n">transactionIdAsString</span><span class="p">.</span><span class="n">ToLowerInvariant</span><span class="p">()</span> <span class="p">==</span> <span class="s">&#34;undefined&#34;</span><span class="p">)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>becomes:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">transactionIdAsString</span> <span class="p">!=</span> <span class="kc">null</span> <span class="p">&amp;&amp;</span> <span class="kt">string</span><span class="p">.</span><span class="n">Compare</span><span class="p">(</span><span class="n">transactionIdAsString</span><span class="p">,</span> <span class="s">&#34;undefined&#34;</span><span class="p">,</span> <span class="n">StringComparison</span><span class="p">.</span><span class="n">OrdinalIgnoreCase</span><span class="p">)</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="best-practices-withlinq">Best practices with LINQ</h2>
<p>The LINQ syntax is used extensively all over the source code. However, several patterns are found very often and might impact overall performance.</p>
<h2 id="prefer-ienumerable-toilist">Prefer IEnumerable<T> to IList<T></h2>
<p>Most of the methods are iterating on sequences represented by <strong>IEnumerable<T></strong> either via <strong>foreach()</strong> or thanks to <strong>System.Linq.Enumerable</strong> extension methods. <strong>IList<T></strong> should be used only when sequence modification is required:</p>
<p><img loading="lazy" src="/posts/2021-07-01_memory-anti-patterns-in/1_AEBYXkD87app7BdyPEK8qw.png"></p>
<p>It is also recommended to use <strong>IEnumerable<T></strong> instead of <strong>IList<T></strong> as method parameters if there is no need to add/remove elements to the sequence. That way, the client code don’t have to use <strong>ToList()</strong> before calling the method. The same comment applies to return types that should be <strong>IEnumerable<T></strong> rather than <strong>IList<T></strong> because most of the time, the sequence will simply be iterated via a <strong>foreach</strong> statement.</p>
<h2 id="firstordefault-and-any-are-your-friends-but-might-not-beneeded">FirstOrDefault and Any are your friends… but might not be needed</h2>
<p>First, there is no need to call <strong>Any</strong> (or even worse <strong>ToList().Count &gt; 0</strong>) before <strong>foreach</strong> such as in the following code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">sequence</span> <span class="p">!=</span> <span class="kc">null</span> <span class="p">&amp;&amp;</span> <span class="n">sequence</span><span class="p">.</span><span class="n">Any</span><span class="p">())</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">item</span> <span class="k">in</span> <span class="n">sequence</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="avoid-unnecessary-tolisttoarray-calls">Avoid unnecessary ToList()/ToArray() calls</h2>
<p>LINQ queries are supposed to defer their execution until the corresponding sequence is iterated such as with a **foreach **statement. This is also the case when <strong>ToList()</strong> or <strong>ToArray()</strong> are called on such a query:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">resourceNames</span> <span class="p">=</span> <span class="n">resourceAssembly</span>
</span></span><span class="line"><span class="cl"><span class="p">.</span><span class="n">GetManifestResourceNames</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">r</span> <span class="p">=&gt;</span> <span class="n">r</span><span class="p">.</span><span class="n">StartsWith</span><span class="p">(</span><span class="s">$&#34;{resourcePath}.i18n&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">.</span><span class="n">ToArray</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">resourceName</span> <span class="k">in</span> <span class="n">resourceNames</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>ToList()</strong> method builds a <strong>List&lt;&gt;</strong> instance that contains all elements of the given sequence. It should be used carefully because the cost of creating a list from a large sequence of objects could be high both in term of memory and performance due to the implementation of element addition in <strong>List&lt;&gt;</strong>.</p>
<p>The only recommended usages are:</p>
<ul>
<li>optimization sake to avoid executing the underlying query several times when it is expensive</li>
<li>removing/adding elements from a sequence</li>
<li>storing the result of a query execution in a class field</li>
</ul>
<p>However, most of the times, you don’t need to call <strong>ToList()</strong> to iterate on a <strong>IEnumerable<T></strong>. If you do so, you hurt the runtime execution both in term of memory consumption (because of the unneeded <strong>List<T></strong> that is just temporary) and in term of performance because the sequence gets iterated twice.</p>
<p>The base of LINQ to Object is the I<strong>Enumerable</strong> interface used to iterate on a sequence of objects. All LINQ extension methods are taking <strong>IEnumerable</strong> instances as parameter in addition to <strong>foreach <strong>constructs. It is also not needed to call <strong>ToList()</strong> when an <strong>IEnumerable</strong> is expected (this is a good reason to prefer <strong>IEnumerable</strong> to <strong>IList</strong>/<strong>List</strong>/</strong>[]</strong> in method signatures)</p>
<p>Some methods are calling <strong>ToList()</strong> before <strong>Where</strong> clauses are applied to an <strong>IEnumerable</strong> sequence: it is more efficient to stack the <strong>Where</strong> clauses and call <strong>ToList()</strong> at the end.</p>
<p>Last but not least, it is not needed to call <strong>ToList()</strong> to get the number of elements in a sequence such as in the following code sample:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">productInfos</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Select</span><span class="p">(</span><span class="n">p</span> <span class="p">=&gt;</span> <span class="n">p</span><span class="p">.</span><span class="n">Split</span><span class="p">(</span><span class="n">DisplayProductInfoSeparator</span><span class="p">)[</span><span class="m">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Distinct</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">ToList</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Count</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>becomes:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">productInfos</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Select</span><span class="p">(</span><span class="n">p</span> <span class="p">=&gt;</span> <span class="n">p</span><span class="p">.</span><span class="n">Split</span><span class="p">(</span><span class="n">DisplayProductInfoSeparator</span><span class="p">)[</span><span class="m">0</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Distinct</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Count</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h3 id="prefer-ienumerableany-to-listexists">Prefer IEnumerable&lt;&gt;.Any to List&lt;&gt;.Exists</h3>
<p>When manipulating <strong>IEnumerable</strong>, it is recommended to use <strong>Any</strong> instead of <strong>ToList().Exists()</strong> such as in the following code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">sequence</span><span class="p">.</span><span class="n">ToList</span><span class="p">().</span><span class="n">Exists</span><span class="p">(</span><span class="err">…</span><span class="p">))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>becomes:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">sequence</span><span class="p">.</span><span class="n">Any</span><span class="p">(...))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h3 id="prefer-any-to-count-when-checking-for-emptiness">Prefer Any to Count when checking for emptiness</h3>
<p>The **Any **extension methods should be preferred to count computation on <strong>IEnumerable</strong> because the iteration on the sequence stops as soon as the condition (if any) is fulfilled without allocating any temporary list:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">nonArchivedCampaigns</span> <span class="p">=</span> 
</span></span><span class="line"><span class="cl">   <span class="n">campaigns</span>
</span></span><span class="line"><span class="cl">   <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">c</span> <span class="p">=&gt;</span> <span class="n">c</span><span class="p">.</span><span class="n">Status</span> <span class="p">!=</span> <span class="n">CampaignActivityStatus</span><span class="p">.</span><span class="n">Archived</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">   <span class="p">.</span><span class="n">ToList</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">nonArchivedCampaigns</span><span class="p">.</span><span class="n">Count</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>becomes:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(!</span><span class="n">campaigns</span><span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">c</span> <span class="p">=&gt;</span> <span class="n">c</span><span class="p">.</span><span class="n">Status</span> <span class="p">!=</span> <span class="n">CampaignActivityStatus</span><span class="p">.</span><span class="n">Archived</span><span class="p">).</span><span class="n">Any</span><span class="p">())</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that it is also valid to use <code>if (!campaigns.Any(filter))</code></p>
<h2 id="order-in-extension-methods-mightmatter">Order in extension methods might matter</h2>
<p>When operators are applied to sequences (i.e. <strong>IEnumerable</strong>), their order might have an impact on the performance of the resulting code. One important rule is to always filter first so the resulting sequences get smaller and smaller to iterate. This is why it is recommended to start a LINQ query by <strong>Where</strong> filters.</p>
<p>With LINQ, the code you write to define a query might be misleading in term of execution. For example, what is the difference between:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">filteredElements</span> <span class="p">=</span> <span class="n">sequence</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">first</span> <span class="n">filter</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">second</span> <span class="n">filter</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>and:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">filteredElements</span> <span class="p">=</span> <span class="n">sequence</span>
</span></span><span class="line"><span class="cl">  <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">first</span> <span class="n">filter</span> <span class="p">&amp;&amp;</span> <span class="n">second</span> <span class="n">filter</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It depends on the query executor. For LINQ for Objects, it seems that there is no difference in term of the filters execution: the first and second filters will be executed the same number of times as shown by the following code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"> <span class="kt">var</span> <span class="n">integers</span> <span class="p">=</span> <span class="n">Enumerable</span><span class="p">.</span><span class="n">Range</span><span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="m">6</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"> <span class="kt">var</span> <span class="n">set1</span> <span class="p">=</span> <span class="n">integers</span>
</span></span><span class="line"><span class="cl"> <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">i</span> <span class="p">=&gt;</span> <span class="n">IsEven</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"> <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">i</span> <span class="p">=&gt;</span> <span class="n">IsMultipleOf3</span><span class="p">(</span><span class="n">i</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">current</span> <span class="k">in</span> <span class="n">set1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;--&gt; {current}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;--------------------------------&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="kt">var</span> <span class="n">set2</span> <span class="p">=</span> <span class="n">integers</span>
</span></span><span class="line"><span class="cl"> <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">i</span> <span class="p">=&gt;</span> <span class="n">IsEven</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="n">IsMultipleOf3</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"> <span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">current</span> <span class="k">in</span> <span class="n">set2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;--&gt; {current}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>When you run it, you get the exact same lines in the console:</p>
<pre tabindex="0"><code>IsEven(1)
IsEven(2)
   IsMultipleOf3(2)
IsEven(3)
IsEven(4)
   IsMultipleOf3(4)
IsEven(5)
IsEven(6)
   IsMultipleOf3(6)
--&gt; 6
--------------------------------
IsEven(1)
IsEven(2)
   IsMultipleOf3(2)
IsEven(3)
IsEven(4)
   IsMultipleOf3(4)
IsEven(5)
IsEven(6)
   IsMultipleOf3(6)
--&gt; 6
</code></pre><p>However, when you run it under Benchmark.NET,</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"> <span class="kd">private</span> <span class="kt">int</span><span class="p">[]</span> <span class="n">_myArray</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na"> [Params(10, 1000, 10000)]</span>
</span></span><span class="line"><span class="cl"> <span class="kd">public</span> <span class="kt">int</span> <span class="n">Size</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na"> [GlobalSetup]</span>
</span></span><span class="line"><span class="cl"> <span class="kd">public</span> <span class="k">void</span> <span class="n">Setup</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="n">_myArray</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">Size</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">     <span class="k">for</span> <span class="p">(</span><span class="kt">var</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">Size</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">         <span class="n">_myArray</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="p">=</span> <span class="n">i</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na"> [Benchmark(Baseline = true)]</span>
</span></span><span class="line"><span class="cl"> <span class="kd">public</span> <span class="k">void</span> <span class="n">Original</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="kt">var</span> <span class="k">set</span> <span class="p">=</span> <span class="n">_myArray</span>
</span></span><span class="line"><span class="cl">         <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">i</span> <span class="p">=&gt;</span> <span class="n">IsEven</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">         <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">i</span> <span class="p">=&gt;</span> <span class="n">IsMultipleOf3</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">         <span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">     <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">current</span> <span class="k">in</span> <span class="k">set</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">     <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="n">i</span> <span class="p">=</span> <span class="n">current</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="p">}</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na"> [Benchmark]</span>
</span></span><span class="line"><span class="cl"> <span class="kd">public</span> <span class="k">void</span> <span class="n">Merged</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">     <span class="kt">var</span> <span class="k">set</span> <span class="p">=</span> <span class="n">_myArray</span>
</span></span><span class="line"><span class="cl">         <span class="p">.</span><span class="n">Where</span><span class="p">(</span><span class="n">i</span> <span class="p">=&gt;</span> <span class="n">IsEven</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="n">IsMultipleOf3</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">         <span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">     <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">current</span> <span class="k">in</span> <span class="k">set</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">     <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="n">i</span> <span class="p">=</span> <span class="n">current</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">     <span class="p">}</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>the results are significantly better for the single “merged” <strong>Where</strong> clause:</p>
<p><img loading="lazy" src="/posts/2021-07-01_memory-anti-patterns-in/1_Q5xXZItYl0OuewsVW2O1gA.png"></p>
<p>After looking at the implementation in the <a href="https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,44b8532e11187695">.NET Framework</a> with my colleague <a href="https://twitter.com/durot_jp">Jean-Philippe</a>, the additional cost seems to be related to the underlying <strong>IEnumerator</strong> corresponding to the first <strong>Where</strong>.</p>
<p>Remember to never assume and always measure.</p>
<hr>
<p>Interesting in joining the team? Check out our latest job posts:</p>
<p><a href="https://careers.criteo.com/job/341c287d-f045-46b9-b3ff-70f773ce6911/Senior-Site-Reliability-Engineer-PRE-Team-remote-flexibility-with-base-in-France"><strong>Senior Site Reliability Engineer - PRE Team (remote flexibility with base in France) job in Paris</strong>
careers.criteo.com</a><a href="https://careers.criteo.com/job/341c287d-f045-46b9-b3ff-70f773ce6911/Senior-Site-Reliability-Engineer-PRE-Team-remote-flexibility-with-base-in-France"></a><a href="https://careers.criteo.com/job/38e2cc1c-718c-4d2d-ae62-4f5206192de7/Senior-Site-Reliability-Engineer-PRE-Performance-remote-flexibility-with-base-in-France"><strong>Senior Site Reliability Engineer - PRE - Performance (remote flexibility with base in France)</strong>
careers.criteo.com</a><a href="https://careers.criteo.com/job/38e2cc1c-718c-4d2d-ae62-4f5206192de7/Senior-Site-Reliability-Engineer-PRE-Performance-remote-flexibility-with-base-in-France"></a></p>
]]></content:encoded></item><item><title>How to ease async callstacks analysis in Perfview</title><link>https://chrisnas.github.io/posts/2021-03-02_how-to-ease-async/</link><pubDate>Tue, 02 Mar 2021 09:35:37 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-03-02_how-to-ease-async/</guid><description>Our new post describes how to easily profile with Perfview and more interestingly, how to get human readable async callstacks!</description><content:encoded><![CDATA[<hr>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_ATK-AQ6S7ImVxHgw5iF_kQ.jpeg"></p>
<h2 id="introduction">Introduction</h2>
<p>In the <a href="/posts/2021-01-19_understanding-reversed-callsta/">previous post</a>, I described why you might get weird reversed callstacks in Visual Studio when analyzing or debugging async/await code. And if you are using Perfview to profile the same application, you should also get the same reverse continuation flow:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_uI53gXuqEuy7tDlp7xw1mA.png"></p>
<p>The rest of the post describes how to easily profile with Perfview and more interestingly, how to leverage grouping/folding features to get much more readable asynchronous callstacks.</p>
<h2 id="perfview-101">Perfview 101</h2>
<p>Here are the different steps to get the previous tree-like representation of a profiling session results.</p>
<p>First, start a data collection by clicking the <strong>Start Collection</strong> button from the <strong>Collect | Collect</strong> dialog box and check <strong>Kernel Base</strong>, <strong>CPU Samples</strong>, and <strong>.NET</strong> boxes:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_guqs5Mn098S85vqwRzZw6A.png"></p>
<p>Stop the collection when the application ends and double-click the <strong>CPU Stacks</strong> node :</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_Z703G2KzHaYJBgAc4cTG_A.png"></p>
<p>After selecting the application in the <strong>Select Process Window</strong></p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_Ex2VB19zQfmciFRYkE9mFQ.png"></p>
<p>click the <strong>CallTree</strong> tab:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_Dh99MrN5biZGS9O2lZDtKA.png"></p>
<p>Before entering the dreaded yellow/white CPU Stacks window, let’s spend some time detailing its vast toolbar in the following figure:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_BOJ23oWq4qxmFQmjQhAzcw.png"></p>
<p>The most powerful elements are the *<strong>Pats</strong> combo-boxes. Each of them supports a “simple” matching pattern syntax for different purposes (don’t worry, you will see how to use them in many more examples later):</p>
<ul>
<li><strong>GroupPats</strong>: merge sibling matching frames into one.</li>
<li><strong>FoldPats</strong>: matching frames are folded into parent frame.</li>
<li><strong>IncPats</strong>: non matching frames are removed (used for process filtering for example).</li>
<li><strong>ExcPats</strong>: matching frames are excluded.</li>
</ul>
<p>Let’s see what we get with all combo-box set as empty for the <strong>CallTree</strong> tab:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_IJx5TrYT-uxKeE5sn_s4XA.png"></p>
<p>For server applications, we are usually not interested in making any difference between threads so it would be nice to group all threads under a single <strong>AllThreads</strong> node. This is exactly what the first choice of the <strong>GroupPats</strong> combo-box provides:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_VZDZWpBz8l3zHEow_ZUjoQ.png"></p>
<p>The effect is simple: all lines at the same level containing “Thread” in the text are merged into a new line with “AllThreads” as new text</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_-fds9I0vwxpOWjOw460wxA.png"></p>
<p>You can now get the same kind of tree based representation as in Visual Studio: the difference is that you need to open each node by clicking a checkbox (or right-click + <strong>Expand All</strong> to see the whole tree)</p>
<p>Most columns meaning are quite self-explicit except maybe the <strong>When</strong> column that provides the CPU usage graph over time in a “textual” way:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_CyrGe04ny6Xx53BzoOqRRw.png"></p>
<p>The <strong>CallTree</strong> representation obviously displays the frames sorted by <strong>Inc%</strong> column.</p>
<h2 id="going-further-withperfview">Going further with Perfview</h2>
<p>When expanding the calltree, you usually get lost in the async/await implementation details. The following screenshot shows the signal/noise ratio for my simple test application where we don’t really care about the blue lines!</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_6CEg_Izw7pigwsMDCtAbiQ.png"></p>
<p>This is where the different Perfview combo-boxes are coming to the rescue. Some frames and their children are clearly not interesting such as the last two <strong>coreclr!ThePreStub</strong>. In that case, select the frame and copy the text from the status bar (yes: this is possible and so handy!)</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_U_-kLqVwQzl-yqlcTjj5-Q.png"></p>
<p>and paste it into the <strong>ExcPats</strong> combo-box</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_mIVQ1JJGFIcH9UxsifNF9g.png"></p>
<p>to make the corresponding frames disappear.</p>
<p>Unfortunately, you can’t do the same for the other Task-related frames because these <strong>ExcPats</strong> matching frames are completely removed with their children where your async method calls appear.</p>
<h2 id="folding-patterns-are-yourfriends">Folding patterns are your friends</h2>
<p>This time, the <strong>FoldPats</strong> combo-box will be your friend: each frame that maps one of its ; separated substring will disappear and its occurrence count will be added to its parent frame. Since all these Task-related frame do not appear a lot, the impact in the Inc/Exc columns of the parent frames should be minimal. After I used the following substrings:</p>
<blockquote>
<p>Tasks.Task+DelayPromise.CompleteTimedOut(;Tasks.Task.FinishContinuations(;Tasks.Task.RunOrQueueCompletionAction;Tasks.Task.RunContinuations(;Tasks.Task+DelayPromise+;Tasks.AwaitTaskContinuation.RunOrScheduleAction;Runtime.CompilerServices.AsyncTaskMethodBuilder`;CompilerServices.AsyncTaskMethodBuilder.Start(;CompilerServices.AsyncMethodBuilderCore.Start(;ExecutionContext.RunInternal(;CompilerServices.AsyncTaskMethodBuilder.SetResult(;Tasks.VoidTaskResult].TrySetResult;.TrySetResult(System.Threading.Tasks.;Tasks.Task.TrySetResult(</p>
</blockquote>
<p>the callstack was much more readable:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_i5zr41nODbhMumppGDyHRw.png"></p>
<h2 id="morph-theframes">Morph the frames</h2>
<p>The final step is to transform the frames text into something more meaningful thanks to the <strong>GroupPats</strong> combo-box. At the beginning of this post, I picked the predefined <strong>[fold threads] Thread -&gt; AllThreads</strong> grouping pattern. The starting text between <strong>[]</strong> is used as a title by Perfview to allow the user to more easily figure out what its role is. The rest of the string defines how parts of each frame should match and be grouped. The corresponding contextual help has already been shown earlier when the toolbar was detailed.</p>
<p>Here, I don’t want to group all sibling frames into a group but rather morph the text into something more readable. The part before <strong>-&gt;</strong> or <strong>=&gt;</strong> is used as matching pattern to be replaced by the part after the sign. It is also possible to “extract” elements between <strong>{}</strong> from the matching pattern to be used to build the replacement string. Each matched element is identified as <strong>$1</strong>, <strong>$2</strong>,… based on its position in the pattern.</p>
<p>In my example, I would like to apply the following transformation:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_7vPuOGJTB9pO9xboyvhfYw.png"></p>
<p>To write the pattern, you should focus on the separators (<strong>!</strong>, <strong>+&lt;</strong> and <strong>&gt;d__</strong> in this case):</p>
<p>{%}<strong>!</strong>{%}<strong>+&lt;</strong>{%}<strong>&gt;d__</strong>*</p>
<p>And the items to extract are what is left in between; identified as <strong>{%}</strong>.</p>
<p>The building of the replacement string is simply counting the matching item position (starting at 1):</p>
<p>(<strong>$1</strong>) <strong>$2</strong> ~~~async back to~~~ <strong>$3</strong>()</p>
<p>And here is the corresponding final result:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_YxhSEYtt_O9wCCQyCRpZMA.png"></p>
<h2 id="dont-lose-yourxxxpats">Don’t lose your xxxPats!</h2>
<p>It is interesting to note that you could define your own patterns preset via the <strong>Preset</strong> menu item</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1__Pje1UcZ0SSLM3po9E-YQw.png"></p>
<p>As you can see here, I have defined my own <strong>Criteo Arbitrage</strong> preset. If you want to reuse the content of <strong>GroupPats</strong>, <strong>FoldPats</strong>, and <strong>Fold%</strong> combo-boxes, click the <strong>Save As Preset</strong> (or even <strong>Set As Startup Preset</strong> to get them when you start Perfview) and pick a name</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_KZNIZSMPlil-cWsbA5GW9w.png"></p>
<p>Feel free to use the <strong>Manage Presets</strong> dialog for easier preset manipulation:</p>
<p><img loading="lazy" src="/posts/2021-03-02_how-to-ease-async/1_a9MDAqx0lbasO48canlt3Q.png"></p>
<p>I hope that, now, you better understand the value of Perfview to analyze complicated callstacks.</p>
<hr>
<p><strong>Read more from Christophe on our Medium blog!</strong></p>
<p><a href="https://medium.com/criteo-engineering/consul-streaming-whats-behind-it-6f44f77a5175"><strong>Consul Streaming: What’s behind it?</strong>
*Let’s look at new hidden feature for Consul large or very dynamic clusters of Consul 1.9: Streaming.*medium.com</a><a href="https://medium.com/criteo-engineering/consul-streaming-whats-behind-it-6f44f77a5175"></a></p>
<p><strong>Like what you are reading? Join us and make an impact!</strong></p>
<p><a href="http://careers.criteo.com"><strong>Careers at Criteo | Criteo jobs</strong>
*Find opportunities everywhere. ​Choose your next challenge. Find the job opportunities at Criteo in Product, research &amp;…*careers.criteo.com</a><a href="http://careers.criteo.com"></a></p>
]]></content:encoded></item><item><title>Understanding “reversed” callstacks in Visual Studio and Perfview with async/await code</title><link>https://chrisnas.github.io/posts/2021-01-19_understanding-reversed-callsta/</link><pubDate>Tue, 19 Jan 2021 17:16:08 +0000</pubDate><guid>https://chrisnas.github.io/posts/2021-01-19_understanding-reversed-callsta/</guid><description>This post explains why profilers like Visual Studio could display “reversed” callstacks when dealing with async/await code.</description><content:encoded><![CDATA[<hr>
<h2 id="introduction">Introduction</h2>
<p>With my colleague <a href="https://twitter.com/ezsilmar">Eugene</a>, we spent a long time analyzing performances of one of Criteo main applications with Perfview. The application is processing thousand of requests in an asynchronous pipeline full of <strong>async</strong>/**await **calls. During our research, we ended up with weird callstacks that looked kind of “reversed”. The goal of this post is to describe why this could happen (even in Visual Studio).</p>
<h2 id="lets-see-the-result-of-profiling-in-visualstudio">Let’s see the result of profiling in Visual Studio</h2>
<p>I wrote a simple .NET Core application that simulates a few <strong>async</strong>/**await **calls:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">static</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;pid = {Process.GetCurrentProcess().Id}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;press ENTER to start...&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">ReadLine</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">await</span> <span class="n">ComputeAsync</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;press ENTER to exit...&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">ReadLine</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">ComputeAsync</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">await</span> <span class="n">Task</span><span class="p">.</span><span class="n">WhenAll</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">Compute1</span><span class="p">(),</span>
</span></span><span class="line"><span class="cl">        <span class="p">...</span>
</span></span><span class="line"><span class="cl">        <span class="n">Compute1</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="p">);</span> 
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p><code>ComputeAsync</code> is starting a bunch of tasks that will await other **async **methods:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">Compute1</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPU</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">await</span> <span class="n">Compute2</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPUAfterCompute2</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">Compute2</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPU</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">await</span> <span class="n">Compute3</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPUAfterCompute3</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">Compute3</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">await</span> <span class="n">Task</span><span class="p">.</span><span class="n">Delay</span><span class="p">(</span><span class="m">1000</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPUinCompute3</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;DONE&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Unlike the <code>Compute1</code> and <code>Compute2</code> methods, the last <code>Compute3</code> is waiting 1 second before consuming some CPU with square root computation in <code>CompusumeCPUXXX</code> helpers:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[MethodImpl(MethodImplOptions.NoInlining)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ConsumeCPUinCompute3</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPU</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">[MethodImpl(MethodImplOptions.NoInlining)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ConsumeCPUAfterCompute3</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPU</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">[MethodImpl(MethodImplOptions.NoInlining)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ConsumeCPUAfterCompute2</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ConsumeCPU</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ConsumeCPU</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="m">1000</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">j</span> <span class="p">&lt;</span> <span class="m">1000000</span><span class="p">;</span> <span class="n">j</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">Math</span><span class="p">.</span><span class="n">Sqrt</span><span class="p">((</span><span class="kt">double</span><span class="p">)</span><span class="n">j</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>From Visual Studio, profile the CPU usage of this test program via <strong>Debug | Performance Profiler…</strong></p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_19VeO5ExRqIdORWkPalEsg.png"></p>
<p>In the summary result panel, click the <strong>Open Details…</strong> link</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_mgUex0R8dO0KebQoOY3Dgw.png"></p>
<p>And pick the <strong>Call Tree</strong> view</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_pHREftYmqPo6_o16Q9UIDg.png"></p>
<p>You should see two paths of execution:</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_7bCbVvcXMaqyVnJGod52EA.png"></p>
<p>If you open the last one, you should see the expected chain of calls:</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_UaWjYDCs6nJMrwpCN-TPmg.png"></p>
<p>… if the methods were synchronous; which is not the case. So Visual Studio did a great job in dealing with the implementation details of <strong>async</strong>/**await **to present a nice call stack.</p>
<p>However, if you open the first node, you get something more disturbing:</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_32whbyr8dVBDlYx8Y37t7g.png"></p>
<p>… if you don’t know how <strong>async</strong>/**await **is implemented. My <code>Compute3</code> code is definitively not calling <code>Compute2</code> which is not calling <code>Compute1</code>! This is where Visual Studio smart frame/callstack reconstruction brings more confusion than anything else. So what’s going on?</p>
<h2 id="understanding-asyncawait-implementation">Understanding async/await implementation</h2>
<p>Unlike Visual Studio that is hiding real calls, you should be able to see what methods are really called when analyzing a memory dump with <strong>dotnet-dump</strong> and the <strong>pstacks</strong> command:</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_jl7s91j1oVmCfOk-1FLx9Q.png"></p>
<p>If you follow the arrows from the bottom to the top, you should see the following synchronous (because as frame in thread callstacks) calls:</p>
<ul>
<li>a timer callback is calling <code>d__4.MoveNext()</code> : this corresponds to the end of the <code>Task.Delay</code> in<code>Compute3</code> method.</li>
<li><code>d__3.MoveNext()</code> gets called to continue the code after <code>await Compute3</code></li>
<li><code>d__.MoveNext()</code> gets called to continue the code after <code>await Compute2</code></li>
<li><code>ConsumeCPUAfterCompute2()</code> gets called as expected</li>
<li><code>ComputeCPU()</code>** **or <code>ConsumeCPUInCompute3()</code> get called as expected</li>
</ul>
<p>All the fancy methods names are due to “state machine” types that is generated by the C# compiler when you (1) define **async **methods that (2) await other **async **methods (or any “awaitable” object). Their role is to manage a “state machine” to execute code synchronously up to an <strong>await</strong> call, and again up to the next <strong>await</strong> call, and again and again until the method returns.</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_z5p1oSFQhsBrFPMsqBYSuQ.png"></p>
<p>All these <code>d__*</code> types contains fields corresponding to each <strong>async</strong> method local variables and parameters if any. For example, here is what is generated for the <code>ComputeAsync</code>** **and <code>Compute1/2/3</code> <strong>async</strong> methods without any local or parameter:</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_YjVkaPmjcdGN1SHKjPSX5w.png"></p>
<p>The integer <code>&lt;&gt;1__state</code> field keeps track of the “execution state” of the machine. For example, after the state machine is created in <code>Compute1</code>, this field is set to <strong>-1</strong>:</p>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_WaPLv6vAl2h2vyQMZ4GdIw.png"></p>
<p>I don’t want to dig into the builder details but just let’s just say that the <code>MoveNext</code> method of the state machine <code>d__2</code> gets executed (by the same thread).</p>
<p>Before looking at the <code>MoveNext</code> implementation corresponding to the <code>Compute1</code> method (without exception handling), keep in mind that it has to :</p>
<ul>
<li>run all code up to an **await **call,</li>
<li>change the “execution state” (more on this later)</li>
<li>do some magic to execute that code in another thread (if needed — more on this later)</li>
<li>come back to continue the execution of the code after the <strong>await</strong> call</li>
<li>and do that up to the next <strong>await</strong> call again and again</li>
</ul>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_Ip734O0vxHnhdDv3siEPWA.png"></p>
<p>Since <code>&lt;&gt;1__state</code> is -1, the first “synchronous” part of the code is executed (i.e. calling <code>ComsumeCPU</code> method).</p>
<p>The <code>Compute2</code> method is then called to get the corresponding awaitable object (here a <code>Task</code>). If the task runs immediately (i.e. no <strong>await</strong> call such as a simple <code>Task.FromResult()</code> in the <strong>async</strong> method), <code>IsCompleted()</code> will return <strong>true</strong> and the code after the <strong>await</strong> call will be run by the same thread. Yes it means that <strong>async</strong>/<strong>await</strong> calls could be run synchronously by the same thread: why creating a thread when it is not needed?</p>
<p>If the Task is passed to the ThreadPool to be executed by a worker thread, the <code>&lt;&gt;1__state</code> value is set to <strong>0</strong> (so the next time <code>MoveNext</code> is called, the next “synchronous” part (i.e. after the <strong>await</strong> call) will be executed). The code now calls <code>awaitUnsafeOnCompleted</code> to do its magic: adding a continuation to the <code>Compute2</code> task (the first <strong>awaiter</strong> parameter) so that <code>MoveNext</code> will be called on that same state machine (the second <strong>this</strong> parameter) when the task ends. The current thread then quietly returns.</p>
<p>So when the <code>Compute2</code> task ends, its continuation runs to call <code>MoveNext</code> this time with <code>&lt;&gt;1__state</code> as <strong>0</strong> so the last two lines are executed: <code>awaiter.GetResult()</code> returns immediately because the <code>Task</code> returned by <code>Compute2</code> already ended and the last <code>CinsumeCPUAfterCompute2</code> method is now called.</p>
<p>Here is a summary of what is happening:</p>
<ul>
<li>Each time you see an <strong>async</strong> method, the C# compiler is generating a dedicated state machine type with a <code>MoveNext</code> method that is responsible for executing your code synchronously between <strong>await</strong> calls</li>
<li>each time you see an <strong>await</strong> call, it means that a continuation will be added to the <code>Task</code> wrapping the <strong>async</strong> method to be executed. That continuation code will call the <code>MoveNext</code> method of the state machine of the calling method to execute the next piece of code up to its next <strong>await</strong> call.</li>
</ul>
<p><img loading="lazy" src="/posts/2021-01-19_understanding-reversed-callsta/1_Zf0qb_2Y8oKs9KO1Fli1rg.png"></p>
<p>This is why Visual Studio, trying to smartly match each async method state machine <code>MoveNext</code> frame to the method itself, shows reversed callstacks: the shown frames are the ones corresponding to the continuations after the <strong>await</strong> calls (in green in the previous figure).</p>
<p>Note that I described in more details how <strong>async</strong>/<strong>await</strong> is working and the action of <code>AwaitUnsageOnCompleted</code> during a <a href="https://youtu.be/8Ans2u4Bsi8?t=995">DotNext conference session</a> with <a href="https://twitter.com/KooKiz">Kevin</a> so feel free to watch the <a href="https://youtu.be/8Ans2u4Bsi8?t=995">recording at that particular time</a> if you want to go deeper.</p>
<p>The next post will describe what to do in Perfview to get more readable callstacks.</p>
<h3 id="stay-tuned">Stay tuned!</h3>
<hr>
<p><strong>Check out our latest posts on Medium:</strong></p>
<p><a href="https://medium.com/criteo-engineering/top-applications-of-graph-neural-networks-2021-c06ec82bfc18"><strong>Top Applications of Graph Neural Networks 2021</strong>
*GNNs have come a long way in academia. But do we have good applications of them in industry?*medium.com</a><a href="https://medium.com/criteo-engineering/top-applications-of-graph-neural-networks-2021-c06ec82bfc18"></a><a href="/posts/2020-12-08_build-your-own-net/"><strong>Build your own .NET CPU profiler in C#</strong>
*After describing memory allocation profiling it is now time to dig into the CPU sample profiling in C#!*medium.com</a></p>
<hr>
<p><strong>Join the crowd!</strong></p>
<p><a href="http://careers.criteo.com"><strong>Careers at Criteo | Criteo jobs</strong>
*Find opportunities everywhere. ​Choose your next challenge. *careers.criteo.com</a><a href="http://careers.criteo.com"></a></p>
]]></content:encoded></item><item><title>Build your own .NET CPU profiler in C#</title><link>https://chrisnas.github.io/posts/2020-12-08_build-your-own-net/</link><pubDate>Tue, 08 Dec 2020 10:27:37 +0000</pubDate><guid>https://chrisnas.github.io/posts/2020-12-08_build-your-own-net/</guid><description>After describing memory allocation profiling it is now time to dig into the CPU sample profiling in C#!</description><content:encoded><![CDATA[<hr>
<p>The last series was describing how to get details about your .NET application allocation patterns in C#.</p>
<ul>
<li><a href="/posts/2020-04-18_build-your-own-net/">Get a sampling of .NET application allocations</a></li>
<li><a href="/posts/2020-05-18_build-your-own-net/">A simple way to get the call stack</a></li>
<li><a href="/posts/2020-06-19_build-your-own-net/">Getting the call stack by hand</a></li>
</ul>
<p>It is now time to do the same but for the CPU consumption of your .NET applications.</p>
<h2 id="thanks-you-mr-windowskernel">Thanks you Mr Windows Kernel!</h2>
<p>Under Windows, the kernel ETW provider allows you to get notified every milli-second with the call stack of all threads running on a core. Without any surprise, it is easy with TraceEvent to listen to these events. As explained in an <a href="/posts/2018-07-26_grab-etw-session-providers/">old posts</a>, you simply need to create a session, enable providers and listen to the right event.</p>
<p>For sampled CPU profiling, I’m using the <code>TraceLogEventSource</code> to wrap the event source and automatically get the stack frames symbol resolution:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">string</span> <span class="n">sessionName</span> <span class="p">=</span> <span class="s">&#34;Cpu_Profiling_Session+&#34;</span> <span class="p">+</span> <span class="n">Guid</span><span class="p">.</span><span class="n">NewGuid</span><span class="p">().</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="n">_session</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TraceEventSession</span><span class="p">(</span><span class="n">sessionName</span><span class="p">,</span> <span class="n">TraceEventSessionOptions</span><span class="p">.</span><span class="n">Create</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(!</span><span class="n">EnableProviders</span><span class="p">(</span><span class="n">_session</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_session</span><span class="p">.</span><span class="n">Dispose</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">_session</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">_profilingTask</span> <span class="p">=</span> <span class="n">Task</span><span class="p">.</span><span class="n">Factory</span><span class="p">.</span><span class="n">StartNew</span><span class="p">(()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">using</span> <span class="p">(</span><span class="n">TraceLogEventSource</span> <span class="n">source</span> <span class="p">=</span> <span class="n">TraceLog</span><span class="p">.</span><span class="n">CreateFromTraceEventSession</span><span class="p">(</span><span class="n">_session</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// CPU sampling kernel events</span>
</span></span><span class="line"><span class="cl">        <span class="n">source</span><span class="p">.</span><span class="n">Kernel</span><span class="p">.</span><span class="n">PerfInfoSample</span> <span class="p">+=</span> <span class="p">(</span><span class="n">SampledProfileTraceData</span> <span class="n">data</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="p">...</span>
</span></span><span class="line"><span class="cl">        <span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// this call exits when the session is stopped</span>
</span></span><span class="line"><span class="cl">        <span class="n">source</span><span class="p">.</span><span class="n">Process</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You need to enable three providers:</p>
<ul>
<li>Kernel: get the profiling event every milli-second and be notified when a dll gets loaded by a process to let TraceEvent manage the symbols</li>
<li>Clr: get JIT events describing managed method details</li>
<li>ClrRundown: get already JITted methods details</li>
</ul>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">protected</span> <span class="kt">bool</span> <span class="n">EnableProviders</span><span class="p">(</span><span class="n">TraceEventSession</span> <span class="n">session</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">session</span><span class="p">.</span><span class="n">BufferSizeMB</span> <span class="p">=</span> <span class="m">256</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// Note: it could fail if the user does not have the required privileges</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">success</span> <span class="p">=</span> <span class="n">session</span><span class="p">.</span><span class="n">EnableKernelProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">ImageLoad</span> <span class="p">|</span>
</span></span><span class="line"><span class="cl">        <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Process</span> <span class="p">|</span>
</span></span><span class="line"><span class="cl">        <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Profile</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">stackCapture</span><span class="p">:</span> <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Profile</span>
</span></span><span class="line"><span class="cl">        <span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">success</span><span class="p">)</span> <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// this call always returns false  :^(</span>
</span></span><span class="line"><span class="cl">    <span class="n">session</span><span class="p">.</span><span class="n">EnableProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">ProviderGuid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">TraceEventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="kt">ulong</span><span class="p">)(</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// events related to JITed methods</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Jit</span> <span class="p">|</span>                       <span class="c1">// Turning on JIT events is necessary to resolve JIT compiled code </span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">JittedMethodILToNativeMap</span> <span class="p">|</span> <span class="c1">// This is needed if you want line number information in the stacks</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Loader</span>                      <span class="c1">// You must include loader events as well to resolve JIT compiled code.</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// this provider will send events of already JITed methods</span>
</span></span><span class="line"><span class="cl">    <span class="n">session</span><span class="p">.</span><span class="n">EnableProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrRundownTraceEventParser</span><span class="p">.</span><span class="n">ProviderGuid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">TraceEventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="kt">ulong</span><span class="p">)(</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Jit</span> <span class="p">|</span>              <span class="c1">// We need JIT events to be rundown to resolve method names</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">JittedMethodILToNativeMap</span> <span class="p">|</span> <span class="c1">// This is needed if you want line number information in the stacks</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Loader</span> <span class="p">|</span>           <span class="c1">// As well as the module load events.  </span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">StartEnumeration</span>   <span class="c1">// This indicates to do the rundown now (at enable time)</span>
</span></span><span class="line"><span class="cl">        <span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The code to handle the event is really simple:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">source</span><span class="p">.</span><span class="n">Kernel</span><span class="p">.</span><span class="n">PerfInfoSample</span> <span class="p">+=</span> <span class="p">(</span><span class="n">SampledProfileTraceData</span> <span class="n">data</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span> <span class="p">!=</span> <span class="n">Pid</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">callstack</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">CallStack</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">callstack</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">MergeCallStack</span><span class="p">(</span><span class="n">callstack</span><span class="p">,</span> <span class="n">Reader</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>I’m only interested in profiling a given process (hence the check on process id) and events with a call stack. The callstack is returned by the extension method <code>CallStack()</code> (see the <a href="/posts/2020-05-18_build-your-own-net/">previous post</a> for more details). The main processing is done by the <code>MergeCallStack()</code> method. But before looking at the only complicated part, it is time to discuss a useful tip.</p>
<h2 id="tip-use-etlxluke">Tip: use ETLx Luke!</h2>
<p>Like the previous posts about memory profiling, my goal is to demonstrate how to monitor applications as they run. However when you monitor an application CPU consumption, you would like to avoid any noisy neighbor that could highjack some cores. So minimizing the work of your profiling code is always a good idea. In addition, it could also be valuable to record the events and analyze them later. Microsoft <a href="https://github.com/microsoft/perfview/releases">Perfview</a> is the open source tool that I’m using the most to dig into CPU consumption. So the solution is to simply record the events and generate an .etlx file for Perfview.</p>
<p>The first code change is small: the session is created with a filename.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">string</span> <span class="n">sessionName</span> <span class="p">=</span> <span class="s">&#34;Cpu_Profiling_Session+&#34;</span> <span class="p">+</span> <span class="n">Guid</span><span class="p">.</span><span class="n">NewGuid</span><span class="p">().</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="n">_session</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TraceEventSession</span><span class="p">(</span><span class="n">sessionName</span><span class="p">,</span> <span class="n">_filename</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>I’m using a naming convention that contains the process ID I want to monitor so it will be easy to remember when I will analyze the recording in Perfview:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">profiler</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EtlCpuSampleProfiler</span><span class="p">(</span><span class="s">$&#34;trace-{parameters.pid}.etl&#34;</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The second step to generate the .etlx file is a one liner:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">traceLog</span> <span class="p">=</span> <span class="n">TraceLog</span><span class="p">.</span><span class="n">OpenOrConvert</span><span class="p">(</span><span class="n">_filename</span><span class="p">,</span> <span class="k">new</span> <span class="n">TraceLogOptions</span><span class="p">()</span> <span class="p">{</span> <span class="n">ConversionLog</span> <span class="p">=</span> <span class="n">SymbolMessages</span> <span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>ConversionLog TraceLogOptions</code> property is expecting a <code>TextWriter</code> to log all possible messages related to symbols resolution.</p>
<p>The parsing of kernel profiling samples is done on the <code>TraceLog</code> in a more manual way by selecting the events based on <code>TaskGuid</code> corresponding to the kernel profiling task and the <code>OpCode</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// parse profiling kernel events</span>
</span></span><span class="line"><span class="cl"><span class="c1">// from https://github.com/microsoft/perfview/blob/master/src/TraceEvent/Samples/41_TraceLogMonitor.cs#L150</span>
</span></span><span class="line"><span class="cl"><span class="c1">// from https://docs.microsoft.com/en-us/windows/win32/etw/perfinfo</span>
</span></span><span class="line"><span class="cl"><span class="c1">// from https://github.com/microsoft/perfview/blob/master/src/TraceEvent/Parsers/KernelTraceEventParser.cs#L3128</span>
</span></span><span class="line"><span class="cl"><span class="c1">// and https://github.com/microsoft/perfview/blob/master/src/TraceEvent/Parsers/KernelTraceEventParser.cs#L2298</span>
</span></span><span class="line"><span class="cl"><span class="c1">//</span>
</span></span><span class="line"><span class="cl"><span class="n">Guid</span> <span class="n">perfInfoTaskGuid</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Guid</span><span class="p">(</span><span class="m">0xce1dbfb4</span><span class="p">,</span> <span class="m">0x137e</span><span class="p">,</span> <span class="m">0x4da6</span><span class="p">,</span> <span class="m">0x87</span><span class="p">,</span> <span class="m">0xb0</span><span class="p">,</span> <span class="m">0x3f</span><span class="p">,</span> <span class="m">0x59</span><span class="p">,</span> <span class="m">0xaa</span><span class="p">,</span> <span class="m">0x10</span><span class="p">,</span> <span class="m">0x2c</span><span class="p">,</span> <span class="m">0xbc</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kt">int</span> <span class="n">profileOpcode</span> <span class="p">=</span> <span class="m">46</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">data</span> <span class="k">in</span> <span class="n">traceLog</span><span class="p">.</span><span class="n">Events</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span> <span class="p">!=</span> <span class="n">Pid</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">TaskGuid</span> <span class="p">!=</span> <span class="n">perfInfoTaskGuid</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">((</span><span class="kt">uint</span><span class="p">)</span><span class="n">data</span><span class="p">.</span><span class="n">Opcode</span> <span class="p">!=</span> <span class="n">profileOpcode</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">callstack</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">CallStack</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">callstack</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">MergeCallStack</span><span class="p">(</span><span class="n">callstack</span><span class="p">,</span> <span class="n">Reader</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="how-to-merge-callstacks">How to “merge” call stacks</h2>
<p>In both live and file based implementations, I end up merging call stacks by calling the <code>MergeCallStack()</code> method. Instead of jumping directly into the C# code, I prefer to describe what I’m expecting from “merging“ call stacks.</p>
<p>If you think about what frames (i.e. method call) would appear at the beginning all these threads call stacks, it seems obvious that they should start with the same code: either the main thread startup, timer/thread pool initialization or custom thread bootstrap. In case of server applications, the same request processing calls would lead to specific handlers or controllers code. Each time a common group of frames appears in different call stacks, it would be more readable to see them as different branches starting from the same trunk like in Visual Studio Parallel Stack panel.</p>
<p><img loading="lazy" src="/posts/2020-12-08_build-your-own-net/1_Q6ry2HMPlwOTHGrhW0Avpg.png"></p>
<p>In order to build a “visual” representation, I have to count the number of time each frame appears at the same place in the recorded call stacks. My data structure looks like a tree where each node contains the current frame, the sampling count (as node or as leaf) and a list of different child frames corresponding to the different execution branches:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">MergedSymbolicStacks</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">int</span> <span class="n">_countAsNode</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">int</span> <span class="n">_countAsLeaf</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">Frame</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="kd">private</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">string</span> <span class="n">Symbol</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="kd">private</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">CountAsNode</span> <span class="p">=&gt;</span> <span class="n">_countAsNode</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">CountAsLeaf</span> <span class="p">=&gt;</span> <span class="n">_countAsLeaf</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">MergedSymbolicStacks</span><span class="p">&gt;</span> <span class="n">Stacks</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Each frame contains both the address and the method signature that have been extracted from the callstack retrieved from the events:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">protected</span> <span class="k">void</span> <span class="n">MergeCallStack</span><span class="p">(</span><span class="n">TraceCallStack</span> <span class="n">callStack</span><span class="p">,</span> <span class="n">SymbolReader</span> <span class="n">reader</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">currentFrame</span> <span class="p">=</span> <span class="n">callStack</span><span class="p">.</span><span class="n">Depth</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">frames</span> <span class="p">=</span> <span class="k">new</span> <span class="n">SymbolicFrame</span><span class="p">[</span><span class="n">callStack</span><span class="p">.</span><span class="n">Depth</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// the first element of callstack is the last frame: we need to iterate on each frame</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// up to the first one before adding them into the MergedSymbolicStack</span>
</span></span><span class="line"><span class="cl">    <span class="k">while</span> <span class="p">(</span><span class="n">callStack</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">codeAddress</span> <span class="p">=</span> <span class="n">callStack</span><span class="p">.</span><span class="n">CodeAddress</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">codeAddress</span><span class="p">.</span><span class="n">Method</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">moduleFile</span> <span class="p">=</span> <span class="n">codeAddress</span><span class="p">.</span><span class="n">ModuleFile</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">moduleFile</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="c1">// TODO: this seems to trigger extremely slow retrieval of symbols </span>
</span></span><span class="line"><span class="cl">                <span class="c1">//       through HTTP requests: see how to delay it AFTER the user</span>
</span></span><span class="line"><span class="cl">                <span class="c1">//       stops the profiling</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(!</span><span class="n">_missingSymbols</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">moduleFile</span><span class="p">,</span> <span class="k">out</span> <span class="kt">var</span> <span class="n">_</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">                <span class="p">{</span>
</span></span><span class="line"><span class="cl">                    <span class="n">codeAddress</span><span class="p">.</span><span class="n">CodeAddresses</span><span class="p">.</span><span class="n">LookupSymbolsForModule</span><span class="p">(</span><span class="n">reader</span><span class="p">,</span> <span class="n">moduleFile</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">                    <span class="k">if</span> <span class="p">(</span><span class="n">codeAddress</span><span class="p">.</span><span class="n">Method</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">                    <span class="p">{</span>
</span></span><span class="line"><span class="cl">                        <span class="n">_missingSymbols</span><span class="p">[</span><span class="n">moduleFile</span><span class="p">]</span> <span class="p">=</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                    <span class="p">}</span>
</span></span><span class="line"><span class="cl">                <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="n">frames</span><span class="p">[--</span><span class="n">currentFrame</span><span class="p">]</span> <span class="p">=</span> <span class="k">new</span> <span class="n">SymbolicFrame</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="n">codeAddress</span><span class="p">.</span><span class="n">Address</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">codeAddress</span><span class="p">.</span><span class="n">FullMethodName</span>
</span></span><span class="line"><span class="cl">            <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">callStack</span> <span class="p">=</span> <span class="n">callStack</span><span class="p">.</span><span class="n">Caller</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">_stackCount</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">    <span class="n">_stacks</span><span class="p">.</span><span class="n">AddStack</span><span class="p">(</span><span class="n">frames</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>MergedSymbolicStack.AddStack()</code> method is doing the real merging. The idea of merging call stacks is to start from the bottom and if the frame has already been seen (at this position), increment its sampling count. If not, remember it before incrementing the count. Look at the next frame and do the same match/remember + increment up to the top of the stack.</p>
<p>Here is an animation of what it would look like on a piece of paper (like the one I wrote down before starting to write the C# implementation :^)</p>
<p><img loading="lazy" src="/posts/2020-12-08_build-your-own-net/1_31F18a8E4cGDevn4-V_pog.gif"></p>
<p>Here is the corresponding C# code to merge a stack (i.e. an array of frames)</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">void</span> <span class="n">AddStack</span><span class="p">(</span><span class="n">SymbolicFrame</span><span class="p">[]</span> <span class="n">frames</span><span class="p">,</span> <span class="kt">int</span> <span class="n">index</span> <span class="p">=</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_countAsNode</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">firstFrame</span> <span class="p">=</span> <span class="n">frames</span><span class="p">[</span><span class="n">index</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// search if the frame to add has already been seen</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">callstack</span> <span class="p">=</span> <span class="n">Stacks</span><span class="p">.</span><span class="n">FirstOrDefault</span><span class="p">(</span><span class="n">s</span> <span class="p">=&gt;</span> <span class="kt">string</span><span class="p">.</span><span class="n">CompareOrdinal</span><span class="p">(</span><span class="n">s</span><span class="p">.</span><span class="n">Symbol</span><span class="p">,</span> <span class="n">firstFrame</span><span class="p">.</span><span class="n">Symbol</span><span class="p">)</span> <span class="p">==</span> <span class="m">0</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// if not, we are starting a new branch</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">callstack</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">callstack</span> <span class="p">=</span> <span class="k">new</span> <span class="n">MergedSymbolicStacks</span><span class="p">(</span><span class="n">frames</span><span class="p">[</span><span class="n">index</span><span class="p">].</span><span class="n">Address</span><span class="p">,</span> <span class="n">frames</span><span class="p">[</span><span class="n">index</span><span class="p">].</span><span class="n">Symbol</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Stacks</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">callstack</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// it was the last frame of the stack</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">index</span> <span class="p">==</span> <span class="n">frames</span><span class="p">.</span><span class="n">Length</span> <span class="p">-</span> <span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">callstack</span><span class="p">.</span><span class="n">_countAsLeaf</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">callstack</span><span class="p">.</span><span class="n">AddStack</span><span class="p">(</span><span class="n">frames</span><span class="p">,</span> <span class="n">index</span> <span class="p">+</span> <span class="m">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Last but not least, the constructors of the class reflect how to (1) create the root instance and (2) each node in the tree:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">MergedSymbolicStacks</span><span class="p">()</span> <span class="p">:</span> <span class="k">this</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="kt">string</span><span class="p">.</span><span class="n">Empty</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// this will be the root of all stacks</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">MergedSymbolicStacks</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">frame</span><span class="p">,</span> <span class="kt">string</span> <span class="n">symbol</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Frame</span> <span class="p">=</span> <span class="n">frame</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">Symbol</span> <span class="p">=</span> <span class="n">symbol</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">_countAsNode</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">_countAsLeaf</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">Stacks</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">MergedSymbolicStacks</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The code to render the merged stack</p>
<p><img loading="lazy" src="/posts/2020-12-08_build-your-own-net/1_WIzDdkN_0nbUFiNnmKXe7A.png"></p>
<p>is not that complicated because everything is already in the tree of frames.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">RenderStack</span><span class="p">(</span><span class="n">MergedSymbolicStacks</span> <span class="n">stack</span><span class="p">,</span> <span class="n">IRenderer</span> <span class="n">visitor</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">isRoot</span><span class="p">,</span> <span class="kt">int</span> <span class="n">increment</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">alignment</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">string</span><span class="p">(</span><span class="sc">&#39; &#39;</span><span class="p">,</span> <span class="n">Padding</span> <span class="p">*</span> <span class="n">increment</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">padding</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">string</span><span class="p">(</span><span class="sc">&#39; &#39;</span><span class="p">,</span> <span class="n">Padding</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">currentFrame</span> <span class="p">=</span> <span class="n">stack</span><span class="p">.</span><span class="n">Frame</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// special root case</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">isRoot</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">visitor</span><span class="p">.</span><span class="n">WriteCount</span><span class="p">(</span><span class="s">$&#34;{Environment.NewLine}{alignment}{stack.CountAsNode, Padding} &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="n">visitor</span><span class="p">.</span><span class="n">WriteCount</span><span class="p">(</span><span class="s">$&#34;{Environment.NewLine}{alignment}{stack.CountAsLeaf + stack.CountAsNode, Padding} &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">visitor</span><span class="p">.</span><span class="n">WriteMethod</span><span class="p">(</span><span class="n">stack</span><span class="p">.</span><span class="n">Symbol</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">childrenCount</span> <span class="p">=</span> <span class="n">stack</span><span class="p">.</span><span class="n">Stacks</span><span class="p">.</span><span class="n">Count</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">childrenCount</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">visitor</span><span class="p">.</span><span class="n">WriteFrameSeparator</span><span class="p">(</span><span class="s">&#34;&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">nextStackFrame</span> <span class="k">in</span> <span class="n">stack</span><span class="p">.</span><span class="n">Stacks</span><span class="p">.</span><span class="n">OrderByDescending</span><span class="p">(</span><span class="n">s</span> <span class="p">=&gt;</span> <span class="n">s</span><span class="p">.</span><span class="n">CountAsNode</span> <span class="p">+</span> <span class="n">s</span><span class="p">.</span><span class="n">CountAsLeaf</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// increment when more than 1 children</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">childIncrement</span> <span class="p">=</span> <span class="p">(</span><span class="n">childrenCount</span> <span class="p">==</span> <span class="m">1</span><span class="p">)</span> <span class="p">?</span> <span class="n">increment</span> <span class="p">:</span> <span class="n">increment</span> <span class="p">+</span> <span class="m">1</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">RenderStack</span><span class="p">(</span><span class="n">nextStackFrame</span><span class="p">,</span> <span class="n">visitor</span><span class="p">,</span> <span class="kc">false</span><span class="p">,</span> <span class="n">childIncrement</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">increment</span> <span class="p">!=</span> <span class="n">childIncrement</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">visitor</span><span class="p">.</span><span class="n">WriteFrameSeparator</span><span class="p">(</span><span class="s">$&#34;{Environment.NewLine}{alignment}{padding}{nextStackFrame.CountAsNode + nextStackFrame.CountAsLeaf, Padding} &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="n">visitor</span><span class="p">.</span><span class="n">WriteFrameSeparator</span><span class="p">(</span><span class="s">$&#34;~~~~ &#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>IRenderer</code> interface implementations are simply changing foreground color depending on what kind of information to display:</p>
<p>I have used the same “Visitor” pattern for the <a href="https://github.com/chrisnas/DebuggingExtensions/tree/master/src/ParallelStacks.Runtime"><strong>pstack</strong></a> tool/extension for WinDBG.</p>
<h2 id="not-for-adminonly">Not for Admin only</h2>
<p>I always thought that I needed to be a member of the Administrator group and running elevated to be allowed to start a kernel profiling session. Well… This is in fact not the case! You have to dig into the documentation for <a href="https://docs.microsoft.com/en-us/windows/win32/etw/configuring-and-starting-a-systemtraceprovider-session">configuring and starting a <strong>SystemTraceProvider</strong> session</a> to read the following note:</p>
<p>If you want a non-administrators or a non-TCB process to be able to start a profiling trace session using the <code>SystemTraceProvider</code> on behalf of third party applications, then you need to grant the user profile privilege and then add this user to both the session <strong>GUID</strong> (created for the logger session) and the system trace provider <strong>GUID</strong> to enable the system trace provider. For more information, see the <a href="https://docs.microsoft.com/en-us/windows/desktop/api/Evntcons/nf-evntcons-eventaccesscontrol"><strong>EventAccessControl</strong></a> function.</p>
<p>Long story short, you need a user to be part of the <strong>Performance Log Users</strong> group (makes sense) or grant her the TRACELOG_ACCESS_REALTIME permission. Obviously, you need an administrator account to do both but this can be done once on a machine by your IT in a secure way.</p>
<p>I wrapped a managed implementation of the corresponding code to add the permission in a <code>ProfilingPermission</code> class that hides all the P/Invoke and weird marshalling stuff to the native Windows API. Simply pass a user name to <code>EnableProfileUser()</code> and it should work just fine.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">  1
</span><span class="lnt">  2
</span><span class="lnt">  3
</span><span class="lnt">  4
</span><span class="lnt">  5
</span><span class="lnt">  6
</span><span class="lnt">  7
</span><span class="lnt">  8
</span><span class="lnt">  9
</span><span class="lnt"> 10
</span><span class="lnt"> 11
</span><span class="lnt"> 12
</span><span class="lnt"> 13
</span><span class="lnt"> 14
</span><span class="lnt"> 15
</span><span class="lnt"> 16
</span><span class="lnt"> 17
</span><span class="lnt"> 18
</span><span class="lnt"> 19
</span><span class="lnt"> 20
</span><span class="lnt"> 21
</span><span class="lnt"> 22
</span><span class="lnt"> 23
</span><span class="lnt"> 24
</span><span class="lnt"> 25
</span><span class="lnt"> 26
</span><span class="lnt"> 27
</span><span class="lnt"> 28
</span><span class="lnt"> 29
</span><span class="lnt"> 30
</span><span class="lnt"> 31
</span><span class="lnt"> 32
</span><span class="lnt"> 33
</span><span class="lnt"> 34
</span><span class="lnt"> 35
</span><span class="lnt"> 36
</span><span class="lnt"> 37
</span><span class="lnt"> 38
</span><span class="lnt"> 39
</span><span class="lnt"> 40
</span><span class="lnt"> 41
</span><span class="lnt"> 42
</span><span class="lnt"> 43
</span><span class="lnt"> 44
</span><span class="lnt"> 45
</span><span class="lnt"> 46
</span><span class="lnt"> 47
</span><span class="lnt"> 48
</span><span class="lnt"> 49
</span><span class="lnt"> 50
</span><span class="lnt"> 51
</span><span class="lnt"> 52
</span><span class="lnt"> 53
</span><span class="lnt"> 54
</span><span class="lnt"> 55
</span><span class="lnt"> 56
</span><span class="lnt"> 57
</span><span class="lnt"> 58
</span><span class="lnt"> 59
</span><span class="lnt"> 60
</span><span class="lnt"> 61
</span><span class="lnt"> 62
</span><span class="lnt"> 63
</span><span class="lnt"> 64
</span><span class="lnt"> 65
</span><span class="lnt"> 66
</span><span class="lnt"> 67
</span><span class="lnt"> 68
</span><span class="lnt"> 69
</span><span class="lnt"> 70
</span><span class="lnt"> 71
</span><span class="lnt"> 72
</span><span class="lnt"> 73
</span><span class="lnt"> 74
</span><span class="lnt"> 75
</span><span class="lnt"> 76
</span><span class="lnt"> 77
</span><span class="lnt"> 78
</span><span class="lnt"> 79
</span><span class="lnt"> 80
</span><span class="lnt"> 81
</span><span class="lnt"> 82
</span><span class="lnt"> 83
</span><span class="lnt"> 84
</span><span class="lnt"> 85
</span><span class="lnt"> 86
</span><span class="lnt"> 87
</span><span class="lnt"> 88
</span><span class="lnt"> 89
</span><span class="lnt"> 90
</span><span class="lnt"> 91
</span><span class="lnt"> 92
</span><span class="lnt"> 93
</span><span class="lnt"> 94
</span><span class="lnt"> 95
</span><span class="lnt"> 96
</span><span class="lnt"> 97
</span><span class="lnt"> 98
</span><span class="lnt"> 99
</span><span class="lnt">100
</span><span class="lnt">101
</span><span class="lnt">102
</span><span class="lnt">103
</span><span class="lnt">104
</span><span class="lnt">105
</span><span class="lnt">106
</span><span class="lnt">107
</span><span class="lnt">108
</span><span class="lnt">109
</span><span class="lnt">110
</span><span class="lnt">111
</span><span class="lnt">112
</span><span class="lnt">113
</span><span class="lnt">114
</span><span class="lnt">115
</span><span class="lnt">116
</span><span class="lnt">117
</span><span class="lnt">118
</span><span class="lnt">119
</span><span class="lnt">120
</span><span class="lnt">121
</span><span class="lnt">122
</span><span class="lnt">123
</span><span class="lnt">124
</span><span class="lnt">125
</span><span class="lnt">126
</span><span class="lnt">127
</span><span class="lnt">128
</span><span class="lnt">129
</span><span class="lnt">130
</span><span class="lnt">131
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="k">class</span> <span class="nc">ProfilingPermission</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kd">const</span> <span class="kt">uint</span> <span class="n">TRACELOG_GUID_ENABLE</span> <span class="p">=</span> <span class="m">0x0080</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kd">const</span> <span class="kt">int</span> <span class="n">NO_ERROR</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>  <span class="c1">// ERROR_SUCCESS in C++</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kd">const</span> <span class="kt">int</span> <span class="n">ERROR_INSUFFICIENT_BUFFER</span> <span class="p">=</span> <span class="m">122</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// read https://docs.microsoft.com/en-us/windows/win32/etw/configuring-and-starting-a-systemtraceprovider-session </span>
</span></span><span class="line"><span class="cl">    <span class="c1">// for more details </span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">EnableProfilerUser</span><span class="p">(</span><span class="kt">string</span> <span class="n">accountName</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// Kernel provider from https://github.com/microsoft/perfview/blob/master/src/TraceEvent/Parsers/KernelTraceEventParser.cs#L43</span>
</span></span><span class="line"><span class="cl">        <span class="n">Guid</span> <span class="n">kernelProviderGuid</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Guid</span><span class="p">(</span><span class="s">&#34;{9e814aad-3204-11d2-9a82-006008a86939}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="kt">byte</span><span class="p">[]</span> <span class="n">sid</span> <span class="p">=</span> <span class="n">LookupSidByName</span><span class="p">(</span><span class="n">accountName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// from https://docs.microsoft.com/en-us/windows/win32/etw/configuring-and-starting-a-systemtraceprovider-session</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint</span> <span class="n">operation</span> <span class="p">=</span> <span class="p">(</span><span class="kt">uint</span><span class="p">)</span><span class="n">EventSecurityOperation</span><span class="p">.</span><span class="n">EventSecurityAddDACL</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint</span> <span class="n">rights</span> <span class="p">=</span> <span class="n">TRACELOG_GUID_ENABLE</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kt">bool</span> <span class="n">allowOrDeny</span> <span class="p">=</span> <span class="p">(</span><span class="s">&#34;Allow&#34;</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint</span> <span class="n">result</span> <span class="p">=</span> <span class="n">EventAccessControl</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="k">ref</span> <span class="n">kernelProviderGuid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">operation</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">sid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">rights</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="n">allowOrDeny</span>
</span></span><span class="line"><span class="cl">        <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">result</span> <span class="p">!=</span> <span class="n">NO_ERROR</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">lastErrorMessage</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Win32Exception</span><span class="p">((</span><span class="kt">int</span><span class="p">)</span><span class="n">result</span><span class="p">).</span><span class="n">Message</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">InvalidOperationException</span><span class="p">(</span><span class="s">$&#34;Failed to add ACL ({result.ToString()}) : {lastErrorMessage}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kd">static</span> <span class="kt">byte</span><span class="p">[]</span> <span class="n">LookupSidByName</span><span class="p">(</span><span class="kt">string</span> <span class="n">accountName</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">byte</span><span class="p">[]</span> <span class="n">sid</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint</span> <span class="n">cbSid</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">StringBuilder</span> <span class="n">referencedDomainName</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StringBuilder</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint</span> <span class="n">cchReferencedDomainName</span> <span class="p">=</span> <span class="p">(</span><span class="kt">uint</span><span class="p">)</span><span class="n">referencedDomainName</span><span class="p">.</span><span class="n">Capacity</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">SID_NAME_USE</span> <span class="n">sidUse</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="kt">int</span> <span class="n">err</span> <span class="p">=</span> <span class="n">NO_ERROR</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(!</span><span class="n">LookupAccountName</span><span class="p">(</span><span class="kc">null</span><span class="p">,</span> <span class="n">accountName</span><span class="p">,</span> <span class="n">sid</span><span class="p">,</span> <span class="k">ref</span> <span class="n">cbSid</span><span class="p">,</span> <span class="n">referencedDomainName</span><span class="p">,</span> <span class="k">ref</span> <span class="n">cchReferencedDomainName</span><span class="p">,</span> <span class="k">out</span> <span class="n">sidUse</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">err</span> <span class="p">=</span> <span class="n">Marshal</span><span class="p">.</span><span class="n">GetLastWin32Error</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">err</span> <span class="p">==</span> <span class="n">ERROR_INSUFFICIENT_BUFFER</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">sid</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="p">[</span><span class="n">cbSid</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">                <span class="n">referencedDomainName</span><span class="p">.</span><span class="n">EnsureCapacity</span><span class="p">((</span><span class="kt">int</span><span class="p">)</span><span class="n">cchReferencedDomainName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">                <span class="n">err</span> <span class="p">=</span> <span class="n">NO_ERROR</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="p">(!</span><span class="n">LookupAccountName</span><span class="p">(</span><span class="kc">null</span><span class="p">,</span> <span class="n">accountName</span><span class="p">,</span> <span class="n">sid</span><span class="p">,</span> <span class="k">ref</span> <span class="n">cbSid</span><span class="p">,</span> <span class="n">referencedDomainName</span><span class="p">,</span> <span class="k">ref</span> <span class="n">cchReferencedDomainName</span><span class="p">,</span> <span class="k">out</span> <span class="n">sidUse</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">                    <span class="n">err</span> <span class="p">=</span> <span class="n">Marshal</span><span class="p">.</span><span class="n">GetLastWin32Error</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">err</span> <span class="p">!=</span> <span class="n">NO_ERROR</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">lastErrorMessage</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Win32Exception</span><span class="p">(</span><span class="n">err</span><span class="p">).</span><span class="n">Message</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">InvalidOperationException</span><span class="p">(</span><span class="s">$&#34;LookupAccountName fails ({err.ToString()}) : {lastErrorMessage}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// display the SID associated to the given user</span>
</span></span><span class="line"><span class="cl">        <span class="n">IntPtr</span> <span class="n">ptrSid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(!</span><span class="n">ConvertSidToStringSid</span><span class="p">(</span><span class="n">sid</span><span class="p">,</span> <span class="k">out</span> <span class="n">ptrSid</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">err</span> <span class="p">=</span> <span class="n">Marshal</span><span class="p">.</span><span class="n">GetLastWin32Error</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">lastErrorMessage</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Win32Exception</span><span class="p">(</span><span class="n">err</span><span class="p">).</span><span class="n">Message</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;No SID string associated to user {accountName} ({err.ToString()}) : {lastErrorMessage}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">string</span> <span class="n">sidString</span> <span class="p">=</span> <span class="n">Marshal</span><span class="p">.</span><span class="n">PtrToStringAuto</span><span class="p">(</span><span class="n">ptrSid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="n">ProfilingPermission</span><span class="p">.</span><span class="n">LocalFree</span><span class="p">(</span><span class="n">ptrSid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;Account ({referencedDomainName}){accountName} mapped to {sidString}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">sid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;Sechost.dll&#34;, SetLastError = true)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">uint</span> <span class="n">EventAccessControl</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="k">ref</span> <span class="n">Guid</span> <span class="n">providerGuid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint</span> <span class="n">operation</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="na">        [MarshalAs(UnmanagedType.LPArray)]</span> <span class="kt">byte</span><span class="p">[]</span> <span class="n">Sid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="kt">uint</span> <span class="n">right</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="kt">bool</span> <span class="n">allowOrDeny</span> <span class="c1">// true means ALLOW</span>
</span></span><span class="line"><span class="cl">        <span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;kernel32.dll&#34;)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">static</span> <span class="kd">extern</span> <span class="n">IntPtr</span> <span class="n">LocalFree</span><span class="p">(</span><span class="n">IntPtr</span> <span class="n">hMem</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;advapi32.dll&#34;, SetLastError = true)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">bool</span> <span class="n">LookupAccountName</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="kt">string</span> <span class="n">systemName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="kt">string</span> <span class="n">accountName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="na">        [MarshalAs(UnmanagedType.LPArray)]</span> <span class="kt">byte</span><span class="p">[]</span> <span class="n">Sid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="k">ref</span> <span class="kt">uint</span> <span class="n">cbSid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">StringBuilder</span> <span class="n">referencedDomainName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="k">ref</span> <span class="kt">uint</span> <span class="n">cchReferencedDomainName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="k">out</span> <span class="n">SID_NAME_USE</span> <span class="n">nameUse</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;advapi32.dll&#34;, CharSet = CharSet.Auto, SetLastError = true)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">bool</span> <span class="n">ConvertSidToStringSid</span><span class="p">(</span>
</span></span><span class="line"><span class="cl"><span class="na">        [MarshalAs(UnmanagedType.LPArray)]</span> <span class="kt">byte</span><span class="p">[]</span> <span class="n">pSID</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="k">out</span> <span class="n">IntPtr</span> <span class="n">ptrSid</span><span class="p">);</span> <span class="c1">// can&#39;t be an out string because we need to explicitly call LocalFree on it;</span>
</span></span><span class="line"><span class="cl">                            <span class="c1">// the marshaller would call CoTaskMemFree in case of a string</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// from http://pinvoke.net/default.aspx/advapi32/LookupAccountName.html</span>
</span></span><span class="line"><span class="cl">    <span class="kd">enum</span> <span class="n">SID_NAME_USE</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeUser</span> <span class="p">=</span> <span class="m">1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeGroup</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeDomain</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeAlias</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeWellKnownGroup</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeDeletedAccount</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeInvalid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeUnknown</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">SidTypeComputer</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// from evntcons.h</span>
</span></span><span class="line"><span class="cl">    <span class="kd">enum</span> <span class="n">EventSecurityOperation</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">EventSecuritySetDACL</span> <span class="p">=</span> <span class="m">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">EventSecuritySetSACL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">EventSecurityAddDACL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">EventSecurityAddSACL</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">EventSecurityMax</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span> <span class="c1">// EVENTSECURITYOPERATION</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You are now ready to profile your application memory allocation patterns and CPU consumption!</p>
<hr>
<p><strong>Thanks for checking in with us again on our C# series. Like what you are reading? Head over to our latest blog posts on the topic:</strong></p>
<p><a href="https://medium.com/criteo-labs/build-your-own-net-memory-profiler-in-c-allocations-1-2-9c9f0c86cefd"><strong>Build your own .NET memory profiler in C#</strong>
*This post explains how to collect allocation details by writing your own memory profiler in C#.*medium.com</a><a href="https://medium.com/criteo-labs/build-your-own-net-memory-profiler-in-c-allocations-1-2-9c9f0c86cefd"></a><a href="/posts/2020-05-18_build-your-own-net/"><strong>Build your own .NET memory profiler in C# — call stacks (2/2–1)</strong>
*This post explains how to get the call stack corresponding to the allocations with CLR events.*medium.com</a></p>
<hr>
<p><strong>If you are interested in joining our team, check out our open positions and apply today!</strong></p>
<p><a href="http://careers.criteo.com"><strong>Careers at Criteo | Criteo jobs</strong>
*Find opportunities everywhere. ​Choose your next challenge. Find the job opportunities at Criteo in Product, research &amp;…*careers.criteo.com</a><a href="http://careers.criteo.com"></a></p>
]]></content:encoded></item><item><title>The .NET Core Journey at Criteo</title><link>https://chrisnas.github.io/posts/2020-07-31_the-net-core-journey/</link><pubDate>Fri, 31 Jul 2020 15:55:50 +0000</pubDate><guid>https://chrisnas.github.io/posts/2020-07-31_the-net-core-journey/</guid><description>This post shows the challenges we faced during the migration to .NET Core on containerized Linux for our main application.</description><content:encoded><![CDATA[<hr>
<p><img loading="lazy" src="/posts/2020-07-31_the-net-core-journey/1_WFbx_DPjik2EQydCiAahUA.jpeg"></p>
<h2 id="introduction">Introduction</h2>
<p>When I arrived at Criteo in late 2016, I joined the .NET Core “guild” (i.e. group of people from different teams dedicated to a specific topic). The first meeting I attended included Microsoft folks led by Scott Hunter (head of .NET program management) and including David Fowler (SignalR and ASP.NET Core). The goal for Criteo was simple: Moving a set of C# applications from Windows/.NET Framework to Linux/.NET Core. I guess that for Microsoft we were a customer with workloads that could be interesting to support with .NET Core. At that time, I did not realize how strong their commitment to work with us was. Our Open Source mindset was the selling point.</p>
<p>How complicated could it be? Well… this post will show you the challenges that we had to face to run, monitor and debug our applications.</p>
<hr>
<h2 id="try-it">Try it</h2>
<p>Once we got a build of all .NET Core assemblies (more on this in a forthcoming blog post), it was time to run a few applications. The first issues that we faced were related to missing features between .NET Framework and .NET Core. For example, we need cryptography support of <a href="https://github.com/dotnet/corefx/issues/4647">3DES and AES with cypher mode CFB</a> but it is (still) not available in .NET Core for Linux. Thanks to the Open Source status of .NET Core, we were able to <a href="https://github.com/criteo-forks/corefx/tree/aes_3des_cfb_mode_implementation_unix">add it to CoreFx</a>. However, since we did not implement it on MacOS/Windows as Microsoft requested for our change to be accepted as a Pull Request, we had to keep our Criteo-forked branch.</p>
<p>The second class of runtime problems we had to solve were due to differences between Windows and Linux but also with the “containerization” of the runtime environment. Let’s take two examples involving the .NET Garbage Collector. First, our containers were using Linux cgroups to manage quotas including memory and number of CPU cores usable by applications. However, at CLR startup, the GC was counting the <strong>total</strong> count of CPU cores to compute the number of heaps to allocate instead of the one defined at the cgroup level: We ended up with instant Out Of Memory automatic killing. This time our fix was done and merged in the CLR repository.</p>
<p>The second example is related to a GC optimization: During background generation 2 collections, the CLR threads working underneath are affinitized to each different CPU core to avoid locks. We were lucky enough to welcome <a href="https://twitter.com/@maoni0">Maoni Stephens</a> (Lead Dev on the GC) in our Paris office early 2018 to share our weird allocation patterns that impacted the GC. During her stay, she was kind enough to help us investigate a behavior on our servers: When <a href="https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer?WT.mc_id=DT-MVP-5003325">SysInternals ProcessExplorer</a> was running, the garbage collections were taking more time than usual. Maoni found out ProcessExplorer had an affinitized high priority thread conflicting with GC threads. During investigations related to longer response time on Linux compared to Windows. We realized that GC threads were not affinitized like it was the case on Windows and the issue was <a href="https://github.com/dotnet/coreclr/pull/24801">fixed by Jan Vorlicek</a>.</p>
<p><em>Here is our lesson: Sometimes fixes are merged into the official release and sometimes they are not. If your workloads are pushing .NET to its limits, you will probably have to build and manage your <em><a href="https://github.com/criteo-forks/coreclr"><em>own Core fork</em></a></em> and make it available to your deployments.</em></p>
<h2 id="monitor-it">Monitor it</h2>
<p>At Criteo, our Grafana dashboards measuring .NET Framework application health were based on metrics computed from Windows performance counters. Even without going to Linux, .NET Core is no more exposing performance counters so we had to entirely rebuild our metrics collection system!</p>
<p>Based on Microsoft feedbacks, we decided to listen to CLR events emitted via ETW on Windows and LTTng on Linux. In addition to work for both Operating Systems, these events are also providing accurate details about thread contention, exceptions and garbage collections not available with Performance counters. Please refer to our <a href="/posts/2019-10-17_how-to-expose-your/">series of blog posts</a> for more details and reusable code samples to integrate these events into your own systems.</p>
<p>Our first Linux metrics collection implementation was based on LTTng and we presented our journey during the <a href="https://www.youtube.com/watch?v=pMl9RM9h2eg&amp;list=PLuo4E47p5_7bfeZyYIyNYM-f-2tmr0neu&amp;index=6">Tracing Summit in 2017</a>. Microsoft already built <a href="https://github.com/microsoft/perfview/blob/master/documentation/TraceEvent/TraceEventLibrary.md">TraceEvent</a>, an assembly allowing .NET code to parse CLR events for both Windows and Linux. Unfortunately for us, the Linux part was only able to load traces files but we needed live session like on Windows where you can listen to events emitted by running applications. Since this code is Open Source, <a href="https://twitter.com/@GregoryLeocadie">Gregory</a> was able to add the <a href="https://github.com/microsoft/perfview/pull/340">live session feature</a> to TraceEvent.</p>
<p>With .NET Core 3.0 Microsoft provided a way to exchange events common to Linux and Windows called EventPipes. So… we moved our collection implementation from LTTng to EventPipe (look at our <a href="/posts/2019-10-17_how-to-expose-your/">blog series</a> and <a href="https://www.youtube.com/watch?v=Jpoy3O6x-wM">DotNext conference session</a> for more details and reusable code sample). With the new EventPipe implementation in the CLR came performance issues not seen by Microsoft. The reason is simple: Some of our applications are running hundreds of threads to process thousands of requests per second and allocate memory like crazy. In that kind of context, the CLR has a lot to do and so, has a lot of events to generate and emit via LTTng or EventPipes.</p>
<p><img loading="lazy" src="/posts/2020-07-31_the-net-core-journey/1_P_gRXkTbBaDLhsQSnWDJIQ.png"></p>
<p>The initial implementation was <a href="https://github.com/dotnet/runtime/issues/12204">lacking some</a> filtering and too many events were generated or expensive event payload was created even though the events were not emitted. Based on our feedback, the Microsoft Diagnostic team was very responsive and quickly fixed the problem.</p>
<p><em>Microsoft did not “just” move to Open Source, the teams are working deeply integrated with the issue/pull request model of GitHub. So don’t be shy and if you find a problem, create an issue with a detailed reproduction and even better, provide a pull request with the fix. Everyone in the community will benefit!</em></p>
<h2 id="run-it">Run it</h2>
<p>With these metrics, we started to investigate some performance differences (mostly response time) between Windows and containerized Linux.</p>
<p><img loading="lazy" src="/posts/2020-07-31_the-net-core-journey/1_h41QfdE5wVef3pD8DZ6twA.png"></p>
<p>We saw a huge performance difference on Linux: Both response time (x2) and scalability (timeout increase with QPS). Our team spent a lot of time to improve the situation up to the point where it was possible to send the applications to production.</p>
<p>In the new containerized environment we faced the same kind of <em>noisy neighbor</em> symptoms that we had with Process Explorer. If the CPU cores are not dedicated to a container (as it was for us at the beginning), this scenario happens a lot. So we updated the scheduling system to dedicate CPU cores to containers.</p>
<p>On a totally different area, we found out that the way .NET Core handles network I/O continuation had an impact on our main application. To give a bit of context, this application has to handle a lot of requests and is response-time driven. During the processing of a request, the current thread might have to send an HTTP request before continuing its processing. Since this is done asynchronously, the thread is now available to process more incoming requests and this is good for throughput. However, it means that when the inner HTTP request comes back, all available threads might be processing new incoming requests and it will take time to complete the old one. The net effect is to increase the median response time and this is not something we want!</p>
<p>The .NET Core implementation is relying on the .NET ThreadPool that shares its threads with all the async/await magic and the incoming requests processing (The .NET Framework implementation is using a totally different implementation based on I/O completion ports on Windows). To solve the issue, <a href="https://twitter.com/KooKiz">Kevin</a> <a href="https://github.com/criteo-forks/corefx/commit/dda2c4d80fd2d74b3dc7e0833e2a6794f1e290d3">implemented a custom thread pool</a> to handle network I/O and we keep on <a href="https://github.com/criteo-forks/corefx/commit/2acc917aef47798243cc221afc9b360c86ed60b7">optimizing it</a>. When you work on this kind of deep area of code-shared by so many different workloads, you realize that it is impossible to find the silver bullet.</p>
<h2 id="debug-it">Debug it</h2>
<p>What would you do if something would go wrong in an application? On Windows, with Visual Studio, we are able to remote debug a rogue application to set a breakpoint, look at fields and properties or even have a high-level view of what threads are doing with the ParallelStacks view. In the worst case, SysInternals <a href="https://docs.microsoft.com/en-us/visualstudio/debugger/remote-debugging-dotnet-core-linux-with-ssh?WT.mc_id=DT-MVP-5003325&amp;view=vs-2019">procdump </a>allows us to take a snapshot of the application and analyze it on our developer’s machine with WinDBG or Visual Studio.</p>
<p>In terms of remote debugging a Linux application, Microsoft provides an <a href="https://docs.microsoft.com/en-us/visualstudio/debugger/remote-debugging-dotnet-core-linux-with-ssh?WT.mc_id=DT-MVP-5003325">SSH-based solution</a> to attach to a running application. However, for security reasons, it is not allowed to run an SSH server in our Criteo containers. <em>The solution was to implement the communication protocol with VsDbg for Linux on top of WebSockets.</em></p>
<p><img loading="lazy" src="/posts/2020-07-31_the-net-core-journey/1_dBiRXngqIZIMqAyQQ1PryA.png"></p>
<p>Well… this was not enough. Hosting architecture (Marathon and Mesos in our case) ensures that applications in containers are running smoothly by sending requests to <em>health check</em> endpoints. If the application replies that everything is fine, then the container is safe. If the application does not answer as expected (including retries), then Marathon/Mesos kills the application and cleans up the container. Now think about what will happen if you set a breakpoint in the application and you dig into the data structures content in Visual Studio Watch/Quick Watch panels for a few minutes. Behind the scene, the debugger has to freeze all application threads, including the ones from the thread pool responsible to answer health checks. As you have probably guessed already, the debugging session will not end well.</p>
<p>This is why the previous figure shows an arrow between Marathon and the Remote Debugger which acts as a proxy for the application health check. When a debugging session starts (i.e. when the WebSockets code executes the protocol), the Remote Debugger knows that it should answer OK instead of calling the application endpoint that might never answer.</p>
<p>When remote debugging is not enough, how do you take a memory snapshot of the application? For example, if the health check does not answer after a series of retry, the Remote Debugger is calling the <a href="https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/xplat-minidump-generation.md">createdump tool</a> installed with the .NET Core runtime to generate a dump file. Again, since the memory dump creation of 40+ GB applications could take several minutes, the same health check proxy mechanism has been put in place.</p>
<p>Once the dump file is created, the remote debugger let Marathon kill the application. But wait! This is not enough because in that case, the container will be cleaned up and the disk storage will disappear. Not a problem, after a dump has been generated by createdump, the file is sent to a “Dump Navigator” application (one per data center). This application is providing a simple HTML user interface to get high-level details of the application state such as thread stacks or managed heap content.</p>
<p><img loading="lazy" src="/posts/2020-07-31_the-net-core-journey/1_jcwiOFsn6SN305A_f_vuxQ.png"></p>
<p>On Windows, we have built our own set of <a href="https://github.com/chrisnas/DebuggingExtensions/blob/master/Documentation/gsose.md">extension commands</a> that allow us to investigate memory, threadpool starvation, thread contention, or timer leak scenarios in a Windows memory dump with WinDBG as shown during this <a href="https://www.youtube.com/watch?v=biDJkJ4L_K8">NDC Oslo conference session</a>. Note that they are also <a href="https://github.com/kevingosse/LLDB-LoadManaged">usable with LLDB</a> on Linux. These commands are leveraging the <a href="https://github.com/microsoft/clrmd">ClrMD Microsoft library</a> that gives you access to a live process or a memory dump in C#. Thanks to the Linux support that has been added to this library by Microsoft developers, it was easy to reuse the code into our Dump Navigator application. I definitively recommend to look at the API provided by ClrMD to automate and build your own tools. The <a href="/posts/2019-12-31_getting-another-view-on/">long Criteo blog series</a> is a good start in addition to my <a href="https://www.youtube.com/watch?v=O8c5WwfbGFU">DotNext conference session</a>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Even though some of our main applications moved to .NET Core running on containerized Linux with a large set of monitoring/debugging tools, the journey is not over. We are now testing the preview of .NET Core 5.0 (like we did for 3.0) to check if it supports Criteo specific needs. If this is not the case, we will figure out why and find solutions to integrate into the code. Same for the tools: I have started to <a href="https://github.com/dotnet/diagnostics/pull/1376">add our extension commands</a> to Microsoft dotnet-dump CLI tool used to analyze both Windows and Linux dumps.</p>
<p>At least we could say that we not only helped ourselves but also Microsoft to understand how far .NET Core could go and even the whole .NET Windows and Linux community. This is where Open Source shines!</p>
<hr>
<p><strong>Stay tuned for the next article in our mini-series. Don’t forget to head over to our previous articles of this journey:</strong></p>
<p><a href="https://medium.com/criteo-labs/migrating-arbitrage-to-apache-mesos-3f474179ec0b"><strong>Migrating Arbitrage to Apache Mesos</strong>
*Lessons learned from migrating our largest application to our container platform.*medium.com</a><a href="https://medium.com/criteo-labs/migrating-arbitrage-to-apache-mesos-3f474179ec0b"></a><a href="https://medium.com/criteo-labs/moving-net-to-linux-at-scale-d8ff49b42661"><strong>Moving .NET to Linux at Scale</strong>
*The story of a multi-year migration: How we changed Criteo’s whole foundation.*medium.com</a><a href="https://medium.com/criteo-labs/moving-net-to-linux-at-scale-d8ff49b42661"></a></p>
<hr>
<p><strong>Interested in joining the challenge? Head over to our career site!</strong></p>
<p><a href="https://careers.criteo.com/working-in-R&amp;D"><strong>Product, Research &amp; Development | Criteo Careers</strong>
careers.criteo.com</a><a href="https://careers.criteo.com/working-in-R&amp;D"></a></p>
]]></content:encoded></item><item><title>Build your own .NET memory profiler in C# — call stacks (2/2–2)</title><link>https://chrisnas.github.io/posts/2020-06-19_build-your-own-net/</link><pubDate>Fri, 19 Jun 2020 09:32:16 +0000</pubDate><guid>https://chrisnas.github.io/posts/2020-06-19_build-your-own-net/</guid><description>In this last episode I detail how to transform addresses from the stack into methods name and signature.</description><content:encoded><![CDATA[<hr>
<p>In the past two episodes of this series I have explained how to <a href="/posts/2020-04-18_build-your-own-net/">get a sampling of .NET application allocations</a> and <a href="/posts/2020-05-18_build-your-own-net/">one way to get the call stack</a> corresponding to the allocations; all with CLR events. In this last episode, I will detail how to transform addresses from the stack into methods name and possibly signature.</p>
<h2 id="from-managed-address-to-method-signature">From managed address to method signature</h2>
<p>In order to transform an address on the stack into a managed method name, you need to know where in memory (i.e. at which address) is stored the method JITted assembly code and what is its size:</p>
<p><img loading="lazy" src="/posts/2020-06-19_build-your-own-net/1_v73Nx1IxWIEQ3NzDZF0rsQ.png"></p>
<p>For each JITted method, the <code>MethodLoadVerbose</code>/<code>MethodDCStartVerboseV2</code> events are providing this information in addition to 3 properties to rebuild the full method name and signature (more on this later). I’m storing each method description as a <code>MethodInfo</code> into a <code>MethodStore</code> per process.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">PerProcessProfilingState</span> <span class="p">:</span> <span class="n">IDisposable</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">MethodStore</span><span class="p">&gt;</span> <span class="n">_methods</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">MethodStore</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">MethodStore</span> <span class="p">:</span> <span class="n">IDisposable</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// JITed methods information (start address + size + signature)</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">MethodInfo</span><span class="p">&gt;</span> <span class="n">_methods</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The only interesting part of the <code>MethodInfo</code> class is the computation of the full method name stored in the <code>_fullName</code> field:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">MethodInfo</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="kt">ulong</span> <span class="n">_startAddress</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="kt">int</span> <span class="n">_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="kt">string</span> <span class="n">_fullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>ComputeFullName</code> helper merges together the 3 properties given by the <code>MethodxxxVerbose</code> events including special processing for constructors:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">string</span> <span class="n">ComputeFullName</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">startAddress</span><span class="p">,</span> <span class="kt">string</span> <span class="n">namespaceAndTypeName</span><span class="p">,</span> <span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">string</span> <span class="n">signature</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">fullName</span> <span class="p">=</span> <span class="n">signature</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// constructor case: name = .ctor | namespaceAndTypeName = A.B.typeName | signature = ...  (parameters)</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// --&gt; A.B.typeName(parameters)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">name</span> <span class="p">==</span> <span class="s">&#34;.ctor&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="s">$&#34;{namespaceAndTypeName}{ExtractParameters(signature)}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// general case: name = Foo | namespaceAndTypeName = A.B.typeName | signature = ...  (parameters)</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// --&gt; A.B.Foo(parameters)</span>
</span></span><span class="line"><span class="cl">    <span class="n">fullName</span> <span class="p">=</span> <span class="s">$&#34;{namespaceAndTypeName}.{name}{ExtractParameters(signature)}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">fullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">string</span> <span class="n">ExtractTypeName</span><span class="p">(</span><span class="kt">string</span> <span class="n">namespaceAndTypeName</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">pos</span> <span class="p">=</span> <span class="n">namespaceAndTypeName</span><span class="p">.</span><span class="n">LastIndexOf</span><span class="p">(</span><span class="s">&#34;.&#34;</span><span class="p">,</span> <span class="n">StringComparison</span><span class="p">.</span><span class="n">Ordinal</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pos</span> <span class="p">==</span> <span class="p">-</span><span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">namespaceAndTypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="c1">// skip the .</span>
</span></span><span class="line"><span class="cl">    <span class="n">pos</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">namespaceAndTypeName</span><span class="p">.</span><span class="n">Substring</span><span class="p">(</span><span class="n">pos</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Only the parameters (not the return type) are extracted from the “return type SPACE SPACE (parameters)” signature format:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">string</span> <span class="n">ExtractParameters</span><span class="p">(</span><span class="kt">string</span> <span class="n">signature</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">pos</span> <span class="p">=</span> <span class="n">signature</span><span class="p">.</span><span class="n">IndexOf</span><span class="p">(</span><span class="s">&#34;  (&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">pos</span> <span class="p">==</span> <span class="p">-</span><span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="s">&#34;(???)&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// skip double space</span>
</span></span><span class="line"><span class="cl">    <span class="n">pos</span> <span class="p">+=</span> <span class="m">2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">parameters</span> <span class="p">=</span> <span class="n">signature</span><span class="p">.</span><span class="n">Substring</span><span class="p">(</span><span class="n">pos</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">parameters</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>With the starting address and the size of each JITted methods, it is easy to find the one corresponding to a given address on the stack: look for the <code>MethodInfo</code> where this address could be between the start address and the start address + the code size:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">string</span> <span class="n">GetFullName</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_cache</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">address</span><span class="p">,</span> <span class="k">out</span> <span class="kt">var</span> <span class="n">fullName</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">fullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="c1">// look for managed methods</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">_methods</span><span class="p">.</span><span class="n">Count</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">method</span> <span class="p">=</span> <span class="n">_methods</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">        
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">((</span><span class="n">address</span> <span class="p">&gt;=</span> <span class="n">method</span><span class="p">.</span><span class="n">StartAddress</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">address</span> <span class="p">&lt;</span> <span class="n">method</span><span class="p">.</span><span class="n">StartAddress</span> <span class="p">+</span> <span class="p">(</span><span class="kt">ulong</span><span class="p">)</span><span class="n">method</span><span class="p">.</span><span class="n">Size</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">fullName</span> <span class="p">=</span> <span class="n">method</span><span class="p">.</span><span class="n">FullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="n">_cache</span><span class="p">[</span><span class="n">address</span><span class="p">]</span> <span class="p">=</span> <span class="n">fullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">fullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// look for native methods</span>
</span></span><span class="line"><span class="cl">    <span class="n">fullName</span> <span class="p">=</span> <span class="n">GetNativeMethodName</span><span class="p">(</span><span class="n">address</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">_cache</span><span class="p">[</span><span class="n">address</span><span class="p">]</span> <span class="p">=</span> <span class="n">fullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">fullName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>For performance sake, the <code>_cache</code> dictionary property speeds up the process by keeping track of the address/full name mappings.</p>
<p>It is now time to look at the details of the <code>GetNativeMethodName</code> helper that takes care of the native functions scenario.</p>
<h2 id="the-native-part-of-the-symbolsstory">The native part of the symbols story</h2>
<p>Unlike for JITted methods, the CLR does not send events to describe native functions even for the CLR itself. Instead, you need to find a way to map a call stack address to a native function by yourself. Unlike Perfview, I will be using the <strong>dbghelp</strong> native API instead of <strong>DIA</strong> mostly because my scenario is to get the stacks while the applications are still running:</p>
<p><img loading="lazy" src="/posts/2020-06-19_build-your-own-net/1_hCkVFS_bkxBJUtq-pAywSQ.png"></p>
<p>After reading the <a href="https://docs.microsoft.com/en-us/archive/msdn-magazine/2002/march/under-the-hood-improved-error-reporting-with-dbghelp-5-1-apis?WT.mc_id=DT-MVP-5003325">march 2002 MSDN article about DBGHELP</a> by Matt Pietrek, the updated symbols <a href="https://docs.microsoft.com/en-us/windows/win32/debug/dbghelp-functions#symbol-handler?WT.mc_id=DT-MVP-5003325">related Microsoft Docs</a> and the dbghelp.h include a file from the Windows SDK, I wrote a C# wrapper around the dbghelp function needed to get a method name from an address in a process address space:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">internal</span> <span class="kd">static</span> <span class="k">class</span> <span class="nc">NativeDbgHelp</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// from C:\Program Files (x86)\Windows Kits\10\Debuggers\inc\dbghelp.h</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">const</span> <span class="kt">uint</span> <span class="n">SYMOPT_UNDNAME</span> <span class="p">=</span> <span class="m">0x00000002</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">const</span> <span class="kt">uint</span> <span class="n">SYMOPT_DEFERRED_LOADS</span> <span class="p">=</span> <span class="m">0x00000004</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [StructLayout(LayoutKind.Sequential)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">struct</span> <span class="nc">SYMBOL_INFO</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">SizeOfStruct</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">TypeIndex</span><span class="p">;</span>      <span class="c1">// Type Index of symbol</span>
</span></span><span class="line"><span class="cl">        <span class="kd">private</span> <span class="kt">ulong</span> <span class="n">Reserved1</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kd">private</span> <span class="kt">ulong</span> <span class="n">Reserved2</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">Index</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">Size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">ModBase</span><span class="p">;</span>       <span class="c1">// Base Address of module containing this symbol</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">Flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">Value</span><span class="p">;</span>         <span class="c1">// Value of symbol, ValuePresent should be 1</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">Address</span><span class="p">;</span>       <span class="c1">// Address of symbol including base address of module</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">Register</span><span class="p">;</span>       <span class="c1">// register holding value or pointer to value</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">Scope</span><span class="p">;</span>          <span class="c1">// scope of the symbol</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">Tag</span><span class="p">;</span>            <span class="c1">// pdb classification</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">NameLen</span><span class="p">;</span>        <span class="c1">// Actual length of name</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">uint</span> <span class="n">MaxNameLen</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="na">        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 1024)]</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="kt">string</span> <span class="n">Name</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;dbghelp.dll&#34;, SetLastError = true)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">bool</span> <span class="n">SymInitialize</span><span class="p">(</span><span class="n">IntPtr</span> <span class="n">hProcess</span><span class="p">,</span> <span class="kt">string</span> <span class="n">userSearchPath</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">invadeProcess</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;dbghelp.dll&#34;, SetLastError = true)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">uint</span> <span class="n">SymSetOptions</span><span class="p">(</span><span class="kt">uint</span> <span class="n">symOptions</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;dbghelp.dll&#34;, SetLastError = true, CharSet = CharSet.Ansi)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">ulong</span> <span class="n">SymLoadModule64</span><span class="p">(</span><span class="n">IntPtr</span> <span class="n">hProcess</span><span class="p">,</span> <span class="n">IntPtr</span> <span class="n">hFile</span><span class="p">,</span> <span class="kt">string</span> <span class="n">imageName</span><span class="p">,</span> <span class="kt">string</span> <span class="n">moduleName</span><span class="p">,</span> <span class="kt">ulong</span> <span class="n">baseOfDll</span><span class="p">,</span> <span class="kt">uint</span> <span class="n">sizeOfDll</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// use ANSI version to ensure the right size of the structure </span>
</span></span><span class="line"><span class="cl">    <span class="c1">// read https://docs.microsoft.com/en-us/windows/win32/api/dbghelp/ns-dbghelp-symbol_info</span>
</span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;dbghelp.dll&#34;, SetLastError = true, CharSet = CharSet.Ansi)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">bool</span> <span class="n">SymFromAddr</span><span class="p">(</span><span class="n">IntPtr</span> <span class="n">hProcess</span><span class="p">,</span> <span class="kt">ulong</span> <span class="n">address</span><span class="p">,</span> <span class="k">out</span> <span class="kt">ulong</span> <span class="n">displacement</span><span class="p">,</span> <span class="k">ref</span> <span class="n">SYMBOL_INFO</span> <span class="n">symbol</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="na">
</span></span></span><span class="line"><span class="cl"><span class="na">    [DllImport(&#34;dbghelp.dll&#34;, SetLastError = true)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">bool</span> <span class="n">SymCleanup</span><span class="p">(</span><span class="n">IntPtr</span> <span class="n">hProcess</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that you will need to download the dbghelp.dll (SymSrv.dll if needed) from <a href="https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk/?WT.mc_id=DT-MVP-5003325">the Windows SDK</a> and copy it next to your memory profiler binaries.</p>
<p>The usage of the dbghelp API is straightforward. First, for each new process, call <code>SymSetOptions</code>/<code>SymInitialize</code>** **with a handle of the process:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">bool</span> <span class="n">SymInitialize</span><span class="p">(</span><span class="n">IntPtr</span> <span class="n">hProcess</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// read https://docs.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symsetoptions for more details</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// maybe SYMOPT_NO_PROMPTS and SYMOPT_FAIL_CRITICAL_ERRORS could be used</span>
</span></span><span class="line"><span class="cl">    <span class="n">NativeDbgHelp</span><span class="p">.</span><span class="n">SymSetOptions</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">NativeDbgHelp</span><span class="p">.</span><span class="n">SYMOPT_DEFERRED_LOADS</span> <span class="p">|</span>   <span class="c1">// performance optimization</span>
</span></span><span class="line"><span class="cl">        <span class="n">NativeDbgHelp</span><span class="p">.</span><span class="n">SYMOPT_UNDNAME</span>            <span class="c1">// C++ names are not mangled</span>
</span></span><span class="line"><span class="cl">        <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// https://docs.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-syminitialize</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// search path for symbols:</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//   - The current working directory of the application</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//   - The _NT_SYMBOL_PATH environment variable</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//   - The _NT_ALTERNATE_SYMBOL_PATH environment variable</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// passing false as last parameter means that we will need to call SymLoadModule64 </span>
</span></span><span class="line"><span class="cl">    <span class="c1">// each time a module is loaded in the process</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">NativeDbgHelp</span><span class="p">.</span><span class="n">SymInitialize</span><span class="p">(</span><span class="n">hProcess</span><span class="p">,</span> <span class="kc">null</span><span class="p">,</span> <span class="kc">false</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">IntPtr</span> <span class="n">BindToProcess</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_process</span> <span class="p">=</span> <span class="n">Process</span><span class="p">.</span><span class="n">GetProcessById</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(!</span><span class="n">SymInitialize</span><span class="p">(</span><span class="n">_process</span><span class="p">.</span><span class="n">Handle</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">IntPtr</span><span class="p">.</span><span class="n">Zero</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">_process</span><span class="p">.</span><span class="n">Handle</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">catch</span> <span class="p">(</span><span class="n">Exception</span> <span class="n">x</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;Error while binding pid #{pid} to DbgHelp:&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="n">x</span><span class="p">.</span><span class="n">Message</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">IntPtr</span><span class="p">.</span><span class="n">Zero</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In the case of protected processes, <code>Process.GetProcessById</code> might throw an exception. The <code>_hProcess</code> field storing the process handle will be cleaned up in the <code>IDisposible.Dispose</code> implementation of the <code>MethodStore</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">void</span> <span class="n">Dispose</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_hProcess</span> <span class="p">==</span> <span class="n">IntPtr</span><span class="p">.</span><span class="n">Zero</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">_hProcess</span> <span class="p">=</span> <span class="n">IntPtr</span><span class="p">.</span><span class="n">Zero</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">_process</span><span class="p">.</span><span class="n">Dispose</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>After a process has been bound, each time one of its modules is loaded, <code>SymLoadModule64</code> must be called. You can be notified of such a loaded module by enabling the Kernel provider with the <code>ImageLoad</code> keyword.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">session</span><span class="p">.</span><span class="n">EnableKernelProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">ImageLoad</span> <span class="p">|</span>
</span></span><span class="line"><span class="cl">    <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Process</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">None</span>
</span></span><span class="line"><span class="cl"><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The handler attached to the <code>ImageLoaded</code> event will be called each time a dll gets loaded.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">SetupListeners</span><span class="p">(</span><span class="n">ETWTraceEventSource</span> <span class="n">source</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// get notified when a module is load to map the corresponding symbols</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">Kernel</span><span class="p">.</span><span class="n">ImageLoad</span> <span class="p">+=</span> <span class="n">OnImageLoad</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">const</span> <span class="kt">int</span> <span class="n">ERROR_SUCCESS</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnImageLoad</span><span class="p">(</span><span class="n">ImageLoadTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">FilterOutEvent</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">GetProcessMethods</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">).</span><span class="n">AddModule</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">FileName</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">ImageBase</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">ImageSize</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">void</span> <span class="n">AddModule</span><span class="p">(</span><span class="kt">string</span> <span class="n">filename</span><span class="p">,</span> <span class="kt">ulong</span> <span class="n">baseOfDll</span><span class="p">,</span> <span class="kt">int</span> <span class="n">sizeOfDll</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">baseAddress</span> <span class="p">=</span> <span class="n">NativeDbgHelp</span><span class="p">.</span><span class="n">SymLoadModule64</span><span class="p">(</span><span class="n">_hProcess</span><span class="p">,</span> <span class="n">IntPtr</span><span class="p">.</span><span class="n">Zero</span><span class="p">,</span> <span class="n">filename</span><span class="p">,</span> <span class="kc">null</span><span class="p">,</span> <span class="n">baseOfDll</span><span class="p">,</span> <span class="p">(</span><span class="kt">uint</span><span class="p">)</span><span class="n">sizeOfDll</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">baseAddress</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// should work if the same module is added more than once</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">Marshal</span><span class="p">.</span><span class="n">GetLastWin32Error</span><span class="p">()</span> <span class="p">==</span> <span class="n">ERROR_SUCCESS</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;SymLoadModule64 failed for {filename}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now everything is in place to get a native function name from an address on the stack:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">string</span> <span class="n">GetNativeMethodName</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">symbol</span> <span class="p">=</span> <span class="k">new</span> <span class="n">NativeDbgHelp</span><span class="p">.</span><span class="n">SYMBOL_INFO</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">symbol</span><span class="p">.</span><span class="n">MaxNameLen</span> <span class="p">=</span> <span class="m">1024</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">symbol</span><span class="p">.</span><span class="n">SizeOfStruct</span> <span class="p">=</span> <span class="p">(</span><span class="kt">uint</span><span class="p">)</span><span class="n">Marshal</span><span class="p">.</span><span class="n">SizeOf</span><span class="p">(</span><span class="n">symbol</span><span class="p">)</span> <span class="p">-</span> <span class="m">1024</span><span class="p">;</span>   <span class="c1">// char buffer is not counted</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// the ANSI version of SymFromAddr is called so each character is 1 byte long</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">NativeDbgHelp</span><span class="p">.</span><span class="n">SymFromAddr</span><span class="p">(</span><span class="n">_hProcess</span><span class="p">,</span> <span class="n">address</span><span class="p">,</span> <span class="k">out</span> <span class="kt">var</span> <span class="n">displacement</span><span class="p">,</span> <span class="k">ref</span> <span class="n">symbol</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">buffer</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StringBuilder</span><span class="p">(</span><span class="n">symbol</span><span class="p">.</span><span class="n">Name</span><span class="p">.</span><span class="n">Length</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// remove weird &#34;$##&#34; at the end of some symbols</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">pos</span> <span class="p">=</span> <span class="n">symbol</span><span class="p">.</span><span class="n">Name</span><span class="p">.</span><span class="n">LastIndexOf</span><span class="p">(</span><span class="s">&#34;$##&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">pos</span> <span class="p">==</span> <span class="p">-</span><span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">buffer</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="n">symbol</span><span class="p">.</span><span class="n">Name</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">            <span class="n">buffer</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="n">symbol</span><span class="p">.</span><span class="n">Name</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="n">pos</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// add offset if any</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">displacement</span> <span class="p">!=</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">buffer</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="s">$&#34;+0x{displacement}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">buffer</span><span class="p">.</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// default value is the just the address in HEX</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="s">$&#34;0x{address:x}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that I needed to remove some unexpected <strong>$##</strong> strings are the end of some symbols.</p>
<p><em><strong>This is the last episode of the series about building your own memory profiler in C#. In case you missed the first episodes, check them out on Medium:</strong></em></p>
<p><a href="https://medium.com/criteo-labs/build-your-own-net-memory-profiler-in-c-call-stacks-2-2-1-f67b440a8cc"><strong>Build your own .NET memory profiler in C# — call stacks (2/2–1)</strong>
*This post explains how to get the call stack corresponding to the allocations with CLR events.*medium.com</a><a href="https://medium.com/criteo-labs/build-your-own-net-memory-profiler-in-c-call-stacks-2-2-1-f67b440a8cc"></a><a href="/posts/2020-04-18_build-your-own-net/"><strong>Build your own .NET memory profiler in C#</strong>
*This post explains how to collect allocation details by writing your own memory profiler in C#.*medium.com</a></p>
<hr>
<h2 id="resources">Resources</h2>
<ul>
<li>Source code available <a href="https://github.com/chrisnas/ClrEvents">on Github</a>.</li>
<li>Download <a href="https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk/?WT.mc_id=DT-MVP-5003325">Debugging Tools for Windows</a> for dbghelp.dll and SymSrv.dll</li>
<li>Matt Pietrek <a href="https://docs.microsoft.com/en-us/archive/msdn-magazine/2002/march/under-the-hood-improved-error-reporting-with-dbghelp-5-1-apis?WT.mc_id=DT-MVP-5003325">article in MSDN Magazine</a> about DBGHELP</li>
<li>Dbghelp samples -<a href="http://www.debuginfo.com/examples/dbghelpexamples.html">http://www.debuginfo.com/examples/dbghelpexamples.html</a></li>
</ul>
<hr>
<p><strong>Join the crowd!</strong></p>
<p><a href="https://careers.criteo.com/"><strong>Careers at Criteo | Criteo jobs</strong>
*Find opportunities everywhere. ​Choose your next challenge. Find the job opportunities at Criteo in Product, research &amp;…*careers.criteo.com</a><a href="https://careers.criteo.com/"></a></p>
]]></content:encoded></item><item><title>Build your own .NET memory profiler in C# — call stacks (2/2–1)</title><link>https://chrisnas.github.io/posts/2020-05-18_build-your-own-net/</link><pubDate>Mon, 18 May 2020 12:07:01 +0000</pubDate><guid>https://chrisnas.github.io/posts/2020-05-18_build-your-own-net/</guid><description>This post explains how to get the call stack corresponding to the allocations with CLR events.</description><content:encoded><![CDATA[<hr>
<p>In the <a href="/posts/2020-04-18_build-your-own-net/">previous episode</a> of this series, you have seen how to get a sampling of .NET application allocations thanks to the <strong>AllocationTick</strong> and <strong>GCSampleObjectAllocation</strong>(<strong>High</strong>/<strong>Low</strong>) CLR events. However, this is often not enough to investigate unexpected memory consumption: you would need to know which part of the code is triggering the allocations. This post explains how to get the call stack corresponding to the allocations, again with CLR events.</p>
<p><img loading="lazy" src="/posts/2020-05-18_build-your-own-net/1_lYXf1qgB1ctzgi5_RKSDEw.jpeg"></p>
<h2 id="introduction">Introduction</h2>
<p>If you look carefully at the payload of the <code>TraceEvent</code> object mapped by Microsoft <strong>TraceEvent</strong> library (not my fault if they have the same name) for each CLR event, you won’t see anything related to a call stack. However, in the <strong>TraceEvent</strong> <a href="https://github.com/microsoft/perfview/blob/master/src/TraceEvent/Samples/41_TraceLogMonitor.cs#L204">sample 41</a>, the following line looks promising:</p>
<blockquote>
<p>var callStack = data.CallStack();</p>
</blockquote>
<p>with data being a <code>TraceEvent</code> object received for each CLR event!</p>
<p>This <code>CallStack</code> method is <a href="https://github.com/microsoft/perfview/blob/master/src/TraceEvent/TraceLog.cs#L10539">an extension method</a> provided by the <code>TraceLog</code> special kind of event source. You might not have noticed but I have used it in the <strong>AllocationTick</strong> code sample from the <a href="/posts/2020-04-18_build-your-own-net/">previous post</a>. This class (and many more helper classes) is doing a lot of work to :</p>
<ul>
<li>“attach” a call stack to each CLR event; i.e. a list of addresses of assembly code</li>
<li>to translate addresses into string symbols (method names or full signatures), listen to a bunch of JIT related events for managed methods (more on this later), using COM-based <a href="https://docs.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/debug-interface-access-sdk?WT.mc_id=DT-MVP-5003325?view=vs-2019">Debug Interface Access</a> (a.k.a. DIA) and <a href="https://www.nuget.org/packages/System.Reflection.Metadata"><strong>MetadataReaderProvider</strong></a>** **for native functions</li>
</ul>
<p>Notice that since events from all managed processes on the machine are handled by <code>TraceLog</code>, the internal cache for JITted methods description could consume a lot of memory. During my tests with two Visual Studio running, my test profiler consumed more than 500 MB before even handling call stacks. If you are in such an environment with multiple .NET processes, I will show how to “manually” get the same stacks (+ symbols in the next episode) with CLR events and a few methods from dbghelp.dll in a cheaper way.</p>
<p><img loading="lazy" src="/posts/2020-05-18_build-your-own-net/1_jD1PSQqxKAbjsIOdHVSS_Q.png"></p>
<p>The new provider (more on <strong>ClrRundown</strong> later), keywords and events need to be received to make all this work:</p>
<p><img loading="lazy" src="/posts/2020-05-18_build-your-own-net/1_tWU46jlpltvIqt0ieA_YCA.png"></p>
<h2 id="tracelog-the-easyway">TraceLog: the easy way</h2>
<p>As you have seen in the previous posts, the <code>TraceEventSession</code> class exposes a <code>Source</code> property of <code>ETWTraceEventSource</code> type. This source has event parsers properties from which you register handler methods that will be called when CLR events are received. Instead of directly using this source, you should wrap it with a <code>TraceLogEventSource</code> object that provides the same event parsers.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">await</span> <span class="n">Task</span><span class="p">.</span><span class="n">Factory</span><span class="p">.</span><span class="n">StartNew</span><span class="p">(()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">using</span> <span class="p">(</span><span class="n">_session</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">SetupProviders</span><span class="p">(</span><span class="n">_session</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">using</span> <span class="p">(</span><span class="n">TraceLogEventSource</span> <span class="n">source</span> <span class="p">=</span> <span class="n">TraceLog</span><span class="p">.</span><span class="n">CreateFromTraceEventSession</span><span class="p">(</span><span class="n">_session</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">SetupListeners</span><span class="p">(</span><span class="n">source</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="n">source</span><span class="p">.</span><span class="n">Process</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="whats-new-with-providers">What’s new with providers?</h2>
<p>The code for my<code>SetupProviders</code> method is a little bit different from the previous post even though no new event listeners are needed:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">SetupProviders</span><span class="p">(</span><span class="n">TraceEventSession</span> <span class="n">session</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// Note: the kernel provider MUST be the first provider to be enabled</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// If the kernel provider is not enabled, the callstacks for CLR events are still received </span>
</span></span><span class="line"><span class="cl">    <span class="c1">// but the symbols are not found (except for the application itself)</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// TraceEvent implementation details triggered when a module (image) is loaded</span>
</span></span><span class="line"><span class="cl">    <span class="n">session</span><span class="p">.</span><span class="n">EnableKernelProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">ImageLoad</span> <span class="p">|</span>
</span></span><span class="line"><span class="cl">        <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Process</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">KernelTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">None</span>
</span></span><span class="line"><span class="cl">    <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">session</span><span class="p">.</span><span class="n">EnableProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">ProviderGuid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">TraceEventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span>    <span class="c1">// this is needed in order to receive AllocationTick_V2 event</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="kt">ulong</span><span class="p">)(</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// required to receive AllocationTick events</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">GC</span> <span class="p">|</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Jit</span> <span class="p">|</span>                      <span class="c1">// Turning on JIT events is necessary to resolve JIT compiled code </span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">JittedMethodILToNativeMap</span> <span class="p">|</span><span class="c1">// This is needed if you want line number information in the stacks</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Loader</span> <span class="p">|</span>                   <span class="c1">// You must include loader events as well to resolve JIT compiled code.</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Stack</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// this provider will send events of already JITed methods</span>
</span></span><span class="line"><span class="cl">    <span class="n">session</span><span class="p">.</span><span class="n">EnableProvider</span><span class="p">(</span><span class="n">ClrRundownTraceEventParser</span><span class="p">.</span><span class="n">ProviderGuid</span><span class="p">,</span> <span class="n">TraceEventLevel</span><span class="p">.</span><span class="n">Informational</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="kt">ulong</span><span class="p">)(</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Jit</span> <span class="p">|</span>              <span class="c1">// We need JIT events to be rundown to resolve method names</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">JittedMethodILToNativeMap</span> <span class="p">|</span> <span class="c1">// This is needed if you want line number information in the stacks</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Loader</span> <span class="p">|</span>           <span class="c1">// As well as the module load events.  </span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">StartEnumeration</span>   <span class="c1">// This indicates to do the rundown now (at enable time)</span>
</span></span><span class="line"><span class="cl">        <span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><ul>
<li>The kernel provider needs to be enabled with the <strong>ImageLoad</strong> and <strong>Process</strong> keywords in order to let TraceEvent detect when a process loads “images” (i.e. dlls) and at which address (needed to convert Relative Virtual Addresses (RVA) to addresses in the address space). Note that this provider must be enabled before any other provider or your code will trigger an exception.</li>
<li>The CLR provider needs to be enabled with <strong>Jit</strong>, <strong>JittedMethodILToNativeMap</strong>, and <strong>Loader</strong> (in addition to the usual <strong>GC</strong> one).</li>
<li>The <strong>Stack</strong> keyword has to be set on the same CLR provider to receive call stacks events for “normal” CLR event (more on this later)</li>
<li>The CLR Rundown provider is enabled with the same <strong>Jit</strong>, <strong>JittedMethodILToNativeMap</strong>, and <strong>Loader</strong> keywords. That way, JIT events corresponding to <em>already</em> JITted methods will be received (not only the new ones). This is important because otherwise, you won’t be able to map these methods with the address in memory of their JITted native code in the case of processes that have been started before the profiler. This is the case for my AllocationTickProfiler sample.</li>
</ul>
<h2 id="callstacks-andsymbols">Callstacks and symbols</h2>
<p>Now, when an <strong>AllocationTick</strong> event is received, calling the <code>CallStack</code> extension method on the <code>GCAllocationTickTraceData</code> argument returns a <code>TraceCallStack</code> object. <a href="https://github.com/microsoft/perfview/blob/master/src/TraceEvent/TraceLog.cs#L7501">This class</a> is a linked list of <code>TraceCodeAddress</code> representing each stack frame (i.e. address in assembly code). These classes are at the heart of TraceEvent and Perfview callstack management. The method names and signatures are retrieved behind the scene thanks to JIT events and the <code>SymbolReader</code><a href="https://github.com/microsoft/perfview/blob/01b14294ca97b8f3bb2534624fb9cf2405881193/src/TraceEvent/Symbols/SymbolReader.cs#L21"> class</a> that digs into .pdb files.</p>
<p>You first need to initialize a <code>SymbolReader</code> instance:</p>
<ul>
<li>Set the path to find the .pdb; including the Microsoft HTTP endpoint for public .NET versions symbols,</li>
<li>Allow pdb next to the executable to be loaded.</li>
</ul>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// By default a symbol Reader uses whatever is in the _NT_SYMBOL_PATH variable.  However you can override</span>
</span></span><span class="line"><span class="cl"><span class="c1">// if you wish by passing it to the SymbolReader constructor.  Since we want this to work even if you </span>
</span></span><span class="line"><span class="cl"><span class="c1">// have not set an _NT_SYMBOL_PATH, so we add the Microsoft default symbol server path to be sure/</span>
</span></span><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">symbolPath</span> <span class="p">=</span> <span class="k">new</span> <span class="n">SymbolPath</span><span class="p">(</span><span class="n">SymbolPath</span><span class="p">.</span><span class="n">SymbolPathFromEnvironment</span><span class="p">).</span><span class="n">Add</span><span class="p">(</span><span class="n">SymbolPath</span><span class="p">.</span><span class="n">MicrosoftSymbolServerPath</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">_symbolReader</span> <span class="p">=</span> <span class="k">new</span> <span class="n">SymbolReader</span><span class="p">(</span><span class="n">_symbolLookupMessages</span><span class="p">,</span> <span class="n">symbolPath</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// By default the symbol reader will NOT read PDBs from &#39;unsafe&#39; locations (like next to the EXE)  </span>
</span></span><span class="line"><span class="cl"><span class="c1">// because hackers might make malicious PDBs. If you wish ignore this threat, you can override this</span>
</span></span><span class="line"><span class="cl"><span class="c1">// check to always return &#39;true&#39; for checking that a PDB is &#39;safe&#39;.  </span>
</span></span><span class="line"><span class="cl"><span class="n">_symbolReader</span><span class="p">.</span><span class="n">SecurityCheck</span> <span class="p">=</span> <span class="p">(</span><span class="n">path</span> <span class="p">=&gt;</span> <span class="kc">true</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Then, displaying a <code>TraceCallStack</code> from a received CLR event in a human-readable format is simple:</p>
<ul>
<li>Get one frame after the other from the linked list,</li>
<li>If the <code>CodeAddress</code> field is not cached yet, load the symbols for its module,</li>
<li>Display the <code>FullMethodName</code> field of the frame (or the address if not found).</li>
</ul>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">DumpStack</span><span class="p">(</span><span class="n">TraceCallStack</span> <span class="n">callStack</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">while</span> <span class="p">(</span><span class="n">callStack</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">codeAddress</span> <span class="p">=</span> <span class="n">callStack</span><span class="p">.</span><span class="n">CodeAddress</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">codeAddress</span><span class="p">.</span><span class="n">Method</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="kt">var</span> <span class="n">moduleFile</span> <span class="p">=</span> <span class="n">codeAddress</span><span class="p">.</span><span class="n">ModuleFile</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">moduleFile</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">Debug</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;Could not find module for Address 0x{codeAddress.Address:x}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="k">else</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">codeAddress</span><span class="p">.</span><span class="n">CodeAddresses</span><span class="p">.</span><span class="n">LookupSymbolsForModule</span><span class="p">(</span><span class="n">_symbolReader</span><span class="p">,</span> <span class="n">moduleFile</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(!</span><span class="kt">string</span><span class="p">.</span><span class="n">IsNullOrEmpty</span><span class="p">(</span><span class="n">codeAddress</span><span class="p">.</span><span class="n">FullMethodName</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;     {codeAddress.FullMethodName}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">            <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;     0x{codeAddress.Address:x}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">callStack</span> <span class="p">=</span> <span class="n">callStack</span><span class="p">.</span><span class="n">Caller</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that the first frame in the linked list is the last on the stack (i.e. last executed method).</p>
<p>As I mentioned at the beginning of the post, I have been facing OutOfMemory errors due to the TraceEvent symbols management large memory usage when a few other .NET applications were running. Let’s see how to get the call stacks in a less memory consuming way.</p>
<h2 id="manually-rebuilding-the-allocations-callstack">Manually rebuilding the allocations call stack</h2>
<p>Instead of using the call stack and symbol management provided by <code>TraceLog</code> in TraceEvent, I would prefer to manually get them. If you remember the <a href="/posts/2020-04-18_build-your-own-net/">last post</a>, thanks to <strong>GCSampledObjectAllocation</strong> CLR events, it is possible to have a sampling of the allocation size and count per process and per type. What I would like to add to the type picture is the list of call stacks leading to these allocations.</p>
<h2 id="how-to-manually-get-clr-events-callstack">How to manually get CLR events call stack</h2>
<p>The first step is to understand how to get the CLR events call stacks. If you use the <code>TraceLog</code>-based code just presented, you should see the following kind of call stack:</p>
<p><img loading="lazy" src="/posts/2020-05-18_build-your-own-net/1_oZ7Y3712aL4ELCpUixnXrw.png"></p>
<p>The <code>ETWCallout</code> <a href="https://github.com/dotnet/runtime/blob/5178041776634bfbc4f868425710e60d95f7066f/src/coreclr/src/vm/eventtrace.cpp#L4423">CLR helper function</a> is in charge of sending a special event containing the call stack of other normal events from the four supported CLR providers. If you set the <strong>Stack</strong> keyword to the CLR provider, each time an event is sent by a thread, a <strong>ClrStackWalk</strong> event will be sent just after. It means after each <strong>SampleObjectAllocation</strong> event, a <strong>ClrStackWalk</strong> event containing the call stack will be immediately received. In fact, since an application will probably be using more than one thread, it is required to do the mapping between the two events on a per-thread basis.</p>
<p>Each allocation event received by the <code>OnSampleObjectAllocation</code> handler contains the <code>ThreadID</code> property so it is easy to keep track of the last received allocation event per thread. In my case, the <code>ProcessAllocations</code> class stores this information in its <code>_perThreadLastAllocation</code> field:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">ProcessAllocations</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="n">AllocationInfo</span><span class="p">&gt;</span> <span class="n">_allocations</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">AllocationInfo</span><span class="p">&gt;</span> <span class="n">_perThreadLastAllocation</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now, each time a <strong>SampleObjectAllocation</strong> event is received, the id of the sending thread is passed to the updated<code>ProcessAllocations.AddAllocation()</code> method:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">AllocationInfo</span> <span class="n">AddAllocation</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">,</span> <span class="kt">ulong</span> <span class="n">size</span><span class="p">,</span> <span class="kt">ulong</span> <span class="n">count</span><span class="p">,</span> <span class="kt">string</span> <span class="n">typeName</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">_allocations</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">typeName</span><span class="p">,</span> <span class="k">out</span> <span class="kt">var</span> <span class="n">info</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">info</span> <span class="p">=</span> <span class="k">new</span> <span class="n">AllocationInfo</span><span class="p">(</span><span class="n">typeName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">_allocations</span><span class="p">[</span><span class="n">typeName</span><span class="p">]</span> <span class="p">=</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">info</span><span class="p">.</span><span class="n">AddAllocation</span><span class="p">(</span><span class="n">size</span><span class="p">,</span> <span class="n">count</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// the last allocation is still here without the corresponding stack</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_perThreadLastAllocation</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="k">out</span> <span class="kt">var</span> <span class="n">lastAlloc</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;no stack for the last allocation&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// keep track of the allocation for the given thread</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// --&gt; will be used when the corresponding call stack event will be received</span>
</span></span><span class="line"><span class="cl">    <span class="n">_perThreadLastAllocation</span><span class="p">[</span><span class="n">pid</span><span class="p">]</span> <span class="p">=</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>_perThreadLastAllocation</code> dictionary stores the <code>AllocationInfo</code> per thread. If an allocation happens, it is added into the dictionary. When a <strong>ClrStackWalk</strong> event is received for a given thread, the stack will be associated with the last <code>AllocationInfo</code> and removed from the dictionary. If some events are missed (it never happens during my tests but who knows), error message could be logged.</p>
<p>The <code>ClrStackWalkTraceData</code> argument received by the <strong>ClrStackWalk</strong> listener has a <code>FrameCount</code> property that returns the number of frames in the call stack. In addition, its <code>InstructionPointer()</code> method takes a frame position in the stack (starting at 0) and returns the address (in assembly code) at this position on the call stack.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnClrStackWalk</span><span class="p">(</span><span class="n">ClrStackWalkTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">FilterOutEvent</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">callstack</span> <span class="p">=</span> <span class="n">BuildCallStack</span><span class="p">(</span><span class="n">data</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">GetProcessAllocations</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">).</span><span class="n">AddStack</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ThreadID</span><span class="p">,</span> <span class="n">callstack</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">AddressStack</span> <span class="n">BuildCallStack</span><span class="p">(</span><span class="n">ClrStackWalkTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">length</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">FrameCount</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">AddressStack</span> <span class="n">stack</span> <span class="p">=</span> <span class="k">new</span> <span class="n">AddressStack</span><span class="p">(</span><span class="n">length</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// frame 0 is the last frame of the stack (i.e. last called method)</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">length</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">stack</span><span class="p">.</span><span class="n">AddFrame</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">InstructionPointer</span><span class="p">(</span><span class="n">i</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">stack</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>AddressStack</code> class returned by <code>BuildCallStack</code> stores the frames as a list of addresses so it can be stored in <code>AllocationInfo</code>.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">AddressStack</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// the first frame is the address of the last called method </span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">List</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;</span> <span class="n">_stack</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">AddressStack</span><span class="p">(</span><span class="kt">int</span> <span class="n">capacity</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_stack</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;(</span><span class="n">capacity</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// No need to override GetHashCode because we don&#39;t want to use it as a key in a dictionary</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">override</span> <span class="kt">bool</span> <span class="n">Equals</span><span class="p">(</span><span class="kt">object</span> <span class="n">obj</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">obj</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span> <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">stack</span> <span class="p">=</span> <span class="n">obj</span> <span class="k">as</span> <span class="n">AddressStack</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">stack</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span> <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">frameCount</span> <span class="p">=</span> <span class="n">_stack</span><span class="p">.</span><span class="n">Count</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">frameCount</span> <span class="p">!=</span> <span class="n">stack</span><span class="p">.</span><span class="n">_stack</span><span class="p">.</span><span class="n">Count</span><span class="p">)</span> <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">frameCount</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">_stack</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="p">!=</span> <span class="n">stack</span><span class="p">.</span><span class="n">_stack</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">IReadOnlyList</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;</span> <span class="n">Stack</span> <span class="p">=&gt;</span> <span class="n">_stack</span><span class="p">;</span> 
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">AddFrame</span><span class="p">(</span><span class="kt">ulong</span> <span class="n">address</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_stack</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">address</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This class overrides the <code>Equals</code> method for a single reason: I want to be able to detect when the “same” stack (i.e. with the exact same frame addresses) is received for a given type allocation. That way, I just need to keep a counter for each different <code>AddressStack</code> and not all call stacks in <code>AllocationInfo</code>. Remember that <code>AllocationInfo</code> is used to keep track of allocations per type:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">AllocationInfo</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="kt">string</span> <span class="n">_typeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">ulong</span> <span class="n">_size</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">ulong</span> <span class="n">_count</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">StackInfo</span><span class="p">&gt;</span> <span class="n">_stacks</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>StackInfo</code> class contains an <code>AddressStack</code> and how many times it led to this type of allocation.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">StackInfo</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">AddressStack</span> <span class="n">_stack</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">ulong</span> <span class="n">Count</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">internal</span> <span class="n">StackInfo</span><span class="p">(</span><span class="n">AddressStack</span> <span class="n">stack</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Count</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_stack</span> <span class="p">=</span> <span class="n">stack</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">AddressStack</span> <span class="n">Stack</span> <span class="p">=&gt;</span> <span class="n">_stack</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>So, when a stack event is received, <code>AddStack</code> is called on the last <code>AllocationInfo</code> for the same thread:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">void</span> <span class="n">AddStack</span><span class="p">(</span><span class="kt">int</span> <span class="n">tid</span><span class="p">,</span> <span class="n">AddressStack</span> <span class="n">stack</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_perThreadLastAllocation</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">tid</span><span class="p">,</span> <span class="k">out</span> <span class="kt">var</span> <span class="n">lastAlloc</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">lastAlloc</span><span class="p">.</span><span class="n">AddStack</span><span class="p">(</span><span class="n">stack</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">_perThreadLastAllocation</span><span class="p">.</span><span class="n">Remove</span><span class="p">(</span><span class="n">tid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The job of <code>AllocationInfo.AddStack()</code> the method is to check if a previous allocation was made with the same call stack (hence the <code>Equals</code> override). If this is the case, just increment the corresponding <code>StackInfo</code> count. Otherwise, create a new <code>StackInfo</code> for this call stack with a count set to 1.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">internal</span> <span class="k">void</span> <span class="n">AddStack</span><span class="p">(</span><span class="n">AddressStack</span> <span class="n">stack</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">info</span> <span class="p">=</span> <span class="n">GetInfo</span><span class="p">(</span><span class="n">stack</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">info</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">info</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StackInfo</span><span class="p">(</span><span class="n">stack</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">_stacks</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">info</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">info</span><span class="p">.</span><span class="n">Count</span><span class="p">++;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">StackInfo</span> <span class="n">GetInfo</span><span class="p">(</span><span class="n">AddressStack</span> <span class="n">stack</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">_stacks</span><span class="p">.</span><span class="n">Count</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">info</span> <span class="p">=</span> <span class="n">_stacks</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">stack</span><span class="p">.</span><span class="n">Equals</span><span class="p">(</span><span class="n">info</span><span class="p">.</span><span class="n">Stack</span><span class="p">))</span> <span class="k">return</span> <span class="n">info</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Knowing the address in code of each frame for all events call stack is nice but it would be much more useful to translate them into method names… You have to deal with two different cases: managed and native methods. I will cover these topics in the next episode.</p>
<h2 id="resources">Resources</h2>
<ul>
<li>Source code available <a href="https://github.com/chrisnas/ClrEvents">on Github</a>.</li>
<li>TraceEvent <a href="https://github.com/microsoft/perfview/blob/master/src/TraceEvent/Samples/41_TraceLogMonitor.cs#L204">sample 41</a> source code.</li>
</ul>
<hr>
<p>Missed the first part of this story? Check this out:</p>
<p><a href="/posts/2020-04-18_build-your-own-net/"><strong>Build your own .NET memory profiler in C#</strong>
*This post explains how to collect allocation details by writing your own memory profiler in C#.*medium.com</a></p>
<hr>
<p><strong>Interested in joining our journey? Check this out:</strong></p>
<p><a href="https://careers.criteo.com/working-in-R&amp;D"><strong>Product, Research &amp; Development | Criteo Careers</strong>
careers.criteo.com</a><a href="https://careers.criteo.com/working-in-R&amp;D"></a></p>
]]></content:encoded></item><item><title>Build your own .NET memory profiler in C# — Allocations (1/2)</title><link>https://chrisnas.github.io/posts/2020-04-18_build-your-own-net/</link><pubDate>Sat, 18 Apr 2020 09:48:53 +0000</pubDate><guid>https://chrisnas.github.io/posts/2020-04-18_build-your-own-net/</guid><description>This post explains how to collect allocation details by writing your own memory profiler in C#.</description><content:encoded><![CDATA[<hr>
<p>In a <a href="/posts/2019-05-28_spying-on-net-garbage/">previous post</a>, I explained how to get statistics about the .NET Garbage Collector such as suspension time or generation sizes. But what if you would need more details about your application allocations such as how many times instances of a given type were allocated and for what cumulated size or even the allocation rate? This post explains how to get access to such information by writing your own memory profiler. The next one will show how to collect each sampled allocation stack trace.</p>
<h2 id="introduction">Introduction</h2>
<p>I have already used commercial tools to get detailed information about allocated type instances in an application; Visual Studio Profiler, dotTrace, ANTS memory profiler, or Perfview to name a few. With these tools in mind, I started to look at the .NET profiler API documentation and it reminded me the first time I read about the .NET profiler API. It was in December 2001 in <a href="https://docs.microsoft.com/en-us/archive/msdn-magazine/2001/december/under-the-hood-the-net-profiling-api-and-the-dnprofiler-tool?WT.mc_id=DT-MVP-5003325">Matt Pietrek’s MSDN Magazine article</a> (I still have the paper version). When your application is starting, based on an environment variable, the .NET Framework (and now .NET Core) runtime is loading a profiler COM object that implements a specific <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback9-interface?WT.mc_id=DT-MVP-5003325"><strong>ICorProfilerCallback</strong></a> interface (today, runtimes are supporting the 9th version <strong>ICorProfilerCallback9</strong> interface). The methods of this interface will be called by the runtime at specific moments during the application lifetime. For example, the <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-objectallocated-method?WT.mc_id=DT-MVP-5003325"><strong>ObjectAllocated</strong></a> method is called each time an instance of a class is allocated: perfect for the job but it requires going back to COM and writing native code. Don’t be scared: I won’t go that way :^)</p>
<p><em>However, if you would like to get more details about writing your own .NET profiler in C or C++, I would recommend looking at the Microsoft ClrProfiler <em><a href="https://github.com/microsoftarchive/clrprofiler/tree/master/CLRProfiler"><em>initial .NET Framework implementation</em></a></em> and also Pavel Yosifovich DotNext session about <em><a href="https://www.youtube.com/watch?v=TqS4OEWn6hQ"><em>Writing a .NET Core cross platform profiler in an hour</em></a></em> with the corresponding (more recent and cross platform) <em><a href="https://github.com/zodiacon/DotNextMoscow2019"><em>source code</em></a></em>.</em></p>
<p>Instead, several events that are emitted by the CLR are providing interesting details:</p>
<p><img loading="lazy" src="/posts/2020-04-18_build-your-own-net/1_8RzRelU9Rgux0TJRdFhzzw.png"></p>
<p><img loading="lazy" src="/posts/2020-04-18_build-your-own-net/1_-DSm_89dq8yj8jI1aZ3ZYg.png"></p>
<p>The <strong>GCSampledObjectAllocation</strong> events payload provides a type ID instead of a plain text type name. In order to retrieve the type name given its ID, we need to listen to <strong>TypeBulkType</strong> event that contains the mapping as I described in <a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">my post about finalizers</a>. This is why the last two <strong>GCHeapAndTypeNames</strong> and <strong>Type</strong> keywords are needed.</p>
<p>Remember that if both <strong>GCSampledObjectAllocationLow</strong> and <strong>GCSampledObjectAllocationHigh</strong> keywords are set, an event will be received for EACH allocation. This could be a performance issue both for the monitored application and the profiler. I would recommend starting with either low or high (more on this later).</p>
<p>Last but not least, enabling at least one of these keywords is also <a href="https://github.com/dotnet/runtime/blob/fcd862e06413a000f9cafa9d2f359226c60b9b42/src/coreclr/src/vm/jitinterfacegen.cpp#L69">switching the CLR to use “slower” allocators</a>. This is why you should check that it does not impact your application performance. These slower allocators are also used when your <strong>ICorProfilerCallback.Initialize</strong> method calls <strong>SetEventMask</strong> with <strong>COR_PRF_ENABLE_OBJECT_ALLOCATED</strong> flag to receive allocation notifications.</p>
<p>When you use <a href="https://github.com/Microsoft/perfview/releases">Perfview</a> for memory investigation, you are relying on these events without knowing it. In the Collect/Run dialog, three checkboxes are defining how to get the memory profiling details:</p>
<p><img loading="lazy" src="/posts/2020-04-18_build-your-own-net/1_8kuywni3W8PVleqg5NwWPw.png"></p>
<ul>
<li><em>.NET Alloc</em>: use a custom native C++ <strong>ICorProfilerCallback</strong> implementation (noticeable impact on the profiled application performance).</li>
<li><em>.NET SampAlloc</em>: use the same custom native profiler but with sampled events.</li>
<li><em>ETW .NET Alloc</em>: use <strong>GCSampledObjectAllocationHigh</strong> events</li>
</ul>
<p>In all cases, the profiled application needs to be started after the collection begins.</p>
<h2 id="how-to-listen-to-allocation-events">How to listen to allocation events</h2>
<p>As I have already explained in previous posts, the Microsoft <a href="https://www.nuget.org/packages/Microsoft.Diagnostics.Tracing.TraceEvent/"><strong>TraceEvent</strong> nuget</a> helps you listening to CLR events. First, you create a <strong>TraceEventSession</strong> and setup the providers you want to receive events from:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">session</span><span class="p">.</span><span class="n">EnableProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">ProviderGuid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">TraceEventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span>    <span class="c1">// this is needed in order to receive AllocationTick_V2 event</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="kt">ulong</span><span class="p">)(</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// required to receive AllocationTick events</span>
</span></span><span class="line"><span class="cl">    <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">GC</span> <span class="p">|</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// the CLR source code indicates that the provider must be set before the monitored application starts</span>
</span></span><span class="line"><span class="cl">    <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">GCSampledObjectAllocationLow</span> <span class="p">|</span> 
</span></span><span class="line"><span class="cl">    <span class="c1">//ClrTraceEventParser.Keywords.GCSampledObjectAllocationHigh | </span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// required to receive the BulkType events that allows </span>
</span></span><span class="line"><span class="cl">    <span class="c1">// mapping between the type ID received in the allocation events</span>
</span></span><span class="line"><span class="cl">    <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">GCHeapAndTypeNames</span> <span class="p">|</span>   
</span></span><span class="line"><span class="cl">    <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Type</span> <span class="p">|</span>
</span></span><span class="line"><span class="cl"><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Second, you set up the handlers for the events you are interested in:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">SetupListeners</span><span class="p">(</span><span class="n">TraceLogEventSource</span> <span class="n">source</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">Clr</span><span class="p">.</span><span class="n">GCAllocationTick</span> <span class="p">+=</span> <span class="n">OnAllocationTick</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">Clr</span><span class="p">.</span><span class="n">GCSampledObjectAllocation</span> <span class="p">+=</span> <span class="n">OnSampleObjectAllocation</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// required to receive the mapping between type ID (received in GCSampledObjectAllocation)</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// and their name (received in TypeBulkType)</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">Clr</span><span class="p">.</span><span class="n">TypeBulkType</span> <span class="p">+=</span> <span class="n">OnTypeBulkType</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And lastly, the processing of received events is done in a dedicated thread until the session is disposed of:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="k">await</span> <span class="n">Task</span><span class="p">.</span><span class="n">Factory</span><span class="p">.</span><span class="n">StartNew</span><span class="p">(()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">using</span> <span class="p">(</span><span class="n">_session</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">SetupProviders</span><span class="p">(</span><span class="n">_session</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">SetupListeners</span><span class="p">(</span><span class="n">_session</span><span class="p">.</span><span class="n">Source</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        
</span></span><span class="line"><span class="cl">        <span class="n">_session</span><span class="p">.</span><span class="n">Source</span><span class="p">.</span><span class="n">Process</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now let’s see the difference between the two sets of events.</p>
<h2 id="the-allocationtick-way">The AllocationTick way</h2>
<p>My first idea was to use the <strong>AllocationTick</strong> event because it seemed easy: one sampled event with a size, a type name, and LOH/ephemeral kind. However, how this sampling works makes it impossible to get an exact per type allocated size. Let’s have a look at this list of events received from a WPF test application:</p>
<pre tabindex="0"><code>Small | 105444 : FreezableContextPair[]
Small | 111908 : FreezableContextPair[]
Small | 106720 : System.String
Small | 102488 : System.String
Small | 107028 : System.TimeSpan[]
Small | 106100 : System.String
</code></pre><p>All allocations were s<strong>mall</strong> (i.e. not in the LOH: &lt; 85.000 bytes) and the second column gives the cumulated size of all allocations to reach the 100 KB threshold but not for this particular type! There is no easy way to make a valid guess of the specific last allocation size for which we get the type name.</p>
<p>For example, the first array of <strong>FreezableContextPair</strong> triggered the event for a cumulated size of 105.444 bytes. But how big was this array? We don’t know: could have been 100.000 because only 5444 bytes were allocated before or only 10444 bytes because 95.000 were allocated before. It would have been so useful that the size of the last allocated object would be passed in the event payload…</p>
<p>It is a little bit different (but not that better) for objects allocated in LOH because they have to be at least 85.000 bytes long. For example, allocate 4-byte arrays, each one 85.000 bytes long and let’s see the corresponding events:</p>
<pre tabindex="0"><code>Large | 170064 : System.Byte[]
Large | 170064 : System.Byte[]
</code></pre><p>Two <strong>AllocationTick</strong> events are received with 170064 as cumulated size. Still hard to figure out what was the size of the last allocated array: the only thing we know is that it was larger (or equal) to 85.000 bytes because it was allocated in LOH.</p>
<p>For larger objects, it might seem a little bit more accurate. Let’s allocate 2 byte arrays, each one 110.000 bytes long:</p>
<pre tabindex="0"><code>Large | 195064 : System.Byte[]
Large | 110032 : System.Byte[]
</code></pre><p>There are ~85.000 bytes difference between the two events even though the same 110.000 bytes were allocated. You could remove 85.000 bytes from the value and have an approximation of the LOH allocated object: the larger the allocation the less the error. But still: could be 85.000 size error…</p>
<p>So we won’t be able to rely on the size provided by the <strong>AllocationTick</strong> event; only the type name. In addition, you get a view of objects allocated in LOH. Maybe the other events will provide better results.</p>
<h2 id="the-gcsampledobjectallocation-way">The GCSampledObjectAllocation way</h2>
<p>When an object is allocated by the GC allocator, a <strong>GCSampledObjectAllocation</strong> event is emitted under certain conditions:</p>
<ul>
<li>Both <strong>GCSampledObjectAllocationLow</strong> and <strong>GCSampledObjectAllocationHigh</strong> keywords are set on the CLR provider,</li>
<li>The object size is larger than 10.000 bytes,</li>
<li>At least 1000 instances of the type have been allocated,</li>
<li>Just before the application exits, current statistics for all types <a href="https://github.com/dotnet/runtime/blob/61ec7c7bdacb70ffd51dece09e30179f86156a0d/src/coreclr/src/vm/eventtrace.cpp#L3668">are flushed</a>,</li>
<li>A <a href="https://github.com/dotnet/runtime/blob/61ec7c7bdacb70ffd51dece09e30179f86156a0d/src/coreclr/src/vm/eventtrace.cpp#L3067">complicated piece of code</a> decides based on time since the last event and the type allocation rate.</li>
</ul>
<p>Picking one or the other keyword <a href="https://github.com/dotnet/runtime/blob/61ec7c7bdacb70ffd51dece09e30179f86156a0d/src/coreclr/src/vm/eventtrace.cpp#L2902">changes the maximum number of milliseconds between two events</a> for a given type:</p>
<ul>
<li>High (10 ms) : 100 events / second</li>
<li>Low (200 ms) : 5 events / second</li>
</ul>
<p>You should use low or high depending on the monitored application memory allocation workload to avoid impacting too much the profiler (and even the monitored application performance)</p>
<p>The interesting feature of these events is that, for a given type, the payload contains both the number of allocated instances since the last event and the cumulated size of these instances. Let’s take the same allocation of 4 arrays of byte, each 85000 long:</p>
<pre tabindex="0"><code>226 | 103616 : System.Byte[]
  1 |  85012 : System.Byte[]
  1 |  85012 : System.Byte[]
  1 |  85012 : System.Byte[]
</code></pre><p>This time, we get the exact count in the first column (<strong>ObjectCountForTypeSample</strong>) and the exact cumulated size in the second column (<strong>TotalSizeForTypeSample</strong>). If the count is 1, we have the exact size of that allocation and if it is bigger than 85000 bytes, we know it has been allocated in the LOH. Same accuracy for the 2-byte array of 110.000 elements:</p>
<pre tabindex="0"><code>198 | 123552 : System.Byte[]
  1 | 110012 : System.Byte[]
</code></pre><p>Sounds good. However, you have to remember that profiled applications need to be started after the session was created: it means that you can’t write a tool that will listen to a specific process ID like with <strong>AllocationTick</strong>. Three dictionaries are used by <strong>PerProcessProfilingState</strong> to keep track of per type allocations, type ID mappings, and process names:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">PerProcessProfilingState</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;</span> <span class="n">_processNames</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">ProcessTypeMapping</span><span class="p">&gt;</span> <span class="n">_perProcessTypes</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">ProcessTypeMapping</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">ProcessAllocationInfo</span><span class="p">&gt;</span> <span class="n">_perProcessAllocations</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">ProcessAllocationInfo</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;</span> <span class="n">Names</span> <span class="p">=&gt;</span> <span class="n">_processNames</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">ProcessTypeMapping</span><span class="p">&gt;</span> <span class="n">Types</span> <span class="p">=&gt;</span> <span class="n">_perProcessTypes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">Dictionary</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">ProcessAllocationInfo</span><span class="p">&gt;</span> <span class="n">Allocations</span> <span class="p">=&gt;</span> <span class="n">_perProcessAllocations</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>SampledObjectAllocationMemoryProfiler</strong> class uses it for the events processing:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">SampledObjectAllocationMemoryProfiler</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">TraceEventSession</span> <span class="n">_session</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">PerProcessProfilingState</span> <span class="n">_processes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="c1">// because we are not interested in self monitoring</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="kt">int</span> <span class="n">_currentPid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">int</span> <span class="n">_started</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">SampledObjectAllocationMemoryProfiler</span><span class="p">(</span><span class="n">TraceEventSession</span> <span class="n">session</span><span class="p">,</span> <span class="n">PerProcessProfilingState</span> <span class="n">processes</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_session</span> <span class="p">=</span> <span class="n">session</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_processes</span> <span class="p">=</span> <span class="n">processes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_currentPid</span> <span class="p">=</span> <span class="n">Process</span><span class="p">.</span><span class="n">GetCurrentProcess</span><span class="p">().</span><span class="n">Id</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The constructor of the profiler keeps track of its own process ID in <strong>_currentPid</strong> to skip its own events.</p>
<h2 id="gathering-typemapping">Gathering type mapping</h2>
<p>The processing of <strong>TypeBulkType</strong> events is quite straightforward: store the type ID/name association into a per-process dictionary:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnTypeBulkType</span><span class="p">(</span><span class="n">GCBulkTypeTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">FilterOutEvent</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">ProcessTypeMapping</span> <span class="n">mapping</span> <span class="p">=</span> <span class="n">GetProcessTypesMapping</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">currentType</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">currentType</span> <span class="p">&lt;</span> <span class="n">data</span><span class="p">.</span><span class="n">Count</span><span class="p">;</span> <span class="n">currentType</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">GCBulkTypeValues</span> <span class="k">value</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">Values</span><span class="p">(</span><span class="n">currentType</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">mapping</span><span class="p">[</span><span class="k">value</span><span class="p">.</span><span class="n">TypeID</span><span class="p">]</span> <span class="p">=</span> <span class="k">value</span><span class="p">.</span><span class="n">TypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">ProcessTypeMapping</span> <span class="n">GetProcessTypesMapping</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ProcessTypeMapping</span> <span class="n">mapping</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">_processes</span><span class="p">.</span><span class="n">Types</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="k">out</span> <span class="n">mapping</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">AssociateProcess</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">mapping</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ProcessTypeMapping</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">_processes</span><span class="p">.</span><span class="n">Types</span><span class="p">[</span><span class="n">pid</span><span class="p">]</span> <span class="p">=</span> <span class="n">mapping</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">mapping</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Remember that I choose to skip events from the current process detected by <strong>FilterOutEvent()</strong>.</p>
<h2 id="how-to-get-processnames">How to get process names</h2>
<p>Even though each event contains the ID of the emitting process, it would be better to display its name instead. You could use <strong>Process.GetProcessById(pid).ProcessName</strong> when analyzing the details but the process might be long gone at that time.</p>
<p>Another solution would be to enable the Kernel ETW provider and listen to the <strong>ProcessStart</strong> event. The <strong>ImageFileName</strong> field of the payload contains the process filename with the extension. However, it is obviously not working on Linux.</p>
<p>The easiest solution is to use <strong>GetProcessById</strong> but just when you receive the first type mapping for a given process. This is the role of the <strong>AssociateProcess</strong> method called in <strong>GetProcessTypesMapping</strong> shown previously:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">AssociateProcess</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_processes</span><span class="p">.</span><span class="n">Names</span><span class="p">[</span><span class="n">pid</span><span class="p">]</span> <span class="p">=</span> <span class="n">Process</span><span class="p">.</span><span class="n">GetProcessById</span><span class="p">(</span><span class="n">pid</span><span class="p">).</span><span class="n">ProcessName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">catch</span> <span class="p">(</span><span class="n">Exception</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;? {pid}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// we might not have access to the process</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It is now time to process allocation events.</p>
<h2 id="collecting-allocation-details">Collecting allocation details</h2>
<p>The <strong>GCSampledObjectAllocationTraceData</strong> payload contains the size and count of instances since the last event. We just need to store them for the corresponding process:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnSampleObjectAllocation</span><span class="p">(</span><span class="n">GCSampledObjectAllocationTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">FilterOutEvent</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="n">GetProcessAllocations</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">.</span><span class="n">AddAllocation</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="kt">ulong</span><span class="p">)</span><span class="n">data</span><span class="p">.</span><span class="n">TotalSizeForTypeSample</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="kt">ulong</span><span class="p">)</span><span class="n">data</span><span class="p">.</span><span class="n">ObjectCountForTypeSample</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">            <span class="n">GetProcessTypeName</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">TypeID</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">string</span> <span class="n">GetProcessTypeName</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">,</span> <span class="kt">ulong</span> <span class="n">typeID</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(!</span><span class="n">_processes</span><span class="p">.</span><span class="n">Types</span><span class="p">.</span><span class="n">TryGetValue</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="k">out</span> <span class="kt">var</span> <span class="n">mapping</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">typeID</span><span class="p">.</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">name</span> <span class="p">=</span> <span class="n">mapping</span><span class="p">[</span><span class="n">typeID</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kt">string</span><span class="p">.</span><span class="n">IsNullOrEmpty</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="p">?</span> <span class="n">typeID</span><span class="p">.</span><span class="n">ToString</span><span class="p">()</span> <span class="p">:</span> <span class="n">name</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>AddAllocation()</strong> helper method is simply accumulating these numbers for a given type in the <strong>ProcessAllocationInfo</strong> associated to the related process.</p>
<h2 id="displaying-theresults">Displaying the results</h2>
<p>When the profiling session ends, it is easy to show the allocated count and size per type:</p>
<p><img loading="lazy" src="/posts/2020-04-18_build-your-own-net/1_1zQgewOknfy2R_SHJfGkKQ.png"></p>
<p>The code is using a Linq syntax to get top allocations sorted either by count or by size:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">ShowResults</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="n">ProcessAllocationInfo</span> <span class="n">allocations</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">sortBySize</span><span class="p">,</span> <span class="kt">int</span> <span class="n">topTypesLimit</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;Memory allocations for {name}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;---------------------------------------------------------&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;    Count        Size   Type&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;---------------------------------------------------------&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">IEnumerable</span><span class="p">&lt;</span><span class="n">AllocationInfo</span><span class="p">&gt;</span> <span class="n">types</span> <span class="p">=</span> <span class="p">(</span><span class="n">sortBySize</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">?</span> <span class="n">allocations</span><span class="p">.</span><span class="n">GetAllocations</span><span class="p">().</span><span class="n">OrderByDescending</span><span class="p">(</span><span class="n">a</span> <span class="p">=&gt;</span> <span class="n">a</span><span class="p">.</span><span class="n">Size</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">:</span> <span class="n">allocations</span><span class="p">.</span><span class="n">GetAllocations</span><span class="p">().</span><span class="n">OrderByDescending</span><span class="p">(</span><span class="n">a</span> <span class="p">=&gt;</span> <span class="n">a</span><span class="p">.</span><span class="n">Count</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">topTypesLimit</span> <span class="p">!=</span> <span class="p">-</span><span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">types</span> <span class="p">=</span> <span class="n">types</span><span class="p">.</span><span class="n">Take</span><span class="p">(</span><span class="n">topTypesLimit</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">allocation</span> <span class="k">in</span> <span class="n">types</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;{allocation.Count,9} {allocation.Size,11}   {allocation.TypeName}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Another usage could be a long-running monitoring system that shows the allocation rate: a nice complement to the other GC metrics. However, compared to the other profilers, one important feature is missing: if an unexpected number of instances are created, how to know which part of the code is responsible for the spike?</p>
<p>The next post will explain how to enhance such a sampled memory profiler with call stacks per sampled allocation.</p>
<hr>
<h2 id="resources">Resources</h2>
<ul>
<li>Source code available <a href="https://github.com/chrisnas/ClrEvents">on Github</a>.</li>
<li><a href="/posts/2018-12-15_spying-on-net-garbage/">Spying on .NET Garbage Collector with TraceEvent</a></li>
<li>Pavel Yosifovich — <a href="https://www.youtube.com/watch?v=TqS4OEWn6hQ">Writing a .NET Core cross-platform profiler in an hour</a></li>
<li><a href="https://github.com/microsoftarchive/clrprofiler/tree/master/CLRProfiler">Original Microsoft ClrProfiler source code and documentation</a></li>
</ul>
<hr>
<p><strong>Like what you read? Don’t forget to check out part 2 on this topic:</strong></p>
<p><a href="/posts/2020-05-18_build-your-own-net/"><strong>Build your own .NET memory profiler in C# — call stacks (2/2–1)</strong>
*This post explains how to get the call stack corresponding to the allocations with CLR events.*medium.com</a></p>
<hr>
<p><strong>Interested in joining our journey? Check this out:</strong></p>
<p><a href="https://careers.criteo.com/working-in-R&amp;D"><strong>Product, Research &amp; Development | Criteo Careers</strong>
*Product, Research &amp; Development at Criteo. At Criteo, come and meet our teams and join our R &amp; D and also enjoy…*careers.criteo.com</a><a href="https://careers.criteo.com/working-in-R&amp;D"></a></p>
]]></content:encoded></item><item><title>Debugging Wednesday at Criteo — Cancel this task!</title><link>https://chrisnas.github.io/posts/2020-02-21_debugging-wednesday-cancel-thi/</link><pubDate>Fri, 21 Feb 2020 08:18:08 +0000</pubDate><guid>https://chrisnas.github.io/posts/2020-02-21_debugging-wednesday-cancel-thi/</guid><description>Last Wednesday “Debugging” day for Kevin and myself. Let’s share with you these frustrating but interesting minutes.</description><content:encoded><![CDATA[<hr>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_g3mc-sEc56gtAUP_rR4HHA.jpeg"></p>
<h2 id="introduction">Introduction</h2>
<p>Last Wednesday was a great day for <a href="https://twitter.com/KooKiz">Kevin</a> and myself: We spent a lot of time investigating the reasons why a test was failing. Let’s share with you these frustrating but interesting minutes.</p>
<p>One of our colleagues came to us because an integration test would get stuck in some specific conditions. Here is the simplified code of the service that is supposed to do some background processing until it is stopped:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">Service</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">CancellationTokenSource</span> <span class="n">_cancellationSource</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">Task</span> <span class="n">_backgroundProcessing</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">Task</span> <span class="n">_cleanup</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In the test, the service is created: Two <code>Task</code> instances are created in the constructor (for background processing irrelevant for this discussion) that (1) are started when the service starts and (2) are canceled when the service is stopped. A <code>CancellationTokenSource</code> is used to cancel the tasks if needed.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">Service</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_cancellationSource</span> <span class="p">=</span> <span class="k">new</span> <span class="n">CancellationTokenSource</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">cancellationToken</span> <span class="p">=</span> <span class="n">_cancellationSource</span><span class="p">.</span><span class="n">Token</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">_backgroundProcessing</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Task</span><span class="p">(</span><span class="n">DoStuffInTheBackground</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="n">cancellationToken</span><span class="p">,</span> <span class="n">TaskCreationOptions</span><span class="p">.</span><span class="n">LongRunning</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">_cleanup</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Task</span><span class="p">(</span><span class="n">DoStuffInTheBackground</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">        <span class="n">cancellationToken</span><span class="p">,</span> <span class="n">TaskCreationOptions</span><span class="p">.</span><span class="n">LongRunning</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>However, in the specific conditions of this test, the <code>Start</code> method would never be called, skipping straight to the <code>Stop</code> method.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">void</span> <span class="n">Start</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_backgroundProcessing</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">_cleanup</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">Stop</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_cancellationSource</span><span class="p">.</span><span class="n">Cancel</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="k">await</span> <span class="n">_backgroundProcessing</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">await</span> <span class="n">_cleanup</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The task is never started because the test skips the <code>Service.Start()</code> method, but it should transition to “Cancelled” state as soon as the <code>CancellationTokenSource</code> associated with the <code>CancellationToken</code> passed to the <code>Task</code> gets canceled. In that particular situation, we expect the <code>await _backgroundProcessing</code> code to immediately return in the <code>Stop()</code> method.</p>
<p>In our colleague Visual Studio, the debugger never came back from <code>await _backgroundProcessing</code>, indicating that the task never completed…</p>
<h2 id="reproduce-theproblem">Reproduce the problem</h2>
<p>I wanted to un-validate any side effect related to our test framework or custom Criteo libraries and confirm my understanding of task cancellation, so I wrote a small Console application with the same 4.7.2 version of .NET Framework with the following code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">static</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">ReproduceWorking</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">source</span> <span class="p">=</span> <span class="k">new</span> <span class="n">CancellationTokenSource</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">token</span> <span class="p">=</span> <span class="n">source</span><span class="p">.</span><span class="n">Token</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">t</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Task</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;Done&#34;</span><span class="p">),</span> <span class="n">token</span><span class="p">,</span> <span class="n">TaskCreationOptions</span><span class="p">.</span><span class="n">LongRunning</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">source</span><span class="p">.</span><span class="n">Cancel</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">try</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">await</span> <span class="n">t</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">catch</span> <span class="p">(</span><span class="n">TaskCanceledException</span> <span class="n">x</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="n">x</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And guess what?
I got the expected <code>TaskCancellationException</code>:</p>
<blockquote>
<p>System.Threading.Tasks.TaskCanceledException: A task was canceled.
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at TaskCancellation.Program.d__2.MoveNext()</p>
</blockquote>
<p>The next step was to double-check the understanding of how tasks are working related to cancellation. So I set a breakpoint after the task is instantiated</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_lNlSNvggMbjwA7mAC33LPA.png"></p>
<p>And its status is <strong>Created</strong>.</p>
<p>Doing the same after <code>Cancel()</code> is called on the <code>CancellationTokenSource</code>, gives the expected <strong>Canceled</strong> status.</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_uDQ7b1Eq2QYYmVdT1jSn1Q.png"></p>
<p>So, let’s do the same when debugging the test code:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_v90VjEEUTCNHm3Xre_zeHQ.png"></p>
<p>It gives the same <strong>Created</strong> status after the task creation but after canceling the source:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_iOYZBxNZOLvrxarbcbj6Uw.png"></p>
<p>…the status does not switch to <strong>Canceled</strong> like in my Console repro!</p>
<h2 id="looking-for-cancellation">Looking for cancellation</h2>
<p>The only thing that came to my mind was: maybe the cancellation token is not taken into account. However, a <code>CancellationToken</code> is just a struct that keeps a reference to its <code>CancellationTokenSource</code>. It should be easy in Visual Studio debugger to double-check that our task keeps track of the token somewhere and goes back to the cancellation source. Well… the token is not kept as a field of the task but stored inside the <code>m_contingentProperties</code> field <a href="https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs,674">deep during the task construction code path</a>.</p>
<p>Let’s look at the value in the <strong>Quick Watch</strong> after the cancellation source gets canceled:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_rJgSvnb9o7_Zut1D0f4fjg.png"></p>
<p>It sounds like the cancellation token is not canceled… But if we look at the source <code>Token</code> property of our canceled <code>CancellationTokenSource</code>,</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_AduGgofUrRBXhEHlqYYueA.png"></p>
<p>we don’t have the same value for <code>IsCancellationRequested</code>!</p>
<p>It’s like we don’t look at the same cancellation source… To find out, we just have to compare the reference to our <code>CancellationTokenSource</code> with the one we see in the <code>m_contingentProperties</code> token of the task. To achieve that, we could copy the expression from QuickWatch, paste it into the <strong>Debug | Windows | Memory…</strong> pane and press ENTER to get the address where the object is stored in memory</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_UVx4QML3o6xp4Ecsu6cnbw.png"></p>
<p>After having done the same with <code>_cancellationSource</code>, I did not get the same address:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_cPXCRetw7iGMYQ6g1tMpcA.png"></p>
<p>It means that we were dealing with two different instances of <code>CancellationTokenSource</code>.</p>
<p>But this might be too C++ish for you… Kevin prefers leveraging the <em>Make Object ID</em> feature of the C# debugger. You simply right-click the Data Tip of the cancellation source and select <strong>Make Object ID</strong>:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_RCJafBYT7icR6YFQ6tkxdA.png"></p>
<p>Once this is done, a numeric identifier is displayed for this instance in Data Tip and any Watch window (#1 in this screenshot):</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_CTf7RGYgX2otS5K6rH9azQ.png"></p>
<p>When we looked at the cancellation source of the token stored by the task,</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_aHnPvqWuL5sM0cj9ztyA4A.png"></p>
<p>we didn’t see any ID so it was not the same object.</p>
<p>So we decided to use Make Object ID on this <code>m_source</code> that became marked as #2.</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_KZ48iXU0TobSm-A3YFUTIQ.png"></p>
<p>And when we looked at the <code>m_source</code> of the second cleanup <code>Task</code>, we realized that it was the same object but not the one we created!</p>
<p>We started to think that we were becoming crazy. So, let’s restart from the beginning and follow the <code>CancellationTokenSource</code> from its creation because we are sure that we passed a valid cancellation token (linked to this source) to the task constructor. Or… Did we? The QuickWatch gives a different answer just after the task gets created compared to what we’ve seen already: a token with an empty source property now!</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_z9hOeNeJ7_o6_qX08bwWaw.png"></p>
<p>I closed the QuickWatch pane and reopened it for Kevin to confirm. And like in a nightmare, the source was not null anymore…</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_l_2KlUeedUlG42RAej2jjA.png"></p>
<p>Visual Studio must be responsible for that weird behavior!</p>
<p>Kevin remembered the <em>”Enable property evaluation”</em> settings in the Options dialog. If it is checked (which is the default), it means that the Debugger would fetch the value of an instance field and then call the Getter of each property in order to display its value.</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_XqOydO0OsS0noacmEDpEdg.png"></p>
<p>However, if you uncheck it, only the fields are displayed. So in our case, we then always got a null <code>m_contingentProperties</code> field (and, as expected, all property would not be displayed):</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_33vZaFRFh1ii2X2APgaeXw.png"></p>
<p>The <code>m_contingentProperties</code> is initialized in <code>EnsureContingentPropertiesInitialized()</code> when called by <code>AssignCancellationToken()</code> from the <code>TaskContructorCore()</code> helper used by the <code>Task</code> constructor but it did not seem to be the case because it was definitively <strong>null</strong>…</p>
<p>Kevin decided to stop at the <code>CancellationTokenSource</code> constructor with a new breakpoint (more on how to set a breakpoint on a .NET Framework method soon) to see where the one shown in the Debugger was created but the breakpoint was never hit. So the <code>CancellationTokenSource</code> #2 must have been created even before our own was created by our code. In fact, a static <code>CancellationTokenSource</code> is created and is set to <code>m_source</code> when <code>InitializeDefaultSource()</code> gets called by one of the Getter. This explains why we saw the same instance #2 in both tasks token.</p>
<p>To sum up, we were now sure that the passed token was not “received” by the <code>Task</code>.</p>
<h2 id="eureka">Eureka!</h2>
<p>Maybe there is a magic trick done by the .NET Framework to lazily set the token source after the creation of the task. However, we did not find such a code in <a href="https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs">the .NET Framework</a> and this is not what we see in our repro.</p>
<p>Back to the basics: are we sure that we are executing the code we think is executed? We looked for mscorlib in the <strong>Debug | Windows | Modules</strong> pane,</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_dL92Y22KRoMBJVlGaGefgQ.png"></p>
<p>and we opened it with a decompiler: the code of the methods called during the <code>Task</code>** **construction was the same as the one shown in <a href="https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs">https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs</a>.</p>
<p>Next, in order to better follow the execution and the passing of parameters (including our token), we decided to set breakpoints on <code>Task</code> <a href="https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs,590">private method responsible</a> for its initialization.</p>
<pre tabindex="0"><code>internal void TaskConstructorCore(object action, object state, CancellationToken cancellationToken, TaskCreationOptions creationOptions, InternalTaskOptions internalOptions, TaskScheduler scheduler)
</code></pre><p>In the <strong>Debug | Windows | Breakpoints</strong> pane, click <strong>New</strong> | <strong>Function Breakpoint…</strong></p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_eODuofmyha3AvYYQ5hL52g.png"></p>
<p>and type the full name of the method. This is working even for a method of a class defined in the .NET Framework assembly for which you do not have the source code:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_WhyV6fYlMtw7qbTGQoOSWQ.png"></p>
<p>We checked that the breakpoints were well set (i.e. no typo in the full name) by looking at the filled red circle:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_csd6pFIfQdTH853g1A8Uew.png"></p>
<p>The <code>cancellationToken</code> parameter should contain the token that we passed at <code>Task</code> creation. Unfortunately, the QuickWatch pane displayed a “cannot read memory” error that we never saw in Visual Studio before!</p>
<p>At that time, we thought we were doomed but we looked at the <strong>Call Stack</strong> pane and we realized that the code was calling <a href="https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs,505">the wrong <strong>Task</strong> constructor</a>:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_MC5ZN4NUzKnGIiQTufUPbw.png"></p>
<pre tabindex="0"><code>public Task(Action&lt;object&gt; action, object state, TaskCreationOptions creationOptions)
</code></pre><p>Its signature is compatible with our code:</p>
<pre tabindex="0"><code>_backgroundProcessing = new Task(DoStuffInTheBackground, cancellationToken, TaskCreationOptions.LongRunning);
</code></pre><p>and this is why the compiler did not complain.</p>
<p>Our <code>CancellationToken</code> was passed as the <code>state</code> parameter is given directly to our <code>DoStuffInTheBackground Action</code>: the created <code>Task</code> had no idea that it was supposed to be its <code>CancellationToken</code>.</p>
<p>Note that if we had noticed the <strong>Auto Completion</strong> (Ctrl + Shift + Space) hint, we might have figured out the root cause much sooner…</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_CWW78rrNIhnbsLqCi0zniw.png"></p>
<p>The fix was straightforward; just using <a href="https://referencesource.microsoft.com/#mscorlib/system/threading/Tasks/Task.cs,533">the right constructor</a>:</p>
<pre tabindex="0"><code>public Task(Action&lt;object&gt; action, object state, CancellationToken cancellationToken, TaskCreationOptions creationOptions)
</code></pre><p>that accepts both a state for the callback and a <code>CancellationToken</code> for the <code>Task</code> to create:</p>
<pre tabindex="0"><code>_backgroundProcessing = new Task(DoStuffInTheBackground, cancellationToken, cancellationToken, TaskCreationOptions.LongRunning);
</code></pre><p>Under the debugger, we validated that the source was now the expected one:</p>
<p><img loading="lazy" src="/posts/2020-02-21_debugging-wednesday-cancel-thi/1_43MDI_vnkwcGjiLRoxj0qQ.png"></p>
<p>and the test did not hang anymore.</p>
<p>If, from the beginning, we would have been able to step into .NET Framework compiled code as we do with Jetbrains Resharper integration in Visual Studio, we would have found the issue almost immediately. Thankfully, Microsoft has just announced <a href="https://devblogs.microsoft.com/visualstudio/decompilation-of-c-code-made-easy-with-visual-studio?WT.mc_id=DT-MVP-5003325">decompilation of C# code made easy with Visual Studio</a>.</p>
<p>We wish we had it last Wednesday…</p>
<hr>
<p><strong>Interested in reading more about Christophe’s &amp; Kevin’s work? Check out their latest articles:</strong></p>
<p><a href="https://medium.com/criteo-labs/build-your-own-net-memory-profiler-in-c-allocations-1-2-9c9f0c86cefd"><strong>Build your own .NET memory profiler in C#</strong>
*This post explains how to collect allocation details by writing your own memory profiler in C#.*medium.com</a><a href="https://medium.com/criteo-labs/build-your-own-net-memory-profiler-in-c-allocations-1-2-9c9f0c86cefd"></a><a href="https://medium.com/criteo-labs/switching-back-to-the-ui-thread-in-wpf-uwp-in-modern-c-5dc1cc8efa5e"><strong>Switching back to the UI thread in WPF/UWP, in modern C#</strong>
<em>Leveraging the async machinery to transparently switch to the UI thread when needed</em>medium.com</a><a href="https://medium.com/criteo-labs/switching-back-to-the-ui-thread-in-wpf-uwp-in-modern-c-5dc1cc8efa5e"></a></p>
<hr>
<p><strong>If you are looking for a change and would love to work with these two, head over to our careers page and let us know if there is something that sounds like you!</strong></p>
<p><a href="https://careers.criteo.com/working-in-R&amp;D"><strong>Product, Research &amp; Development | Criteo Careers</strong>
*Come and meet our teams …*careers.criteo.com</a><a href="https://careers.criteo.com/working-in-R&amp;D"></a></p>
]]></content:encoded></item><item><title>How to expose your custom counters in .NET Core</title><link>https://chrisnas.github.io/posts/2019-10-17_how-to-expose-your/</link><pubDate>Thu, 17 Oct 2019 12:42:17 +0000</pubDate><guid>https://chrisnas.github.io/posts/2019-10-17_how-to-expose-your/</guid><description>This post shows how to code your own counters: you’ll get the count and duration of ASP.NET requests processed with(out) GC as example.</description><content:encoded><![CDATA[<hr>
<p>This post of the series explains how to implement your own counters.</p>
<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>Part 2: <a href="/posts/2018-07-26_grab-etw-session-providers/">Grab ETW Session, Providers and Events</a>.</p>
<p>Part 3: <a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">CLR Threading events with TraceEvent</a>.</p>
<p>Part 4: <a href="/posts/2018-12-15_spying-on-net-garbage/">Spying on .NET Garbage Collector with TraceEvent</a>.</p>
<p>Part 5: <a href="/posts/2019-02-12_building-your-own-java/">Building your own Java GC logs in .NET</a></p>
<p>Part 6: <a href="/posts/2019-05-28_spying-on-net-garbage/">Spying on .NET Core Garbage Collector with .NET Core EventPipes</a></p>
<p>Part 7: <a href="/posts/2019-07-23_net-core-counters-internals/">.NET Core Counters internals: how to integrate counters in your monitoring pipeline</a></p>
<h2 id="introduction">Introduction</h2>
<p>The** EventPipe** counters are the .NET Core replacement for Windows performance counters. In the <a href="/posts/2019-07-23_net-core-counters-internals/">previous post</a>, I’ve explained how to listen to CLR event pipes to get the counter’s value over time both on Windows and Linux. This post shows you how easy it is to provide your counters via the same infrastructure.</p>
<p>The example I’m using is based on a real-world case we had to investigate at Criteo. We needed to correlate request duration with garbage collections, so we decided to add new metrics to our testing dashboard: number and duration of requests but split between those processed without being interrupted by a GC and the others.</p>
<p>For the sake of the ASP.NET Core code example, a <a href="https://docs.microsoft.com/en-us/aspnet/core/fundamentals/middleware/write?WT.mc_id=DT-MVP-5003325?view=aspnetcore-3.0">dedicated middleware</a> is created: it simply measures the time spent to process a request and if the count of garbage collections has changed before and after the request is processed:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span><span class="lnt">46
</span><span class="lnt">47
</span><span class="lnt">48
</span><span class="lnt">49
</span><span class="lnt">50
</span><span class="lnt">51
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">RequestMetricsMiddleware</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="n">RequestDelegate</span> <span class="n">_next</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">RequestMetricsMiddleware</span><span class="p">(</span><span class="n">RequestDelegate</span> <span class="n">next</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_next</span> <span class="p">=</span> <span class="n">next</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">async</span> <span class="n">Task</span> <span class="n">InvokeAsync</span><span class="p">(</span><span class="n">HttpContext</span> <span class="n">context</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// get the count of GCs before processing the request</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">collectionCountBeforeProcessingTheRequest</span> <span class="p">=</span> <span class="n">GetCurrentCollectionCount</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">sw</span> <span class="p">=</span> <span class="n">Stopwatch</span><span class="p">.</span><span class="n">StartNew</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">try</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// Call the next delegate/middleware in the pipeline</span>
</span></span><span class="line"><span class="cl">            <span class="k">await</span> <span class="n">_next</span><span class="p">(</span><span class="n">context</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">finally</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// compare the counter of GCs after processing the request</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// if the count changed, a garbage collection occurred during the processing </span>
</span></span><span class="line"><span class="cl">            <span class="c1">// and might have slowed it down and maybe reaching SLA limit: this could </span>
</span></span><span class="line"><span class="cl">            <span class="c1">// explain 9x-percentile in slow requests for example</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="p">(</span><span class="n">GetCurrentCollectionCount</span><span class="p">()</span> <span class="p">-</span> <span class="n">collectionCountBeforeProcessingTheRequest</span> <span class="p">!=</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="c1">// update with collection metric</span>
</span></span><span class="line"><span class="cl">               <span class="n">RequestCountersEventSource</span><span class="p">.</span><span class="n">Instance</span><span class="p">.</span><span class="n">AddRequestWithGcDuration</span><span class="p">(</span><span class="n">sw</span><span class="p">.</span><span class="n">ElapsedMilliseconds</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="k">else</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="c1">// update without collection metric</span>
</span></span><span class="line"><span class="cl">                <span class="n">RequestCountersEventSource</span><span class="p">.</span><span class="n">Instance</span><span class="p">.</span><span class="n">AddRequestWithoutGcDuration</span><span class="p">(</span><span class="n">sw</span><span class="p">.</span><span class="n">ElapsedMilliseconds</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">int</span> <span class="n">GetCurrentCollectionCount</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">int</span> <span class="n">count</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">GC</span><span class="p">.</span><span class="n">MaxGeneration</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">count</span> <span class="p">+=</span> <span class="n">GC</span><span class="p">.</span><span class="n">CollectionCount</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">count</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The interesting part is in the <code>RequestCountersEventSource</code> implementation.</p>
<h2 id="use-an-eventsource-luke">Use an EventSource Luke!</h2>
<p>As explained in the previous post, an <code>EventSource</code> instance is used as the “server” part of the <strong>EventPipe</strong> communication channel. It exposes a name that is used to identify it, but more important to listen to it with <strong>dotnet-trace</strong>, <strong>dotnet-counters,</strong> or your own listener as the <em>provider</em> name.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[EventSource(Name = RequestCountersEventSource.SourceName)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">RequestCountersEventSource</span> <span class="p">:</span> <span class="n">EventSource</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// this name will be used as &#34;provider&#34; name with dotnet-counters</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// ex: dotnet-counters monitor -p &lt;pid&gt; Sample.RequestCounters</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//</span>
</span></span><span class="line"><span class="cl">    <span class="kd">const</span> <span class="kt">string</span> <span class="n">SourceName</span> <span class="p">=</span> <span class="s">&#34;Sample.RequestCounters&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">RequestCountersEventSource</span><span class="p">()</span> 
</span></span><span class="line"><span class="cl">        <span class="p">:</span> <span class="k">base</span><span class="p">(</span><span class="n">RequestCountersEventSource</span><span class="p">.</span><span class="n">SourceName</span><span class="p">,</span> <span class="n">EventSourceSettings</span><span class="p">.</span><span class="n">EtwSelfDescribingEventFormat</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// create the counters: they&#39;ll be bound to this event source + CounterGroup</span>
</span></span><span class="line"><span class="cl">        <span class="n">CreateCounters</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This name is exposed via an <code>EventSourceAttribute</code> that decorates your <code>EventSource</code>-derived class (you could also pass it to the constructor). The counters are created in the constructor through the <code>CreateCounters</code> helper.</p>
<h2 id="pick-the-right-counterclass">Pick the right Counter class</h2>
<p>Before looking at the implementation of the <code>CreateCounters</code> method, you need to understand what kind of counters are available for you. In the previous post, I mentioned that the CLR was using <em>Mean</em> (that provides mean, max, and min values over the update interval) and <em>Sum</em> (to increment a single value) kinds of counters. Note that dotnet-counter will only show the mean value for <em>Mean</em> counters.
 
 In addition, the counters could either automatically poll the value from a callback (the method used by the CLR today), or your code could change a counter value by calling the <code>WriteMetric</code> method. The <code>EventCounter</code> class provides this helper and it does its best to compute the min/max/mean in <a href="https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/shared/System/Diagnostics/Tracing/EventCounter.cs#L144">a lock-free way</a>.</p>
<p><img loading="lazy" src="/posts/2019-10-17_how-to-expose-your/1_8EnLvreQ7iS5gVkXhF8vfw.png"></p>
<p>The next question to answer is which one should you use.</p>
<p>In the case of the request with(out) GC example, I want to expose different metrics:</p>
<ul>
<li><em>Request count</em>: a <code>PollingCounter</code> will be used in addition to an int field incremented when a request is received.</li>
<li><em>Request count delta</em>: an <code>IncrementingCounter</code> associated with the same int value will provide the delta (i.e., number of requests processed during an interval)</li>
<li><em>Request with GC and without GC counts</em>: two <code>PollingCounter</code> instances based on two int fields incremented when a request with (or without respectively) GC are processed.</li>
<li><em>Duration of requests with and without GC</em>: two <code>EventCounter</code> instances updated when requests are processed.</li>
</ul>
<p>Here is the implementation of<code>CreateCounters</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">CreateCounters</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// the same request count can be used for two counters:</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// - raw request counter that will always increase</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// - increment counter that will automatically compute the delta</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//   between the current value and the value when the counter</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//   was previously sent</span>
</span></span><span class="line"><span class="cl">    <span class="n">_requestCount</span> <span class="p">??=</span> <span class="k">new</span> <span class="n">PollingCounter</span><span class="p">(</span><span class="s">&#34;request-count&#34;</span><span class="p">,</span> <span class="k">this</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">_requestCountValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="n">DisplayName</span> <span class="p">=</span> <span class="s">&#34;Requests count&#34;</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="n">_requestCountDelta</span> <span class="p">??=</span> <span class="k">new</span> <span class="n">IncrementingPollingCounter</span><span class="p">(</span><span class="s">&#34;request-count-delta&#34;</span><span class="p">,</span> <span class="k">this</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">_requestCountValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="n">DisplayName</span> <span class="p">=</span> <span class="s">&#34;New requests&#34;</span><span class="p">,</span> <span class="n">DisplayRateTimeScale</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TimeSpan</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="m">1</span><span class="p">)</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// split the request counts between those for which a GC occured or not </span>
</span></span><span class="line"><span class="cl">    <span class="c1">// during their processing</span>
</span></span><span class="line"><span class="cl">    <span class="n">_noGcRequestCount</span> <span class="p">??=</span> <span class="k">new</span> <span class="n">PollingCounter</span><span class="p">(</span><span class="s">&#34;no-gc-request-count&#34;</span><span class="p">,</span> <span class="k">this</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">_noGcRequestCountValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="n">DisplayName</span> <span class="p">=</span> <span class="s">&#34;Requests (processed without GC) count&#34;</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="n">_withGcRequestCount</span> <span class="p">??=</span> <span class="k">new</span> <span class="n">PollingCounter</span><span class="p">(</span><span class="s">&#34;with-gc-request-count&#34;</span><span class="p">,</span> <span class="k">this</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">_withGcRequestsCountValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="n">DisplayName</span> <span class="p">=</span> <span class="s">&#34;Requests (processed during a GC) count&#34;</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// request duration counters (with or without GC happening during the processing)</span>
</span></span><span class="line"><span class="cl">    <span class="n">_noGcRequestDuration</span> <span class="p">??=</span> <span class="k">new</span> <span class="n">EventCounter</span><span class="p">(</span><span class="s">&#34;no-gc-request-duration&#34;</span><span class="p">,</span> <span class="k">this</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="n">DisplayName</span> <span class="p">=</span> <span class="s">&#34;Requests (processed without GC) duration in milli-seconds&#34;</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">_withGcRequestDuration</span> <span class="p">??=</span> <span class="k">new</span> <span class="n">EventCounter</span><span class="p">(</span><span class="s">&#34;with-gc-request-duration&#34;</span><span class="p">,</span> <span class="k">this</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="n">DisplayName</span> <span class="p">=</span> <span class="s">&#34;Requests (processed during a GC) duration in milli-seconds&#34;</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The request processing code of the ASP.NET Core middleware is relying on the following helper methods to update the counters:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">internal</span> <span class="k">void</span> <span class="n">AddRequestWithoutGcDuration</span><span class="p">(</span><span class="kt">long</span> <span class="n">elapsedMilliseconds</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">IncRequestCount</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">Interlocked</span><span class="p">.</span><span class="n">Increment</span><span class="p">(</span><span class="k">ref</span> <span class="n">_noGcRequestCountValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// compute min/max/mean</span>
</span></span><span class="line"><span class="cl">    <span class="n">_noGcRequestDuration</span><span class="p">?.</span><span class="n">WriteMetric</span><span class="p">(</span><span class="n">elapsedMilliseconds</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">internal</span> <span class="k">void</span> <span class="n">AddRequestWithGcDuration</span><span class="p">(</span><span class="kt">long</span> <span class="n">elapsedMilliseconds</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">IncRequestCount</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">Interlocked</span><span class="p">.</span><span class="n">Increment</span><span class="p">(</span><span class="k">ref</span> <span class="n">_withGcRequestsCountValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// compute min/max/mean</span>
</span></span><span class="line"><span class="cl">    <span class="n">_withGcRequestDuration</span><span class="p">?.</span><span class="n">WriteMetric</span><span class="p">(</span><span class="n">elapsedMilliseconds</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">IncRequestCount</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Interlocked</span><span class="p">.</span><span class="n">Increment</span><span class="p">(</span><span class="k">ref</span> <span class="n">_requestCountValue</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And that’s it!</p>
<h2 id="how-to-get-these-custom-counters">How to get these custom counters?</h2>
<p>The controller of the ASP.NET Core sample application is triggering (or not) garbage collections based on the parameters passed via the url:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[Route(&#34;api/[controller]</span><span class="s">&#34;)]
</span></span></span><span class="line"><span class="cl"><span class="na">[ApiController]</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">RequestController</span> <span class="p">:</span> <span class="n">ControllerBase</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// GET: api/Request/5</span>
</span></span><span class="line"><span class="cl"><span class="na">    [HttpGet(&#34;{id}&#34;)]</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">string</span> <span class="n">Get</span><span class="p">(</span><span class="kt">int</span> <span class="n">id</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">id</span> <span class="p">==</span> <span class="p">-</span><span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="s">$&#34;pid = {Process.GetCurrentProcess().Id}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">((</span><span class="n">id</span> <span class="p">&gt;=</span> <span class="m">0</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">id</span> <span class="p">&lt;=</span> <span class="m">2</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">GC</span><span class="p">.</span><span class="n">Collect</span><span class="p">(</span><span class="n">id</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="s">$&#34;triggered GC {id}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">id</span> <span class="p">&lt;=</span> <span class="m">10</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// trigger a given number of GCs up to 10</span>
</span></span><span class="line"><span class="cl">            <span class="n">TriggerGCs</span><span class="p">(</span><span class="n">id</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="s">$&#34;triggered {id} garbage collections&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="s">$&#34;value = {id}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">void</span> <span class="n">TriggerGCs</span><span class="p">(</span><span class="kt">int</span> <span class="n">count</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">current</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">current</span> <span class="p">&lt;</span> <span class="n">count</span><span class="p">;</span> <span class="n">current</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">GC</span><span class="p">.</span><span class="n">Collect</span><span class="p">(</span><span class="m">0</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>As explained earlier, it is possible to see the counter values with <strong>dotnet-counters</strong> by using the event source name as a provider with the following command line:</p>
<blockquote>
<p>dotnet counters monitor -p <pid> <strong>Sample.RequestCounters</strong></p>
</blockquote>
<p>Then if you trigger a few requests with and without GC, you should see the numbers change:</p>
<p><img loading="lazy" src="/posts/2019-10-17_how-to-expose-your/1_M-GjqpcH8BL4oG02sk3sEg.png"></p>
<p>The code available on <a href="https://github.com/chrisnas/ClrEvents">Github</a> has been updated to provide the middleware and the event source classes that demonstrate how to expose custom .NET Core counters.</p>
]]></content:encoded></item><item><title>.NET Core Counters internals: how to integrate counters in your monitoring pipeline</title><link>https://chrisnas.github.io/posts/2019-07-23_net-core-counters-internals/</link><pubDate>Tue, 23 Jul 2019 15:30:37 +0000</pubDate><guid>https://chrisnas.github.io/posts/2019-07-23_net-core-counters-internals/</guid><description>This post shows how to easily get .NET Core counters. Their internals are also detailed for a better understanding of usage/limits</description><content:encoded><![CDATA[<hr>
<p>This post of the series digs into the implementation details of the new .NET Core counters.</p>
<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>Part 2: <a href="/posts/2018-07-26_grab-etw-session-providers/">Grab ETW Session, Providers and Events</a>.</p>
<p>Part 3: <a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">CLR Threading events with TraceEvent</a>.</p>
<p>Part 4: <a href="/posts/2018-12-15_spying-on-net-garbage/">Spying on .NET Garbage Collector with TraceEvent</a>.</p>
<p>Part 5: <a href="/posts/2019-02-12_building-your-own-java/">Building your own Java GC logs in .NET</a></p>
<p>Part6: <a href="/posts/2019-05-28_spying-on-net-garbage/">Spying on .NET Core Garbage Collector with .NET Core EventPipes</a></p>
<h2 id="introduction">Introduction</h2>
<p>As explained in <a href="/posts/2018-12-06_in-process-clr-event/">a previous post</a>, <a href="/posts/2018-12-06_in-process-clr-event/">.NET Core 2.2 introduced</a> the <a href="https://docs.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.eventlistener?WT.mc_id=DT-MVP-5003325?view=netcore-2.2">EventListener class</a> to receive in-proc CLR events both on Windows and Linux. Starting with .NET Core 3.0 Preview 6, the <strong>EventPipe</strong>-based infrastructure makes it now possible to get these events from another process. The <a href="https://github.com/dotnet/diagnostics">diagnostics repository</a> contains the cross-platform tools leveraging this infrastructure:</p>
<ul>
<li><a href="https://github.com/dotnet/diagnostics/blob/master/documentation/dotnet-dump-instructions.md"><strong>dotnet-dump</strong></a>: take memory snapshot and allow analysis based on most SOS commands</li>
<li><a href="https://github.com/dotnet/diagnostics/blob/master/documentation/dotnet-trace-instructions.md"><strong>dotnet-trace</strong></a>: collect events emitted by the Core CLR and generate trace file to be analyzed with Perfview</li>
<li><a href="https://github.com/dotnet/diagnostics/blob/master/documentation/dotnet-counters-instructions.md"><strong>dotnet-counters</strong></a>: collect the metrics corresponding to some performance counters that used to be exposed by the .NET Framework</li>
</ul>
<p>At Criteo, our metrics are exposed in Grafana dashboards and it is interesting to figure out how the new counters are implemented and see how to fetch them via the <strong>EventPipe</strong> infrastructure. With this knowledge in hand, I’ve implemented helpers to let you get counters in less than 10 lines of code:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">_counterMonitor</span> <span class="p">=</span> <span class="k">new</span> <span class="n">CounterMonitor</span><span class="p">(</span><span class="n">_pid</span><span class="p">,</span> <span class="n">GetProviders</span><span class="p">());</span>
</span></span><span class="line"><span class="cl"><span class="n">_counterMonitor</span><span class="p">.</span><span class="n">CounterUpdate</span> <span class="p">+=</span> <span class="c1">// receive the value of one counter after the other</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">Task</span> <span class="n">monitorTask</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Task</span><span class="p">(()</span> <span class="p">=&gt;</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_counterMonitor</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span><span class="line"><span class="cl"><span class="n">monitorTask</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>At the end of this post you will be able to very easily integrate any counter to your own monitoring pipeline!</p>
<h2 id="net-core-replacement-fornet-framework-performance-counters">.NET Core replacement for .NET Framework Performance Counters</h2>
<p>With .NET Core being cross-platform, performance counters were gone and, as explained in the previous posts of the series, CLR events were the only way to get metrics about how your .NET Core applications were behaving. However, with .NET Core 3.0, it is now possible to view a few metrics thanks to the <strong>dotnet-counters</strong> tool.</p>
<p>You can download and install the tools automatically if you have installed .NET Core SDK 2.1+. Microsoft is currently working to provide other ways to directly download the tools binaries without having to install the SDK or recompile the diagnostics repository.</p>
<p>Use the following command line to install dotnet-counters:
<code>dotnet tool install --global dotnet-counters --version 3.0.0-preview7.19365.2</code></p>
<p>Note that you need to have the same version both for the Core CLR runtime and for the tools because, as you will soon see, the monitoring and the monitored applications are communicating via a dedicated protocol (that have changed between previews) on top of a transport layer different between Windows and Linux.</p>
<p>After the installation, use the following command line <code>dotnet counters monitor -p </code> and you get a 1 second auto-refreshed view of counters.</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_tzmG5E7_XphPKWYrK_YNzg.png"></p>
<p>These counters are exposed by the <em>System.Runtime</em> provider and are detailed with the <code>list</code> argument:</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_vpUJf51QchDMQh9GPlWbRA.png"></p>
<p>This list is currently hard-coded in the <code>CreateKnownProviders</code> method. However, you are free to create your own provider and expose your application metrics as shown in <a href="https://github.com/dotnet/corefx/blob/master/src/System.Diagnostics.Tracing/documentation/EventCounterTutorial.md">this tutorial</a> (and in the next forthcoming post). In addition, if you are using ASP.NET Core, starting from Preview 7, then you could get a few counters from the “Microsoft.AspNetCore.Hosting” provider defined in <code>HostingEventSource.cs</code>.</p>
<h2 id="what-are-these-counters">What are these “counters”</h2>
<p>Even though it is nice to have a console-based cross-platform tool to see the values of counters change, what would be the cost to get them into your own monitoring pipeline? For example, at Criteo, we are pushing our metrics to Graphite in order to get nice Grafana dashboards. These graphical representations allow us to have a visual representation of the evolution of metrics over time. In addition, it is also possible to define alerts based on threshold for some metrics values (when CPU &gt; 85% for more than 5 seconds for example).</p>
<p>In a nutshell, dotnet-counters tool is listening to another application via <strong>EventPipe</strong>. Unlike .NET Framework performance counters that are polled by the monitoring application, the counters are pushed by the monitored .NET Core process.</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_SuOEY89mW73PiMJkgHZsng.png"></p>
<p>In term of implementation, these counters are values that you could get via .NET internal or public APIs if you were running in-proc as shown <a href="https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/src/System/Diagnostics/Eventing/RuntimeEventSource.cs#L47">in RuntimeEventSource.cs</a>:</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_NaWnrko0FZBfR1IPpzY0iw.png"></p>
<p>Unlike most of the events that previous posts of this series presented, counters are metrics that are computed by the CLR in the monitored application. They are supposed to provide a set of values changing over time in the monitored application without impacting the performance nor flooding the listener client. I highly recommend to take a look at <a href="https://github.com/dotnet/diagnostics/issues/346">this issue</a> for a deeper discussion about <strong>EventCounters</strong> compared to regular events.</p>
<p>As of Preview 7, two types of counters are used:</p>
<ul>
<li><em>Mean</em>: supposed to contain a mean of all values during the polling interval with its min and max values. However, based on <a href="https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/shared/System/Diagnostics/Tracing/PollingCounter.cs#L70">the current implementation</a>, all contain only the current value.</li>
<li><em>Sum</em>: contains an increment between the previous value and the current one</li>
</ul>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_Qxhek7OZy1N-wfDNdKecpQ.png"></p>
<p>The question is now to figure out how to get the values of the counters.</p>
<h2 id="how-to-receive-the-counters">How to receive the counters?</h2>
<p>Like the Perfview tool that relies on <strong>TraceEvent</strong> library, dotnet-counters uses an API exposed by <strong>Microsoft.Diagnostics.Tools.RuntimeClient</strong> assembly. Note that it is currently <a href="https://github.com/dotnet/diagnostics/issues/343">not (yet) available from nuget</a> so you need to recompile it with the <a href="https://github.com/dotnet/diagnostics/issues/343">diagnostics git repo</a>.</p>
<p>To receive counters, you need to create an <strong>EventPipe</strong> session that communicates via IPC (named pipes on Windows and domain sockets on Linux) with the CLR of the monitored process. Here is an excerpt of the <code>CounterMonitor.StartMonitoring</code> <a href="https://github.com/dotnet/diagnostics/blob/master/src/Tools/dotnet-counters/CounterMonitor.cs#L177">implementation</a> that connects and listens to counter events:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">configuration</span> <span class="p">=</span> <span class="k">new</span> <span class="n">SessionConfiguration</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">circularBufferSizeMB</span><span class="p">:</span> <span class="m">1000</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">outputPath</span><span class="p">:</span> <span class="s">&#34;&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">providers</span><span class="p">:</span> <span class="n">Trace</span><span class="p">.</span><span class="n">Extensions</span><span class="p">.</span><span class="n">ToProviders</span><span class="p">(</span><span class="n">providerString</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">binaryReader</span> <span class="p">=</span> <span class="n">EventPipeClient</span><span class="p">.</span><span class="n">CollectTracing</span><span class="p">(</span><span class="n">_processId</span><span class="p">,</span> <span class="n">configuration</span><span class="p">,</span> <span class="k">out</span> <span class="n">_sessionId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">EventPipeEventSource</span> <span class="n">source</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EventPipeEventSource</span><span class="p">(</span><span class="n">binaryReader</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">source</span><span class="p">.</span><span class="n">Dynamic</span><span class="p">.</span><span class="n">All</span> <span class="p">+=</span> <span class="n">ProcessEvents</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">source</span><span class="p">.</span><span class="n">Process</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The important method call is call is <code>EventPipeClient.CollectTracing()</code> that returns a <code>Stream</code> from which an <code>EventPipeEventSource</code> instance gets created. This class has been added to <strong>TraceEvent</strong> so you can now leverage the event parsing infrastructure on top of <strong>EventPipe</strong>! As shown in <a href="/posts/2018-07-26_grab-etw-session-providers/">a previous post</a>, it is easy to attach a listener to the source <code>All</code> .NET event and get notified each time an event is received after the <code>Process</code> method is called.</p>
<p>A few parameters are given to <code>CollectTracing</code> via the <code>SessionConfiguration</code> object: the size of the circular buffer used by the CLR and no file path because we want a live session. The last one is supposed to filter which providers and counters you would like to listen to: it expects a list of <code>Provider</code> instances. This struct <a href="https://github.com/dotnet/diagnostics/blob/master/src/Microsoft.Diagnostics.Tools.RuntimeClient/Eventing/Provider.cs#L10">is created with a few parameters</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">struct</span> <span class="nc">Provider</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kd">public</span> <span class="n">Provider</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">ulong</span> <span class="n">keywords</span> <span class="p">=</span> <span class="kt">ulong</span><span class="p">.</span><span class="n">MaxValue</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">                        <span class="n">EventLevel</span> <span class="n">eventLevel</span> <span class="p">=</span> <span class="n">EventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">                        <span class="kt">string</span> <span class="n">filterData</span> <span class="p">=</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>As we have already mentioned, the name of the provider is “<em>System.Runtime</em>” for the Core CLR counters. The keywords and event level are expected to have these max values. The filter data string starts with “<em>EventCounterIntervalSec=</em>” followed by the refresh interval in seconds. Internally, the CLR in the monitored application <a href="https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/shared/System/Diagnostics/Tracing/CounterGroup.cs#L135">is creating a timer</a> with that frequency to push the counters via <strong>EventPipe</strong> (more on this later).</p>
<p>Here is a helper class to easily create your providers:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">CounterHelpers</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kd">static</span> <span class="n">Provider</span> <span class="n">MakeProvider</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">int</span> <span class="n">refreshIntervalInSec</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">filterData</span> <span class="p">=</span> <span class="n">BuildFilterData</span><span class="p">(</span><span class="n">refreshIntervalInSec</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="k">new</span> <span class="n">Provider</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="m">0xFFFFFFFF</span><span class="p">,</span> <span class="n">EventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span> <span class="n">filterData</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kd">static</span> <span class="kt">string</span> <span class="n">BuildFilterData</span><span class="p">(</span><span class="kt">int</span> <span class="n">refreshIntervalInSec</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">refreshIntervalInSec</span> <span class="p">&lt;</span> <span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">ArgumentOutOfRangeException</span><span class="p">(</span><span class="n">nameof</span><span class="p">(</span><span class="n">refreshIntervalInSec</span><span class="p">),</span> <span class="s">$&#34;must be at least 1 second&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="s">$&#34;EventCounterIntervalSec={refreshIntervalInSec}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that <code>dotnet-counters</code> allows you to pass a subset of the counters with the <em>System.Runtime[counter1,counter2,counter2]</em> syntax: events for all System.Runtime counters will be received but only these three will be displayed in the console.</p>
<h2 id="show-time-for-counterevents">Show time for counter events!</h2>
<p>Next, the important part of the job takes place in the <code>EventSourc.All</code> event listener. Each new counter value is received in the payload of an event named “<em>EventCounters</em>”.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">ProcessEvents</span><span class="p">(</span><span class="n">TraceEvent</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">EventName</span><span class="p">.</span><span class="n">Equals</span><span class="p">(</span><span class="s">&#34;EventCounters&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;</span> <span class="n">countersPayload</span> <span class="p">=</span> <span class="p">(</span><span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;)(</span><span class="n">data</span><span class="p">.</span><span class="n">PayloadValue</span><span class="p">(</span><span class="m">0</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;</span> <span class="n">kvPairs</span> <span class="p">=</span> <span class="p">(</span><span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;)(</span><span class="n">countersPayload</span><span class="p">[</span><span class="s">&#34;Payload&#34;</span><span class="p">]);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">name</span> <span class="p">=</span> <span class="kt">string</span><span class="p">.</span><span class="n">Intern</span><span class="p">(</span><span class="n">kvPairs</span><span class="p">[</span><span class="s">&#34;Name&#34;</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">displayName</span> <span class="p">=</span> <span class="kt">string</span><span class="p">.</span><span class="n">Intern</span><span class="p">(</span><span class="n">kvPairs</span><span class="p">[</span><span class="s">&#34;DisplayName&#34;</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">counterType</span> <span class="p">=</span> <span class="n">kvPairs</span><span class="p">[</span><span class="s">&#34;CounterType&#34;</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">counterType</span><span class="p">.</span><span class="n">Equals</span><span class="p">(</span><span class="s">&#34;Sum&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">OnSumCounter</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">kvPairs</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">counterType</span><span class="p">.</span><span class="n">Equals</span><span class="p">(</span><span class="s">&#34;Mean&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">OnMeanCounter</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">kvPairs</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="k">else</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">InvalidOperationException</span><span class="p">(</span><span class="s">$&#34;Unsupported counter type &#39;{counterType}&#39;&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>Name</code> and <code>DisplayName</code> values are self-explanatory. The <em>Sum</em>/<em>Mean</em> type is retrieved from <code>CounterType</code>.</p>
<p>The value for each counter type is retrieved from the payload with “Increment” (<em>Sum</em> type) or “Mean” (*Mean *type) keys.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"> <span class="kd">private</span> <span class="k">void</span> <span class="n">OnSumCounter</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">string</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;</span> <span class="n">kvPairs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">double</span> <span class="k">value</span> <span class="p">=</span> <span class="kt">double</span><span class="p">.</span><span class="n">Parse</span><span class="p">(</span><span class="n">kvPairs</span><span class="p">[</span><span class="s">&#34;Increment&#34;</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// send the information to your metrics pipeline</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnMeanCounter</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">string</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;</span> <span class="n">kvPairs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">double</span> <span class="k">value</span> <span class="p">=</span> <span class="kt">double</span><span class="p">.</span><span class="n">Parse</span><span class="p">(</span><span class="n">kvPairs</span><span class="p">[</span><span class="s">&#34;Mean&#34;</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// send the information to your metrics pipeline</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>CounterMonitor</code> class has been added on my <a href="https://github.com/chrisnas/ClrEvents">Github</a> to expose a <code>CounterUpdate</code> C# event when a counter event is received:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">CounterMonitor</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">event</span> <span class="n">Action</span><span class="p">&lt;</span><span class="n">CounterEventArgs</span><span class="p">&gt;</span> <span class="n">CounterUpdate</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">void</span> <span class="n">OnSumCounter</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">string</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;</span> <span class="n">kvPairs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">double</span> <span class="k">value</span> <span class="p">=</span> <span class="kt">double</span><span class="p">.</span><span class="n">Parse</span><span class="p">(</span><span class="n">kvPairs</span><span class="p">[</span><span class="s">&#34;Increment&#34;</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// send the information to your metrics pipeline</span>
</span></span><span class="line"><span class="cl">        <span class="n">CounterUpdate</span><span class="p">(</span><span class="k">new</span> <span class="n">CounterEventArgs</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">CounterType</span><span class="p">.</span><span class="n">Sum</span><span class="p">,</span> <span class="k">value</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">void</span> <span class="n">OnMeanCounter</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">string</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">IDictionary</span><span class="p">&lt;</span><span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">&gt;</span> <span class="n">kvPairs</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">double</span> <span class="k">value</span> <span class="p">=</span> <span class="kt">double</span><span class="p">.</span><span class="n">Parse</span><span class="p">(</span><span class="n">kvPairs</span><span class="p">[</span><span class="s">&#34;Mean&#34;</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// send the information to your metrics pipeline</span>
</span></span><span class="line"><span class="cl">        <span class="n">CounterUpdate</span><span class="p">(</span><span class="k">new</span> <span class="n">CounterEventArgs</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">CounterType</span><span class="p">.</span><span class="n">Mean</span><span class="p">,</span> <span class="k">value</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The event argument contains the expected properties but other could be added if needed such as the timestamp for example:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">CounterEventArgs</span> <span class="p">:</span> <span class="n">EventArgs</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">internal</span> <span class="n">CounterEventArgs</span><span class="p">(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">string</span> <span class="n">displayName</span><span class="p">,</span> <span class="n">CounterType</span> <span class="n">type</span><span class="p">,</span> <span class="kt">double</span> <span class="k">value</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Counter</span> <span class="p">=</span> <span class="n">name</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">DisplayName</span> <span class="p">=</span> <span class="n">displayName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">Type</span> <span class="p">=</span> <span class="n">type</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">Value</span> <span class="p">=</span> <span class="k">value</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">string</span> <span class="n">Counter</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">string</span> <span class="n">DisplayName</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">CounterType</span> <span class="n">Type</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">double</span> <span class="n">Value</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">enum</span> <span class="n">CounterType</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Sum</span> <span class="p">=</span> <span class="m">0</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">Mean</span> <span class="p">=</span> <span class="m">1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="lets-show-somegraphs">Let’s show some graphs!</h2>
<p>With these helpers in hand, it is easy to integrate any counter to your monitoring pipeline. As an example, let’s see how to generate a .csv file used to create visual representations in Excel.</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_e68YXXlCXrL6kY7HYs-Jow.png"></p>
<p>With a refresh rate of 1 second, one line containing the value of the CLR counters should be added to the .csv file every second. Since we get one event per counter, we need to know which is the “last” counter event sent by the CLR for a given 1 second counters push.</p>
<p>As mentioned earlier the <a href="https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/src/System/Diagnostics/Eventing/RuntimeEventSource.cs#L47">RuntimeEventSource</a> class defines the CLR counters. Each one is an instance of a type derived from the <code>DiagnoticCounter</code> class that <a href="https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/shared/System/Diagnostics/Tracing/DiagnosticCounter.cs#L45">associates its instances</a> to a <code>CounterGroup</code> also bound to the <code>RuntimeEventSource</code>. The <code>CounterGroup</code> class will setup a repeating timer responsible for creating the payload for its <code>DiagnosticCounter</code>-derived instances and ask the event source to send each to the monitoring application via <strong>EventPipe</strong>.</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_U2SXMs1uV4x36fdjH7nKiA.png"></p>
<p>So we can rely on the order defined by the counters creation code in <code>RuntimeEventSource</code>: for a given push of counters, the name of the last one will be “<em>assembly-count</em>”. Beware that in a case of new counters (such as for ASP.NET Core), you would need to check what would be the last one of the counters series. Another way to work around would be to rely on the timestamps of each event but this could become flaky over time. It would have been great if a “<em>CounterSeries</em>”event containing the list of counter names would have been sent before any “<em>EventCounters</em>” of a series push (good idea for a pull request :^)</p>
<p>The <code>CsvCounterListener</code> class wraps the few lines of code needed to handle the events and add a line into the .csv file each time a series of counters is received:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">  1
</span><span class="lnt">  2
</span><span class="lnt">  3
</span><span class="lnt">  4
</span><span class="lnt">  5
</span><span class="lnt">  6
</span><span class="lnt">  7
</span><span class="lnt">  8
</span><span class="lnt">  9
</span><span class="lnt"> 10
</span><span class="lnt"> 11
</span><span class="lnt"> 12
</span><span class="lnt"> 13
</span><span class="lnt"> 14
</span><span class="lnt"> 15
</span><span class="lnt"> 16
</span><span class="lnt"> 17
</span><span class="lnt"> 18
</span><span class="lnt"> 19
</span><span class="lnt"> 20
</span><span class="lnt"> 21
</span><span class="lnt"> 22
</span><span class="lnt"> 23
</span><span class="lnt"> 24
</span><span class="lnt"> 25
</span><span class="lnt"> 26
</span><span class="lnt"> 27
</span><span class="lnt"> 28
</span><span class="lnt"> 29
</span><span class="lnt"> 30
</span><span class="lnt"> 31
</span><span class="lnt"> 32
</span><span class="lnt"> 33
</span><span class="lnt"> 34
</span><span class="lnt"> 35
</span><span class="lnt"> 36
</span><span class="lnt"> 37
</span><span class="lnt"> 38
</span><span class="lnt"> 39
</span><span class="lnt"> 40
</span><span class="lnt"> 41
</span><span class="lnt"> 42
</span><span class="lnt"> 43
</span><span class="lnt"> 44
</span><span class="lnt"> 45
</span><span class="lnt"> 46
</span><span class="lnt"> 47
</span><span class="lnt"> 48
</span><span class="lnt"> 49
</span><span class="lnt"> 50
</span><span class="lnt"> 51
</span><span class="lnt"> 52
</span><span class="lnt"> 53
</span><span class="lnt"> 54
</span><span class="lnt"> 55
</span><span class="lnt"> 56
</span><span class="lnt"> 57
</span><span class="lnt"> 58
</span><span class="lnt"> 59
</span><span class="lnt"> 60
</span><span class="lnt"> 61
</span><span class="lnt"> 62
</span><span class="lnt"> 63
</span><span class="lnt"> 64
</span><span class="lnt"> 65
</span><span class="lnt"> 66
</span><span class="lnt"> 67
</span><span class="lnt"> 68
</span><span class="lnt"> 69
</span><span class="lnt"> 70
</span><span class="lnt"> 71
</span><span class="lnt"> 72
</span><span class="lnt"> 73
</span><span class="lnt"> 74
</span><span class="lnt"> 75
</span><span class="lnt"> 76
</span><span class="lnt"> 77
</span><span class="lnt"> 78
</span><span class="lnt"> 79
</span><span class="lnt"> 80
</span><span class="lnt"> 81
</span><span class="lnt"> 82
</span><span class="lnt"> 83
</span><span class="lnt"> 84
</span><span class="lnt"> 85
</span><span class="lnt"> 86
</span><span class="lnt"> 87
</span><span class="lnt"> 88
</span><span class="lnt"> 89
</span><span class="lnt"> 90
</span><span class="lnt"> 91
</span><span class="lnt"> 92
</span><span class="lnt"> 93
</span><span class="lnt"> 94
</span><span class="lnt"> 95
</span><span class="lnt"> 96
</span><span class="lnt"> 97
</span><span class="lnt"> 98
</span><span class="lnt"> 99
</span><span class="lnt">100
</span><span class="lnt">101
</span><span class="lnt">102
</span><span class="lnt">103
</span><span class="lnt">104
</span><span class="lnt">105
</span><span class="lnt">106
</span><span class="lnt">107
</span><span class="lnt">108
</span><span class="lnt">109
</span><span class="lnt">110
</span><span class="lnt">111
</span><span class="lnt">112
</span><span class="lnt">113
</span><span class="lnt">114
</span><span class="lnt">115
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">class</span> <span class="nc">CsvCounterListener</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="kt">string</span> <span class="n">_filename</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">readonly</span> <span class="kt">int</span> <span class="n">_pid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">CounterMonitor</span> <span class="n">_counterMonitor</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">List</span><span class="p">&lt;(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">double</span> <span class="k">value</span><span class="p">)&gt;</span> <span class="n">_countersValue</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">CsvCounterListener</span><span class="p">(</span><span class="kt">string</span> <span class="n">filename</span><span class="p">,</span> <span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_filename</span> <span class="p">=</span> <span class="n">filename</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_pid</span> <span class="p">=</span> <span class="n">pid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_countersValue</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;(</span><span class="kt">string</span> <span class="n">name</span><span class="p">,</span> <span class="kt">double</span> <span class="k">value</span><span class="p">)&gt;();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">Start</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">_counterMonitor</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">InvalidOperationException</span><span class="p">(</span><span class="s">$&#34;Start can&#39;t be called multiple times&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">_counterMonitor</span> <span class="p">=</span> <span class="k">new</span> <span class="n">CounterMonitor</span><span class="p">(</span><span class="n">_pid</span><span class="p">,</span> <span class="n">GetProviders</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">        <span class="n">_counterMonitor</span><span class="p">.</span><span class="n">CounterUpdate</span> <span class="p">+=</span> <span class="n">OnCounterUpdate</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">Task</span> <span class="n">monitorTask</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Task</span><span class="p">(()</span> <span class="p">=&gt;</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">try</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">_counterMonitor</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="k">catch</span> <span class="p">(</span><span class="n">Exception</span> <span class="n">x</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="n">Environment</span><span class="p">.</span><span class="n">FailFast</span><span class="p">(</span><span class="s">&#34;Error while listening to counters&#34;</span><span class="p">,</span> <span class="n">x</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">            <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">});</span>
</span></span><span class="line"><span class="cl">        <span class="n">monitorTask</span><span class="p">.</span><span class="n">Start</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">void</span> <span class="n">OnCounterUpdate</span><span class="p">(</span><span class="n">CounterEventArgs</span> <span class="n">args</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_countersValue</span><span class="p">.</span><span class="n">Add</span><span class="p">((</span><span class="n">args</span><span class="p">.</span><span class="n">DisplayName</span><span class="p">,</span> <span class="n">args</span><span class="p">.</span><span class="n">Value</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// we know that the last CLR counter is &#34;assembly-count&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">Counter</span> <span class="p">==</span> <span class="s">&#34;assembly-count&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">SaveLine</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">            <span class="n">_countersValue</span><span class="p">.</span><span class="n">Clear</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">bool</span> <span class="n">isHeaderSaved</span> <span class="p">=</span> <span class="kc">false</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="k">void</span> <span class="n">SaveLine</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(!</span><span class="n">isHeaderSaved</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">File</span><span class="p">.</span><span class="n">AppendAllText</span><span class="p">(</span><span class="n">_filename</span><span class="p">,</span> <span class="n">GetHeaderLine</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">            <span class="n">isHeaderSaved</span> <span class="p">=</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">File</span><span class="p">.</span><span class="n">AppendAllText</span><span class="p">(</span><span class="n">_filename</span><span class="p">,</span> <span class="n">GetCurrentLine</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">string</span> <span class="n">GetHeaderLine</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">StringBuilder</span> <span class="n">buffer</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StringBuilder</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">counter</span> <span class="k">in</span> <span class="n">_countersValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">buffer</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0}\t&#34;</span><span class="p">,</span> <span class="n">counter</span><span class="p">.</span><span class="n">name</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// remove last tab</span>
</span></span><span class="line"><span class="cl">        <span class="n">buffer</span><span class="p">.</span><span class="n">Remove</span><span class="p">(</span><span class="n">buffer</span><span class="p">.</span><span class="n">Length</span> <span class="p">-</span> <span class="m">1</span><span class="p">,</span> <span class="m">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// add Windows-like new line because will be used in Excel</span>
</span></span><span class="line"><span class="cl">        <span class="n">buffer</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="s">&#34;\r\n&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">buffer</span><span class="p">.</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="kt">string</span> <span class="n">GetCurrentLine</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">StringBuilder</span> <span class="n">buffer</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StringBuilder</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">counter</span> <span class="k">in</span> <span class="n">_countersValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">buffer</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0}\t&#34;</span><span class="p">,</span> <span class="n">counter</span><span class="p">.</span><span class="k">value</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// remove last tab</span>
</span></span><span class="line"><span class="cl">        <span class="n">buffer</span><span class="p">.</span><span class="n">Remove</span><span class="p">(</span><span class="n">buffer</span><span class="p">.</span><span class="n">Length</span> <span class="p">-</span> <span class="m">1</span><span class="p">,</span> <span class="m">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// add Windows-like new line because will be used in Excel</span>
</span></span><span class="line"><span class="cl">        <span class="n">buffer</span><span class="p">.</span><span class="n">Append</span><span class="p">(</span><span class="s">&#34;\r\n&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">buffer</span><span class="p">.</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">Stop</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">_counterMonitor</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">InvalidOperationException</span><span class="p">(</span><span class="s">$&#34;Stop can&#39;t be called before Start&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">_counterMonitor</span><span class="p">.</span><span class="n">Stop</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">_counterMonitor</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">_countersValue</span><span class="p">.</span><span class="n">Clear</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">IReadOnlyCollection</span><span class="p">&lt;</span><span class="n">Provider</span><span class="p">&gt;</span> <span class="n">GetProviders</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">providers</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p">&lt;</span><span class="n">Provider</span><span class="p">&gt;();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// create default &#34;System.Runtime&#34; provider with a refresh every second</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">provider</span> <span class="p">=</span> <span class="n">CounterHelpers</span><span class="p">.</span><span class="n">MakeProvider</span><span class="p">(</span><span class="s">&#34;System.Runtime&#34;</span><span class="p">,</span> <span class="m">1</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">providers</span><span class="p">.</span><span class="n">Add</span><span class="p">(</span><span class="n">provider</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">providers</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="whats-next">What’s next?</h2>
<p>You have seen how easy it is to be notified of CLR counters update. The integration to your own monitoring system should not be more complicated. However, you need to pay attention to the meaning of counter types between *Mean *and <em>Sum</em>. For example, the value you get for <strong>gen-0-count</strong> (<em>Sum</em>) counters is a difference between now and the previous computation. It means that you can’t have the “current” number of gen 0 collection at a given time.</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_XKDMcfmyoXAVetidWvYmcA.png"></p>
<p>This is not a problem in the Excel example because you can “rebuild” a column that will contain the “current” count based on the previous value + the diff returned by the counter.</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_3klYCATjxMGhjb_NRcrSZA.png"></p>
<p>Here is the resulting graph:</p>
<p><img loading="lazy" src="/posts/2019-07-23_net-core-counters-internals/1_a42c8gZoHWbMwKNBFSnsiA.png"></p>
<p>In other cases, you might need to feed your monitoring system with real count values and benefit from advanced charting such as non derivative computation to show a rate based on a series of values. At the end of the day, it is just a question of initial value from which rebuild a count. And if you think about it, you are often more interested in unexpected variations (i.e. differences returned by counters) when monitoring your application.</p>
<p>In addition to your business metrics, .NET Core Counters are usually enough to monitor the health of your applications. However, in order to investigate situations where counters value are showing weird results, you often need more details. For example spikes in garbage collections count might not be a problem if the pause time is not too long. Listening to specific CLR events as shown in previous posts of this series is a great way to unveil important metrics such as GC pause time, contentions duration or exception names without performance hit.</p>
<p>The code available on <a href="https://github.com/chrisnas/ClrEvents">Github</a> has been updated to provide the <code>CounterMonitor</code> and <code>CsvCounterListener</code> classes that demonstrates how to get .NET Core counters and generate .csv file usable in Excel.</p>
]]></content:encoded></item><item><title>Spying on .NET Garbage Collector with .NET Core EventPipes</title><link>https://chrisnas.github.io/posts/2019-05-28_spying-on-net-garbage/</link><pubDate>Tue, 28 May 2019 07:38:15 +0000</pubDate><guid>https://chrisnas.github.io/posts/2019-05-28_spying-on-net-garbage/</guid><description>This post shows how to use .NET Core EventPipes to build Garbage Collector logs. The emitted raw CLR events are described in details.</description><content:encoded><![CDATA[<hr>
<p>This post of the series shows how to generate GC logs in .NET Core with the new event pipes architecture and details the events emitted by the CLR during a collection.</p>
<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>Part 2: <a href="/posts/2018-07-26_grab-etw-session-providers/">Grab ETW Session, Providers and Events</a>.</p>
<p>Part 3: CLR Threading events with TraceEvent.</p>
<p>Part 4: <a href="/posts/2018-12-15_spying-on-net-garbage/">Spying on .NET Garbage Collector with TraceEvent</a>.</p>
<p>Part 5: <a href="/posts/2019-02-12_building-your-own-java/">Building your own Java GC logs in .NET</a></p>
<h2 id="introduction">Introduction</h2>
<p>The previous episode of the series introduced the notion of “GC log”, well known in the Java world and how to implement it in .NET thanks to ETW and TraceEvent on Windows. This solution is easy but requires to create an ETW session (and to remember to close it)… and is also not supported on Linux. However, <a href="/posts/2018-12-06_in-process-clr-event/">.NET Core 2.2 introduced</a> the <a href="https://docs.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.eventlistener?WT.mc_id=DT-MVP-5003325?view=netcore-2.2">EventListener class</a> as the best way to receive CLR events both on Windows and Linux but only from inside the process itself. As of today, TraceEvent is not supporting live session with EventPipe/EventListener, only <a href="https://github.com/Microsoft/perfview/blob/master/src/TraceEvent/EventPipe/EventPipeEventSource.cs#L28">a file-based constructor is available</a>. This is unfortunate because it means that you can’t rely on the huge work done by TraceEvent to parse the CLR events; especially those related to garbage collections. The rest of the post will explain how to decipher raw events.</p>
<p>In addition, there is a bigger problem: the current .NET Core 2.2 implementation is <a href="https://github.com/dotnet/coreclr/issues/21380">not working for all CLR events</a>. Long story short, the <code>EventPipe</code> class relies on specific Thread Local Storage slot that is not set by GC background worker threads: the events are not emitted in that case. In addition, there is no per event timestamp information in 2.2. The implementation presented in this post relies on tests done with ETW traces and on the <a href="https://github.com/dotnet/coreclr/pull/21817">Pull Request</a> that fixes the issue for .NET Core 3.0, available in Preview 5.</p>
<h2 id="back-to-the-basics-what-events-are-emitted-by-thegc">Back to the basics: what events are emitted by the GC?</h2>
<p>The previous posts of the series were based on C# events raised by the TraceEvent parser with names different from the original CLR events and the <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events?WT.mc_id=DT-MVP-5003325">corresponding Microsoft Docs</a>. When you implement your EventListener-derived class, each event is received as an <code>EventWrittenEventArgs</code> object in the <code>OnEventWritten</code> override. The <code>EventId</code> and <code>EventName</code> properties allow you to figure out which event is received. If you have worked with TraceEvent before, you might be using the <code>Opcode</code> property but even if a property with the same name exists in <code>EventWrittenEventArgs</code>, the value is completely different and should not be used.</p>
<p>The CLR is versioning the emitted events to be able to add information over time. For example, the <code>EventId</code> of the “GCStart” event is 1 but the <code>EventName</code> could be <em>GCStart</em>, <em>GCStart_V1</em> or <em>GCStart_V2</em> even though the Microsoft Docs seems to be <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events#gcstart_v1_event?WT.mc_id=DT-MVP-5003325">stuck on version 1</a>. The following table lists the interesting GC events for .NET Core 2.2/3.0:</p>
<p><img loading="lazy" src="/posts/2019-05-28_spying-on-net-garbage/1_NFTfqDwPckWMA7Pjv9oX5A.png"></p>
<p>Look at <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events?WT.mc_id=DT-MVP-5003325">the documentation related to each event</a>.</p>
<p>If you go back to <a href="/posts/2019-02-12_building-your-own-java/">this previous article</a> of the series, you notice that all details provided by the <code>TraceGC</code> argument are available except for the objects size before and after the collection. These values are embedded in the workload of the <em>GCPerHeapHistory</em> event by the GC code. Unfortunately, these details are not marshalled by the current <code>EventPipe</code> implementation to your <code>OnEventWritten</code> override (read <a href="https://github.com/dotnet/coreclr/issues/24506">https://github.com/dotnet/coreclr/issues/24506</a> for more details and when it will be fixed).</p>
<p>There is no strongly typed <code>EventArgs</code> per event and you need to know the name of the field you are interested in to get its index. From this index, you get its corresponding value from the <code>Payload</code> property of the received <code>EventWrittenArgs</code>. The following helper method is doing the heavy lifting for you:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">T</span> <span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="n">T</span><span class="p">&gt;(</span><span class="n">EventWrittenEventArgs</span> <span class="n">e</span><span class="p">,</span> <span class="kt">string</span> <span class="n">fieldName</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// this is not very optimum in term of performance but should not be a problem</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">index</span> <span class="p">=</span> <span class="n">e</span><span class="p">.</span><span class="n">PayloadNames</span><span class="p">.</span><span class="n">IndexOf</span><span class="p">(</span><span class="n">fieldName</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">index</span> <span class="p">==</span> <span class="p">-</span><span class="m">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="k">default</span><span class="p">(</span><span class="n">T</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="n">T</span><span class="p">)</span> <span class="n">e</span><span class="p">.</span><span class="n">Payload</span><span class="p">[</span><span class="n">index</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now that all interesting events are known, it is time to figure out what is the sequence of events emitted during a garbage collection: a new line with the details should be added to the GC log file when the last event is received.</p>
<h2 id="what-is-the-exact-sequence-of-gcevents">What is the exact sequence of GC events</h2>
<p>So let’s go back to the main phases of a garbage collection with the related CLR events as shown in the following figure (with <a href="https://twitter.com/konradkokosa">Konrad Kokosa</a> courtesy from <a href="https://www.amazon.com/Pro-NET-Memory-Management-Performance/dp/148424026X">his book</a>)</p>
<p><img loading="lazy" src="/posts/2019-05-28_spying-on-net-garbage/1_pNJJ5L4IlEaOsH6aHzQ4tQ.png"></p>
<p>This is the expected events for the most complicated case: a background collection with possible foreground ephemeral (gen0 and gen1) collections while the GC threads are concurrently sweeping. However, it is not possible to rely on this specific order of events because the order changes, depending on workstation/background mode and generation 2/ephemeral. Each type of collection triggers events in different order as shown below:</p>
<h2 id="gen0gen1-and-gen-2-non-concurrent">Gen0/Gen1 and Gen 2 (non concurrent)</h2>
<p><img loading="lazy" src="/posts/2019-05-28_spying-on-net-garbage/1_wxICgleCgQgNQKQ7eBd8Zg.png"></p>
<h2 id="gen-2-background">Gen 2 (background)</h2>
<p><img loading="lazy" src="/posts/2019-05-28_spying-on-net-garbage/1_mV2osuRu1bwXSgppAvmhvQ.png"></p>
<p>Here is a more visual view of what could happen (dark blue is gen 2 and light blue are ephemeral gen0/1):</p>
<p><img loading="lazy" src="/posts/2019-05-28_spying-on-net-garbage/1_CJb-0oh4Z1vntA2JQpZsog.png"></p>
<h2 id="when-exactly-does-a-gcstart">When exactly does a GC start…</h2>
<p>The <strong>GCTriggered</strong> event notifies that a new collection will start except in the case of foreground ephemeral gen0/gen1 collections triggered during a background gen2. In that case, you could rely on the <strong>GCStart</strong> event and check if a background gen2 is running. This <strong>GCStart</strong> event provides the condemned generation in its <code>Depth</code> property. So I keep track of both the current background GC (if any) and the foreground GC (if any) in a <code>GCInfo</code> object:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">internal</span> <span class="k">class</span> <span class="nc">GCInfo</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// When a background garbage collection (BGC) is started,</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// other foreground garbage collection (FGC) for gen 0 and 1 could happen</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// before the original BGC ends</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">GCDetails</span> <span class="n">CurrentBGC</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// this could contain a FGC after a BGC has started</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// or a non-concurrent gen0/gen1/gen2 collection</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">GCDetails</span> <span class="n">GCInProgress</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>GCDetails</code> class keeps tracks of all the details gathered during a garbage collection:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">internal</span> <span class="k">class</span> <span class="nc">GCDetails</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">DateTime</span> <span class="n">TimeStamp</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">double</span> <span class="n">PauseDuration</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">Number</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">GCReason</span> <span class="n">Reason</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">GCType</span> <span class="n">Type</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">int</span> <span class="n">Generation</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">bool</span> <span class="n">IsCompacting</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="n">HeapDetails</span> <span class="n">Heaps</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>HeapDetails</code> stores the size of each generation after a collection:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">struct</span> <span class="nc">HeapDetails</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">long</span> <span class="n">Gen0Size</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">long</span> <span class="n">Gen1Size</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">long</span> <span class="n">Gen2Size</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="kt">long</span> <span class="n">LOHSize</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>GCDetails</code> instance is created when the <strong>GCStart</strong> event is received:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnGcStart</span><span class="p">(</span><span class="n">EventWrittenEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// This event is received after a collection is started</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">newGC</span> <span class="p">=</span> <span class="n">BuildGCDetails</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// If a BCG is already started, FGC (0/1) are possible and will finish before the BGC</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;Depth&#34;</span><span class="p">)</span> <span class="p">==</span> <span class="m">2</span><span class="p">)</span> <span class="p">&amp;&amp;</span> 
</span></span><span class="line"><span class="cl">        <span class="p">((</span><span class="n">GCType</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;Type&#34;</span><span class="p">)</span> <span class="p">==</span> <span class="n">GCType</span><span class="p">.</span><span class="n">BackgroundGC</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_gcInfo</span><span class="p">.</span><span class="n">CurrentBGC</span> <span class="p">=</span> <span class="n">newGC</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_gcInfo</span><span class="p">.</span><span class="n">GCInProgress</span> <span class="p">=</span> <span class="n">newGC</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// forthcoming expected events for gen 0/1 collections are GCGlobalHeapHistory then GCHeapStats</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">GCDetails</span> <span class="n">BuildGCDetails</span><span class="p">(</span><span class="n">EventWrittenEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="k">new</span> <span class="n">GCDetails</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">Number</span> <span class="p">=</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;Count&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">Generation</span> <span class="p">=</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;Depth&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">Type</span> <span class="p">=</span> <span class="p">(</span><span class="n">GCType</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;Type&#34;</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">Reason</span> <span class="p">=</span> <span class="p">(</span><span class="n">GCReason</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;Reason&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This is where it is important to remember if either a background or foreground GC is starting. In the former case, the <code>CurrentBGC</code> field is set and the <code>GCInProgress</code> field is set otherwise with a new <code>GCDetails</code> instance.</p>
<p>That way, when either of <strong>GCGlobalHistory</strong> or <strong>GCHeapStarts</strong> is received, it is easy to know what is the GC in progress; i.e. if a foreground GC is in progress, an event happens in its context (until the last one **GCHeapStats **that will clean the <code>GCInProcess</code> field):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="n">GCDetails</span> <span class="n">GetCurrentGC</span><span class="p">(</span><span class="n">GCInfo</span> <span class="n">info</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">info</span><span class="p">.</span><span class="n">GCInProgress</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">info</span><span class="p">.</span><span class="n">GCInProgress</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">info</span><span class="p">.</span><span class="n">CurrentBGC</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="-suspend-pause-application-threads-and-end-of-ephemeral-collections">… suspend, pause application threads and end of ephemeral collections</h2>
<p>The suspension and pause time are not that complicated to compute. The garbage collector code is relying on the <code>SuspendEE</code> and <code>RestartEE</code> methods provided by the .NET Execution Engine to suspend and restart the application threads respectively. Each of these methods emits a pair of <strong>GCxxxBegin</strong> and <strong>GCxxxEnd</strong> events. After <strong>GCSuspendEEBegin</strong> is emitted, the Execution Engine waits for the application threads to suspend their execution. When all threads are suspended, <strong>GCSuspendEEEnd</strong> gets emitted.</p>
<p>The <strong>GCRestartEEBegin</strong> event is emitted when the applications threads begin to resume their execution. When all application threads are resumed, <strong>GCRestartEEEnd</strong> gets emitted. The elapsed time between the <strong>GCSuspendEEEnd</strong> and <strong>GCRestartEEBegin</strong> events is counted as <em>suspension time</em>. However, for simplicity sake, my current implementation sums both the time spent by the Execution Engine to suspend the threads and the pause time due to the GC work.</p>
<p>The suspension start time is kept in <strong>GCInfo</strong>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// time when SuspendEEBegin is received for this process</span>
</span></span><span class="line"><span class="cl"><span class="c1">// --&gt; from here, all app threads will be suspended until RestartEEStop is received</span>
</span></span><span class="line"><span class="cl"><span class="c1">// Note that we don&#39;t know yet what will be the triggered GC</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="n">DateTime</span><span class="p">?</span> <span class="n">SuspensionStart</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="k">set</span><span class="p">;</span> <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It will be set when the <strong>GCSuspendEEBegin</strong> event is received:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnGcSuspendEEBegin</span><span class="p">(</span><span class="n">EventWrittenEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// we don&#39;t know yet what will be the next GC corresponding to this suspension</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// so it is kept until next GCStart </span>
</span></span><span class="line"><span class="cl">    <span class="n">_gcInfo</span><span class="p">.</span><span class="n">SuspensionStart</span> <span class="p">=</span> <span class="n">e</span><span class="p">.</span><span class="n">TimeStamp</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This implementation decision does not provide the same level of suspension details (no fine grain suspension time for inner foreground collections) as the one provided by the TraceEvent parsing.</p>
<p>The sibling <strong>GCRestartEEEnd</strong> event is used to (1) compute the total pause time and (2) detect when gen0/gen1/non concurrent gen2 collections end:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnGcRestartEEEnd</span><span class="p">(</span><span class="n">EventWrittenEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">currentGC</span> <span class="p">=</span> <span class="n">GetCurrentGC</span><span class="p">(</span><span class="n">_gcInfo</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">currentGC</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// this should never happen, except if we are unlucky to have missed a GCStart event</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// compute suspension time</span>
</span></span><span class="line"><span class="cl">    <span class="kt">double</span> <span class="n">suspensionDuration</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_gcInfo</span><span class="p">.</span><span class="n">SuspensionStart</span><span class="p">.</span><span class="n">HasValue</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">suspensionDuration</span> <span class="p">=</span> <span class="p">(</span><span class="n">e</span><span class="p">.</span><span class="n">TimeStamp</span> <span class="p">-</span> <span class="n">_gcInfo</span><span class="p">.</span><span class="n">SuspensionStart</span><span class="p">.</span><span class="n">Value</span><span class="p">).</span><span class="n">TotalMilliseconds</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_gcInfo</span><span class="p">.</span><span class="n">SuspensionStart</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// bad luck: a xxxBegin event has been missed</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="n">currentGC</span><span class="p">.</span><span class="n">PauseDuration</span> <span class="p">+=</span> <span class="n">suspensionDuration</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// could be the end of a gen0/gen1 or of a non concurrent gen2 GC</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="n">currentGC</span><span class="p">.</span><span class="n">Generation</span> <span class="p">&lt;</span> <span class="m">2</span><span class="p">)</span> <span class="p">||</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="n">currentGC</span><span class="p">.</span><span class="n">Type</span> <span class="p">==</span> <span class="n">GCType</span><span class="p">.</span><span class="n">NonConcurrentGC</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">GcEvents</span><span class="p">?.</span><span class="n">Invoke</span><span class="p">(</span><span class="k">this</span><span class="p">,</span> <span class="n">BuildGcArgs</span><span class="p">(</span><span class="n">currentGC</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="n">_gcInfo</span><span class="p">.</span><span class="n">GCInProgress</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// in case of background gen2, just need to sum the suspension time</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// --&gt; its end will be detected during GcGlobalHistory event</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="detect-other-collections-end-and-moredetails">Detect other collections end (and more details)</h2>
<p>As shown in the events workflow figure, the <strong>GCRestartEEBegin</strong>/<strong>GCRestartEEEnd</strong> duo of events are used to detect the end of non-concurrent gen0/1/2 collections. It is more complicated to detect the end of a gen2 background or inner ephemeral gen0/1 collections: <strong>GCGlobalHeapHistory</strong> for the former and <strong>GCHeapStats</strong> for the latter. However, these two events payload does not contain the piece of information to know if we are in a middle of a background gen 2 or not. With this details in mind, the code of the different event handlers is quite straightforward.</p>
<p>The generations size are retrieved from the <strong>GCHeapStat</strong> event:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// This event provides the size of each generation after the collection</span>
</span></span><span class="line"><span class="cl"><span class="c1">// Note: last event for non background GC (will be GCGlobalHeapHistory for background gen 2)</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnGcHeapStats</span><span class="p">(</span><span class="n">EventWrittenEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">currentGC</span> <span class="p">=</span> <span class="n">GetCurrentGC</span><span class="p">(</span><span class="n">_gcInfo</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">currentGC</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">currentGC</span><span class="p">.</span><span class="n">Heaps</span><span class="p">.</span><span class="n">Gen0Size</span> <span class="p">=</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;GenerationSize0&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">currentGC</span><span class="p">.</span><span class="n">Heaps</span><span class="p">.</span><span class="n">Gen1Size</span> <span class="p">=</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;GenerationSize1&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">currentGC</span><span class="p">.</span><span class="n">Heaps</span><span class="p">.</span><span class="n">Gen2Size</span> <span class="p">=</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;GenerationSize2&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">currentGC</span><span class="p">.</span><span class="n">Heaps</span><span class="p">.</span><span class="n">LOHSize</span> <span class="p">=</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">ulong</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;GenerationSize3&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// this is the last event for non background collections  during a background gen2 collections</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="n">_gcInfo</span><span class="p">.</span><span class="n">CurrentBGC</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span> <span class="p">&amp;&amp;</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="n">currentGC</span><span class="p">.</span><span class="n">Generation</span> <span class="p">&lt;</span> <span class="m">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">       <span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">GcEvents</span><span class="p">?.</span><span class="n">Invoke</span><span class="p">(</span><span class="k">this</span><span class="p">,</span> <span class="n">BuildGcArgs</span><span class="p">(</span><span class="n">currentGC</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">        <span class="n">_gcInfo</span><span class="p">.</span><span class="n">GCInProgress</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Remember this is the last event received for a gen0/gen1/foreground gen2 collection so I’m using it to clear the <code>GCInProgress</code> field: the next event will be for the current background gen2 if any (<code>CurrentBGC</code> field is not null) or a new collection.</p>
<p>As of today with Preview 5, the before/after generation sizes are not marshalled through event pipes (see the <a href="https://github.com/dotnet/coreclr/issues/24506">corresponding bug</a> for more details) so the **GCPerHeapHistory **event does not bring any value.</p>
<p>The last <strong>GCGlobalHeapHistory</strong> event of background gen 2 collection is also used to detect compaction:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// This event is used to figure out if a collection is compacting or not</span>
</span></span><span class="line"><span class="cl"><span class="c1">// Note: last event for background GC (will be GCHeapStats for ephemeral (0/1) and non concurrent gen 2 collections)</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnGcGlobalHeapHistory</span><span class="p">(</span><span class="n">EventWrittenEventArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">currentGC</span> <span class="p">=</span> <span class="n">GetCurrentGC</span><span class="p">(</span><span class="n">_gcInfo</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// check unexpected event (we should have received a GCStart first)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">currentGC</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">globalMask</span> <span class="p">=</span> <span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="n">GCGlobalMechanisms</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;GlobalMechanisms&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">currentGC</span><span class="p">.</span><span class="n">IsCompacting</span> <span class="p">=</span>
</span></span><span class="line"><span class="cl">        <span class="p">(</span><span class="n">globalMask</span> <span class="p">&amp;</span> <span class="n">GCGlobalMechanisms</span><span class="p">.</span><span class="n">Compaction</span><span class="p">)</span> <span class="p">==</span> <span class="n">GCGlobalMechanisms</span><span class="p">.</span><span class="n">Compaction</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// this is the last event for gen 2 background collections</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">((</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">int</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;CondemnedGeneration&#34;</span><span class="p">)</span> <span class="p">==</span> <span class="m">2</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">currentGC</span><span class="p">.</span><span class="n">Type</span> <span class="p">==</span> <span class="n">GCType</span><span class="p">.</span><span class="n">BackgroundGC</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// check unexpected generation mismatch</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">globalMask</span> <span class="p">=</span> <span class="p">(</span><span class="n">GCGlobalMechanisms</span><span class="p">)</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;GlobalMechanisms&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">currentGC</span><span class="p">.</span><span class="n">IsCompacting</span> <span class="p">=</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="n">globalMask</span> <span class="p">&amp;</span> <span class="n">GCGlobalMechanisms</span><span class="p">.</span><span class="n">Compaction</span><span class="p">)</span> <span class="p">==</span> <span class="n">GCGlobalMechanisms</span><span class="p">.</span><span class="n">Compaction</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// this is the last event for gen 2 background collections</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">((</span><span class="n">GetFieldValue</span><span class="p">&lt;</span><span class="kt">uint</span><span class="p">&gt;(</span><span class="n">e</span><span class="p">,</span> <span class="s">&#34;CondemnedGeneration&#34;</span><span class="p">)</span> <span class="p">==</span> <span class="m">2</span><span class="p">)</span> <span class="p">&amp;&amp;</span> <span class="p">(</span><span class="n">currentGC</span><span class="p">.</span><span class="n">Type</span> <span class="p">==</span> <span class="n">GCType</span><span class="p">.</span><span class="n">BackgroundGC</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">GcEvents</span><span class="p">?.</span><span class="n">Invoke</span><span class="p">(</span><span class="k">this</span><span class="p">,</span> <span class="n">BuildGcArgs</span><span class="p">(</span><span class="n">currentGC</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">            <span class="n">ClearCollections</span><span class="p">(</span><span class="n">_gcInfo</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In case of a background gen 2, this is the last event so there should not be any collection in progress:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">ClearCollections</span><span class="p">(</span><span class="n">GCInfo</span> <span class="n">info</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">info</span><span class="p">.</span><span class="n">CurrentBGC</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">info</span><span class="p">.</span><span class="n">GCInProgress</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The next received event will start a new garbage collection cycle of events.</p>
<p>This post concludes the series about CLR events and how to use them to better understand how the runtime is behaving under the workloads of your applications. The code available on <a href="https://github.com/chrisnas/ClrEvents">Github</a> has been updated to provide the <code>EventListenerGcLog</code> class that uses the code demonstrated in this post to generate GC logs with event pipes.</p>
]]></content:encoded></item><item><title>Let’s debug the Core CLR with WinDBG!</title><link>https://chrisnas.github.io/posts/2019-04-04_let-debug-the-core/</link><pubDate>Thu, 04 Apr 2019 13:32:51 +0000</pubDate><guid>https://chrisnas.github.io/posts/2019-04-04_let-debug-the-core/</guid><description>This post of the series shows how you could easily debug the Core CLR in a real world case of insane thread contention duration.</description><content:encoded><![CDATA[<hr>
<p>This post of the series shows how we debugged the Core CLR to figure out insane contention duration.</p>
<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>Part 2: <a href="/posts/2018-07-26_grab-etw-session-providers/">Grab ETW Session, Providers and Events</a>.</p>
<p>Part 3: CLR Threading events with TraceEvent.</p>
<p>Part 4: <a href="/posts/2018-12-15_spying-on-net-garbage/">Spying on .NET Garbage Collector with TraceEvent</a>.</p>
<p>Part 5: <a href="/posts/2019-02-12_building-your-own-java/">Building your own Java GC logs in .NET</a>.</p>
<h2 id="introduction">Introduction</h2>
<p>Long before migrating our .NET applications to Linux, our first step was to build a monitoring pipeline based on LTTng instead of ETW on Windows. To achieve this goal, the open source TraceEvent Nuget package needed to be updated in order to listen to LTTng live session (only a file based implementation was provided by Microsoft; mostly to allow Perfview to be able to open traces taken on Linux machines). This was a <a href="https://github.com/criteo-forks/perfview/pull/1">huge development task</a> that led sometimes to weird results. Among the metrics we wanted to monitor, the contention duration gave insane value such as thousands of minutes… per minute:</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_chjw_0ZlNI1GH6wc2tgNBg.png"></p>
<p>As shown in <a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">a previous episode</a>, this duration is computed by comparing the time between the two events <strong>ContentStart</strong> and <strong>ContentionStop</strong>. What could be the possible reasons to get such insane values?</p>
<ol>
<li>
<p>A lot of small contentions are happening</p>
</li>
<li>
<p>A few very long contentions are happening</p>
</li>
</ol>
<p>As a first step, it would be great to be able to debug the Core CLR and figure out what call stacks end up to triggering these contention events. Unfortunately for us, the .NET debugging ecosystem on Linux is far from being as rich as on Windows. So this episode is detailing the steps to compile and debug the Core CLR on Windows with WinDBG.</p>
<h2 id="from-the-source-to-debugging-theruntime">From the source to debugging the runtime</h2>
<p>To better understand the implementation details in the CLR, we needed to find where the two events are emitted. In fact, during the CLR compilation, a lot of helpers are created based on the name of the event. In our case, <code>FireEtwContentionStart_V1</code> and <code>FireContentionStop</code> are the two helpers in charge. Both are called <a href="https://github.com/dotnet/coreclr/blob/master/src/vm/syncblk.cpp#L2993">in the <strong>AwareLock::EnterEpilogHelper</strong> function</a>.</p>
<p>As a Windows developer, I would like to debug the CLR code and set a breakpoint in the <code>EnterEpilogHelper</code> with Visual Studio to see what are the call stacks that end up to contention. However, I did not find a way to do it with Visual Studio. I turned to WinDBG and things gets “easier”… in a certain way.</p>
<p>Here are the different steps you need to follow before setting a breakpoint on any Core CLR function in WinDBG:</p>
<ul>
<li>Clone the Core CLR repository from <a href="https://github.com/dotnet/coreclr">https://github.com/dotnet/coreclr</a></li>
<li>Build it:</li>
<li>Get the Visual Studio, .NET Core SDK, CMake, Python, Powershell prerequisites from <a href="https://github.com/dotnet/coreclr/blob/master/Documentation/building/windows-instructions.md">the documentation</a></li>
<li>Goto the root folder and type <code>.\build -skiptests</code> to build a DEBUG version of the Core CLR</li>
<li>Leave your desk and go to lunch (ok… maybe just take a coffee break)</li>
</ul>
<ol start="3">
<li>When you go back, the result of the compilation should be available in the following folder:</li>
</ol>
<p>…\coreclr\bin\Product\Windows_NT.x64.debug.</p>
<ol start="4">
<li>the next step is to <a href="https://github.com/dotnet/coreclr/blob/master/Documentation/workflow/UsingYourBuild.md">use your custom Core CLR build</a> in the application:</li>
</ol>
<ul>
<li>the application must be <a href="https://docs.microsoft.com/en-us/dotnet/core/deploying/#self-contained-deployments-scd">self-contained</a> by adding <code>win-x64</code> (or linux-x64 for Linux) in a <code>PropertyGroup</code> section of the .csproj.</li>
<li>publish the application by running <code>dotnet publish</code> or from within Visual Studio</li>
</ul>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_Xvue5jy9443m9Zjih19cQw.png"></p>
<ul>
<li>Click the <em>Configure</em> link and select Debug configuration</li>
</ul>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_UDj6pcKSZutk3E7Kbvkfjw.png"></p>
<ul>
<li>after clicking <em>Save</em> and <em>Publish</em>, you should now have the result under the \bin\Debug\netcoreapp2.2\publish folder.</li>
<li>after clicking <em>Save</em> and <em>Publish</em>, you should now have the result under the \bin\Debug\netcoreapp2.2\publish folder.</li>
</ul>
<ol start="5">
<li>It is now time to copy the following files from the Core CLR output to your application publication folder:</li>
</ol>
<ul>
<li>coreclr.dll (for the native part of the CLR) and System.Private.CoreLib.dll (if the CLR C# code has been modified)</li>
<li>in the PDB subfolder, coreclr.pdb and System.Private.CoreLib.pdb</li>
<li>note that you might also need the sos.dll and mscordaccore.dll files for any investigation in WinDBG.</li>
</ul>
<p>If you wonder why the CoreFx repo is not rebuilt, the answer is simple: the contention related code is in the CoreCLR. Also, most of the managed “mscorlib” is defined in System.Private.CoreLib.dll that gets built with CoreCLR. The rest of the BCL is covered by CoreFX and not needed in this investigation.</p>
<h2 id="from-running-to-debugging-inwindbg">From running to debugging in WinDBG</h2>
<p>You should use <a href="https://github.com/dotnet/coreclr/blob/master/Documentation/workflow/UsingCoreRun.md">corerun.exe</a> instead of dotnet.exe to run an application with the debug version of the Core CLR you’ve just built.</p>
<p>Open up a command prompt in the <strong>coreclr\bin\Product\Windows_NT.x64.debug</strong> folder and type <code>corerun</code>** c:&lt;your path to the <strong>bin\Debug\netcoreapp2.2\publish</strong> folder of your application&gt;&lt;yourApp.dll&gt;**</p>
<p>You have to tell <code>corerun</code> where to find the CoreFx assemblies via the <code>CORE_LIBRARY</code> environment variable:</p>
<p><code>CORE_LIBRARIES=C:\Program Files\dotnet\shared\Microsoft.NETCore.App\2.2.0</code></p>
<p>If you forget about it, don’t be surprised if the application stops with <code>FileNoteFoundException</code> for a missing assembly (usually <strong>System.Runtime</strong>)…</p>
<p>If, like me, your applications are running with server mode GC, you know that it is set in the application .csproj file to end up into the runtimeconfig.json file. Unfortunately, this is not taken into account by <code>corerun</code> (yet?) and you need to set it (and if you need concurrent version too) explicitly through the following environment variables:</p>
<p><code>COMPlus_gcServer=1 COMPlus_gcConcurrent=1</code></p>
<p>From there, (<a href="https://www.microsoft.com/en-us/p/windbg-preview/9pgjgd53tn86">install WinDBG if not already done</a> and) start the debugger: click the <em>File</em> menu and select <em>Launch Executable (advanced)</em> to setup a debugging session:</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_ZyuzZ_qKw_bhpnCfq0WTDA.png"></p>
<p>The <em>Executable</em> text field points to the <strong>corerun.exe</strong> file generated during the compilation of the Core CLR. The same folder is used as <em>Start Directory</em> and the <em>Arguments</em> text field contains the full path of the application to debug. You could also attach to a running process but sometimes you need to access Core CLR data structures before any C#-compiled managed code of your application starts executing (to see how the garbage collector initializes for example).</p>
<p>As soon as you click the <em>Ok</em> button, the application starts but is almost immediately stopped by WinDBG</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_FFwwuUz6_xeF2AYBxn8K-A.png"></p>
<p>Don’t be scared by the last lines of the output: even through you read the word <strong>exception</strong>, this <code>int 3</code>** **assembly instruction tells you that a breakpoint has been set for you by WinDBG, has been hit when the application reached it and the application is now paused just before calling its entry point.</p>
<p>As you can see from the list of loaded modules, even though CoreRun.exe is there, no managed assembly (especially your application) has been loaded yet; not even the Core CLR itself! This means that you have to tell WinDBG to keep on executing the application until a point you would be interested in. To achieve that task, you will first need a quick tour of WinDBG user interface even though this post is not there to replace the <a href="https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/debugging-using-windbg-preview">WinDBG online help</a> nor provide a detailed walkthrough.</p>
<p>The debugging section of the <em>Home</em> tab is not too different from what you get in Visual Studio:</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_h5nRlDCO_PXfonPtcVwFgg.png"></p>
<p>The icons are even easier to understand because their action is also displayed. If you want to see the current call stack, select the <em>View</em> tab and click the <strong>Stack</strong> icon:</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_UgBlXPkw57Gja-_ANnkfTw.png"></p>
<p>Like in Visual Studio, you are able to pin each panel wherever you want into the IDE</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_kqvC1tJsfN_lpaD6rclGjA.png"></p>
<p>The next step would be to set a breakpoint on the line of code you are interested in. But let’s be clear here: I’m talking about a line of code in a function exported by a native dll; not a line of C# code in a managed assembly. Remember that WinDBG is a native debugger and it debugs only native code. If you want to debug managed code with WinDBG, you need to use commands from the sos extension; but <a href="https://docs.microsoft.com/en-us/archive/blogs/tess/setting-breakpoints-in-net-code-using-bpmd?WT.mc_id=DT-MVP-5003325">this is another story</a>.</p>
<p>So let’s go back to the native world. Even though WinDBG does not have the notion of “solution” like Visual Studio provides, it is still possible to open a C++ file and set a breakpoint in it. Click <em>File |Open Source File</em> menu and go to your Core CLR github repo to select syncblk.cpp under the \src\vm folder. Look for <code>AwareLock::EnterEpilogHelper</code> with CTRL+F (yes: search is working in WinDBG) and go down to the call to the <code>FireEtwContentionStart_V1</code> helper. Setting a breakpoint on this line is as simple as pressing **F9 **like in Visual Studio. Press the <em>View</em> tab and click the <em>Breakpoint</em> button to see the result:</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_O1BnT-PQKdoFGo0NxyiOtw.png"></p>
<p>Since the dll in which the breakpoint is set is not loaded yet, you can’t see the details of the breakpoint.</p>
<p>There is a way to tell WinDBG to continue the execution of the application until a dll get’s loaded. For coreclr.dll, type the following command:</p>
<p><code>sxeld:coreclr</code></p>
<p>and type <strong>F5</strong> (or type <code>g</code> as a command or click the green triangle in the <em>Home</em> toolbar) to resume the execution of the application. The <em>Breakpoints</em> panel shows more details now:</p>
<p><img loading="lazy" src="/posts/2019-04-04_let-debug-the-core/1_SCPPNlsp-9Oqp0uk2LcTUw.png"></p>
<p>Press <strong>F5</strong> to resume the execution and the breakpoint should be triggered when the first contention happens.</p>
<h2 id="from-symbols-to-call-stacks-inwindbg">From symbols to call stacks in WinDBG</h2>
<p>Before digging into call stacks, I would like to show you one of the differences between native dlls and managed assemblies. As a .NET developer, you are used to Intellisense and strongly typed environment provided by the metadata stored in an assembly itself. For native dll, the story is different. Exported functions are visible with tools such as <a href="http://www.dependencywalker.com/">Dependency Walker</a> or <code>dumpbin /exports</code> from the SDK. If the dll exports symbols built by the C++ compiler, their name gets <em>mangled</em> to describe their signature. To get human readable symbols, you need the associated .pdb file. It will also be required to map a function address to its name in call stacks.</p>
<p>WinDBG allows you to browse these symbols with the <code>dt</code> command. For example, if you want to know all members defined by the <strong>AwareLock</strong> class, use the following command:</p>
<p><code>dt CoreClr!AwareLock::*</code></p>
<p>Like what was shown in the previous <em>Breakpoints</em> screenshot, the prefix of a name is the dll in which the symbol is defined. Next, use <code>!</code> as separator before the class name. Since Visual Studio is really slow to navigate the source code of the Core CLR or search in the thousands of include and C/C++ files, this is a very convenient way to navigate and learn its different parts. Don’t forget that the compilation could also inline functions (that won’t be visible in the symbols) and expand macros.</p>
<p>If you want to set a breakpoint on a function, use the <code>bp</code> command with the same syntax as <code>dt</code>. For example, the following command:</p>
<p><code>bp coreClr!AwareLock::EnterEpilogHelper</code></p>
<p>sets a breakpoint at the beginning of the function in which I already set a breakpoint.</p>
<p>This is the very basics of breakpoints in WinDBG. You are also able to define which actions to start when a breakpoint is hit. This is extremely powerful! For example, in the case of thread contention, you typically don’t want to stop the execution of the application because it will pause all threads and disturb the normal flow of execution that could lead to thread contention. Instead, you could ask WinDBG to dump the call stack leading to the function we are interested in and lets the execution resume with the following syntax:</p>
<p><code>bp coreClr!AwareLock::EnterEpilogHelper &quot;!clrstack; g&quot;</code></p>
<p>The commands to execute after the breakpoint is hit are defined between quotes. In this example, I’m using the <code>clrstack</code> command exported by the sos.dll extension (that must be previously loaded via <code>.loadby sos coreclr</code>) and once it is done, <code>g</code> resumes the execution.</p>
<h2 id="whats-next">What’s next?</h2>
<p>Due to automatic suspension of all threads when the <code>clrstack</code> command gets executed (before <code>g</code> resumes), the interactions between threads are not the same as normal execution outside of a debugger. I have even used <a href="https://github.com/criteo-forks/coreclr/commit/7394345097a78c7be3241939d357595ebad9b26a">some code available in DEBUG to dump the callstacks outside of a debugger</a> if the contention last more than a threshold. However, it was not possible to reproduce the problem on Windows.</p>
<p>In parallel on Linux, another colleague investigated another lead: some events may also be skipped by our LTTng implementation. Due to complicated event management, if a <strong>ContentionStop</strong> and <strong>ContentionStart</strong> are missed, a possible previous <strong>ContentionStart</strong> could be used by the next <strong>ContentionStop</strong> and the duration would be unrelated to the real contention that happened.</p>
<p>So there could be a simpler way to narrow down the issue: instead of relying on two events, why not simply compute the duration of the contention in the <code>AwareLock::EnterEpilogHelper</code> function and emit only one new event with the duration as payload? Well… this will be the topic of the next episode of this series.</p>
<h2 id="references">References</h2>
<p>Series of videos from the Defrag Tools show where <a href="https://twitter.com/maoni0">Maoni Stephens</a> explains how to debug the Garbage Collector for a better understanding of its arcana</p>
<ul>
<li><a href="https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-33-CLR-GC-Part-1">https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-33-CLR-GC-Part-1</a></li>
<li><a href="https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-34-CLR-GC-Part-2">https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-34-CLR-GC-Part-2</a></li>
<li><a href="https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-35-CLR-GC-Part-3">https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-35-CLR-GC-Part-3</a></li>
<li><a href="https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-36-CLR-GC-Part-4">https://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-36-CLR-GC-Part-4</a></li>
</ul>
]]></content:encoded></item><item><title>Building your own Java-like GC logs in .NET</title><link>https://chrisnas.github.io/posts/2019-02-12_building-your-own-java/</link><pubDate>Tue, 12 Feb 2019 10:11:39 +0000</pubDate><guid>https://chrisnas.github.io/posts/2019-02-12_building-your-own-java/</guid><description>This post of the series focuses on logging each GC details in a file and how to leverage it during investigations.</description><content:encoded><![CDATA[<hr>
<p>This post of the series focuses on logging each GC details in a file and how to leverage it during investigations.</p>
<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>Part 2: <a href="/posts/2018-07-26_grab-etw-session-providers/">Grab ETW Session, Providers and Events</a>.</p>
<p>Part 3: <a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">Monitor Finalizers, contention and threads in your application</a>.</p>
<p>Part 4: <a href="/posts/2018-12-15_spying-on-net-garbage/">Spying on .NET Garbage Collector with TraceEvent</a>.</p>
<h2 id="introduction">Introduction</h2>
<p>I’m working in a team where we investigate issues in production: both for Java and .NET applications. This is a good opportunity to learn what are the features provided by Java that are missing in .NET. One of the features heavily discussed with my colleague <a href="https://twitter.com/jpbempel">Jean-Philippe</a> is called the <em>GC Log</em>. It is possible to start an application with parameters that tell the GC to save tons of details about each garbage collection in a file : the GC Log. Based on this file, it is possible to extract the reason of a collection, the times of the different phases including the suspension time. This is a great source of information during investigations… when you know how GC is working or by leveraging <a href="https://gceasy.io/">automatic report generation</a>.</p>
<p>In addition, you can also build your own UI to more easily understand what is going on and get a more visual representation of the situation.</p>
<p>In the short video above you can see the heap evolution during several days. Then, as this is an interactive HTML page you can zoom in an interesting period to have a more detailed view of the evolution between GCs.</p>
<p>Also for the pause time graph, you can follow the behavior of the GC with different kinds of pauses and associated phases. In this example, we have minor GCs happening and then an “initial mark” is triggered, followed by “final remark” and “cleanup” pauses. After an extra minor GC, we have a series of mixed GCs that is the result of what was planned by the GC after the marking phase.</p>
<p>In the .NET world, there is no such thing as a GC Log. However, as shown <a href="/posts/2018-12-15_spying-on-net-garbage/">in the previous post</a>, it is possible to use Perfview to analyze traces corresponding to collected CLR events. The GCStats view shows high level details in the “All GC Events” section. In addition to this HTML rendering, you can get access to the data itself in different formats</p>
<p><img loading="lazy" src="/posts/2019-02-12_building-your-own-java/1_q01Myz2Qqtad8URqchuJ3Q.png"></p>
<p>The more complete one is the Raw Data XML file that you could parse to extract the details you need. This is very close to a .NET GC Log but it is complicated to build an automated process to get it from a production machine.</p>
<p>It would be great if you could tell a .NET application to generate such a GC Log like in Java instead of relying on manual steps with Perfview (and more scripts on Linux). This post will show you how to achieve this goal!</p>
<h2 id="defining-the-goal-basic-gclog-implementation">Defining the goal: basic GcLog implementation</h2>
<p>In Java, you have to set on or off the GC log before the application starts and you can’t change it while it runs. Since I’m working with server applications, I would prefer to enable/disable the generation of a GC log file without having to stop and restart the application.</p>
<p>So I’ve defined the simple <strong>IGcLog</strong> interface:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="k">interface</span> <span class="nc">IGcLog</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">void</span> <span class="n">Start</span><span class="p">(</span><span class="kt">string</span> <span class="n">filename</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">void</span> <span class="n">Stop</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>In a dedicated <em>administration handler</em> (i.e. http endpoint of the application), the code could just use a class that implements this interface and call <code>Start </code>when the log is enabled and <code>Stop</code> when it is no more needed.</p>
<p>To make the implementation easier, I’ve written the following <code>GcLogBase</code> abstract class:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">abstract</span> <span class="k">class</span> <span class="nc">GcLogBase</span> <span class="p">:</span> <span class="n">IGcLog</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">protected</span> <span class="kt">string</span> <span class="n">Filename</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">private</span> <span class="n">StreamWriter</span> <span class="n">_fileWriter</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">Start</span><span class="p">(</span><span class="kt">string</span> <span class="n">filename</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="kt">string</span><span class="p">.</span><span class="n">IsNullOrEmpty</span><span class="p">(</span><span class="n">filename</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">ArgumentNullException</span><span class="p">(</span><span class="n">nameof</span><span class="p">(</span><span class="n">filename</span><span class="p">));</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">_fileWriter</span> <span class="p">!=</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">throw</span> <span class="k">new</span> <span class="n">InvalidOperationException</span><span class="p">(</span><span class="s">&#34;Start can&#39;t be called twice: Stop must be called first.&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">_fileWriter</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StreamWriter</span><span class="p">(</span><span class="n">filename</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">Filename</span> <span class="p">=</span> <span class="n">filename</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">OnStart</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">public</span> <span class="k">void</span> <span class="n">Stop</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="kt">string</span><span class="p">.</span><span class="n">IsNullOrEmpty</span><span class="p">(</span><span class="n">Filename</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">OnStop</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">Filename</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="n">_fileWriter</span><span class="p">.</span><span class="n">Flush</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">_fileWriter</span><span class="p">.</span><span class="n">Dispose</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">_fileWriter</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">protected</span> <span class="kt">bool</span> <span class="n">WriteLine</span><span class="p">(</span><span class="kt">string</span> <span class="n">line</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="p">(</span><span class="n">_fileWriter</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="kc">false</span><span class="p">;</span>   <span class="c1">// just in case the method is called AFTER Stop</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">_fileWriter</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="n">line</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="kc">true</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">protected</span> <span class="kd">abstract</span> <span class="k">void</span> <span class="n">OnStart</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">protected</span> <span class="kd">abstract</span> <span class="k">void</span> <span class="n">OnStop</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Its main goal is to hide the file management by providing the <code>WriteLine</code> method that child classes would call to save the details of a garbage collection into a single line of text. The write operations are flushed when <code>Stop</code>is called. This combination allows asynchronous writes with low performance impact: don’t be scared if you don’t see the file size change because the <code>StreamWriter</code> class is caching write operations.</p>
<p>The next step is to implement <code>OnStart</code> and <code>OnStop</code> in a derived class to enable/disable GC details retrieval.</p>
<h2 id="how-to-get-the-gc-details-the-easyway">How to get the GC details: the easy way?</h2>
<p>As already discussed in the previous posts of the series, the CLR is emitting traces (via ETW on Windows and LTTng on Linux) that can be collected in C#. You have already seen <a href="/posts/2018-12-15_spying-on-net-garbage/">how TraceEvent could help</a> collecting and parsing GC traces from any application like what Perfview is doing. With TraceEvent, the <code>TraceGC</code><a href="https://github.com/Microsoft/perfview/blob/master/src/TraceEvent/Computers/TraceManagedProcess.cs#L1698"> instance </a>received when a garbage collection ends contains tons of information: it’s mapped to the <code>GarbageCollectionArgs</code><a href="https://github.com/chrisnas/ClrEvents/blob/master/src/ClrCounters/GarbageCollectionArgs.cs"> structure</a> that you get while listening to the <code>GarbageCollection</code><a href="https://github.com/chrisnas/ClrEvents/blob/master/src/ClrCounters/ClrEventsManager.cs#L24"> event</a> of my <code>ClrEventsManager</code> helper class. The only information to provide is the ID of the .NET process I’m interested in: that way, is it easy to filter the events for this process only.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">EtwGcLog</span> <span class="n">gcLog</span> <span class="p">=</span> <span class="n">EtwGcLog</span><span class="p">.</span><span class="n">GetProcessGcLog</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">filename</span> <span class="p">=</span> <span class="n">GetUniqueFilename</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="n">gcLog</span><span class="p">.</span><span class="n">Start</span><span class="p">(</span><span class="n">filename</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// in a simple Console application, wait for the user to press ENTER.</span>
</span></span><span class="line"><span class="cl"><span class="c1">// in a more realistic case, keep track of the EtwLog instance and </span>
</span></span><span class="line"><span class="cl"><span class="c1">// call Stop to end the processing of events when needed.</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">gcLog</span><span class="p">.</span><span class="n">Stop</span><span class="p">();</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>GetUniqueFilename</strong> method builds a filename based on the process ID and the time of the day:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="kt">string</span> <span class="n">GetUniqueFilename</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">now</span> <span class="p">=</span> <span class="n">DateTime</span><span class="p">.</span><span class="n">Now</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">string</span> <span class="n">filename</span> <span class="p">=</span> <span class="n">Path</span><span class="p">.</span><span class="n">Combine</span><span class="p">(</span><span class="n">Environment</span><span class="p">.</span><span class="n">CurrentDirectory</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">        <span class="s">$&#34;{pid.ToString()}_{now.Year}{now.Month}{now.Day}_{now.Hour}{now.Minute}{now.Second}.csv&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">filename</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>GetProcessGcLog</code> method is a factory-like helper to build an instance bound to the given process ID.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="n">EtwGcLog</span> <span class="n">GetProcessGcLog</span><span class="p">(</span><span class="kt">int</span> <span class="n">pid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">EtwGcLog</span> <span class="n">gcLog</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="kt">var</span> <span class="n">process</span> <span class="p">=</span> <span class="n">Process</span><span class="p">.</span><span class="n">GetProcessById</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">process</span><span class="p">.</span><span class="n">Dispose</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">gcLog</span> <span class="p">=</span> <span class="k">new</span> <span class="n">EtwGcLog</span><span class="p">(</span><span class="n">pid</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">catch</span> <span class="p">(</span><span class="n">System</span><span class="p">.</span><span class="n">ArgumentException</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// there is no running process with the given pid</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">gcLog</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The implementation of <code>OnStart</code> and <code>OnStop</code> overrides is straightforward based <a href="https://medium.com/criteo-labs/spying-on-net-garbage-collector-with-traceevent-f49dc3117de">on the previous post</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">protected</span> <span class="kd">override</span> <span class="k">void</span> <span class="n">OnStart</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">string</span> <span class="n">sessionName</span> <span class="p">=</span> <span class="s">$&#34;GcLogEtwSession_{_pid.ToString()}_{Guid.NewGuid().ToString()}&#34;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;Starting {sessionName}...\r\n&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">_userSession</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TraceEventSession</span><span class="p">(</span><span class="n">sessionName</span><span class="p">,</span> <span class="n">TraceEventSessionOptions</span><span class="p">.</span><span class="n">Create</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">Task</span><span class="p">.</span><span class="n">Run</span><span class="p">(()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// only want to receive GC event</span>
</span></span><span class="line"><span class="cl">        <span class="n">ClrEventsManager</span> <span class="n">manager</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ClrEventsManager</span><span class="p">(</span><span class="n">_userSession</span><span class="p">,</span> <span class="n">_pid</span><span class="p">,</span> <span class="n">EventFilter</span><span class="p">.</span><span class="n">GC</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">        <span class="n">manager</span><span class="p">.</span><span class="n">GarbageCollection</span> <span class="p">+=</span> <span class="n">OnGarbageCollection</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="c1">// this is a blocking call until the session is disposed</span>
</span></span><span class="line"><span class="cl">        <span class="n">manager</span><span class="p">.</span><span class="n">ProcessEvents</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;End of CLR event processing&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">});</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">// add a header to the .csv file</span>
</span></span><span class="line"><span class="cl">    <span class="n">WriteLine</span><span class="p">(</span><span class="n">Header</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">protected</span> <span class="kd">override</span> <span class="k">void</span> <span class="n">OnStop</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// when the session is disposed, the call to ProcessEvents() returns</span>
</span></span><span class="line"><span class="cl">    <span class="n">_userSession</span><span class="p">.</span><span class="n">Dispose</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The created <code>TraceEventSession</code> is passed to the <code>ClrEventManager</code> with the process ID with a filter to receive only <strong>GarbageCollection</strong> event notifications. The <code>OnGarbageCollection</code> handler is super simple:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnGarbageCollection</span><span class="p">(</span><span class="kt">object</span> <span class="n">sender</span><span class="p">,</span> <span class="n">GarbageCollectionArgs</span> <span class="n">e</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">Clear</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">StartRelativeMSec</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">Number</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">Generation</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">Type</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">Reason</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">IsCompacting</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">SuspensionDuration</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">PauseDuration</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">BGCFinalPauseDuration</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">Gen0Size</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">Gen1Size</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">Gen2Size</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">LOHSize</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeBefore</span><span class="p">[</span><span class="m">0</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeBefore</span><span class="p">[</span><span class="m">1</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeBefore</span><span class="p">[</span><span class="m">2</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeBefore</span><span class="p">[</span><span class="m">3</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeAfter</span><span class="p">[</span><span class="m">0</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeAfter</span><span class="p">[</span><span class="m">1</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0},&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeAfter</span><span class="p">[</span><span class="m">2</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">    <span class="n">_line</span><span class="p">.</span><span class="n">AppendFormat</span><span class="p">(</span><span class="s">&#34;{0}&#34;</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">ObjSizeAfter</span><span class="p">[</span><span class="m">3</span><span class="p">].</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">WriteLine</span><span class="p">(</span><span class="n">_line</span><span class="p">.</span><span class="n">ToString</span><span class="p">());</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Each garbage collection appears as a textual line with the following columns separated by a comma:</p>
<p><img loading="lazy" src="/posts/2019-02-12_building-your-own-java/1_NmJaV_9pZrRCe-Le44koGA.png"></p>
<p>The last twelve pieces of information require some explanation:</p>
<p>· **xxxBefore **: size of a generation before the collection; without free list</p>
<p>· **xxxAfter **: size of a generation after the collection; without free list</p>
<p>· **xxxSize **: size of a generation (including LOH) after the collection; including free list (i.e. fragmentation)</p>
<p>The computation of these sizes relies on inner fields of the <code>TraceGC</code> argument receives from TraceEvent. The <strong>xxxSize</strong> are grouped in the <strong>GenerationSize0/1/2/3</strong> fields of the <code>HeapStat</code> property. It is a little bit more complicated for the <strong>Before</strong>/<strong>After</strong> sizes. The Garbage Collector keeps track of these numbers in the <code>PerHeapHistories</code> field: an array of <code>GCPerHeapHistory</code> elements; one per heap (i.e. one per core for server GC). The next level is provided by the <code>GenData</code> field storing an array of <code>GCPerHeapHistoryGenData</code> elements; one per generation with LOH as the last index 3. So, to compute the size of each generation, it is needed to iterate on each heap:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kt">long</span><span class="p">[]</span> <span class="n">GetGenerationSizes</span><span class="p">(</span><span class="n">TraceGC</span> <span class="n">gc</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">before</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">sizes</span> <span class="p">=</span> <span class="k">new</span> <span class="kt">long</span><span class="p">[</span><span class="m">4</span><span class="p">];</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">gc</span><span class="p">.</span><span class="n">PerHeapHistories</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="n">sizes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">heap</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">heap</span> <span class="p">&lt;</span> <span class="n">gc</span><span class="p">.</span><span class="n">PerHeapHistories</span><span class="p">.</span><span class="n">Count</span><span class="p">;</span> <span class="n">heap</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="c1">// LOH = 3</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">gen</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">gen</span> <span class="p">&lt;=</span> <span class="m">3</span><span class="p">;</span> <span class="n">gen</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">sizes</span><span class="p">[</span><span class="n">gen</span><span class="p">]</span> <span class="p">+=</span> <span class="n">before</span> <span class="p">?</span> 
</span></span><span class="line"><span class="cl">                <span class="n">gc</span><span class="p">.</span><span class="n">PerHeapHistories</span><span class="p">[</span><span class="n">heap</span><span class="p">].</span><span class="n">GenData</span><span class="p">[</span><span class="n">gen</span><span class="p">].</span><span class="n">ObjSpaceBefore</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="n">gc</span><span class="p">.</span><span class="n">PerHeapHistories</span><span class="p">[</span><span class="n">heap</span><span class="p">].</span><span class="n">GenData</span><span class="p">[</span><span class="n">gen</span><span class="p">].</span><span class="n">ObjSizeAfter</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">sizes</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The code of the <code>GetGenerationSizes</code> helper method does that sum the value of either <code>ObjSpaceBefore</code> or <code>ObjSizeAfter</code>.</p>
<p>As you have probably noticed from the implementation, it is possible that the <code>PerHeapHistories</code> field is not filled up and all <strong>Before</strong>/<strong>After</strong> values are zero. This happens for a background gen2 collection. Also note that for gen0 and gen1 collection the value for gen2 and LOH is also 0 (make sense that gen2 and LOH do not change during such a collection).</p>
<h2 id="a-little-bit-ofui">A little bit of UI</h2>
<p>Now that a .csv file containing all garbage collections details is available, it is time to provide some UI on top of it such as the following for GC pauses:</p>
<p><img loading="lazy" src="/posts/2019-02-12_building-your-own-java/1_nd462AscBlWil52op5ObsA.png"></p>
<p>Let’s start what you can get for Excel champions:</p>
<ul>
<li><strong>Generation ratio</strong></li>
</ul>
<p><img loading="lazy" src="/posts/2019-02-12_building-your-own-java/1_CE3TCA5erE0a4k73kE9Khw.png"></p>
<ul>
<li><strong>Sizes of generations including Large Object Heap</strong></li>
</ul>
<p><img loading="lazy" src="/posts/2019-02-12_building-your-own-java/1_D1rIu1x65M0B9DfO_kvr6w.png"></p>
<ul>
<li><strong>Top 10 pauses (including suspension time comparison)</strong></li>
</ul>
<p><img loading="lazy" src="/posts/2019-02-12_building-your-own-java/1_-cXtvizxi1XFJAsKmuOaaA.png"></p>
<p>But you can get better interaction thanks to <a href="https://twitter.com/jpbempel">Jean-Philippe</a>. My colleague <a href="https://github.com/jpbempel/gclogs-analyzer">adapted his script for JVM to my .NET GC log .csv format</a>: it generates some nice zoomable HTML UI.</p>
<p>This short video above shows the heap evolution during ~20 minutes. Then, as this is an interactive HTML page you can focus on gen2 and LOH impact on memory consumption.</p>
<p>For the pause time graph on the same period, it is very easy to detect long pauses (even for gen0 collection) and zoom into smaller period to figure out the impact of different collections.</p>
<p>The code available on <a href="https://github.com/chrisnas/ClrEvents">Github</a> has been updated to make the <strong>EtwGcLog</strong> class available to you.</p>
]]></content:encoded></item><item><title>Spying on .NET Garbage Collector with TraceEvent</title><link>https://chrisnas.github.io/posts/2018-12-15_spying-on-net-garbage/</link><pubDate>Sat, 15 Dec 2018 11:08:01 +0000</pubDate><guid>https://chrisnas.github.io/posts/2018-12-15_spying-on-net-garbage/</guid><description>This post of the series focuses on CLR events related to garbage collection in .NET.</description><content:encoded><![CDATA[<hr>
<p>This post of the series focuses on CLR events related to garbage collection in .NET.</p>
<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>Part 2: <a href="/posts/2018-07-26_grab-etw-session-providers/">Grab ETW Session, Providers and Events</a>.</p>
<p>Part 3: <a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">CLR Threading events with TraceEvent</a>.</p>
<h2 id="introduction">Introduction</h2>
<p>The allocator and garbage collector components of the CLR may have a real impact on the performances of your application. The Book of the Runtime describes the allocator/collector design goals in the must read <a href="https://github.com/dotnet/coreclr/blob/master/Documentation/botr/garbage-collection.md">Garbage Collection Design page</a> written by Maoni Stephens, lead developer of the GC. In addition, Microsoft provides large <a href="https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/?WT.mc_id=DT-MVP-5003325">garbage collection documentation</a>. And if you want more details about .NET garbage collector, take a look at <a href="https://www.amazon.com/Pro-NET-Memory-Management-Performance/dp/148424026X">Pro .NET Memory Management</a> by <a href="https://twitter.com/konradkokosa">Konrad Kokosa</a>. In this post, I will focus on the events emitted by the CLR and how you could use them to better understand how your application is behaving, related to its memory consumption.</p>
<p>The impact on how your application behaves is mostly related to a couple of topics:</p>
<ul>
<li>*How many times and how long your threads get suspended during a collection
*Desktop applications and games provide fluent User Interfaces where glitches are less and less acceptable. In the opposite side of the spectrum, low latency server applications have short SLAs to answer each request. In both cases, applications cannot afford freezing for too long while the high priority GC threads are cleaning up the .NET heaps for background GCs or blocking non concurrent GCs.</li>
<li>*How much memory is dedicated to your process
*With the rise of containers and their quotas, your application needs to trim down its memory consumption. For example, with server GC enabled, the amount of memory used by your application could grows big (depending on the number of cores) before a gen 0 collection kicks in (read <a href="https://github.com/aspnet/AspNetCore/issues/3409">this discussion</a> about real world cases including StackOverflow web site and what are the possible solutions)
The memory pressure on the system is also taken into account by the GC and could lead to more collections being triggered (read Maoni Stephen blog post about <a href="https://devblogs.microsoft.com/dotnet/running-with-server-gc-in-a-small-container-scenario-part-0/?WT.mc_id=DT-MVP-5003325">how Windows jobs are taken into account by the GC and how to leverage them if needed</a>). It becomes more and more important to detect leaks and memory consumption spikes.</li>
</ul>
<p>In the previous post, you saw how to get the type name of instances being finalized. The CLR provides many more events related to memory management. They definitively help understand the interactions between this crucial part of .NET and your own code. In this article, you will see how to replace <a href="/posts/2018-06-19_replace-net-performance-counters//">the not always consistent performance counters</a> such as generation sizes or collection counts. More importantly, you will get very useful metrics information like the type of GC (foreground or background) and your application threads suspension time.</p>
<h2 id="sequences-of-events-during-garbage-collection-phases">Sequences of events during Garbage Collection phases</h2>
<p>Ephemeral collections (of generation 0 and 1) are called “stop-the-world”: your application threads will be frozen during the whole collection. For generation 2 background collections, it is a little bit more complicated. As shown in the following figure (with <a href="https://twitter.com/konradkokosa">Konrad Kokosa</a> courtesy from <a href="https://www.amazon.com/Pro-NET-Memory-Management-Performance/dp/148424026X">his book</a>)</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_INTuAJqcsWDbp8XM1ZtbMA.png"></p>
<p>The applications threads will be frozen during different phases:</p>
<ul>
<li>Initial internal step at the beginning of the collection,</li>
<li>At the end of the marking phase to reconcile the changes (allocations, references updates) done while background collection threads are running (also if compaction is needed). Look for documentation about card table usage to get more details,</li>
<li>If a compaction occurs.</li>
</ul>
<p>Please read <a href="https://devblogs.microsoft.com/premier-developer/understanding-different-gc-modes-with-concurrency-visualizer/?WT.mc_id=DT-MVP-5003325">Understanding different GC modes with Concurrency Visualizer</a> to go deeper and blog posts from <a href="http://mattwarren.org/2017/01/13/Analysing-Pause-times-in-the-.NET-GC/">Matt</a> <a href="http://mattwarren.org/2016/06/20/Visualising-the-dotNET-Garbage-Collector/">Warren</a> and <a href="https://devblogs.microsoft.com/dotnet/gc-etw-events-2/?WT.mc_id=DT-MVP-5003325">Maoni Stephens</a> about GC pauses.</p>
<h2 id="what-are-the-available-garbage-collections-metrics">What are the available garbage collections metrics?</h2>
<p>The <a href="https://github.com/Microsoft/perfview/blob/master/documentation/Downloading.md">Perfview tool</a> could help you analyze how many garbage collections occurred and for which reason. Select Run in the Collect menu and click the Run Command button.</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_3UbnAgXjKzwMZqbrHQw_0Q.png"></p>
<p>You could also trigger a collection after the application is started with Collect | Collect. When you want to stop collecting information, click the Stop Collection. When the .etl file gets generated, go to the GCStats node</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_fJ0jJD4GejWKNO_4kaXuuA.png"></p>
<p>Look for your application to get statistics related to garbage collections. The first <em><strong>GC Rollup By Generation</strong></em> table gives you high level metrics such as the number of collections per generation and the mean pause time:</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_ebel0SDOuULdvUjs63MmQw.png"></p>
<p>The next two sections list the collections with a pause time longer than 200ms before the section that lists all generation 2 collections:</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_8AoASNYsf0FYNAAJTqWJpQ.png"></p>
<p>The <em><strong>Suspend Msec</strong></em> columns gives you the time it took to suspend your application threads while <em><strong>Pause MSec</strong></em> counts the time during which your threads were actually suspended.</p>
<p>In addition to this, memory details such as the size of all generations after each collection are available:</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_mqZSpPC1ZEEsDkkpCxfrEw.png"></p>
<p>However, my goal is to get these details to feed monitoring dashboards <strong>as the application runs</strong>. I can’t use Perfview but I can still rely on the same CLR events.</p>
<h2 id="a-solution-for-runtimeplease">A solution for runtime please!</h2>
<p>Since version 2 of TraceEvent, there is an easy way to get already computed metrics about GC as <a href="https://devblogs.microsoft.com/dotnet/glad-part-2/?WT.mc_id=DT-MVP-5003325">described by Maoni Stephens</a>. It relies on the same code as Perfview for its <em>GCStats</em> window.</p>
<p>You only need to subscribe to two events; one when a GC starts and one when it ends:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">var</span> <span class="n">source</span> <span class="p">=</span> <span class="n">userSession</span><span class="p">.</span><span class="n">Source</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="n">source</span><span class="p">.</span><span class="n">NeedLoadedDotNetRuntimes</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="n">source</span><span class="p">.</span><span class="n">AddCallbackOnProcessStart</span><span class="p">((</span><span class="n">TraceProcess</span> <span class="n">proc</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">proc</span><span class="p">.</span><span class="n">AddCallbackOnDotNetRuntimeLoad</span><span class="p">((</span><span class="n">TraceLoadedDotNetRuntime</span> <span class="n">runtime</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">runtime</span><span class="p">.</span><span class="n">GCStart</span> <span class="p">+=</span> <span class="p">(</span><span class="n">TraceProcess</span> <span class="n">p</span><span class="p">,</span> <span class="n">TraceGC</span> <span class="n">gc</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// a GC is starting</span>
</span></span><span class="line"><span class="cl">        <span class="p">};</span>
</span></span><span class="line"><span class="cl">        <span class="n">runtime</span><span class="p">.</span><span class="n">GCEnd</span> <span class="p">+=</span> <span class="p">(</span><span class="n">TraceProcess</span> <span class="n">p</span><span class="p">,</span> <span class="n">TraceGC</span> <span class="n">gc</span><span class="p">)</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">        <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="c1">// a GC ends</span>
</span></span><span class="line"><span class="cl">        <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="p">});</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>TraceGC</strong> class provides too many details beyond the scope of this post but here are the main fields that should be used in <strong>GCEnd</strong> event handler to monitor your applications:</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_DlGRChntSn43hNI6RPkjVw.png"></p>
<p>Note that the <strong>IsNotCompacting</strong> method <a href="https://github.com/Microsoft/perfview/issues/811">currently returns invalid value</a>.</p>
<h2 id="final-words">Final words</h2>
<p>I would like to mention one last event related to memory management. The <strong>GCAllocationTick</strong> CLR event (mapped by the <strong>ClrTraceEventParser.GCAllocationTick</strong> event) is emitted after ~100 KB has been allocated by your application. As you can infer from the <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events#gcallocationtick_v2-event?WT.mc_id=DT-MVP-5003325">Microsoft documentation</a>, the field of the <strong>GCAllocationTickTraceData</strong> argument received by your handler provides the following properties:</p>
<p><img loading="lazy" src="/posts/2018-12-15_spying-on-net-garbage/1_MeqPtEj3AD86o5-1B5gb5A.png"></p>
<p>As you can see, listening to this <strong>GCAllocationTick</strong> event gives you a sampling of the allocations made in your application. This is not as precise as what a .NET profiler (relying on expensive <a href="https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/icorprofilercallback-objectallocated-method?WT.mc_id=DT-MVP-5003325">ObjectAllocated</a> and <a href="http://?WT.mc_id=DT-MVP-5003325">ObjectAllocatedByClass</a> <strong>ICorProfilerCallback</strong> hooks) would provide but it is much less intrusive. However, I would not recommend to systematically listen to this event in production, especially if your application is allocating GBs of memory per minute. Unlike what the documentation states, you need to set the verbosity to <strong>TraceEventLevel.Verbose</strong> (and not <strong>Informational</strong>) when you enable the CLR provider and this could impact your application performances due to the high number of emitted CLR events.</p>
<p>This event could be very helpful in case of unusual LOH allocations because you would get the type of the objects in the LOH almost each time (the 85.000 bytes threshold is close to the 100 KB trigger limit) or simply to have an hint on the most allocated types over time. Note that you won’t get the callstack leading to the allocations triggering the event. Instead, for memory leak or memory usage analysis, I would definitively recommend you to use Perfview. Vance Morrison has published a series of videos that detail <a href="https://channel9.msdn.com/Series/PerfView-Tutorial/PerfView-Tutorial-9-NET-Memory-Investigation-Basics-of-GC-Heap-Snapshots?WT.mc_id=DT-MVP-5003325">.NET memory investigations</a>, <a href="https://channel9.msdn.com/Series/PerfView-Tutorial/Tutorial-10-Investigating-NET-Heap-Memory-Leaks-Part1-Collecting-the-data?WT.mc_id=DT-MVP-5003325">collecting the data</a> and <a href="https://channel9.msdn.com/Series/PerfView-Tutorial/Tutorial-11-Investigating-NET-Heap-Memory-Leaks-Part2-Analyzing-the-data?WT.mc_id=DT-MVP-5003325">analyzing the data</a> with Perfview. You will also find a lot of detailed memory-related investigations guidelines in <a href="https://www.amazon.com/Pro-NET-Memory-Management-Performance/dp/148424026X">Konrad Kokosa’s book</a>.</p>
<p>You now have a complete view of the CLR events interesting to understand the different phases of a garbage collection and a few interactions (suspension) with the Execution Engine. Everything is in hands to replace the performance counters by CLR events: the metrics are more accurate and you get access to more information such as suspension time or contention time. The code presented during all episodes is available <a href="https://github.com/chrisnas/ClrEvents">on Github</a> with an easy to reuse <strong>ClrEventManager</strong> class that you could plug into your own applications or monitoring service!</p>
]]></content:encoded></item><item><title>[C#] Get-process-name challenge on a Friday afternoon</title><link>https://chrisnas.github.io/posts/2018-11-13_get-process-name-challenge/</link><pubDate>Tue, 13 Nov 2018 10:29:22 +0000</pubDate><guid>https://chrisnas.github.io/posts/2018-11-13_get-process-name-challenge/</guid><description>Unexpected CPU consumption</description><content:encoded><![CDATA[<hr>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_CDn3N44B8tI1cCzL3G-Qbw.png"></p>
<h2 id="unexpected-cpu-consumption">Unexpected CPU consumption</h2>
<p>At Criteo, CLR metrics are collected by a service that listens to ETW events (<a href="/posts/2018-09-28_monitor-finalizers-contention-threads/">see the related series</a>). This metrics collector is given the process name of applications to monitor. Since applications could crash, be stopped or restarted, the metrics collector must be able to detect such an event. The previous implementation was using ETW kernel events (TraceEvent <code>ProcessStart </code>and <code>ProcessStop </code>events from <code>ETWTraceEventSource.Kernel</code>). However, in rare cases, it seems that a new application start was not detected and therefore the metrics were not collected for it.</p>
<p>An easy fix for this situation is to poll the list of running processes every second and detect which one is new or has left since the last time the list was polled. The implementation is straightforward: just call <code>Process.GetProcesses()</code> and get the process name from the <code>Process MainModule.FileName</code> property. After a few seconds testing this implementation on my laptop the fan started spinning: a quick look at Task Manager shows that the metrics collector is using ~10% CPU time!</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_D9NeWlaSCi8aDtJms3LujA.png"></p>
<p>I’ve used these P/Invoked PSAPI functions 20 years ago but I don’t remember such an impact: for our monitoring service, we would like to keep the CPU impact below 1%.</p>
<h2 id="measure-measure-andprofile">Measure, measure… and profile</h2>
<p>This was a good opportunity to start profiling the metrics collector with dotTrace on a Friday afternoon!</p>
<p><code>NtProcessManager.GetModuleInfos</code> and <code>NtProcessManager.GetProcessIds</code> are at the methods top list of CPU consumption the worst offenders by far:</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_bXsVZkRy-W_33QDcKFXWCA.png"></p>
<p>The callstack to reach <code>GetModuleInfos()</code> shows the following:</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_rDChwSQuaJdUPbaJ7XDanw.png"></p>
<p>The <code>GetProcessPath()</code> method of the metrics collector is asking for the value of <code>Process.MainModule.FileName</code> property that ends up calling <code>GetModuleInfos</code>.</p>
<p>And the callstack for the <code>MainModule </code>getter execution shows the following:</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_CDn3N44B8tI1cCzL3G-Qbw.png"></p>
<p>This call stack looks weird for two reason:</p>
<ul>
<li>Since the <code>Process </code>object exists, why is it needed to call <code>OpenProcess </code>again to get the main module?</li>
<li>Why would <code>OpenProcess </code>need to call <code>GetProcessIds </code>(i.e. get the list of running processes) since its id is already known?!</li>
</ul>
<p>Just take a look at the decompiled source code to get the answer:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="n">SafeProcessHandle</span> <span class="n">OpenProcess</span><span class="p">(</span><span class="kt">int</span> <span class="n">processId</span><span class="p">,</span> <span class="kt">int</span> <span class="n">access</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">throwIfExited</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="n">SafeProcessHandle</span> <span class="n">safeProcessHandle</span> <span class="p">=</span> <span class="n">NativeMethods</span><span class="p">.</span><span class="n">OpenProcess</span><span class="p">(</span><span class="n">access</span><span class="p">,</span> <span class="kc">false</span><span class="p">,</span> <span class="n">processId</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">  <span class="kt">int</span> <span class="n">lastWin32Error</span> <span class="p">=</span> <span class="n">Marshal</span><span class="p">.</span><span class="n">GetLastWin32Error</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">  <span class="k">if</span> <span class="p">(!</span><span class="n">safeProcessHandle</span><span class="p">.</span><span class="n">IsInvalid</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">safeProcessHandle</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  
</span></span><span class="line"><span class="cl">  <span class="c1">// error handling </span>
</span></span><span class="line"><span class="cl">  <span class="k">if</span> <span class="p">(</span><span class="n">processId</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">throw</span> <span class="k">new</span> <span class="n">Win32Exception</span><span class="p">(</span><span class="m">5</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">  <span class="k">if</span> <span class="p">(</span><span class="n">ProcessManager</span><span class="p">.</span><span class="n">IsProcessRunning</span><span class="p">(</span><span class="n">processId</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="k">throw</span> <span class="k">new</span> <span class="n">Win32Exception</span><span class="p">(</span><span class="n">lastWin32Error</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">...</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And <code>IsProcessRunning </code>calls <code>GetProcessIds</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="kt">bool</span> <span class="n">IsProcessRunning</span><span class="p">(</span><span class="kt">int</span> <span class="n">processId</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="k">return</span> <span class="n">IsProcessRunning</span><span class="p">(</span><span class="n">processId</span><span class="p">,</span> <span class="n">GetProcessIds</span><span class="p">());</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It does not appear in the callstack most probably because it was inlined by the JIT.</p>
<p>So, the code of <code>ProcessManager.OpenProcess</code> calls the Win32 <code>OpenProcess</code> API to get… the handle of the process corresponding to the given id and desired access rights. From there, it is spending most of its CPU time dealing with error cases (i.e. when a process information cannot be accessed maybe due to access right limitation). We definitively don’t need all that in our case!</p>
<h2 id="what-next">What next?</h2>
<p>At that point of the investigation, my colleague <a href="https://twitter.com/KooKiz">Kevin</a> and I went to different directions. A few decades ago, I spent a lot of time digging into Windows internals using Win32 APIs and I remember that calling <code>PSAPI.GetModuleFilenameEx</code> with a pid and 0 as module handle should return the path name of the process (BTW, this is also what <code>GetModuleInfos </code>ends up calling but more on that later). So it should not be too complicated to P/Invoke this function from PSAPI.dll.</p>
<p>At the beginning of .NET programming, the <a href="https://pinvoke.net/">https://pinvoke.net/</a> web site was very useful to figure out the right syntax for a lot of APIs if you did not want to read the 1579 pages of the <a href="https://www.amazon.com/NET-COM-Complete-Interoperability-Guide-ebook/dp/B003AYZB7U">Complete Interoperability Guide</a>! The description of <code>GetModuleFileNameEx </code>is <a href="https://pinvoke.net/default.aspx/psapi/GetModuleFileNameEx.html">available</a>, and even come with a code sample.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="na">[DllImport(&#34;psapi.dll&#34;, BestFitMapping = false, CharSet = CharSet.Auto, SetLastError = true)]</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="kd">extern</span> <span class="kt">int</span> <span class="n">GetModuleFileNameEx</span><span class="p">(</span><span class="n">SafeProcessHandle</span> <span class="n">processHandle</span><span class="p">,</span> <span class="n">IntPtr</span> <span class="n">moduleHandle</span><span class="p">,</span> <span class="n">StringBuilder</span> <span class="n">baseName</span><span class="p">,</span> <span class="kt">int</span> <span class="n">size</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that marshaling strings should always be done with care: the Win32 API is not always consistent when a pointer to a C-like string is supposed to be filled up by a function. An additional parameter is given to state the size of the buffer in which the characters of the string will be copied. In some cases, this parameter counts the number of characters and in some others, it counts the number of bytes available in the buffer. I let you imagine what a nightmare it was when you had to deal with ANSI/UNICODE strings. In the <code>GetModuleFileNameEx </code>case, the size parameter takes <a href="https://docs.microsoft.com/en-us/windows/win32/api/psapi/nf-psapi-getmodulefilenameexw?WT.mc_id=DT-MVP-5003325">the number of characters</a>.</p>
<p>If you take a look at the <code>NtProcessManager.GetModuleInfos</code> implementation, you find the following code in the implementation:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">StringBuilder</span> <span class="n">stringBuilder2</span> <span class="p">=</span> <span class="k">new</span> <span class="n">StringBuilder</span><span class="p">(</span><span class="m">1024</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="p">(</span><span class="n">Microsoft</span><span class="p">.</span><span class="n">Win32</span><span class="p">.</span><span class="n">NativeMethods</span><span class="p">.</span><span class="n">GetModuleFileNameEx</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">safeProcessHandle</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">    <span class="k">new</span> <span class="n">HandleRef</span><span class="p">(</span><span class="kc">null</span><span class="p">,</span> <span class="n">handle</span><span class="p">),</span> 
</span></span><span class="line"><span class="cl">    <span class="n">stringBuilder2</span><span class="p">,</span> 
</span></span><span class="line"><span class="cl">    <span class="n">stringBuilder2</span><span class="p">.</span><span class="n">Capacity</span> <span class="p">*</span> <span class="m">2</span>
</span></span><span class="line"><span class="cl"> <span class="p">)</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Since the capacity of the <code>StringBuilder </code>is set to 1024, this code tells <code>GetModuleFileNameEx </code>that it is allowed to write up to 2048 characters. This looks like a bug… but hard to trigger with the <a href="https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file">usual 260 characters limitation for filenames</a>. However, if, one day, you decide to use the extended syntax with the “\?\” prefix syntax to create a looooong folder for your application, the bug will trigger an <code>AccessViolationException </code>beyond 1025 characters.</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_GWNlpyBn9ykY2J6JVTTKNw.png"></p>
<p>Here is my safer implementation:</p>
<pre tabindex="0"><code>private readonly StringBuilder _baseNameBuilder = new StringBuilder(1024);
public static string GetProcessNameNative(Process p)
{
    _baseNameBuilder.Clear();
    if (GetModuleFileNameEx(p.SafeHandle, IntPtr.Zero, _baseNameBuilder, _baseNameBuilder.Capacity) == 0)
    {
        _baseNameBuilder.Append(&#34;???&#34;);
    }

    return _baseNameBuilder.ToString();
}
</code></pre><p>When I presented Kevin my oldies but goodies solution, he told me that he found a smarter solution. While I was digging into my memories, he kept decompiling the implementation of the <code>Process </code>class and realized that it contains a <code>processInfo</code> private field:</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_ukveC2Y4V9Ls0ohF7Lk9Fw.png"></p>
<p>And its internal class exposes a public field called… <code>processName</code>: exactly what we needed!</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_mAeJSykxpHEKbG25GTJSSQ.png"></p>
<p>So I was ready to implement a reflection-based solution like:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="n">Type</span> <span class="n">_processInfoType</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="n">FieldInfo</span> <span class="n">_processNameField</span> <span class="p">=</span> <span class="kc">null</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">static</span> <span class="kt">string</span> <span class="n">GetProcessNameByReflection</span><span class="p">(</span><span class="n">Process</span> <span class="n">p</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">processInfoField</span> <span class="p">=</span> <span class="k">typeof</span><span class="p">(</span><span class="n">System</span><span class="p">.</span><span class="n">Diagnostics</span><span class="p">.</span><span class="n">Process</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="p">.</span><span class="n">GetField</span><span class="p">(</span><span class="s">&#34;processInfo&#34;</span><span class="p">,</span> <span class="n">BindingFlags</span><span class="p">.</span><span class="n">Instance</span> <span class="p">|</span> <span class="n">BindingFlags</span><span class="p">.</span><span class="n">NonPublic</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">processInfo</span> <span class="p">=</span> <span class="n">processInfoField</span><span class="p">.</span><span class="n">GetValue</span><span class="p">(</span><span class="n">p</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">_processInfoType</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="n">_processInfoType</span> <span class="p">=</span> <span class="n">processInfo</span><span class="p">.</span><span class="n">GetType</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">        <span class="n">_processNameField</span> <span class="p">=</span> <span class="n">_processInfoType</span><span class="p">.</span><span class="n">GetField</span><span class="p">(</span><span class="s">&#34;processName&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">_processNameField</span><span class="p">.</span><span class="n">GetValue</span><span class="p">(</span><span class="n">processInfo</span><span class="p">).</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And Kevin was able to give me a definitively smarter solution based on compiled expressions:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="n">Func</span><span class="p">&lt;</span><span class="n">Process</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;</span> <span class="n">GetProcessNameAccessor</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">param</span> <span class="p">=</span> <span class="n">Expression</span><span class="p">.</span><span class="n">Parameter</span><span class="p">(</span><span class="k">typeof</span><span class="p">(</span><span class="n">Process</span><span class="p">),</span> <span class="s">&#34;arg&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">processInfoMember</span> <span class="p">=</span> <span class="n">Expression</span><span class="p">.</span><span class="n">Field</span><span class="p">(</span><span class="n">param</span><span class="p">,</span> <span class="s">&#34;processInfo&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">processNameMember</span> <span class="p">=</span> <span class="n">Expression</span><span class="p">.</span><span class="n">Field</span><span class="p">(</span><span class="n">processInfoMember</span><span class="p">,</span> <span class="s">&#34;processName&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kt">var</span> <span class="n">lambda</span> <span class="p">=</span> <span class="n">Expression</span><span class="p">.</span><span class="n">Lambda</span><span class="p">(</span><span class="k">typeof</span><span class="p">(</span><span class="n">Func</span><span class="p">&lt;</span><span class="n">Process</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;),</span> <span class="n">processNameMember</span><span class="p">,</span> <span class="n">param</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="p">(</span><span class="n">Func</span><span class="p">&lt;</span><span class="n">Process</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;)</span><span class="n">lambda</span><span class="p">.</span><span class="n">Compile</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">readonly</span> <span class="n">Func</span><span class="p">&lt;</span><span class="n">Process</span><span class="p">,</span> <span class="kt">string</span><span class="p">&gt;</span> <span class="n">_getProcessNameFunc</span> <span class="p">=</span> <span class="n">GetProcessNameAccessor</span> <span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="kd">public</span> <span class="kt">string</span> <span class="n">GetProcessNameByExpression</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">_getProcessNameFunc</span><span class="p">(</span><span class="n">_process</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>However, when I tested them on my laptop, I got null reference exceptions while it was working fine on Kevin’s machine… There was a tiny difference between us: I was calling <code>Process.GetProcessById</code> while Kevin was using <code>Process.GetProcesses</code> to get the <code>Process</code> instance. It looks like the implementation of both methods is not doing the same initialization. This kind of things happen when you are trying to use undocumented implementation details… Also note that the .NET Core implementation is different (<a href="https://github.com/dotnet/corefx/blob/e34fa6ac5fcc49be5cb22f46119c6d99219483b6/src/System.Diagnostics.Process/src/System/Diagnostics/ProcessManager.Win32.cs">and does not contain the string length bug</a>).</p>
<h2 id="comparing-the-different-solutions">Comparing the different solutions</h2>
<p>So which solution should I pick for our metrics collector?</p>
<p>It’s time to do some benchmarking thanks to <a href="https://github.com/dotnet/BenchmarkDotNet">BenchmarkDotNet </a>and the results give different order of magnitude!</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_yg7dKTQt9vBfkqix21GNlA.png"></p>
<p>The winner is without contest based on expressions and the worst one is… the initial implementation that does not even fall into the error case during our tests!</p>
<p>After updating the implementation with the compiled expression-based solution, the CPU usage of our metrics collector seems more reasonable:</p>
<p><img loading="lazy" src="/posts/2018-11-13_get-process-name-challenge/1_oyIUliDIkvCo8S3hCzLTRA.png"></p>
<p>It is now a good time to go back home… to write this article :^)</p>
]]></content:encoded></item><item><title>Monitor Finalizers, contention and threads in your application</title><link>https://chrisnas.github.io/posts/2018-09-28_monitor-finalizers-contention-threads/</link><pubDate>Fri, 28 Sep 2018 00:00:00 +0000</pubDate><guid>https://chrisnas.github.io/posts/2018-09-28_monitor-finalizers-contention-threads/</guid><description>This post of the series details more complicated CLR events related to finalizers and threading.</description><content:encoded><![CDATA[<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>Part 2: <a href="/posts/2018-07-26_grab-etw-session-providers/">Grab ETW Session, Providers and Events</a>.</p>
<h2 id="introduction">Introduction</h2>
<p>In the previous post, you saw how the TraceEvent nuget helps you deciphering simple ETW events such as the one emitted when a first chance exception happens. Most situations trigger more than one event and could make their processing more complicated.</p>
<h2 id="who-said-finalizer">Who said Finalizer?</h2>
<p>In the early days of .NET, you might had to deal with native resources that you were responsible for cleaning up with the related unmanaged API or legacy COM component. It was a best practice to implement a ~finalizer method to ensure that everything was deleted the right way. These times are over for most of us now. If you don’t have an <strong>IntPtr</strong> field in your class, chances are that you don’t need a ~finalizer method.</p>
<p>The <a href="https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/implementing-dispose?WT.mc_id=DT-MVP-5003325">Microsoft documentation about IDisposable/Finalizer</a> often leads people to implement both even though only <strong>IDisposable</strong> is needed (i.e. some fields of the class implement <strong>IDisposable</strong>). Having a large number of finalizers could impact memory consumption by having objects staying alive for a longer time and maybe even increase garbage collection total duration. Last but not least, some finalizers code outside of your code base could “block” on locks during their cleanup and… drastically slow down everything else.</p>
<p>Getting the name of these types with TraceEvent is a two steps process. First, a <strong>TypeBulkType</strong> event is received: it contains a list of <strong>GCBulkTypeValues</strong> which binds a <strong>TypeID</strong> integer to a string type name:</p>
<p><strong>OnTypeBulkType.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnTypeBulkType</span><span class="p">(</span><span class="n">GCBulkTypeTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span> <span class="p">!=</span> <span class="n">_processId</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// keep track of the id/name type associations</span>
</span></span><span class="line"><span class="cl">   <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">currentType</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">currentType</span> <span class="p">&lt;</span> <span class="n">data</span><span class="p">.</span><span class="n">Count</span><span class="p">;</span> <span class="n">currentType</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">GCBulkTypeValues</span> <span class="k">value</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">Values</span><span class="p">(</span><span class="n">currentType</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="n">_types</span><span class="p">[</span><span class="k">value</span><span class="p">.</span><span class="n">TypeID</span><span class="p">]</span> <span class="p">=</span> <span class="k">value</span><span class="p">.</span><span class="n">TypeName</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This association is needed because when a finalizer is notified via the <strong>GCFinalizeObject</strong> event, the received data only contains the type ID:</p>
<p><strong>OnGCFinalizeObject.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnGCFinalizeObject</span><span class="p">(</span><span class="n">FinalizeObjectTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span> <span class="p">!=</span> <span class="n">_processId</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// the type id should have been associated to a name via a previous TypeBulkType event</span>
</span></span><span class="line"><span class="cl">   <span class="n">NotifyFinalize</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">TimeStamp</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">TypeID</span><span class="p">,</span> <span class="n">_types</span><span class="p">[</span><span class="n">data</span><span class="p">.</span><span class="n">TypeID</span><span class="p">]);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Note that in the two code snippets, there is an explicit check to keep the events only from the process we are interested in: as explained in part 1, this is needed for older versions of Windows.</p>
<h2 id="thread-contention-duration">Thread contention duration</h2>
<p>With .NET CLR LocksAndThreads “Contention Rate / sec” and “Total # of Contentions” performance counters, you can monitor how many times threads have been blocked while waiting for a lock owned by another thread. However, you don’t know for how long. The two TraceEvent <strong>ContentionStart</strong> and <strong>ContentionStop</strong> events allow you to get this crucial piece of information.</p>
<p>As their names imply and <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/contention-etw-events?WT.mc_id=DT-MVP-5003325">the corresponding documentation explains</a>, these two events let you know respectively when a thread starts to wait on a lock and when the lock has been acquired. In addition to the process and thread identifiers, the <strong>ContentionTraceData</strong> event argument gives you the type of contention with its <strong>ContentionFlags</strong> property: either managed or native</p>
<p><strong>ContentionTraceData.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">public</span> <span class="kd">sealed</span> <span class="k">class</span> <span class="nc">ContentionTraceData</span> <span class="p">:</span> <span class="n">TraceEvent</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="kd">public</span> <span class="n">ContentionFlags</span> <span class="n">ContentionFlags</span> <span class="p">{</span> <span class="k">get</span><span class="p">;</span> <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Since contention is a per-thread waiting operation, you need to keep track of the starting time on a per-thread basis when <strong>ContentionStart</strong> happens.</p>
<p><strong>OnContentionStart.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnContentionStart</span><span class="p">(</span><span class="n">ContentionTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">ContentionInfo</span> <span class="n">info</span> <span class="p">=</span> <span class="n">_contentionStore</span><span class="p">.</span><span class="n">GetContentionInfo</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">ThreadID</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">info</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="n">info</span><span class="p">.</span><span class="n">TimeStamp</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">TimeStamp</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="n">info</span><span class="p">.</span><span class="n">ContentionStartRelativeMSec</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">TimeStampRelativeMSec</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>ContentionStore</strong> class keeps track of the monitored processes and assign them a <strong>ContentionInfo</strong> instance where the contention details are stored.</p>
<p>Now you retrieve it back when the matching <strong>ContentionStop</strong> event occurs. The rest is just a matter of computing the time difference between the two events based on their <strong>TimeStampRelativeMSec</strong> property.</p>
<p><strong>OnContentionStop.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="k">void</span> <span class="n">OnContentionStop</span><span class="p">(</span><span class="n">ContentionTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">ContentionInfo</span> <span class="n">info</span> <span class="p">=</span> <span class="n">_contentionStore</span><span class="p">.</span><span class="n">GetContentionInfo</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">data</span><span class="p">.</span><span class="n">ThreadID</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">info</span> <span class="p">==</span> <span class="kc">null</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// unlucky case when we start to listen just after the ContentionStart event</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">info</span><span class="p">.</span><span class="n">ContentionStartRelativeMSec</span> <span class="p">==</span> <span class="m">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="kt">var</span> <span class="n">contentionDurationMSec</span> <span class="p">=</span> <span class="n">data</span><span class="p">.</span><span class="n">TimeStampRelativeMSec</span> <span class="p">-</span> <span class="n">info</span><span class="p">.</span><span class="n">ContentionStartRelativeMSec</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">   <span class="kt">var</span> <span class="n">isManaged</span> <span class="p">=</span> <span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">ContentionFlags</span> <span class="p">==</span> <span class="n">ContentionFlags</span><span class="p">.</span><span class="n">Managed</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You are now able to detect when thread contention occurs but also if the contention duration increases over time.</p>
<p><img loading="lazy" src="/posts/2018-09-28_monitor-finalizers-contention-threads/0_SEfZiyEmSjrA-UGR.png"></p>
<p>If you want to test contention, here is the kind of code you could use:</p>
<p><strong>TestContention.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">_workers</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Task</span><span class="p">[</span><span class="n">workerCount</span><span class="p">];</span>
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p">&lt;</span> <span class="n">workerCount</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">_workers</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="p">=</span> <span class="n">Task</span><span class="p">.</span><span class="n">Run</span><span class="p">(</span><span class="kd">async</span> <span class="p">()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="k">while</span> <span class="p">(</span><span class="kc">true</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="p">{</span>
</span></span><span class="line"><span class="cl">         <span class="k">lock</span> <span class="p">(</span><span class="n">_lock</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">         <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="n">Thread</span><span class="p">.</span><span class="n">Sleep</span><span class="p">(</span><span class="m">5</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">         <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">   <span class="p">});</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>A few tasks are created to acquire the same lock over and over and sleeping 5 milliseconds before releasing it.</p>
<h2 id="how-to-count-threads-or-monitor-the-threadpool-usage">How to count threads or monitor the ThreadPool usage?</h2>
<p>In <a href="/posts/2018-06-19_replace-net-performance-counters/">a previous post</a>, it was mentioned that CLR performance counters related to threads were not able to provide an accurate count of the running threads. In fact, you could use the <strong>Process/Thread Count</strong> Windows kernel performance counter to get the accurate value. If you build .NET Core applications to run on Linux, you have to find other ways such as described <a href="https://stackoverflow.com/questions/268680/how-can-i-monitor-the-thread-count-of-a-process-on-linux">on stackoverflow</a>. However, there is an easy programmatic way to get the number of running threads in an application that works both on Windows and Linux: call <strong>Process.GetProcessById(<pid>).Threads.Count</strong> with its process ID.</p>
<p>Since this is a series dedicated to ETW, you would expect to simply listen to a few events to get the thread count. Well… It is almost that simple. Each time a thread gets started, the <strong>AppDomainResourceManagement/ThreadCreated</strong> event is emitted with basically the ID of the created thread as payload. In order to receive the sibling <strong>AppDomainResourceManagement/ThreadTerminated</strong> event, you need to call <strong>AppDomain.MonitoringIsEnabled</strong> in the monitored application. The <a href="https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/app-domain-resource-monitoring?WT.mc_id=DT-MVP-5003325">other ways described by the documentation</a> did not work for me.</p>
<p>If you want to figure out if your applications are not hammering too much the .NET thread pool, the CLR provides <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/thread-pool-etw-events?WT.mc_id=DT-MVP-5003325">many ETW events</a> for you to listen that map to the following TraceEvent events:</p>
<p><img loading="lazy" src="/posts/2018-09-28_monitor-finalizers-contention-threads/1_l13vhZSphdzzg1RQdQW8Fg.png"></p>
<p>The <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/thread-pool-etw-events#threadpoolworkerthreadadjustmentadjustment?WT.mc_id=DT-MVP-5003325">ThreadPoolWorkerThreadAdjustementAdjustment</a> event (there is no typo in this name) provides a <strong>Reason</strong> property. If its value is 0x06, then it means <strong>Starvation</strong>: if this event frequency is ~1 per second, it could be a good indication that the <strong>ThreadPool</strong> is receiving a burst of workitems or tasks to process. In addition, the <strong>ThreadPoolWorkerThreadAdjustmentTraceData</strong> argument received by the handler also gives the count of threads via the <strong>NewWorkerThreadCount</strong> property.</p>
<p>With all these events, you should be able to monitor how the .NET <strong>ThreadPool</strong> is used in your application.</p>
<p>The next post will be entirely dedicated to garbage collection analysis.</p>
<hr>
<p><em>Co-authored with <a href="https://twitter.com/kookiz">Kevin Gosse</a></em></p>
]]></content:encoded></item><item><title>Grab ETW Session, Providers and Events</title><link>https://chrisnas.github.io/posts/2018-07-26_grab-etw-session-providers/</link><pubDate>Thu, 26 Jul 2018 00:00:00 +0000</pubDate><guid>https://chrisnas.github.io/posts/2018-07-26_grab-etw-session-providers/</guid><description>This post of the series shows how to easily listen to CLR events with the TraceEvent package.</description><content:encoded><![CDATA[<p>Part 1: <a href="/posts/2018-06-19_replace-net-performance-counters/">Replace .NET performance counters by CLR event tracing</a>.</p>
<p>In the previous post, you saw that the CLR is emitting traces that could (should?) replace the performance counters you are using to monitor your application and investigate when something goes wrong. The perfview tool that was demonstrated is built on top of the <a href="https://www.nuget.org/packages/Microsoft.Diagnostics.Tracing.TraceEvent">Microsoft.Diagnostics.Tracing.TraceEvent Nuget package</a> and you should leverage it to build your own monitoring system. In addition, the <a href="https://www.nuget.org/packages/Microsoft.Diagnostics.Tracing.TraceEvent.Samples/">Microsoft.Diagnostics.Tracing.TraceEvent.Samples Nuget package</a> contains sample code to help you ramping up.</p>
<h2 id="manage-an-etw-session">Manage an ETW session</h2>
<p>Create a console application and add the TraceEvent Nuget package. Your project now contains a TraceEvent.ReadMe.txt and a more detailed _TraceEventProgrammersGuide.docx Word document. You should really take the time to read the latter: it describes the architecture in great details and helps understanding what is going on under the scene.</p>
<p><img loading="lazy" src="/posts/2018-07-26_grab-etw-session-providers/0_32rDLO5VyR1XMDS0.png"></p>
<p>In the <strong>Main</strong> entry point, add the following code to list existing ETW sessions:</p>
<p><strong>ShowETWSessions.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;Current ETW sessions:&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">foreach</span><span class="p">(</span><span class="kt">var</span> <span class="n">session</span> <span class="k">in</span> <span class="n">TraceEventSession</span><span class="p">.</span><span class="n">GetActiveSessionNames</span><span class="p">())</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="n">session</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;--------------------------------------------&#34;</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Like <strong>logman -ets</strong> command, this piece of code might be handy during debugging session. Why? Just because when you debug your code that creates an ETW session and if you stop the debugger before disposing it, the session becomes orphan and after a while, Windows simply refuses to create new ones. In addition to easily find your orphans sessions, another good reason to give a meaningful name to your session is to be able to stop it. Type the following command line: <strong>logman -ets stop <session name></strong> to close a running session and clean up the mess your debugging sessions might have created. This is definitively better than rebooting the machine.</p>
<p>The next step is to create a session. You get a <strong>TraceEventSession</strong> object either by attaching to an existing session or by creating a new one as shown in this code:</p>
<p><strong>MainETW.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kt">string</span> <span class="n">sessionName</span> <span class="p">=</span> <span class="s">&#34;EtwSessionForCLR_&#34;</span> <span class="p">+</span> <span class="n">Guid</span><span class="p">.</span><span class="n">NewGuid</span><span class="p">().</span><span class="n">ToString</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;Starting {sessionName}...\r\n&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="k">using</span> <span class="p">(</span><span class="n">TraceEventSession</span> <span class="n">userSession</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TraceEventSession</span><span class="p">(</span><span class="n">sessionName</span><span class="p">,</span> <span class="n">TraceEventSessionOptions</span><span class="p">.</span><span class="n">Create</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">Task</span><span class="p">.</span><span class="n">Run</span><span class="p">(()</span> <span class="p">=&gt;</span>
</span></span><span class="line"><span class="cl">   <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="c1">// register handlers for events on the session source</span>
</span></span><span class="line"><span class="cl">      <span class="c1">// more on this later...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">      <span class="c1">// decide which provider to listen to with filters if needed</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">      <span class="c1">// process the events in a blocking call</span>
</span></span><span class="line"><span class="cl">   <span class="p">});</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// wait for the user to dismiss the session</span>
</span></span><span class="line"><span class="cl">   <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;Presse ENTER to exit...&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">   <span class="n">Console</span><span class="p">.</span><span class="n">ReadLine</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Why is it necessary to run the code that manipulates the session in another thread with <strong>Task.Run</strong>? The call to the <strong>Process</strong> method is synchronous so it would not be possible to get the user input. When the user exits, the session is disposed and the <strong>Process</strong> method returns. If you close the session with logman, the <strong>Process</strong> method will also return. This behavior applies because the <strong>TraceEventSession.StopOnDispose</strong> property is set to true by default.</p>
<p>Note that if you want to use TraceEvent to parse an .etl file, you simply need to pass the filename as an additional parameter to the <strong>TraceEventSession</strong> constructor; the rest of the code will be the same.</p>
<p>The code running in the task first registers handlers to the events as you will soon see. Next, you need to enable the providers you are interested in receiving events from. In our case, only the ClrTraceEventParser.ProviderGuid CLR provider is enabled (read <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-providers?WT.mc_id=DT-MVP-5003325">https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-providers</a> for more details about the two available CLR providers). In addition to the <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-keywords-and-levels#etw-event-levels">verbosity level</a>, you should set the keywords with the ClrTraceEventParser.Keywords enumeration values corresponding to <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-keywords-and-levels?WT.mc_id=DT-MVP-5003325">the categories of events you want to receive</a></p>
<p><strong>ProviderAndSource.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// register handlers for events on the session source</span>
</span></span><span class="line"><span class="cl"><span class="c1">// more on this later...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// decide which provider to listen to with filters if needed</span>
</span></span><span class="line"><span class="cl"><span class="n">userSession</span><span class="p">.</span><span class="n">EnableProvider</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">   <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">ProviderGuid</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">   <span class="n">TraceEventLevel</span><span class="p">.</span><span class="n">Verbose</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">   <span class="p">(</span><span class="kt">ulong</span><span class="p">)(</span>
</span></span><span class="line"><span class="cl">      <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Contention</span> <span class="p">|</span> <span class="c1">// thread contention timing</span>
</span></span><span class="line"><span class="cl">      <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Threading</span> <span class="p">|</span> <span class="c1">// threadpool events</span>
</span></span><span class="line"><span class="cl">      <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Exception</span> <span class="p">|</span> <span class="c1">// get the first chance exceptions</span>
</span></span><span class="line"><span class="cl">      <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">GCHeapAndTypeNames</span> <span class="p">|</span> 
</span></span><span class="line"><span class="cl">      <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">Type</span> <span class="p">|</span> <span class="c1">// for finalizer and exceptions type names</span>
</span></span><span class="line"><span class="cl">      <span class="n">ClrTraceEventParser</span><span class="p">.</span><span class="n">Keywords</span><span class="p">.</span><span class="n">GC</span> <span class="c1">// garbage collector details</span>
</span></span><span class="line"><span class="cl"><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// this is a blocking call until the session is disposed</span>
</span></span><span class="line"><span class="cl"><span class="n">userSession</span><span class="p">.</span><span class="n">Source</span><span class="p">.</span><span class="n">Process</span><span class="p">();</span>
</span></span><span class="line"><span class="cl"><span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">&#34;End of session&#34;</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Additional filters such as which process to monitor can be set by passing TraceEventProviderOptions to the <strong>EnableProvider()</strong> method. But wait! On a Windows 7 machine this kind of filtering is not working and you get events from all processes… No specific documentation to look at… This is where knowing what Win32 APIs are called behind the scene could help. Instead of decompiling the TraceEvent assembly, you should instead take a look at its implementation… because it is <a href="https://github.com/Microsoft/perfview/blob/master/src/TraceEvent">open sourced</a> with Perfview! The <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-keywords-and-levels?WT.mc_id=DT-MVP-5003325">Microsoft documentation</a> for the called <strong>EnableTraceEx2</strong> function states that <em>there are several types of scope filters that allow filtering based on the event ID, the process ID (PID), executable filename, the app ID, and the app package name.</em> <strong>This feature is supported on Windows 8.1,Windows Server 2012 R2, and later</strong>. If you need to filter out events based on process id, don’t worry: each event will provide it.</p>
<h2 id="listen-to-clr-events">Listen to CLR events</h2>
<p><img loading="lazy" src="/posts/2018-07-26_grab-etw-session-providers/0_Oe3Dr70NMSBEJp40.png"></p>
<p>This class derives from <strong>TraceEventDispatcher</strong> that provides the <strong>Process</strong> method. The <strong>TraceEventSource</strong> ancestor class is where the event handlers can be registered on the following properties:</p>
<p><img loading="lazy" src="/posts/2018-07-26_grab-etw-session-providers/0_TogfxrDCOMI4maqI.png"></p>
<p>Behind the scene, each provider emits strongly typed traces that could be difficult to parse manually: don’t worry, the TraceEvent library does the job for you through dedicated parsers exposed by <strong>TraceEventSource</strong>.</p>
<p>In case of .NET events, you usually rely on the <strong>ClrTraceEventParser</strong> that exposes via .NET event the 100+ different traces emitted by the CLR ETW provider… plus one called <strong>All</strong> just in case you want to see all of them.</p>
<p><img loading="lazy" src="/posts/2018-07-26_grab-etw-session-providers/0_Kp9Q3s1tAIzeLjin.png"></p>
<p>Here is the first naïve implementation to display all CLR traces as shown in the previous screenshot:</p>
<p><strong>NaiveListener.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="c1">// listen to all CLR events</span>
</span></span><span class="line"><span class="cl"><span class="n">userSession</span><span class="p">.</span><span class="n">Source</span><span class="p">.</span><span class="n">Clr</span><span class="p">.</span><span class="n">All</span> <span class="p">+=</span> <span class="k">delegate</span> <span class="p">(</span><span class="n">TraceEvent</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="c1">// skip verbose and unneeded events</span>
</span></span><span class="line"><span class="cl">   <span class="k">if</span> <span class="p">(</span><span class="n">SkipEvent</span><span class="p">(</span><span class="n">data</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">   <span class="c1">// raw dump of the events</span>
</span></span><span class="line"><span class="cl">   <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;{data.ProcessID,7} &lt;{data.ProviderName}:{data.ID}&gt;__[{data.OpcodeName}] {data.EventName} &lt;| {data.GetType().Name}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <strong>SkipEvent</strong> method is here just for you to filter out traces… and this is very helpful to remove meaningless noise when you are beginning to work with CLR traces and you need to see which events are generated.</p>
<p>Each trace payload is received as a generic <strong>TraceEvent</strong> object that exposes common properties:</p>
<ul>
<li><strong>ProcessID</strong>/<strong>ProcessName</strong>: information related to the process in which the trace has been emitted by the CLR</li>
<li><strong>ThreadID</strong>: numeric identifier of the thread from which the trace was sent</li>
<li><strong>ID</strong>: numeric identifier of the trace that helps you find the corresponding event in the Microsoft documentation (i.e. <em>Event ID</em>)</li>
<li><strong>OpcodeName</strong>: human readable name of the trace</li>
<li><strong>EventName</strong>: concatenation of task (= group such as “Contention” corresponding to the <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-keywords-and-levels?WT.mc_id=DT-MVP-5003325">Keyword</a> mentioned earlier) and <strong>OpcodeName</strong> separated by ‘/’</li>
</ul>
<h2 id="but-what-are-the-events-to-listen-to">But… what are the events to listen to?</h2>
<p>Son, the next big step is to learn which events are interesting for you to monitor. I would suggest you read the rest of this post before going to <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-events?WT.mc_id=DT-MVP-5003325">the Microsoft documentation</a> that describes all CLR events in details.</p>
<p>Let’s start with the simplest case: one trace is emitted when an exception is thrown. Microsoft <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/exception-thrown-v1-etw-event?WT.mc_id=DT-MVP-5003325">documents theExceptionThrown_V1</a> event with <strong>ExceptionKeyword</strong> and <strong>Warning</strong> verbosity level:</p>
<p><img loading="lazy" src="/posts/2018-07-26_grab-etw-session-providers/0_EX3lpKl42pBhyeXo.png"></p>
<p>Unfortunately, there is no <strong>ExceptionThrown_V1</strong> event at the <strong>ClrTraceEventParser</strong> level:</p>
<p><img loading="lazy" src="/posts/2018-07-26_grab-etw-session-providers/0_G4sPofzTLca9w_vl.png"></p>
<p>In fact, the ID property of the received traces maps the “Event ID” of the documentation so the <strong>ExceptionStart</strong> event happens to bring the same level of information as <strong>ExceptionThrown_V1</strong> via the <strong>ExceptionTraceData</strong> parameter passed to the handler:</p>
<p><img loading="lazy" src="/posts/2018-07-26_grab-etw-session-providers/0_OFHBgCPrdThmZHHR.png"></p>
<p>The corresponding handler implementation is straightforward:</p>
<p><strong>ExceptionStartHandler.cs</strong></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-csharp" data-lang="csharp"><span class="line"><span class="cl"><span class="kd">private</span> <span class="kd">static</span> <span class="k">void</span> <span class="n">OnExceptionStart</span><span class="p">(</span><span class="n">ExceptionTraceData</span> <span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">   <span class="n">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">$&#34;{data.EventName} --&gt; {data.ExceptionType} : {data.ExceptionMessage}&#34;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The interesting properties are <strong>ExceptionType</strong> for the name of the thrown exception and <strong>ExceptionMessage</strong> for its message. Note that if the exception contains inner exceptions, you know it by taking a look at the <strong>ExceptionFlags</strong> property but you can’t access them. Remember that you should have received earlier the trace corresponding to the inner exception when it was thrown so you are just missing the relationship between the two.</p>
<p>If for an unknown reason the number of exceptions raises for one of your applications, listening to this event is a very cheap way to know which exceptions are thrown and search for them into the source code!</p>
<p>The next episode will detail other important CLR traces you need to monitor your applications and even start an investigation.</p>
<hr>
<p><em>Co-authored with <a href="https://twitter.com/kookiz">Kevin Gosse</a></em></p>
]]></content:encoded></item><item><title>Replace .NET performance counters by CLR event tracing</title><link>https://chrisnas.github.io/posts/2018-06-19_replace-net-performance-counters/</link><pubDate>Tue, 19 Jun 2018 00:00:00 +0000</pubDate><guid>https://chrisnas.github.io/posts/2018-06-19_replace-net-performance-counters/</guid><description>This post of our new series shows why performance counters might not be the best solution to monitor your .NET application and why the CLR events will definitively be a better solution.</description><content:encoded><![CDATA[<h2 id="introduction">Introduction</h2>
<p>At Criteo, each .NET application provides custom metrics to monitor deviation and trigger alerts. This is the first line of defense against misbehaviors. The next step is to figure out what could be the cause of these deviations. After source code changes analysis, it is often needed to dig deeper into performance counters exposed by the CLR such as the following:</p>
<p><img loading="lazy" src="/posts/2018-06-19_replace-net-performance-counters/0_nufe9ma4xSKydxmH.png"></p>
<p>Again, these counters are used to detect possible deviations in usual patterns. For example, some applications are supposed to answer under a 50 ms threshold. When the corresponding “number of timeouts” or “request time” metrics start to increase, several reasons linked to the CLR might be partially responsible but it is hard to tell:</p>
<ul>
<li><em># of Exceps Thrown / sec</em>: if the number increases, it is possible that performance is impacted but how to get the list of these unusual first chance exceptions caught by the applications?</li>
<li><em>Contention rate / sec</em>: an increase might suggest that threads are spending more time waiting for locks to be released but for how many milliseconds? This information is not available among the performance counters.</li>
<li><em>#Gen 2 Collections</em>: even if this counter does vary a lot, how could we be sure that blocking gen2 collections are responsible for the lack of responsiveness: no counter exposes how many milliseconds the applications threads were frozen in case of a compacting collection.</li>
</ul>
<p>As you can see, this second line of defense is not enough to start an investigation with a clear assumption in mind.</p>
<p>In addition, some counters are not showing what you might think:</p>
<p><img loading="lazy" src="/posts/2018-06-19_replace-net-performance-counters/0_qIKHUDB0aOR13eG9.png"></p>
<ul>
<li><em># Gen <n> Collections</em>: gen0 counter is also incremented after gen1 and gen2 collection, gen1 counter is also incremented after gen2 collection. There is no counter for the exact count of per generation collection trigger even though you could compute their value.</li>
</ul>
<p><img loading="lazy" src="/posts/2018-06-19_replace-net-performance-counters/0_1bvyQwbWXVljrTiZ.png"></p>
<ul>
<li><em># of current logical/physical threads</em>: it is not possible to make any link with the number of threads used by the thread pool or the TPL/Tasks. As you can see in the <a href="https://github.com/dotnet/coreclr/blob/release/1.0.0/src/vm/threads.cpp">early versions of the Core CLR</a> (i.e. where the performance counters code was not removed yet), the increment and decrement of counts do not seem thread safe: that could explain some issues (always less threads than expected in our monitoring boards) we faced in the past.</li>
</ul>
<p>You need to realize that performance counters are sampling-based and could show the same value that does not represent the current reality. For example, most of the GC counters will not change until the next garbage collection occurs; i.e. the <em>% Time in GC</em> could stay misleading (until the next GC) and this is very far from the % CPU time you are used to.</p>
<p>If you are moving to .NET Core, you will discover a worse situation: <strong>There are</strong> <strong>no more performance counters on Windows</strong> <strong>to monitor your applications</strong> And if you are targeting Linux… well…</p>
<p>However, as the rest of the articles will demonstrate, the CLR provides even more details via strongly typed tracing through Event Tracing for Windows (ETW) and LTTng on Linux.</p>
<h2 id="clr-and-event-tracing-for-windows">CLR and Event Tracing for Windows</h2>
<p>The ETW framework has existed for a long long time and allows consumers to listen to events emitted by producers as explained in <a href="http://download.microsoft.com/download/3/A/7/3A7FA450-1F33-41F7-9E6D-3AA95B5A6AEA/MSDNMagazineApril2007en-us.chm">a 2007 MSDN Magazine article</a>.</p>
<p><img loading="lazy" src="/posts/2018-06-19_replace-net-performance-counters/0_TxC5sfAh5Mfguhxn.png"></p>
<p>A tracing session wraps the providers and consumers together either for real-time processing or .etl file generation. If you already know <a href="https://github.com/microsoft/perfview/releases">the Perfview tool</a> or the <a href="https://docs.microsoft.com/en-us/windows-hardware/test/wpt?WT.mc_id=DT-MVP-5003325">Windows Performance Toolkit/xperf</a>, you should be familiar with the generated .etl files.</p>
<p>For debugging purpose, listening to events during a live session with a minimal impact on production machines is even more practical. The <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/clr-etw-events?WT.mc_id=DT-MVP-5003325">list of documented CLR events</a> is huge but no fear: this series of posts will focus on exceptions, finalizers, thread contention and garbage collection.</p>
<p>In addition to the documentation, I would recommend that you take a look at how the tracing is implemented in <a href="https://github.com/dotnet/coreclr/tree/master/src">.NET Core source code</a> for two reasons. First, you get access to the <a href="https://github.com/dotnet/coreclr/blob/master/src/vm/ClrEtwAll.man">exact payload schema</a> of all generated events (even those not documented). Second, by searching the Core CLR source code for the FireEtw-prefixed methods generated at build time, you will get a better understanding of when things are happening. Don’t forget the higher level <a href="https://github.com/dotnet/coreclr/blob/master/src/vm/eventtrace.cpp">ETW::-prefixed methods and enums</a> that are also called by the runtime to emit traces.</p>
<p>Perfview has already been mentioned earlier to help analyzing traces but it is also useful for deciphering events produced by the CLR. On a trace, double-click the <strong>Events</strong> node:</p>
<p><img loading="lazy" src="/posts/2018-06-19_replace-net-performance-counters/0_i4-ZIxUxo8nxFTT3.png"></p>
<p>In the new window that pops up, select an event on the left side to get the list of occurrences on the right side:</p>
<p><img loading="lazy" src="/posts/2018-06-19_replace-net-performance-counters/0_djIq3H9GgyYmtnB7.png"></p>
<p>Right-click an event occurrence and select <strong>Dump Event</strong> to get its payload details:</p>
<p><img loading="lazy" src="/posts/2018-06-19_replace-net-performance-counters/0_BwE09NSYG7JXLjND.png"></p>
<p>such as the type name of the new instance that led the GC to trigger the <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/garbage-collection-etw-events#gcallocationtick_v2_event?WT.mc_id=DT-MVP-5003325">AllocationTick event</a> after ~100KB was allocated.</p>
<p>In addition to <a href="https://docs.microsoft.com/en-us/dotnet/framework/performance/controlling-logging?WT.mc_id=DT-MVP-5003325">the tooling available</a> to get the traces, Microsoft is providing <a href="https://www.nuget.org/packages/Microsoft.Diagnostics.Tracing.TraceEvent/">theMicrosoft.Diagnostics.Tracing.TraceEventNuget package</a>. With this library, you will be able to build your own tool or listen to the CLR events from within your running applications to replace the performance counters. The next episode of the series will ramp you up with the implementation of a basic listener.</p>
<hr>
<p><em>Co-authored with <a href="https://twitter.com/kookiz">Kevin Gosse</a></em></p>
]]></content:encoded></item></channel></rss>