<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.3">Jekyll</generator><link href="https://www.ethan-shea.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.ethan-shea.com/" rel="alternate" type="text/html" /><updated>2024-08-09T03:59:08+00:00</updated><id>https://www.ethan-shea.com/feed.xml</id><title type="html">Ethan Shea</title><author><name>Ethan Shea</name></author><entry><title type="html">Setting up a .NET Project in 2024</title><link href="https://www.ethan-shea.com/posts/setting-up-dotnet-2024" rel="alternate" type="text/html" title="Setting up a .NET Project in 2024" /><published>2024-08-06T00:00:00+00:00</published><updated>2024-08-06T00:00:00+00:00</updated><id>https://www.ethan-shea.com/posts/setting-up-dotnet-2024</id><content type="html" xml:base="https://www.ethan-shea.com/posts/setting-up-dotnet-2024"><![CDATA[<p>The tools for .NET have advanced significantly from the .NET framework days. Unfortunately, when consulting documentation it can be difficult to pull out what is currently best practice and what is outdated.</p>

<p>This post represents a snapshot in the year 2024, targeting .NET 8.0/C#12. The structure of this template is borrowed from my time at Microsoft, and heavily influenced by my own personal opinion. I feel that it strikes a balance between the most common practices found in Microsoft, and developer productivity.</p>

<p>Not much has changed since I wrote <a href="/posts/setting-up-dotnet-2021">a similar article</a> in 2021. The updates in this post are mostly my own. I have been working in the typescript ecosystem for the past year, and am now able to take some learnings back to dotnet.</p>

<p>The repository is available on GitHub <a href="https://github.com/pensono/DotNetStarterProject/tree/2024">here</a> and as a nuget template:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dotnet</span><span class="w"> </span><span class="nx">new</span><span class="w"> </span><span class="nx">install</span><span class="w"> </span><span class="nx">Pensono.DotNetStarterProject</span><span class="w">
</span><span class="n">mkdir</span><span class="w"> </span><span class="nx">MyProject</span><span class="w"> </span><span class="o">&amp;&amp;</span><span class="w"> </span><span class="nx">cd</span><span class="w"> </span><span class="nx">MyProject</span><span class="w">
</span><span class="n">dotnet</span><span class="w"> </span><span class="nx">new</span><span class="w"> </span><span class="nx">starterproject</span><span class="w">
</span></code></pre></div></div>

<h2 id="goals">Goals</h2>

<ul>
  <li>One command to build, one command to test. In this case, those commands are <code class="language-plaintext highlighter-rouge">dotnet build</code> and <code class="language-plaintext highlighter-rouge">dotnet test</code>. This makes integration with CI easy, and allows developers and CI to share the same pipeline.</li>
  <li>Use defaults when possible. Only special cases should be configured explicitly.</li>
  <li>Minimal and easy to install tooling dependencies.</li>
  <li>Use official tools as much as possible.</li>
</ul>

<p>With this setup, dependencies are so limited that Visual Studio is not required to be productive.</p>

<h2 id="prerequisites">Prerequisites</h2>

<p>The following dependencies should be installed:</p>

<ul>
  <li><a href="https://dotnet.microsoft.com/download">.NET</a></li>
  <li><a href="https://docs.microsoft.com/en-us/powershell/scripting/install/installing-powershell">PowerShell</a> (Recommended)</li>
</ul>

<p>If it’s likely that team members have old .NET versions installed, you can enforce a minimum through a <a href="https://docs.microsoft.com/en-us/dotnet/core/tools/global-json"><code class="language-plaintext highlighter-rouge">global.json</code></a> file in the root. There’s also some versioning information here which will become relevant later.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">global.json</code> </pre>
</div>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"sdk"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"8.0.303"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"rollForward"</span><span class="p">:</span><span class="w"> </span><span class="s2">"latestMajor"</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="nl">"msbuild-sdks"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"Microsoft.Build.Traversal"</span><span class="p">:</span><span class="w"> </span><span class="s2">"4.1.0"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h2 id="overview">Overview</h2>

<p>Below is a directory listing of the project. Each item will be explained in its section. Items marked with an asterisk are considered optional or project-dependent.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>StarterProject
│   .gitignore
│   Directory.Build.props
│   Directory.Packages.props
│   dirs.proj
│   global.json
│   README.md
├───deployment*
├───docs*
├───shell
│       Init.ps1
│       MyTool.psm1
│       VisualStudio.psm1
├───src
│   ├───MyComponent
│   │   │   StarterProject.MyComponent.csproj
│   │   │   Source.cs
│   │   └───Folder
│   │           MoreSource.cs
│   └───AnotherComponent
│           StarterProject.AnotherComponent.csproj
│           AnotherSource.cs
├───test
│   └───MyComponent
│           SourceTest.cs
│           StarterProject.Test.MyComponent.csproj
└───tools*
</code></pre></div></div>

<h2 id="top-level-configuration">Top-level configuration</h2>

<p>There are some properties not set by default which should be used on new .NET projects. These can be configured in <code class="language-plaintext highlighter-rouge">Directory.Build.props</code>, which is applied to all projects within the directory. Other global configurations can be made here as well. I have included some packaging-related ones for sake of example.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Build.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project&gt;</span>
    <span class="c">&lt;!-- General --&gt;</span>
    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;TargetFramework&gt;</span>net8.0<span class="nt">&lt;/TargetFramework&gt;</span>
        <span class="nt">&lt;LangVersion&gt;</span>12.0<span class="nt">&lt;/LangVersion&gt;</span>
        <span class="nt">&lt;Nullable&gt;</span>enable<span class="nt">&lt;/Nullable&gt;</span>
        <span class="nt">&lt;Features&gt;</span>strict<span class="nt">&lt;/Features&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>

    <span class="c">&lt;!-- Build --&gt;</span>
    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;TreatWarningsAsErrors&gt;</span>true<span class="nt">&lt;/TreatWarningsAsErrors&gt;</span>
        <span class="nt">&lt;ManagePackageVersionsCentrally&gt;</span>true<span class="nt">&lt;/ManagePackageVersionsCentrally&gt;</span>
        <span class="nt">&lt;EnforceCodeStyleInBuild&gt;</span>true<span class="nt">&lt;/EnforceCodeStyleInBuild&gt;</span> <span class="c">&lt;!-- Enable linter --&gt;</span>
        <span class="nt">&lt;UseArtifactsOutput&gt;</span>true<span class="nt">&lt;/UseArtifactsOutput&gt;</span>
        <span class="nt">&lt;RepositoryRoot&gt;</span>$(MSBuildThisFileDirectory)<span class="nt">&lt;/RepositoryRoot&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
    
    <span class="c">&lt;!-- Packaging --&gt;</span>
    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;IsPackable&gt;</span>false<span class="nt">&lt;/IsPackable&gt;</span>
        <span class="nt">&lt;IsPublishable&gt;</span>false<span class="nt">&lt;/IsPublishable&gt;</span>

        <span class="c">&lt;!-- These properties will be used if packaging is enabled for a project --&gt;</span>
        <span class="nt">&lt;IncludeSymbols&gt;</span>true<span class="nt">&lt;/IncludeSymbols&gt;</span>
        <span class="nt">&lt;SymbolPackageFormat&gt;</span>snupkg<span class="nt">&lt;/SymbolPackageFormat&gt;</span>
        <span class="nt">&lt;EmbedUntrackedSources&gt;</span>true<span class="nt">&lt;/EmbedUntrackedSources&gt;</span>
        <span class="nt">&lt;Authors&gt;</span>Author One; Author Two<span class="nt">&lt;/Authors&gt;</span>
        <span class="nt">&lt;PackageLicenseExpression&gt;</span>GPL-3.0-only<span class="nt">&lt;/PackageLicenseExpression&gt;</span>
        <span class="nt">&lt;Description&gt;</span>Example project description.<span class="nt">&lt;/Description&gt;</span>
        <span class="nt">&lt;PackageTags&gt;</span>dotnet<span class="nt">&lt;/PackageTags&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>Here’s the <code class="language-plaintext highlighter-rouge">.gitignore</code> being used. Note that <code class="language-plaintext highlighter-rouge">.sln</code> files are being ignored because they will be generated as needed and not checked in. More on that later.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">.gitignore</code> </pre>
</div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>artifacts/*

**/TestResults/

*.sln
.vs/
</code></pre></div></div>

<h2 id="source-organization">Source Organization</h2>

<p>The key to the source organization is the use of the <a href="https://github.com/microsoft/MSBuildSdks/tree/master/src/Traversal"><code class="language-plaintext highlighter-rouge">Microsoft.Build.Traversal</code> SDK</a>. It allows projects to be organized anywhere in the repository, and referenced through a top-level <code class="language-plaintext highlighter-rouge">dirs.proj</code>. As your project grows, it may make sense to create more <code class="language-plaintext highlighter-rouge">dirs.proj</code> files referencing subsets of the codebase for different services or teams. In <a href="/posts/setting-up-dotnet-2021#source-organization">the previous iteration of this post</a>, I recommended creating a structure of intermediate <code class="language-plaintext highlighter-rouge">dirs.proj</code> files at each level, however I now realize that life is too short to spend time tediously managing this directory structure.</p>

<p>The version of the <code class="language-plaintext highlighter-rouge">Microsoft.Build.Traversal</code> package is specified in <code class="language-plaintext highlighter-rouge">global.json</code>.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">dirs.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.Build.Traversal"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"src\**\*.*proj"</span><span class="nt">/&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"test\**\*.*proj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>Source files are split into two folders, <code class="language-plaintext highlighter-rouge">src</code> and <code class="language-plaintext highlighter-rouge">test</code>. Within each folder are a tree of projects.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>StarterProject
│   dirs.proj
├───src
│   │   dirs.proj
│   ├───MyComponent
│   │   └───Folder
│   │       StarterProject.MyComponent.csproj
│   └───AnotherComponent
│           StarterProject.AnotherComponent.csproj
└───test
    │   dirs.proj
    └───MyComponent
            StarterProject.Test.MyComponent.csproj
</code></pre></div></div>

<p>Dependencies are made between projects using project references. This enables project boundaries to signify self-contained components. .NET will prohibit circular dependencies. Within a project, folders can be used to group files if more than one namespace is needed.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">src/MyComponent/StarterProject.MyComponent.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"$(RepositoryRoot)/src/AnotherComponent/StarterProject.AnotherComponent.csproj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Serilog"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>
<p>Note that project references begin with <code class="language-plaintext highlighter-rouge">$(RepositoryRoot)</code> which was defined earlier in <code class="language-plaintext highlighter-rouge">Directory.Build.props</code>. In some repositories, all project references are relative to the current project and contain many <code class="language-plaintext highlighter-rouge">../</code>s at the front. By making all references relative to the repository root, managing project files becomes less tedious. Find and replace can be used to update any paths and any new project files created by copy-paste will always have the correct configuration.</p>

<h3 id="optional-organizing-test-code-with-source-code">Optional: Organizing Test Code with Source Code</h3>

<p>Having worked in the typescript world for the last year, I have realized that organizing test files next to source files is much easier to manage. It’s still reasonable to organize integration tests as totally separate programs in their own directory.</p>

<p>The file tree looks something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>StarterProject
│   dirs.proj
├───src
│   ├───MyComponent
│   │   └───Folder
│   │       Source.cs
│   │       Source.test.cs
│   │       StarterProject.MyComponent.csproj
│   │       StarterProject.MyComponent.Test.csproj
│   └───AnotherComponent
│           StarterProject.AnotherComponent.csproj
└───test
    └───IntegrationTest
        StarterProject.IntegrationTest.csproj
</code></pre></div></div>

<p>If you wish to do this, exclude test files from each of your source projects, and replicate the project references from the source project to the test project since the test project now builds the source code into the test binary.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">src/MyComponent/StarterProject.MyComponent.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"$(RepositoryRoot)/src/AnotherComponent/StarterProject.AnotherComponent.csproj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;Compile</span> <span class="na">Remove=</span><span class="s">"**\*.test.cs"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Serilog"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">src/MyComponent/StarterProject.MyComponent.Test.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"$(RepositoryRoot)/src/AnotherComponent/StarterProject.AnotherComponent.csproj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Microsoft.NET.Test.Sdk"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Moq"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"xunit"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"xunit.runner.visualstudio"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<h2 id="dependency-management">Dependency Management</h2>

<p>NuGet is the package manager of choice for .NET applications. It can be configured in two parts, <code class="language-plaintext highlighter-rouge">Directory.Packages.props</code> which gives the version number for each package, and in each project file are references to those packages.</p>

<p>Here’s what <code class="language-plaintext highlighter-rouge">Directory.Packages.props</code> may look like. Dependencies are sorted by usage, then alphabetically by package name.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Packages.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project&gt;</span>
  <span class="c">&lt;!-- Runtime --&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"Serilog"</span> <span class="na">Version=</span><span class="s">"4.0.1"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="c">&lt;!-- Test --&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"Microsoft.NET.Test.Sdk"</span> <span class="na">Version=</span><span class="s">"16.8.0"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"Moq"</span> <span class="na">Version=</span><span class="s">"4.20.70"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"xunit"</span> <span class="na">Version=</span><span class="s">"2.9.0"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"xunit.runner.visualstudio"</span> <span class="na">Version=</span><span class="s">"2.8.2"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>A project can then reference one of these packages.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">src/AnotherComponent/StarterProject.AnotherComponent.csproj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Newtonsoft.Json"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Serilog"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>Since this functionality is currently in preview, each project much have <code class="language-plaintext highlighter-rouge">ManagePackageVersionsCentrally</code> set to <code class="language-plaintext highlighter-rouge">true</code>. This can be done globally in <code class="language-plaintext highlighter-rouge">Directory.Build.props</code>. The default value of this property will be <code class="language-plaintext highlighter-rouge">true</code> in future versions of the .NET SDK.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Build.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;ManagePackageVersionsCentrally&gt;</span>true<span class="nt">&lt;/ManagePackageVersionsCentrally&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
</code></pre></div></div>

<h2 id="internal-tooling">Internal Tooling</h2>

<p>It can be useful to have a collection of scripts related to the project checked in. PowerShell is my automation language of choice, not only for it’s integration with .NET, but also because scripts tend to be easier to write and more maintainable than other scripting alternatives. PowerShell can be used on both <a href="https://docs.microsoft.com/en-us/powershell/scripting/install/installing-powershell-core-on-linux">Linux</a> and Windows.</p>

<p>An entrypoint is defined as follows, which imports all other PowerShell scripts where commands are defined. In this case there are only two.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">shell/Init.ps1</code> </pre>
</div>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Import-Module</span><span class="w"> </span><span class="bp">$PSScriptRoot</span><span class="nx">/VisualStudio.psm1</span><span class="w">
</span><span class="n">Import-Module</span><span class="w"> </span><span class="bp">$PSScriptRoot</span><span class="nx">/MyTool.psm1</span><span class="w"> </span><span class="c"># Optional</span><span class="w">

</span><span class="n">Write-Host</span><span class="w"> </span><span class="nt">-ForegroundColor</span><span class="w"> </span><span class="nx">Cyan</span><span class="w"> </span><span class="s2">"Welcome to StarterProject shell"</span><span class="w">
</span></code></pre></div></div>

<p>This can be invoked directly when starting the shell. Running this script will load any commands that the <code class="language-plaintext highlighter-rouge">.psm1</code> files export.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject&gt; .\shell\Init.ps1
Welcome to StarterProject shell
</code></pre></div></div>

<h3 id="solution-generation">Solution Generation</h3>

<p>While developers can use any editor, many will want to work from Visual Studio. Visual Studio requires a solution file in order to be run. Within Microsoft, it is quite common not to check in <code class="language-plaintext highlighter-rouge">.sln</code> files and instead generate them using one of many tools. Here is a short PowerShell script which can be used to do the same thing.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">shell/VisualStudio.psm1</code> </pre>
</div>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">function</span><span class="w"> </span><span class="nf">Start-VisualStudio</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nv">$solutionName</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">Get-Item</span><span class="w"> </span><span class="o">.</span><span class="p">)</span><span class="o">.</span><span class="nf">Name</span><span class="w">

    </span><span class="n">dotnet</span><span class="w"> </span><span class="nx">new</span><span class="w"> </span><span class="nx">sln</span><span class="w"> </span><span class="nt">--force</span><span class="w"> </span><span class="nt">--name</span><span class="w"> </span><span class="nv">$solutionName</span><span class="w">
    </span><span class="n">dotnet</span><span class="w"> </span><span class="nx">sln</span><span class="w"> </span><span class="nx">add</span><span class="w"> </span><span class="p">@(</span><span class="err">Get-ChildItem</span><span class="w"> </span><span class="err">-Recurse</span><span class="w"> </span><span class="err">*.csproj</span><span class="p">)</span><span class="w">
    </span><span class="n">start</span><span class="w"> </span><span class="s2">"</span><span class="nv">$solutionName</span><span class="s2">.sln"</span><span class="w"> </span><span class="c"># This part only works on windows</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="n">Export-ModuleMember</span><span class="w"> </span><span class="o">*-*</span><span class="w">
</span></code></pre></div></div>

<p>Running it will generate the solution file and launch Visual Studio if installed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject&gt; .\shell\Init.ps1
Welcome to StarterProject shell
PS StarterProject&gt; Start-VisualStudio
The template "Solution File" was created successfully.
Project `src\AnotherComponent\StarterProject.AnotherComponent.csproj` added to the solution.
Project `src\MyComponent\StarterProject.MyComponent.csproj` added to the solution.
Project `test\MyComponent\StarterProject.Test.MyComponent.csproj` added to the solution.
Project `tools\MyTool\StarterProject.MyTool.csproj` added to the solution.
</code></pre></div></div>

<p>The command can also be run from a different location within the repo to generate a solution with a smaller scope.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject\src\MyComponent&gt; Start-VisualStudio
The template "Solution File" was created successfully.
Project `StarterProject.MyComponent.csproj` added to the solution.
</code></pre></div></div>

<p class="notice--info"><strong>Note:</strong> The <a href="https://github.com/microsoft/SlnGen">slngen tool</a> is a more robust alternative to this script with better MSBuild integration. However, because it has dependencies on Visual Studio and MSBuild which require extra configuration it is not included in this guide.</p>

<h2 id="testing">Testing</h2>

<p>There are <a href="https://docs.microsoft.com/en-us/dotnet/core/testing/#testing-tools">several popular options</a> for testing in .NET.</p>

<ul>
  <li><a href="https://github.com/Microsoft/testfx-docs">MSTest</a>, Microsoft’s official framework</li>
  <li><a href="https://xunit.net/">xUnit</a>, the open source testing framework</li>
  <li><a href="https://nunit.org/">NUnit</a>, ported from Java’s JUnit.</li>
</ul>

<p>This guide will choose xUnit out of personal preference.</p>

<p>Tests are organized with a hierarchy that parallels the code being tested. This gives something like the following structure.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>├───src
│   │   dirs.proj
│   └───MyComponent
│           StarterProject.MyComponent.csproj
│           Source.cs
└───test
    │   dirs.proj
    └───MyComponent
            StarterProject.Test.MyComponent.csproj
            SourceTest.cs
</code></pre></div></div>

<p>Tests use relative project references to refer to the code they are testing.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">test/MyComponent/StarterProject.Test.MyComponent.csproj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"../../src/MyComponent/StarterProject.MyComponent.csproj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Microsoft.NET.Test.Sdk"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Moq"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"xunit"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"xunit.runner.visualstudio"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<h2 id="linting">Linting</h2>

<p>Code style analyzers were added in .NET 5. In order to enable this, a <code class="language-plaintext highlighter-rouge">.editorconfig</code> file must be created and the <code class="language-plaintext highlighter-rouge">EnforceCodeStyleInBuild</code> property should be enabled. Using this property will cause <code class="language-plaintext highlighter-rouge">IDExxxx</code> rules to be emitted.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Build.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;EnforceCodeStyleInBuild&gt;</span>true<span class="nt">&lt;/EnforceCodeStyleInBuild&gt;</span> <span class="c">&lt;!-- Enable linter --&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">.editorconfig</code> file is too large to reproduce here, but you can see an <a href="https://github.com/pensono/DotNetStarterProject/blob/2024/.editorconfig">example</a> in the SampleProject repo.</p>

<p>Code quality analyzers (<code class="language-plaintext highlighter-rouge">CAxxxx</code>) are enabled by default.</p>

<h2 id="optional-docs">Optional: <code class="language-plaintext highlighter-rouge">/docs</code></h2>

<p>The <code class="language-plaintext highlighter-rouge">/docs</code> folder is a great place to store documentation alongside the code. A simple wiki can be created here as a collection of markdown files. By checking documentation into the repo through pull requests, it undergoes the same quality gates as the rest of the code.</p>

<h2 id="optional-deployment">Optional: <code class="language-plaintext highlighter-rouge">/deployment</code></h2>

<p>If the project will be run as a service, <code class="language-plaintext highlighter-rouge">/deployment</code> is a good place to put any configuration or automation related to making deployments.</p>

<h2 id="optional-tools">Optional: <code class="language-plaintext highlighter-rouge">/tools</code></h2>

<p>Any ad-hoc tools can be placed here. If they are written in .NET, a simple wrapper in the <code class="language-plaintext highlighter-rouge">shell</code> folder can be written to invoke <code class="language-plaintext highlighter-rouge">dotnet run</code>. This will compile and run the program.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">shell/MyTool.psm1</code> </pre>
</div>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">function</span><span class="w"> </span><span class="nf">Invoke-MyTool</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="n">dotnet</span><span class="w"> </span><span class="nx">run</span><span class="w"> </span><span class="nt">-p</span><span class="w"> </span><span class="nx">tools/MyTool/StarterProject.MyTool.csproj</span><span class="w"> </span><span class="o">--</span><span class="w"> </span><span class="err">@</span><span class="nx">args</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="n">Export-ModuleMember</span><span class="w"> </span><span class="o">*-*</span><span class="w">
</span></code></pre></div></div>

<p>Running it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject&gt; .\shell\Init.ps1
Welcome to StarterProject shell
PS StarterProject&gt; Invoke-MyTool arg1 arg2
Hello from MyTool! Arguments: [arg1,arg2]
</code></pre></div></div>]]></content><author><name>Ethan Shea</name></author><summary type="html"><![CDATA[What's the best way to set up a new project?]]></summary></entry><entry><title type="html">Syntax Highlighting for Embedded Languages inside VS Code</title><link href="https://www.ethan-shea.com/posts/vs-code-embedded-language-syntax" rel="alternate" type="text/html" title="Syntax Highlighting for Embedded Languages inside VS Code" /><published>2023-11-07T00:00:00+00:00</published><updated>2023-11-07T00:00:00+00:00</updated><id>https://www.ethan-shea.com/posts/vs-code-embedded-language-syntax</id><content type="html" xml:base="https://www.ethan-shea.com/posts/vs-code-embedded-language-syntax"><![CDATA[<p>Wrote a cool DSL and want to get syntax highlighting within a host language? Want to add syntax highlighting support for an existing language? This guide will walk through how to modify a VS code extension to do so.</p>

<p>In this example, a DSL for programming board games is embedded into typescript’s tag functions. When we’re done, it will look something like this:</p>

<p><img src="/assets/posts/vs-code-embedded-language-syntax/example.png" alt="" class="align-center" /></p>

<p class="notice--warning"><strong>Note:</strong> This guide will assume you have written a regular syntax-highlighting extension. VS Code has a <a href="https://code.visualstudio.com/api/language-extensions/syntax-highlight-guide">guide</a> on how to do this.</p>

<p class="notice--info"><strong>Note:</strong> The <code class="language-plaintext highlighter-rouge">&gt; Developer: Inspect Editor Tokens and Scopes</code> command in VS Code is useful for debugging which scopes are assigned.</p>

<h2 id="creating-the-injection-syntax">Creating the injection syntax</h2>

<p>First, create a file like the following to find points within your host language’s source. In this example, <code class="language-plaintext highlighter-rouge">begin</code> is a regex which will find the start of a string template literal utilizing a tag function called <code class="language-plaintext highlighter-rouge">ludi</code>.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">syntaxes/ts-injection.tmLanguage.json</code> </pre>
</div>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"scopeName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"source.ts.embedded.ludi"</span><span class="p">,</span><span class="w">
  </span><span class="err">//</span><span class="w"> </span><span class="err">source.ts</span><span class="w"> </span><span class="err">is</span><span class="w"> </span><span class="err">the</span><span class="w"> </span><span class="err">scope</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">our</span><span class="w"> </span><span class="err">host</span><span class="w"> </span><span class="err">language</span><span class="w">
  </span><span class="err">//</span><span class="w"> </span><span class="err">`L`</span><span class="w"> </span><span class="err">means</span><span class="w"> </span><span class="err">to</span><span class="w"> </span><span class="err">inject</span><span class="w"> </span><span class="err">this</span><span class="w"> </span><span class="err">grammar</span><span class="w"> </span><span class="err">before</span><span class="w"> </span><span class="err">`source.ts`</span><span class="w">
  </span><span class="err">//</span><span class="w"> </span><span class="err">`-string</span><span class="w"> </span><span class="err">-comment`</span><span class="w"> </span><span class="err">means</span><span class="w"> </span><span class="err">to</span><span class="w"> </span><span class="err">exclude</span><span class="w"> </span><span class="err">string</span><span class="w"> </span><span class="err">and</span><span class="w"> </span><span class="err">comment</span><span class="w"> </span><span class="err">scopes</span><span class="w">
  </span><span class="nl">"injectionSelector"</span><span class="p">:</span><span class="w"> </span><span class="s2">"L:source.ts -string -comment"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"patterns"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
    </span><span class="p">{</span><span class="w">
      </span><span class="nl">"begin"</span><span class="p">:</span><span class="w"> </span><span class="s2">"(?:ludi)</span><span class="se">\\</span><span class="s2">s*`"</span><span class="p">,</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="err">Look</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">strings</span><span class="w"> </span><span class="err">like</span><span class="w"> </span><span class="err">ludi`</span><span class="w">
      </span><span class="nl">"end"</span><span class="p">:</span><span class="w"> </span><span class="s2">"`"</span><span class="p">,</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="err">Look</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">the</span><span class="w"> </span><span class="err">ending</span><span class="w"> </span><span class="err">`</span><span class="w">
      </span><span class="nl">"contentName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"meta.embedded.block.ludi.typescript source.ludi"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"patterns"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
        </span><span class="p">{</span><span class="w">
          </span><span class="err">//</span><span class="w"> </span><span class="err">Process</span><span class="w"> </span><span class="err">the</span><span class="w"> </span><span class="err">ludi</span><span class="w"> </span><span class="err">scope</span><span class="w"> </span><span class="err">inside</span><span class="w">
          </span><span class="nl">"include"</span><span class="p">:</span><span class="w"> </span><span class="s2">"source.ludi"</span><span class="w">
        </span><span class="p">}</span><span class="w">
      </span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h2 id="adding-the-injection-to-the-plugin">Adding the injection to the plugin</h2>

<p>Next, the grammar is added to <code class="language-plaintext highlighter-rouge">package.json</code>. In this example, support for entire ludi language already exists, so only the last section is added.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">package.json</code> </pre>
</div>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ludi"</span><span class="p">,</span><span class="w">
  </span><span class="err">...</span><span class="w">
  </span><span class="nl">"contributes"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="err">//</span><span class="w"> </span><span class="err">Language</span><span class="w"> </span><span class="err">configuration</span><span class="w">
    </span><span class="nl">"languages"</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w">
      </span><span class="nl">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ludi"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"aliases"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"Ludi"</span><span class="p">,</span><span class="w"> </span><span class="s2">"ludi"</span><span class="p">],</span><span class="w">
      </span><span class="nl">"extensions"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">".ludi"</span><span class="p">],</span><span class="w">
      </span><span class="nl">"configuration"</span><span class="p">:</span><span class="w"> </span><span class="s2">"./language-configuration.json"</span><span class="w">
    </span><span class="p">}],</span><span class="w">
    </span><span class="nl">"grammars"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
      </span><span class="err">//</span><span class="w"> </span><span class="err">Grammar</span><span class="w"> </span><span class="err">for</span><span class="w"> </span><span class="err">Ludi</span><span class="w">
      </span><span class="p">{</span><span class="w">
        </span><span class="nl">"language"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ludi"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"scopeName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"source.ludi"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"path"</span><span class="p">:</span><span class="w"> </span><span class="s2">"./syntaxes/ludi.tmLanguage.json"</span><span class="w">
      </span><span class="p">},</span><span class="w">

      </span><span class="err">//</span><span class="w"> </span><span class="err">Injecting</span><span class="w"> </span><span class="err">into</span><span class="w"> </span><span class="err">typescript's</span><span class="w"> </span><span class="err">grammar.</span><span class="w"> </span><span class="err">This</span><span class="w"> </span><span class="err">part</span><span class="w"> </span><span class="err">is</span><span class="w"> </span><span class="err">new.</span><span class="w">
      </span><span class="p">{</span><span class="w">
        </span><span class="nl">"path"</span><span class="p">:</span><span class="w"> </span><span class="s2">"./syntaxes/ts-injection.tmLanguage.json"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"scopeName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"source.ts.embedded.ludi"</span><span class="p">,</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="err">Scope</span><span class="w"> </span><span class="err">matches</span><span class="w"> </span><span class="err">`scopeName`</span><span class="w"> </span><span class="err">in</span><span class="w"> </span><span class="err">the</span><span class="w"> </span><span class="err">grammar</span><span class="w">
        </span><span class="nl">"injectTo"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"source.ts"</span><span class="p">],</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="err">Host</span><span class="w"> </span><span class="err">language's</span><span class="w"> </span><span class="err">scope</span><span class="w">
        </span><span class="nl">"embeddedLanguages"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
          </span><span class="nl">"meta.embedded.inline.ludi.typescript"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ludi"</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="err">The</span><span class="w"> </span><span class="err">language</span><span class="w"> </span><span class="err">id</span><span class="w"> </span><span class="err">defined</span><span class="w"> </span><span class="err">above</span><span class="w">
        </span><span class="p">}</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">]</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>That’s it, you’re done!</p>]]></content><author><name>Ethan Shea</name></author><summary type="html"><![CDATA[Wrote a cool DSL and want to get syntax highlighting within a host language? Want to add syntax highlighting support for an existing language? This guide will walk through how to modify a VS code extension to do so.]]></summary></entry><entry><title type="html">Malformed vs. Invalid</title><link href="https://www.ethan-shea.com/posts/malformed-vs-invalid" rel="alternate" type="text/html" title="Malformed vs. Invalid" /><published>2023-06-05T00:00:00+00:00</published><updated>2023-06-05T00:00:00+00:00</updated><id>https://www.ethan-shea.com/posts/malformed-vs-invalid</id><content type="html" xml:base="https://www.ethan-shea.com/posts/malformed-vs-invalid"><![CDATA[<p>It’s a small nuisance, but I often see error messages using the words “malformed” and “invalid” interchangeably. These words actually have different meanings. The distinction can become important when an issue is being debugged.</p>

<h2 id="malformed">Malformed</h2>

<p>If a payload is “malformed” that means it is not <em>syntactically</em> valid. This means that there is some syntax issue keeping it from being well-formed, so trying to parse it will result in an error. If the payload happens to be in a binary format, it could be said that it cannot be deserialized.</p>

<p>For example, the following JSON is malformed for several reasons:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"key"</span><span class="p">:</span><span class="w"> </span><span class="s2">"value,
  "</span><span class="err">key</span><span class="mi">2</span><span class="err">:</span><span class="s2">" value"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"list"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="err">}</span><span class="w">
</span></code></pre></div></div>

<h2 id="invalid">Invalid</h2>

<p>If a payload is “invalid” that means it has <em>failed validation</em>. The payload could be parsed and therefore is well-formed, but is does not meet certain validation constraints. Since a program must be able to parse a payload in order to validate it, all invalid payloads are also well-formed<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>

<p>The fundamental difference is that these constraints enforce whether a program is willing to accept a payload rather than whether a program can understand it.</p>

<p>For example, <code class="language-plaintext highlighter-rouge">age</code> is not allowed to be negative in the following payload:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"age"</span><span class="p">:</span><span class="w"> </span><span class="mi">-12</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>The JSON communicates that the <code class="language-plaintext highlighter-rouge">age</code> key maps to the number <code class="language-plaintext highlighter-rouge">-12</code>, but knowing that ages must be non-negative a program may choose to reject it.</p>

<h2 id="edge-cases">Edge Cases</h2>

<p>What about this payload, where <code class="language-plaintext highlighter-rouge">count</code> is expected to be an integer?</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"count"</span><span class="p">:</span><span class="w"> </span><span class="s2">"123"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>This case may seem ambiguous, but even though an integer is expected, the payload is syntactically valid<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>. The program reading this payload can choose whether it is invalid or not. The syntax of JSON allows any value to be of any type so there are no problems with well-formedness.</p>

<p>The previous payload is distinct from this one below where the value of <code class="language-plaintext highlighter-rouge">count</code> is a number.</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"count"</span><span class="p">:</span><span class="w"> </span><span class="mi">123</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>It’s worth noting that many JSON parsers will automatically convert strings to integers when possible but this just changes what a program is willing to accept and not the definition of JSON syntax.<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">3</a></sup></p>

<p>Not all formats have the same approach to mismatched types. For example, in a protobuf document the types are specified making the distinction a matter of well-formedness rather than validity.</p>

<div class="language-protobuf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">message</span> <span class="nc">Payload</span> <span class="p">{</span>
  <span class="kt">int32</span> <span class="na">count</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>What about this payload with duplicated keys?</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"key"</span><span class="p">:</span><span class="w"> </span><span class="s2">"value1"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"key"</span><span class="p">:</span><span class="w"> </span><span class="s2">"value2"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Again, it is syntactically valid but this time it is also ambiguous. I would argue that the most reasonable thing to do is to reject the payload as invalid due to the ambiguity. There are many cases where the payload must be interpreted anyways and some reasonable choice must be made.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">

      <p>It’s possible (and performant) to write a parser which validates fields in a streaming fashion as they are parsed rather than all at once after the parsing is complete. Therefore, a document could be rejected for being invalid even though it is later malformed. For example:</p>

      <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"age"</span><span class="p">:</span><span class="w"> </span><span class="mi">-12</span><span class="p">,</span><span class="w">
  </span><span class="nl">"list"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="err">}</span><span class="w">
</span></code></pre></div>      </div>

      <p>In this case, either message is useful. Both issues must be solved eventually. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">

      <p><a href="https://www.json.org/json-en.html">https://www.json.org/json-en.html</a></p>

      <p><a href="https://stackoverflow.com/questions/15368231/can-json-numbers-be-quoted">https://stackoverflow.com/questions/15368231/can-json-numbers-be-quoted</a> <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>There are situations where a number must be encoded as a string, but again this is beyond the scope of syntax <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Ethan Shea</name></author><summary type="html"><![CDATA[What's the difference between these words?]]></summary></entry><entry><title type="html">Falsehoods Programmers Believe About Floating Point Numbers</title><link href="https://www.ethan-shea.com/posts/falsehoods-about-floats" rel="alternate" type="text/html" title="Falsehoods Programmers Believe About Floating Point Numbers" /><published>2022-10-22T00:00:00+00:00</published><updated>2022-10-22T00:00:00+00:00</updated><id>https://www.ethan-shea.com/posts/falsehoods-about-floats</id><content type="html" xml:base="https://www.ethan-shea.com/posts/falsehoods-about-floats"><![CDATA[<p>This is a list of falsehoods programmers tend to believe about floating point numbers- specifically the IEEE-754 floating point numbers used ubiquitously today.</p>

<p>When using floating point, it is easy to write programs which may seem to compute the right answer but are actually hiding subtle bugs. In serious applications numerical computing quickly gets complicated, requiring the consideration of  many factors, like the accumulation of error, numerical stability, and the how the numbers flow throughout the program. Knowing some  floating point quirks provides a good foundation for when your math starts to look off.</p>

<p>While this list does not go into the details of correctly using floating point numbers, it does enumerate a number of assumptions often made by programmers.</p>

<p><strong>All of these assumptions are wrong</strong></p>

<ul>
  <li>Floating point arithmetic is exact
<!-- <details class="spoiler"><summary class="spoiler-title">Why?</summary><div class="spoiler-body"><p>eoueu  oeuoeuo</p><p>oeuoeu</p><p>oeu  oeuoeu</p><p>oaeueo<strong>oeuoeuoeu</strong>aoeueo</p></div></details> -->
<!-- Counterexample: Any arithemitic which rounds --></li>
  <li>Floating point arithmetic is always inexact
<!-- Counterexample: Any arithmetic which can be computed without rounding at all, like 1/2 --></li>
  <li>The properties of arithmetic (commutativity, associativity, distributivity, inverse) hold</li>
  <li>The error in floating point math always tends to average itself out
<!-- Counterexample: Accumulators which keep adding in error --></li>
  <li>Floating point math is precise enough for programs which manage money
<!-- Counterexample: rounding error :( --></li>
  <li>A list of numbers can be summed in any order without affecting the result
<!-- Counterexample: If any rounding occurs -->
<!-- https://riskledger.com/blog/floating-point-numbers/ --></li>
  <li>A list of numbers can be multiplied in any order without affecting the result
<!-- Counterexample: If any rounding occurs --></li>
  <li>Floating point can’t be used for integer math
<!-- Counterexample: Up to 2^38 for doubles and 2^. Just don't divide and get fractional results --></li>
  <li>Floating point numbers are either 64 or 32 bits
<!-- Counterexmple: half, quad, quarter  --></li>
  <li>Floating point numbers have <code class="language-plaintext highlighter-rouge">2^n</code> bits
<!-- Counterexample: x86 extended format, which has 80 bits --></li>
  <li>If two floating point numbers have different bits, they are not equal
<!-- Counterexample: 0 == -0 --></li>
  <li>If two floating point numbers have the same bits, they are equal
<!-- Counterexample: NaN != NaN --></li>
  <li>The reciprocal of two equal numbers is also equal
<!-- Counterexample: 1/-0 != 1/0, -inf != inf --></li>
  <li>There is only one way to encode NaN
<!-- Counterexample: All nans can carry "signaling information" --></li>
  <li>Floating point functions supported by the CPU are computed as accurately as possible
<!-- Beginning of section 3 http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf, counterexamples include sin and log--></li>
  <li>Arithmetic operations execute in a constant amount of time
<!-- Counterexample: division?? --></li>
  <li>Addition/multiplication operations execute in a constant amount of time
<!-- Counterexample: denormals --></li>
  <li>Floating point math is always executed on specialized hardware
<!-- Counterexample: Soft floating point math, common today in microcontrollers without FPUs --></li>
  <li>Exceptions in floating point math always throw
<!-- Counterexample: Exception bits can be turned off --></li>
  <li>Floating point math always rounds the same way
<!-- Counterexample: Configuration bits --></li>
  <li>Programs built with the same compiler brand will produce the exact same results
<!-- New optimizations --></li>
  <li>Programs build with the same compiler version will produce the exact same results
<!-- Optimization switches can change  --></li>
  <li>Debug and release mode give identical results
<!-- FluidSim example --></li>
  <li>CPUs with the same instruction set produce the exact same results executing floating point instructions
<!-- Counterexample: Differing numbers of internal bits between AMD and Intel, in x87 instruction sets --></li>
  <li>32 bit and 64 bit versions of the same program running on the same machine will produce the same results
<!-- 32 bit can't use SSE2 registers, which may execute differently -->
<!-- https://stackoverflow.com/questions/20963419/cross-platform-floating-point-consistency --></li>
</ul>

<!-- Not good enough to post, but still interesting -->
<!-- - The dynamic range of numbers greater than 1 is equal to the dynamic range of numbers between 0 and 1 -->
<!-- Counterexample: 1.79769•10e308 for numeric_limits::max vs 4.94066•10e−324 for numeric_limits::denorm_min -->
<!-- - All floating point configuration is exposed in the library -->
<!-- Counterexample: Round to even is not in the C++ standard library but exists in intel processors -->

<h2 id="further-reading">Further Reading</h2>
<ul>
  <li><a href="https://randomascii.wordpress.com/category/floating-point/">Bruce Dawson’s Blogs on Floating Point</a></li>
  <li><a href="https://floating-point-gui.de/">Floating Point Guide</a> - Michael Borgwardt</li>
  <li><a href="https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html">What Every Computer Scientist Should Know About Floating-Point Arithmetic</a> - David Goldberg
<!-- https://www.lahey.com/float.htm -->
<!-- https://randomascii.wordpress.com/2012/04/05/floating-point-complexities/ -->
<!-- https://randomascii.wordpress.com/2013/07/16/floating-point-determinism/ --></li>
</ul>]]></content><author><name>Ethan Shea</name></author><summary type="html"><![CDATA[Floating point bugs are subtle, but preventable with the right know-how]]></summary></entry><entry><title type="html">Setting up a .NET Project in 2021</title><link href="https://www.ethan-shea.com/posts/setting-up-dotnet-2021" rel="alternate" type="text/html" title="Setting up a .NET Project in 2021" /><published>2021-02-17T00:00:00+00:00</published><updated>2021-02-17T00:00:00+00:00</updated><id>https://www.ethan-shea.com/posts/setting-up-dotnet-2021</id><content type="html" xml:base="https://www.ethan-shea.com/posts/setting-up-dotnet-2021"><![CDATA[<p class="notice--info"><strong>Note:</strong> <a href="/posts/setting-up-dotnet-2024">An updated version of this post is available</a>.</p>

<p>In recent years, there have been many advances in .NET tooling. Unfortunately, when consulting documentation it can be difficult to pull out what is currently best practice and what is outdated.</p>

<p>This post represents a snapshot in the year 2021. The guidelines here are not official guidance from the .NET team and are not endorsed by Microsoft, but represent a combination of what my team at Microsoft uses as well as my own personal preference. The project in this post will target .NET 5, C#9.0 and use the .NET 5 SDK.</p>

<p>The repository is available on GitHub <a href="https://github.com/pensono/DotNetStarterProject/tree/2021">here</a>.</p>

<h2 id="goals">Goals</h2>

<ul>
  <li>One command to build, one command to test. In this case, those commands are <code class="language-plaintext highlighter-rouge">dotnet build</code> and <code class="language-plaintext highlighter-rouge">dotnet test</code>. This makes integration with CI easy, and allows developers and CI to share the same pipeline.</li>
  <li>Use defaults when possible. Only special cases should be configured explicitly.</li>
  <li>Minimal and easy to install tooling dependencies.</li>
  <li>Use official tools as much as possible.</li>
</ul>

<p>With this setup, dependencies are so limited that Visual Studio is not required to be productive.</p>

<h2 id="prerequisites">Prerequisites</h2>

<p>The following dependencies should be installed:</p>

<ul>
  <li><a href="https://dotnet.microsoft.com/download">.NET</a></li>
  <li><a href="https://docs.microsoft.com/en-us/powershell/scripting/install/installing-powershell">PowerShell</a> (Recommended)</li>
</ul>

<p>If it’s likely that team members have old .NET versions installed, you can enforce a minimum through a <a href="https://docs.microsoft.com/en-us/dotnet/core/tools/global-json"><code class="language-plaintext highlighter-rouge">global.json</code></a> file in the root. There’s also some versioning information here which will become relevant later.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">global.json</code> </pre>
</div>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"sdk"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"5.0.103"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"rollForward"</span><span class="p">:</span><span class="w"> </span><span class="s2">"latestMajor"</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="nl">"msbuild-sdks"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"Microsoft.Build.Traversal"</span><span class="p">:</span><span class="w"> </span><span class="s2">"3.0.3"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h2 id="overview">Overview</h2>

<p>Below is a directory listing of the project. Each item will be explained in its section. Items marked with an asterisk are considered optional or project-dependent.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>StarterProject
│   .gitignore
│   Directory.Packages.props
│   Directory.Build.props
│   dirs.proj
│   global.json
│   README.md
├───deployment*
├───docs*
├───shell
│       Init.ps1
│       MyTool.psm1
│       VisualStudio.psm1
├───src
│   │   dirs.proj
│   ├───MyComponent
│   │   │   StarterProject.MyComponent.csproj
│   │   │   Source.cs
│   │   └───Folder
│   │           MoreSource.cs
│   └───AnotherComponent
│           StarterProject.AnotherComponent.csproj
│           AnotherSource.cs
├───test
│   │   dirs.proj
│   └───MyComponent
│           StarterProject.Test.MyComponent.csproj
│           ExampleTest.cs
└───tools*
</code></pre></div></div>

<h2 id="top-level-configuration">Top-level configuration</h2>

<p>There are some properties not set by default which should be used on new .NET projects. These can be configured in <code class="language-plaintext highlighter-rouge">Directory.Build.props</code>, which is applied to all projects within the directory. Other global configurations can be made here as well. I have included some packaging-related ones for sake of example.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Build.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project&gt;</span>
    <span class="c">&lt;!-- General --&gt;</span>
    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;TargetFramework&gt;</span>net5.0<span class="nt">&lt;/TargetFramework&gt;</span>
        <span class="nt">&lt;LangVersion&gt;</span>9.0<span class="nt">&lt;/LangVersion&gt;</span>
        <span class="nt">&lt;Nullable&gt;</span>enable<span class="nt">&lt;/Nullable&gt;</span>
        <span class="nt">&lt;Features&gt;</span>strict<span class="nt">&lt;/Features&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>

    <span class="c">&lt;!-- Build --&gt;</span>
    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;TreatWarningsAsErrors&gt;</span>true<span class="nt">&lt;/TreatWarningsAsErrors&gt;</span>
        <span class="nt">&lt;ManagePackageVersionsCentrally&gt;</span>true<span class="nt">&lt;/ManagePackageVersionsCentrally&gt;</span>
        <span class="nt">&lt;EnforceCodeStyleInBuild&gt;</span>true<span class="nt">&lt;/EnforceCodeStyleInBuild&gt;</span> <span class="c">&lt;!-- Enable linter --&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
    
    <span class="c">&lt;!-- Packaging --&gt;</span>
    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="c">&lt;!-- Enable packaging on a per-project basis. --&gt;</span>
        <span class="nt">&lt;IsPackable&gt;</span>false<span class="nt">&lt;/IsPackable&gt;</span>
        <span class="nt">&lt;IsPublishable&gt;</span>false<span class="nt">&lt;/IsPublishable&gt;</span>

        <span class="nt">&lt;IncludeSymbols&gt;</span>true<span class="nt">&lt;/IncludeSymbols&gt;</span>
        <span class="nt">&lt;SymbolPackageFormat&gt;</span>snupkg<span class="nt">&lt;/SymbolPackageFormat&gt;</span>
        <span class="nt">&lt;EmbedUntrackedSources&gt;</span>true<span class="nt">&lt;/EmbedUntrackedSources&gt;</span>
        <span class="nt">&lt;Authors&gt;</span>Author One; Author Two<span class="nt">&lt;/Authors&gt;</span>
        <span class="nt">&lt;PackageLicenseExpression&gt;</span>GPL-3.0-only<span class="nt">&lt;/PackageLicenseExpression&gt;</span>
        <span class="nt">&lt;Description&gt;</span>Example project description.<span class="nt">&lt;/Description&gt;</span>
        <span class="nt">&lt;PackageTags&gt;</span>dotnet<span class="nt">&lt;/PackageTags&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>Here’s the <code class="language-plaintext highlighter-rouge">.gitignore</code> being used. Note that <code class="language-plaintext highlighter-rouge">.sln</code> files are being ignored because they will be generated as needed and not checked in. More on that later.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">.gitignore</code> </pre>
</div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>**/bin
**/obj

**/TestResults/

*.sln
.vs/
</code></pre></div></div>

<h2 id="source-organization">Source Organization</h2>

<p>The key to the source organization is the use of the <a href="https://github.com/microsoft/MSBuildSdks/tree/master/src/Traversal"><code class="language-plaintext highlighter-rouge">Microsoft.Build.Traversal</code> SDK</a>. It allows projects to be hierarchically structured within the repository. Each folder has a file called <code class="language-plaintext highlighter-rouge">dirs.proj</code> or a <code class="language-plaintext highlighter-rouge">.csproj</code> for the project. The <code class="language-plaintext highlighter-rouge">dirs.proj</code> references where the child project files are located. The version of this package is specified in <code class="language-plaintext highlighter-rouge">global.json</code>.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">dirs.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.Build.Traversal"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"src/dirs.proj"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"test/dirs.proj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">src/dirs.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.Build.Traversal"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"MyComponent/StarterProject.MyComponent.csproj"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"AnotherComponent/StarterProject.AnotherComponent.csproj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>It’s also possible to define one <code class="language-plaintext highlighter-rouge">dirs.proj</code> which automatically references any projects under <code class="language-plaintext highlighter-rouge">src</code> and <code class="language-plaintext highlighter-rouge">test</code>.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">dirs.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.Build.Traversal"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"src\**\*.*proj"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"test\**\*.*proj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>Source files are split into two folders, <code class="language-plaintext highlighter-rouge">src</code> and <code class="language-plaintext highlighter-rouge">test</code>. Within each folder are a tree of projects.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>StarterProject
│   dirs.proj
├───src
│   │   dirs.proj
│   ├───MyComponent
│   │   └───Folder
│   │       StarterProject.MyComponent.csproj
│   └───AnotherComponent
│           StarterProject.AnotherComponent.csproj
└───test
    │   dirs.proj
    └───MyComponent
            StarterProject.Test.MyComponent.csproj
</code></pre></div></div>

<p>Dependencies are made between projects using project references. This implies that project boundaries are drawn around self-contained components. .NET will prohibit circular dependencies. Within a project, folders can be used to group files if more than one namespace is needed.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">src/MyComponent/StarterProject.MyComponent.proj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"../AnotherComponent/StarterProject.AnotherComponent.csproj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Newtonsoft.Json"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Serilog"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<h2 id="dependency-management">Dependency Management</h2>

<p>NuGet is the package manager of choice for .NET applications. It can be configured in two parts, <code class="language-plaintext highlighter-rouge">Directory.Packages.props</code> which gives the version number for each package, and in each project file are references to those packages.</p>

<p class="notice--info"><strong>Note:</strong> The functionality described is currently in preview, but represents the direction of the .NET SDK. A stable alternative is the <a href="https://github.com/Microsoft/MSBuildSdks/blob/master/src/CentralPackageVersions/README.md">CentralPackageVersions</a> SDK, which does the same thing with slightly more boilerplate.</p>

<p>Here’s what <code class="language-plaintext highlighter-rouge">Directory.Packages.props</code> may look like. Dependencies are sorted by usage, then alphabetically by package name.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Packages.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project&gt;</span>
  <span class="c">&lt;!-- Runtime --&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"Newtonsoft.Json"</span> <span class="na">Version=</span><span class="s">"12.0.3"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"Serilog"</span> <span class="na">Version=</span><span class="s">"2.10.0"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="c">&lt;!-- Test --&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"Microsoft.NET.Test.Sdk"</span> <span class="na">Version=</span><span class="s">"16.8.0"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"Moq"</span> <span class="na">Version=</span><span class="s">"4.13.1"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"xunit"</span> <span class="na">Version=</span><span class="s">"2.4.1"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageVersion</span> <span class="na">Include=</span><span class="s">"xunit.runner.visualstudio"</span> <span class="na">Version=</span><span class="s">"2.4.1"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>A project can then reference one of these packages.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">src/AnotherComponent/StarterProject.AnotherComponent.csproj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Newtonsoft.Json"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Serilog"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<p>Since this functionality is currently in preview, each project much have <code class="language-plaintext highlighter-rouge">ManagePackageVersionsCentrally</code> set to <code class="language-plaintext highlighter-rouge">true</code>. This can be done globally in <code class="language-plaintext highlighter-rouge">Directory.Build.props</code>. The default value of this property will be <code class="language-plaintext highlighter-rouge">true</code> in future versions of the .NET SDK.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Build.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;ManagePackageVersionsCentrally&gt;</span>true<span class="nt">&lt;/ManagePackageVersionsCentrally&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
</code></pre></div></div>

<h2 id="internal-tooling">Internal Tooling</h2>

<p>It can be useful to have a collection of scripts related to the project checked in. PowerShell is my automation language of choice, not only for it’s integration with .NET, but also because scripts tend to be easier to write and more maintainable than other scripting alternatives. PowerShell can be used on both <a href="https://docs.microsoft.com/en-us/powershell/scripting/install/installing-powershell-core-on-linux?view=powershell-7.1">Linux</a> and Windows.</p>

<p>An entrypoint is defined as follows, which imports all other PowerShell scripts where commands are defined. In this case there are only two.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">shell/Init.ps1</code> </pre>
</div>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Import-Module</span><span class="w"> </span><span class="bp">$PSScriptRoot</span><span class="nx">/VisualStudio.psm1</span><span class="w">
</span><span class="n">Import-Module</span><span class="w"> </span><span class="bp">$PSScriptRoot</span><span class="nx">/MyTool.psm1</span><span class="w"> </span><span class="c"># Optional</span><span class="w">

</span><span class="n">Write-Host</span><span class="w"> </span><span class="nt">-ForegroundColor</span><span class="w"> </span><span class="nx">Cyan</span><span class="w"> </span><span class="s2">"Welcome to StarterProject shell"</span><span class="w">
</span></code></pre></div></div>

<p>This can be invoked directly when starting the shell. Running this script will load any commands that the <code class="language-plaintext highlighter-rouge">.psm1</code> files export.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject&gt; .\shell\Init.ps1
Welcome to StarterProject shell
</code></pre></div></div>

<h3 id="solution-generation">Solution Generation</h3>

<p>While developers can use any editor, many will want to work from Visual Studio. Visual Studio requires a solution file in order to be run. Within Microsoft, it is quite common not to check in <code class="language-plaintext highlighter-rouge">.sln</code> files and instead generate them using one of many tools. Here is a short PowerShell script which can be used to do the same thing.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">shell/VisualStudio.psm1</code> </pre>
</div>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">function</span><span class="w"> </span><span class="nf">Start-VisualStudio</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nv">$solutionName</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">Get-Item</span><span class="w"> </span><span class="o">.</span><span class="p">)</span><span class="o">.</span><span class="nf">Name</span><span class="w">

    </span><span class="n">dotnet</span><span class="w"> </span><span class="nx">new</span><span class="w"> </span><span class="nx">sln</span><span class="w"> </span><span class="nt">--force</span><span class="w"> </span><span class="nt">--name</span><span class="w"> </span><span class="nv">$solutionName</span><span class="w">
    </span><span class="n">Get-ChildItem</span><span class="w"> </span><span class="nt">-Recurse</span><span class="w"> </span><span class="o">*.</span><span class="nf">csproj</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="kr">ForEach</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">dotnet</span><span class="w"> </span><span class="nx">sln</span><span class="w"> </span><span class="nx">add</span><span class="w"> </span><span class="bp">$_</span><span class="o">.</span><span class="nf">FullName</span><span class="w"> </span><span class="p">}</span><span class="w">
    </span><span class="n">start</span><span class="w"> </span><span class="s2">"</span><span class="nv">$solutionName</span><span class="s2">.sln"</span><span class="w"> </span><span class="c"># This part only works on windows</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="n">Export-ModuleMember</span><span class="w"> </span><span class="o">*-*</span><span class="w">
</span></code></pre></div></div>

<p>Running it will generate the solution file and launch Visual Studio if installed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject&gt; .\shell\Init.ps1
Welcome to StarterProject shell
PS StarterProject&gt; Start-VisualStudio
The template "Solution File" was created successfully.
Project `src\AnotherComponent\StarterProject.AnotherComponent.csproj` added to the solution.
Project `src\MyComponent\StarterProject.MyComponent.csproj` added to the solution.
Project `test\MyComponent\StarterProject.Test.MyComponent.csproj` added to the solution.
Project `tools\MyTool\StarterProject.MyTool.csproj` added to the solution.
</code></pre></div></div>

<p>The command can also be run from a different location within the repo to generate a solution with a smaller scope.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject\src\MyComponent&gt; Start-VisualStudio
The template "Solution File" was created successfully.
Project `StarterProject.MyComponent.csproj` added to the solution.
</code></pre></div></div>

<p class="notice--info"><strong>Note:</strong> The <a href="https://github.com/microsoft/SlnGen">slngen tool</a> is a more robust alternative to this script with better MSBuild integration. However, because it has dependencies on Visual Studio and MSBuild which require extra configuration, it is not included in this guide.</p>

<h2 id="testing">Testing</h2>

<p>There are <a href="https://docs.microsoft.com/en-us/dotnet/core/testing/#testing-tools">several popular options</a> for testing in .NET.</p>

<ul>
  <li><a href="https://github.com/Microsoft/testfx-docs">MSTest</a>, Microsoft’s official framework</li>
  <li><a href="https://xunit.net/">xUnit</a>, the open source testing framework</li>
  <li><a href="https://nunit.org/">NUnit</a>, ported from Java’s JUnit.</li>
</ul>

<p>This guide will choose xUnit out of personal preference.</p>

<p>Tests are organized with a hierarchy that parallels the code being tested. This gives something like the following structure.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>├───src
│   │   dirs.proj
│   └───MyComponent
│           StarterProject.MyComponent.csproj
│           Source.cs
└───test
    │   dirs.proj
    └───MyComponent
            StarterProject.Test.MyComponent.csproj
            SourceTest.cs
</code></pre></div></div>

<p>Tests use relative project references to refer to the code they are testing.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">test/MyComponent/StarterProject.Test.MyComponent.csproj</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Project</span> <span class="na">Sdk=</span><span class="s">"Microsoft.NET.Sdk"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;ProjectReference</span> <span class="na">Include=</span><span class="s">"../../src/MyComponent/StarterProject.MyComponent.csproj"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>

  <span class="nt">&lt;ItemGroup&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Microsoft.NET.Test.Sdk"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"Moq"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"xunit"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;PackageReference</span> <span class="na">Include=</span><span class="s">"xunit.runner.visualstudio"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;/ItemGroup&gt;</span>
<span class="nt">&lt;/Project&gt;</span>
</code></pre></div></div>

<h2 id="linting">Linting</h2>

<p>Code style analyzers have been added to .NET 5. In order to enable this, a <code class="language-plaintext highlighter-rouge">.editorconfig</code> file must be created and the <code class="language-plaintext highlighter-rouge">EnforceCodeStyleInBuild</code> property should be enabled. Using this property will cause <code class="language-plaintext highlighter-rouge">IDExxxx</code> rules to be emitted.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">Directory.Build.props</code> </pre>
</div>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nt">&lt;PropertyGroup&gt;</span>
        <span class="nt">&lt;EnforceCodeStyleInBuild&gt;</span>true<span class="nt">&lt;/EnforceCodeStyleInBuild&gt;</span> <span class="c">&lt;!-- Enable linter --&gt;</span>
    <span class="nt">&lt;/PropertyGroup&gt;</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">.editorconfig</code> file is too large to reproduce here, but you can see an <a href="https://github.com/pensono/DotNetStarterProject/blob/2021/.editorconfig">example</a> in the SampleProject repo.</p>

<p>Code quality analyzers (<code class="language-plaintext highlighter-rouge">CAxxxx</code>) are enabled by default.</p>

<h2 id="optional-docs">Optional: <code class="language-plaintext highlighter-rouge">/docs</code></h2>

<p>The <code class="language-plaintext highlighter-rouge">/docs</code> folder is a great place to store documentation alongside the code. A simple wiki can be created here as a collection of markdown files. By checking documentation into the repo through pull requests, it undergoes the same quality gates as the rest of the code.</p>

<h2 id="optional-deployment">Optional: <code class="language-plaintext highlighter-rouge">/deployment</code></h2>

<p>If the project will be run as a service, <code class="language-plaintext highlighter-rouge">/deployment</code> is a good place to put any configuration or automation related to making deployments.</p>

<h2 id="optional-tools">Optional: <code class="language-plaintext highlighter-rouge">/tools</code></h2>

<p>Any ad-hoc tools can be placed here. If they are written in .NET, a simple wrapper in the <code class="language-plaintext highlighter-rouge">shell</code> folder can be written to invoke <code class="language-plaintext highlighter-rouge">dotnet run</code>. This will compile and run the program.</p>

<div class="highlighter-rouge syntax-header">
  <pre class="highlight"><code class="">shell/MyTool.psm1</code> </pre>
</div>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">function</span><span class="w"> </span><span class="nf">Invoke-MyTool</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="n">dotnet</span><span class="w"> </span><span class="nx">run</span><span class="w"> </span><span class="nt">-p</span><span class="w"> </span><span class="nx">tools/MyTool/StarterProject.MyTool.csproj</span><span class="w"> </span><span class="o">--</span><span class="w"> </span><span class="err">@</span><span class="nx">args</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="n">Export-ModuleMember</span><span class="w"> </span><span class="o">*-*</span><span class="w">
</span></code></pre></div></div>

<p>Running it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS StarterProject&gt; .\shell\Init.ps1
Welcome to StarterProject shell
PS StarterProject&gt; Invoke-MyTool arg1 arg2
Hello from MyTool! Arguments: [arg1,arg2]
</code></pre></div></div>]]></content><author><name>Ethan Shea</name></author><summary type="html"><![CDATA[A lot has changed in the .NET world recently. How should a project be started from scratch?]]></summary></entry><entry><title type="html">Encoding Techniques in Verified Serializers</title><link href="https://www.ethan-shea.com/posts/verified-serializers" rel="alternate" type="text/html" title="Encoding Techniques in Verified Serializers" /><published>2018-03-28T00:00:00+00:00</published><updated>2018-03-28T00:00:00+00:00</updated><id>https://www.ethan-shea.com/posts/verified-serializers</id><content type="html" xml:base="https://www.ethan-shea.com/posts/verified-serializers"><![CDATA[<p>Written by <a href="/">Ethan Shea</a>, edited by <a href="https://jamesrwilcox.com/">James Wilcox</a>.</p>

<p>Serialization converts in-memory data to an external representation, typically a
list or stream of bytes, which is then ready to be stored on disk or sent over
the network.</p>

<p>This post describes Cheerios, a verified library for serialization in Coq.
Cheerios uses typeclasses to make it easy to create new serializers by composing
existing serializers, such that the correctness proofs also compose.  We first
give an overview of the core definitions of Cheerios and show how to build
simple serializers for booleans, natural numbers, and pairs.  Then, we describe
two generic strategies for serializing recursive “container-like” types, such as
lists and trees, and discuss the tradeoffs in proof effort between the
strategies. A recurring theme is the challenge of expressing decoders via
<em>structural recursion</em>.</p>

<p>This post is generated from a literate Coq <a href="https://github.com/pensono/Ethan-Cheerios/blob/master/blog_comparison.v">file</a>, which we encourage you to step through.</p>

<h2 id="defining-serialization">Defining Serialization</h2>

<p>In order to define serialization, three things are needed, types for the serialization and deserialization functions, and a correctness specification. The correctness spec should roughly show that serialization and deserialization are inverses.This enables the proof that any object can be serialized then deserialized into the same object. We’ll start with serialization because it conceptually comes first in the process.</p>

<p>In order to serialize something, all of it’s information must be mapped into bits. It makes sense then to define a serializer for some type <code class="language-plaintext highlighter-rouge">A</code> as <code class="language-plaintext highlighter-rouge">A -&gt; list bool</code>. Take the following type for example, representing olympic medals:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Inductive</span><span class="w"> </span><span class="no">medal</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">Gold</span><span class="w"> </span><span class="p">|</span><span class="w"> </span><span class="no">Silver</span><span class="w"> </span><span class="p">|</span><span class="w"> </span><span class="no">Bronze</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>A serialization function should map each case to a symbol of bits. There are many ways this could be done, each with different trade offs that will be explored later. For now, we just pick one.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Definition</span><span class="w"> </span><span class="no">medal_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">m</span><span class="p">:</span><span class="w"> </span><span class="no">medal</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">m</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">Gold</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">true</span><span class="p">]</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">Silver</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">false</span><span class="p">]</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">Bronze</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">false</span><span class="p">]</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>As it turns out, this first attempt at a type will be exactly what is needed.</p>

<p>Now a type for the deserializer can be determined. We want something that acts as an
inverse to the serialization function we picked. At first thought, <code class="language-plaintext highlighter-rouge">list bool -&gt; A</code> seems
like a good option. This would allow the correctness spec to be <code class="language-plaintext highlighter-rouge">deserialize (serialize a) = a</code>.
However, this runs into problems pretty quickly.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">Fail</span><span class="w"> </span><span class="k">Definition</span><span class="w"> </span><span class="no">medal_deserialize</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">medal</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">true</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Gold</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">false</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Silver</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">false</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Bronze</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Coq catches the mistake and points out that the <code class="language-plaintext highlighter-rouge">bools</code> is not exhaustively matched on. What if
it’s empty? Fundamentally, this problem is encountered because not every sequence of booleans
decodes into a <code class="language-plaintext highlighter-rouge">medal</code>. Even non-empty sequences such as <code class="language-plaintext highlighter-rouge">[false; true]</code> pose issues. Since
these sequences are not produced by the serializer, they can be considered erroneous.
In cheerios we handle this case by returning the <code class="language-plaintext highlighter-rouge">option</code> constructor <code class="language-plaintext highlighter-rouge">None</code> to indicate an
error.</p>

<p>This makes the spec become <code class="language-plaintext highlighter-rouge">deserialize (serialize a) = Some a</code>. In English: deserialization
on any serialized stream always succeeds and returns the correct value.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Definition</span><span class="w"> </span><span class="no">medal_deserialize1</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">):</span><span class="no">option</span><span class="w"> </span><span class="no">medal</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">true</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="no">Gold</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">false</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="no">Silver</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">false</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="no">Bronze</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>This works for a single medal being encoded in the bitstream, but there are problems when the work from above is reused to a type which requires composition, like a pair of medals. Serialization works just fine, but deserialization is problematic.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Definition</span><span class="w"> </span><span class="no">medal_serialize_pair</span><span class="w"> </span><span class="p">(</span><span class="no">medals</span><span class="p">:</span><span class="w"> </span><span class="no">medal</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">medal</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="no">medal_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">fst</span><span class="w"> </span><span class="no">medals</span><span class="p">)</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">medal_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">snd</span><span class="w"> </span><span class="no">medals</span><span class="p">)</span><span class="pi">.</span><span class="w">

</span><span class="no">Fail</span><span class="w"> </span><span class="k">Definition</span><span class="w"> </span><span class="no">medal_deserialize_pair</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w">
    </span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">medal</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">medal</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
</span><span class="p">(</span><span class="no">medal_deserialize1</span><span class="w"> </span><span class="no">bools</span><span class="p">,</span><span class="w"> </span><span class="no">medal_deserialize1</span><span class="w"> </span><span class="no">hmmm</span><span class="p">)</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>When deserializing the first medal, the entire list is consumed. There is nothing to pass into the second call to <code class="language-plaintext highlighter-rouge">medal_deserialize1</code> because it is not known how much of the list has been deserialized. The definition of deserialize needs a way to communicate how much of the stream is remaining back to the caller. In Cheerios, this is represented with the type <code class="language-plaintext highlighter-rouge">medal * list bool</code> where
the deserialized medal and remaining portion of the stream are returned. This is wrapped in an option to allow the entire
deserialization operation to fail. Failure happens at this level because once an error is encountered, it is
impossible in general to resume serialization of the remaining content.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Definition</span><span class="w"> </span><span class="no">medal_deserialize</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w">
	</span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">medal</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">Gold</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">Silver</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">Bronze</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>As we will see shortly, this type is sufficient to support both composition and malformed inputs.
Let’s take a moment to generalize these definitions before continuing so we can arrive at a definition for the spec.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Definition</span><span class="w"> </span><span class="no">serializer</span><span class="w"> </span><span class="p">(</span><span class="no">A</span><span class="p">:</span><span class="w"> </span><span class="kr">Type</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">deserializer</span><span class="w"> </span><span class="p">(</span><span class="no">A</span><span class="p">:</span><span class="w"> </span><span class="kr">Type</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>How does this alter the correctness specification? We can start by taking what
we had last time and making it typecheck:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">deser</span><span class="w"> </span><span class="p">(</span><span class="no">ser</span><span class="w"> </span><span class="no">a</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="p">[])</span><span class="w">
</span></code></pre></div></div>

<p>However this still doesn’t address the problem with the remaining bools. How do we reason
about deserialization with any other input following? Another attempt leads us to something
like this:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">deser</span><span class="w"> </span><span class="p">(</span><span class="no">ser</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">ser</span><span class="w"> </span><span class="no">b</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">ser</span><span class="w"> </span><span class="no">b</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>This works, but now exactly two objects must be encoded in the stream. We can’t
easily reason about deserializing multiple objects or a single object this way.
Generalizing again for what comes after gives:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">deser</span><span class="w"> </span><span class="p">(</span><span class="no">ser</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>Now the dependence on a second object is removed and as a side effect the spec
becomes stronger, allowing any data to follow rather than just data produced
by some serializer.</p>

<p>Note that the spec only needs to worry about encodings which the serializer produces.
This eliminates our need to reason about the error cases that were necessary in the
deserializer. However, in doing this, nothing is said about how malformed bitstrings are
parsed, or that every deserialized value can be generated by exactly one bit string. These
may be useful properties to prove, but cheerios does not handle deserialization
from unknown and unverified sources since this minimum spec is enough.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Definition</span><span class="w"> </span><span class="no">ser_deser_spec</span><span class="w"> </span><span class="no">A</span><span class="w">
           </span><span class="p">(</span><span class="no">ser</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">serializer</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w">
           </span><span class="p">(</span><span class="no">deser</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">deserializer</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">forall</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">),</span><span class="w">
      </span><span class="p">(</span><span class="no">deser</span><span class="w"> </span><span class="p">(</span><span class="no">ser</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">bools</span><span class="p">))</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Wrapping this up in a class gives us the following definition which includes the following
three things: a serializer, a deserializer, and a proof of correctness.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Class</span><span class="w"> </span><span class="no">Serializer</span><span class="w"> </span><span class="p">(</span><span class="no">A</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="kr">Type</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="kr">Type</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="no">serialize</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">;</span><span class="w">
    </span><span class="no">deserialize</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">);</span><span class="w">
    </span><span class="no">ser_deser_identity</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">ser_deser_spec</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="no">deserialize</span><span class="w">
</span><span class="p">}</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>In general, the correctness proofs tend to be straightforward and
repetitive, but this first one is included here to show the structure.
Concretely this becomes:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Theorem</span><span class="w"> </span><span class="no">medal_ser_deser_identity</span><span class="w"> </span><span class="p">:</span><span class="w">
  </span><span class="no">ser_deser_spec</span><span class="w"> </span><span class="no">medal</span><span class="w"> </span><span class="no">medal_serialize</span><span class="w"> </span><span class="no">medal_deserialize</span><span class="pi">.</span><span class="w">
</span><span class="k">Proof</span><span class="pi">.</span><span class="w">
  </span><span class="kp">unfold</span><span class="w"> </span><span class="no">ser_deser_spec</span><span class="pi">.</span><span class="w">
  </span><span class="kp">unfold</span><span class="w"> </span><span class="no">medal_deserialize</span><span class="pi">.</span><span class="w">
  </span><span class="kp">unfold</span><span class="w"> </span><span class="no">medal_serialize</span><span class="pi">.</span><span class="w">
  </span><span class="kp">intros</span><span class="w"> </span><span class="no">m</span><span class="pi">.</span><span class="w">
  </span><span class="kp">destruct</span><span class="w"> </span><span class="no">m</span><span class="p">;</span><span class="w"> </span><span class="ne">reflexivity</span><span class="pi">.</span><span class="w">
</span><span class="k">Qed</span><span class="pi">.</span><span class="w">

</span><span class="k">Instance</span><span class="w"> </span><span class="no">MedalSerializer</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">Serializer</span><span class="w"> </span><span class="no">medal</span><span class="pi">.</span><span class="w">
</span><span class="k">Proof</span><span class="pi">.</span><span class="w">
</span><span class="ne">exact</span><span class="w"> </span><span class="p">{|</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">medal_serialize</span><span class="p">;</span><span class="w">
         </span><span class="no">deserialize</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">medal_deserialize</span><span class="p">;</span><span class="w">
         </span><span class="no">ser_deser_identity</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">medal_ser_deser_identity</span><span class="p">;</span><span class="w">
       </span><span class="p">|}</span><span class="pi">.</span><span class="w">
</span><span class="k">Defined</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Generalizing this pair deserailizer for arbitrary types <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> comes
naturally now that there are better type signatures for serialization
and deserialization. Wrapping all three components in a section avoids some
boilerplate. Note that the type system requires a serializer for <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> in order
for the <code class="language-plaintext highlighter-rouge">A * B</code> serializer to function.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Section</span><span class="w"> </span><span class="no">PairSerializer</span><span class="pi">.</span><span class="w">
</span><span class="k">Variable</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="kr">Type</span><span class="pi">.</span><span class="w">
</span><span class="k">Variable</span><span class="w"> </span><span class="no">B</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="kr">Type</span><span class="pi">.</span><span class="w">
</span><span class="k">Variable</span><span class="w"> </span><span class="no">serA</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">Serializer</span><span class="w"> </span><span class="no">A</span><span class="pi">.</span><span class="w">
</span><span class="k">Variable</span><span class="w"> </span><span class="no">serB</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">Serializer</span><span class="w"> </span><span class="no">B</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">pair_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">p</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">B</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="no">serialize</span><span class="w"> </span><span class="p">(</span><span class="no">fst</span><span class="w"> </span><span class="no">p</span><span class="p">)</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="p">(</span><span class="no">snd</span><span class="w"> </span><span class="no">p</span><span class="p">)</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">pair_deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> 
    </span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">((</span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">B</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
    </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">b</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">((</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">b</span><span class="p">),</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Theorem</span><span class="w"> </span><span class="no">pair_ser_deser_identity</span><span class="w"> </span><span class="p">:</span><span class="w"> 
  </span><span class="no">ser_deser_spec</span><span class="w"> </span><span class="p">(</span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">B</span><span class="p">)</span><span class="w"> </span><span class="no">pair_serialize</span><span class="w"> </span><span class="no">pair_deserialize</span><span class="pi">.</span><span class="w">
</span><span class="k">Proof</span><span class="pi">.</span><span class="w">
  </span><span class="kp">unfold</span><span class="w"> </span><span class="no">ser_deser_spec</span><span class="pi">.</span><span class="w">
  </span><span class="kp">intros</span><span class="pi">.</span><span class="w">
  </span><span class="kp">unfold</span><span class="w"> </span><span class="no">pair_serialize</span><span class="pi">.</span><span class="w">
  </span><span class="kp">rewrite</span><span class="w"> </span><span class="no">app_ass</span><span class="pi">.</span><span class="w">
  </span><span class="kp">unfold</span><span class="w"> </span><span class="no">pair_deserialize</span><span class="pi">.</span><span class="w">
  </span><span class="kp">rewrite</span><span class="w"> </span><span class="no">ser_deser_identity</span><span class="p">,</span><span class="w"> </span><span class="no">ser_deser_identity</span><span class="pi">.</span><span class="w">
  </span><span class="kp">rewrite</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="no">surjective_pairing</span><span class="pi">.</span><span class="w">
  </span><span class="ne">reflexivity</span><span class="pi">.</span><span class="w">
</span><span class="k">Qed</span><span class="pi">.</span><span class="w">

</span><span class="k">Instance</span><span class="w"> </span><span class="no">PairSerializer</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">Serializer</span><span class="w"> </span><span class="p">(</span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">B</span><span class="p">)</span><span class="pi">.</span><span class="w">
</span><span class="k">Proof</span><span class="pi">.</span><span class="w">
</span><span class="ne">exact</span><span class="w"> </span><span class="p">{|</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">pair_serialize</span><span class="p">;</span><span class="w">
         </span><span class="no">deserialize</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">pair_deserialize</span><span class="p">;</span><span class="w">
         </span><span class="no">ser_deser_identity</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">pair_ser_deser_identity</span><span class="p">;</span><span class="w">
       </span><span class="p">|}</span><span class="pi">.</span><span class="w">
</span><span class="k">Defined</span><span class="pi">.</span><span class="w">

</span><span class="k">End</span><span class="w"> </span><span class="no">PairSerializer</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Note that the variable <code class="language-plaintext highlighter-rouge">bools</code> is shadowed several times in this definition. Normally this can complicate
code, but in this case it improves clarity because <code class="language-plaintext highlighter-rouge">bools</code> always refers to “what’s left to parse”.</p>

<p>Now, we will build a simple (inefficient<sup id="fnref:efficient" role="doc-noteref"><a href="#fn:efficient" class="footnote" rel="footnote">1</a></sup>) serializer/deserializer for a more useful datatype, <code class="language-plaintext highlighter-rouge">nat</code>s.
The encoding is essentially the unary representation of the natural number.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">nat_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">n</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">nat</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">n</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">O</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">false</span><span class="p">]</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">S</span><span class="w"> </span><span class="no">n</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">]</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="p">(</span><span class="no">nat_serialize</span><span class="w"> </span><span class="no">n</span><span class="p">)</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">nat_deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">nat</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
    </span><span class="kr">match</span><span class="w"> </span><span class="no">nat_deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">n</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">S</span><span class="w"> </span><span class="no">n</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">O</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="c">(* Deserializing an empty stream *)</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Theorem</span><span class="w"> </span><span class="no">nat_ser_deser_identity</span><span class="w"> </span><span class="p">:</span><span class="w">
  </span><span class="no">ser_deser_spec</span><span class="w"> </span><span class="no">nat</span><span class="w"> </span><span class="no">nat_serialize</span><span class="w"> </span><span class="no">nat_deserialize</span><span class="pi">.</span><span class="w">
</span><span class="k">Proof</span><span class="pi">.</span><span class="w">
  </span><span class="kp">unfold</span><span class="w"> </span><span class="no">ser_deser_spec</span><span class="pi">.</span><span class="w">
  </span><span class="kp">intros</span><span class="w"> </span><span class="no">n</span><span class="p">;</span><span class="w"> </span><span class="kp">induction</span><span class="w"> </span><span class="no">n</span><span class="p">;</span><span class="w"> </span><span class="kp">intros</span><span class="pi">.</span><span class="w">
  </span><span class="p">-</span><span class="w"> </span><span class="kp">simpl</span><span class="pi">.</span><span class="w"> </span><span class="ne">reflexivity</span><span class="pi">.</span><span class="w">
  </span><span class="p">-</span><span class="w"> </span><span class="kp">simpl</span><span class="pi">.</span><span class="w">
    </span><span class="kp">rewrite</span><span class="w"> </span><span class="no">IHn</span><span class="pi">.</span><span class="w">
    </span><span class="ne">reflexivity</span><span class="pi">.</span><span class="w">
</span><span class="k">Qed</span><span class="pi">.</span><span class="w">

</span><span class="k">Instance</span><span class="w"> </span><span class="no">NatSerializer</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">Serializer</span><span class="w"> </span><span class="no">nat</span><span class="pi">.</span><span class="w">
</span><span class="k">Proof</span><span class="pi">.</span><span class="w">
</span><span class="ne">exact</span><span class="w"> </span><span class="p">{|</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">nat_serialize</span><span class="p">;</span><span class="w">
         </span><span class="no">deserialize</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">nat_deserialize</span><span class="p">;</span><span class="w">
         </span><span class="no">ser_deser_identity</span><span class="w"> </span><span class="p">:=</span><span class="w"> </span><span class="no">nat_ser_deser_identity</span><span class="p">;</span><span class="w">
       </span><span class="p">|}</span><span class="pi">.</span><span class="w">
</span><span class="k">Defined</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Notice that the information about when to <em>stop</em> deserialization of each element must be encoded
into the stream itself. For example with the following definition of <code class="language-plaintext highlighter-rouge">nat_serialize</code>, deserialization
of <code class="language-plaintext highlighter-rouge">nat * nat</code> would become problematic.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">nat_serialize_broken</span><span class="w"> </span><span class="p">(</span><span class="no">n</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">nat</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">n</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">O</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[]</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">S</span><span class="w"> </span><span class="no">n</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">]</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="p">(</span><span class="no">nat_serialize</span><span class="w"> </span><span class="no">n</span><span class="p">)</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Under this definition, it’s unclear what deserializing <code class="language-plaintext highlighter-rouge">[true, true true]</code> as a pair of <code class="language-plaintext highlighter-rouge">nat</code>s should
return. It could be <code class="language-plaintext highlighter-rouge">(0,3)</code>, <code class="language-plaintext highlighter-rouge">(1,2)</code>, <code class="language-plaintext highlighter-rouge">(2,1)</code> or <code class="language-plaintext highlighter-rouge">(3,0)</code>. To remove this ambiguity, the information about when to stop must be
encoded in the stream itself in one form or another rather than implicitly by using the end of the stream as a token.
Consider the serialized pair of <code class="language-plaintext highlighter-rouge">nat</code>s <code class="language-plaintext highlighter-rouge">[true, false, true, true, false]</code>, serialized using the not-broken serializer.
It is unambiguously <code class="language-plaintext highlighter-rouge">(1, 2)</code>. When deserializing it is known precisely when each <code class="language-plaintext highlighter-rouge">nat</code>
finishes (when <code class="language-plaintext highlighter-rouge">false</code> is reached), and when the pair finishes (when the second <code class="language-plaintext highlighter-rouge">nat</code> finishes).
This information about the structure of the encoded
data plays a crucial part in showing <code class="language-plaintext highlighter-rouge">ser_deser_identity</code>.</p>

<h2 id="list-serialization">List Serialization</h2>

<p>When serializing lists (or any variable sized collection) there must be some information
about the structure in the serialized stream. Imagine this is not done, and a pair of lists is serialized
into the byte stream. This would produce an encoding which looks like the figure below. It’s 
impossible to tell where one list stops and the next begins just by looking at the stream.</p>

<p><img src="/assets/posts/verified-serializers/list_broken.png" alt="" class="align-center" /></p>

<p>This serializer is broken for the same reason as the broken <code class="language-plaintext highlighter-rouge">nat</code> serializer, the information in a serialized
object must be entirely contained within the bitstream. Note that we don’t run into this problem with any
collection of fixed size, like a pair or vector. It is clear when to stop deserializing a <code class="language-plaintext highlighter-rouge">Vec 5</code> because 5
elements have been deserialized. In this case, the information about the shape of the data in this case is encoded in the
type. Since the type is known to the serializer and the deserializer, it does not need to be encoded
in the bitstream.</p>

<p>Let’s start with solving this problem by including a “continue” bit before every element. If it is true an element
follows, and if it is false, the end of the list has been reached. This appears as follows:</p>

<p><img src="/assets/posts/verified-serializers/list_interleaved.png" alt="" class="align-center" /></p>

<p>Let’s see what this looks like in code.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">list_serialize_inter</span><span class="w"> </span><span class="p">(</span><span class="no">l</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">false</span><span class="p">]</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">h</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">]</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="no">h</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">list_serialize_inter</span><span class="w"> </span><span class="no">t</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>With this scheme, deserialization again proves to be difficult. In the definition below, because <code class="language-plaintext highlighter-rouge">bools_after_elem</code>
is not a syntactic subterm of <code class="language-plaintext highlighter-rouge">bools</code>, the termination checker refuses to accept this definition. The fact that <code class="language-plaintext highlighter-rouge">bools_after_elem</code> is returned from a function hides the subterm property from the typechecker. When executed, the
definition <em>does</em> terminate, since <code class="language-plaintext highlighter-rouge">bools_after_elem</code> is a strict suffix of <code class="language-plaintext highlighter-rouge">bools</code>,
but the type system does not see this. An attempted definition is given below:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">Fail</span><span class="w"> </span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">list_deserialize_inter</span><span class="w">
  </span><span class="p">(</span><span class="no">bools</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">list</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">([],</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
    </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">bools_after_elem</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">list_deserialize_em</span><span class="w"> </span><span class="no">bools_after_elem</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">tail</span><span class="p">,</span><span class="w"> </span><span class="no">bools_after_list</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
          </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">tail</span><span class="p">,</span><span class="w"> </span><span class="no">bools_after_list</span><span class="p">)</span><span class="w">
      </span><span class="kr">end</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>It is intuitively impossible to define this deserialization function
without using general recursion. To solve this recursion problem, the same
information encoded in the continuation bits can be moved to the front of the list’s
encoding in the form of a size. Then the rest of the deserializer can recurse on the
 number of elements remaining.</p>

<p><img src="/assets/posts/verified-serializers/list_front.png" alt="" class="align-center" /></p>

<p>Programmatically,</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">list_serialize_elts</span><span class="w"> </span><span class="p">(</span><span class="no">l</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[]</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">h</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="no">h</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">list_serialize_elts</span><span class="w"> </span><span class="no">t</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">list_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">l</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="no">nat_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">length</span><span class="w"> </span><span class="no">l</span><span class="p">)</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">list_serialize_elts</span><span class="w"> </span><span class="no">l</span><span class="pi">.</span><span class="w">

</span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">list_deserialize_elts</span><span class="w"> </span><span class="p">(</span><span class="no">size</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">nat</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w">
      </span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">list</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">size</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">O</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">([],</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">S</span><span class="w"> </span><span class="no">size</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
    </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">n</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">list_deserialize_elts</span><span class="w"> </span><span class="no">size</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">tail</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">n</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">tail</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
      </span><span class="kr">end</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">list_deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">size</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">list_deserialize_elts</span><span class="w"> </span><span class="no">size</span><span class="w"> </span><span class="no">bools</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>This gives a definition which can be defined using only structural recursion, just
by moving the information around. It’s worth noting that because the size information
is grouped together instead of spread apart, it would be much easier to make the encoding
format more efficient by swapping in a more efficient <code class="language-plaintext highlighter-rouge">nat</code> serializer. The only property
lost with this encoding is that it is now impossible to reason about any tail of the
list in isolation, the concept of a size must also be considered.</p>

<h2 id="binary-trees">Binary Trees</h2>

<p>To continue exploring this idea of serializing shape, we need to look at a more complicated data
structure such as a binary tree. Our definition of a binary tree is straightforward:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Inductive</span><span class="w"> </span><span class="no">tree</span><span class="p">:</span><span class="w"> </span><span class="kr">Type</span><span class="w"> </span><span class="p">:=</span><span class="w"> 
</span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w">
</span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="o">-&gt;</span><span class="w"> </span><span class="no">tree</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Just as with lists, there are two general approaches to serializing trees: interleaved and up-front.</p>

<h3 id="interleaved-tree-serializer">Interleaved Tree Serializer</h3>

<p>For the interleaved shape tree serializer, the concept of a “path” is needed. A path is simply the list of
directions taken from the root to reach some node. We’ll use <code class="language-plaintext highlighter-rouge">true</code> to represent left and <code class="language-plaintext highlighter-rouge">false</code>
to represent right. These directions are stored with the head at the top of the tree.
Below is the path <code class="language-plaintext highlighter-rouge">[true, false]</code>.</p>

<p><img src="/assets/posts/verified-serializers/path.png" alt="" class="align-center" /></p>

<p>Using the concept of a path, the position and data of any node can be serialized. When this is done for
all nodes in the tree, all information captured by the original data structure has been encoded.<sup id="fnref:tree_efficient" role="doc-noteref"><a href="#fn:tree_efficient" class="footnote" rel="footnote">2</a></sup></p>

<p>Even though an interleaved structure is impossible to deserialize without general recursion, using an
interleaved structure is still possible if there is just enough information up front to recurse on. The
number of nodes in the tree provides a nice metric. Our serializer will not be truely interleaved since
we require this header, but information about the shape will still be interleaved in the encoding.</p>

<p>The encoding using an interleaved structure looks like this:</p>

<p><img src="/assets/posts/verified-serializers/tree_interleaved.png" alt="" class="align-center" /></p>

<p>Serialization is performed as follows:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_size</span><span class="w"> </span><span class="p">(</span><span class="no">t</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">nat</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="mi">0</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="no">tree_size</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="no">tree_size</span><span class="w"> </span><span class="no">r</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_serialize_subtree_inter</span><span class="w"> 
    </span><span class="p">(</span><span class="no">t</span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="no">path</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[]</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="no">a</span><span class="w">
      </span><span class="o">++</span><span class="w"> </span><span class="no">tree_serialize_subtree_inter</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="p">(</span><span class="no">path</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">])</span><span class="w">
      </span><span class="o">++</span><span class="w"> </span><span class="no">tree_serialize_subtree_inter</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">(</span><span class="no">path</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="p">[</span><span class="no">false</span><span class="p">])</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">tree_serialize_inter</span><span class="w"> </span><span class="p">(</span><span class="no">t</span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="no">nat_serialize</span><span class="w"> </span><span class="p">(</span><span class="no">tree_size</span><span class="w"> </span><span class="no">t</span><span class="p">)</span><span class="w"> </span><span class="o">++</span><span class="w"> 
  </span><span class="no">tree_serialize_subtree_inter</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="p">[]</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Deserialization is more complicated. As elements are parsed, they are inserted into the tree structure parsed already.
The insertion function used is not particularly robust, however during deserialization as long as any given node is 
preceded by all of its parents no issues arise. This is the case with a preorder traversal, and other
traversals like BFS, so it meets our needs.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_insert</span><span class="w"> </span><span class="p">(</span><span class="no">into</span><span class="w"> </span><span class="no">t</span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)(</span><span class="no">path</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">):</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">into</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">t</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="c">(* not supported *)</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="p">(</span><span class="no">tree_insert</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="no">path</span><span class="p">)</span><span class="w"> </span><span class="no">r</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="p">(</span><span class="no">tree_insert</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="no">path</span><span class="p">)</span><span class="w">
      </span><span class="kr">end</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_deserialize_inter_impl</span><span class="w">
         </span><span class="p">(</span><span class="no">remaining</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">nat</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="no">root</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w">
         </span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">remaining</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">S</span><span class="w"> </span><span class="no">n</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
    </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">path</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
        </span><span class="no">tree_deserialize_inter_impl</span><span class="w">
          </span><span class="no">n</span><span class="w">
          </span><span class="p">(</span><span class="no">tree_insert</span><span class="w"> </span><span class="no">root</span><span class="w"> </span><span class="p">(</span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="no">leaf</span><span class="p">)</span><span class="w"> </span><span class="no">path</span><span class="p">)</span><span class="w">
          </span><span class="no">bools</span><span class="w">
      </span><span class="kr">end</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">O</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">root</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">tree_deserialize_inter</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">nat_deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w"> 
  </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">size</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
		</span><span class="no">tree_deserialize_inter_impl</span><span class="w"> </span><span class="no">size</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="no">bools</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Because of this concept of a path, which is a global address of any particular node, reasoning about a tree
becomes much more difficult. In particular, we must now prove that every insertion is made on a leaf of the
tree so it does not overwrite data or fall off the end.</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">leaf_insertable</span><span class="w"> </span><span class="p">(</span><span class="no">into</span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)(</span><span class="no">path</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">):</span><span class="w"> </span><span class="kr">Prop</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">into</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
    </span><span class="c">(* Only if the path and tree run out at the same time
       should we be able to insert *)</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">True</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">False</span><span class="w">
      </span><span class="kr">end</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">False</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">(</span><span class="no">leaf_insertable</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">path</span><span class="p">)</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">path</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">(</span><span class="no">leaf_insertable</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="no">path</span><span class="p">)</span><span class="w">
      </span><span class="kr">end</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>The proof for this serializer is quite large (about 150 lines) and uninteresting, so it has been omitted. It can be found <a href="https://github.com/pensono/Ethan-Cheerios/blob/f214e92e02bbeebd93c06c44efd73782325a44fa/blog_comparison.v#L717-L877">here</a>.</p>

<h3 id="up-front-tree-serializer">Up-front Tree Serializer</h3>

<p>Alternatively, the structure may be recorded at the beginning and then filled in as the tree is parsed. To do this, a tree’s shape can be reasoned about as the type <code class="language-plaintext highlighter-rouge">tree unit</code>, and it’s elements as the type <code class="language-plaintext highlighter-rouge">list A</code>.</p>

<p>This technique requires serialization and deserialization to be a two step process, which has the advantage
of better mapping to the information stored in the tree (shape and element data), but the disadvantage
of being more complicated.</p>

<p>The shape is encoded similarly to HTML with three symbols:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">[true; true]</code>: The beginning of a <code class="language-plaintext highlighter-rouge">node</code></li>
  <li><code class="language-plaintext highlighter-rouge">[true; false]</code>: The end of a <code class="language-plaintext highlighter-rouge">node</code></li>
  <li><code class="language-plaintext highlighter-rouge">[false]</code>: A leaf node</li>
</ul>

<p>Each <code class="language-plaintext highlighter-rouge">node</code> requires exactly two subtrees between its start and end marker. Storing the shape as <code class="language-plaintext highlighter-rouge">tree unit</code>
 works because <code class="language-plaintext highlighter-rouge">unit</code> contains no information, so <code class="language-plaintext highlighter-rouge">tree unit</code> only contains the information that
the <code class="language-plaintext highlighter-rouge">tree</code> portion of <code class="language-plaintext highlighter-rouge">tree A</code> describes, which is the shape. Since the shape is recorded in a preorder
traversal, the elements are also encoded in the same order, which makes it easy to marry the two together.</p>

<p>A visual representation of this encoding:</p>

<p><img src="/assets/posts/verified-serializers/tree_front.png" alt="" class="align-center" /></p>

<p>And in code:</p>

<div class="language-coq highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_serialize_shape</span><span class="w"> </span><span class="p">(</span><span class="no">t</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">false</span><span class="p">]</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">true</span><span class="p">]</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">tree_serialize_shape</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="o">++</span><span class="w">
                  </span><span class="no">tree_serialize_shape</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="p">[</span><span class="no">true</span><span class="p">;</span><span class="w"> </span><span class="no">false</span><span class="p">]</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_serialize_data_preorder</span><span class="w"> </span><span class="p">(</span><span class="no">t</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="c">(* No data contained within leaf nodes *)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">serialize</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="o">++</span><span class="w">
				  </span><span class="no">tree_serialize_data_preorder</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="o">++</span><span class="w">
				  </span><span class="no">tree_serialize_data_preorder</span><span class="w"> </span><span class="no">r</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">tree_serialize_front</span><span class="w"> </span><span class="p">(</span><span class="no">t</span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="p">)</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="no">tree_serialize_shape</span><span class="w"> </span><span class="no">t</span><span class="w"> </span><span class="o">++</span><span class="w"> </span><span class="no">tree_serialize_data_preorder</span><span class="w"> </span><span class="no">t</span><span class="pi">.</span><span class="w">

</span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_deserialize_shape</span><span class="w"> 
	</span><span class="p">(</span><span class="no">bools</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="no">progress</span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="p">(</span><span class="no">list</span><span class="w"> </span><span class="p">(</span><span class="no">tree</span><span class="w"> </span><span class="no">unit</span><span class="p">)))</span><span class="w">
	</span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">tree</span><span class="w"> </span><span class="no">unit</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
    </span><span class="kr">match</span><span class="w"> </span><span class="no">progress</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">leaf</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">level</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">progress</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
		</span><span class="no">tree_deserialize_shape</span><span class="w">
		  </span><span class="no">bools</span><span class="w">
		  </span><span class="p">((</span><span class="no">leaf</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">level</span><span class="p">)</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">progress</span><span class="p">)</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
		</span><span class="no">tree_deserialize_shape</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">([]</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">progress</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
    </span><span class="kr">match</span><span class="w"> </span><span class="no">progress</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="c">(* end without a beginning *)</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">level</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
      </span><span class="kr">match</span><span class="w"> </span><span class="no">level</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">r</span><span class="p">;</span><span class="w"> </span><span class="no">l</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">node</span><span class="w"> </span><span class="no">tt</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
      </span><span class="kr">end</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">level</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">parent</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">progress</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">level</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">[</span><span class="no">r</span><span class="p">;</span><span class="w"> </span><span class="no">l</span><span class="p">]</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
		</span><span class="no">tree_deserialize_shape</span><span class="w">
		  </span><span class="no">bools</span><span class="w">
		  </span><span class="p">((</span><span class="no">node</span><span class="w"> </span><span class="no">tt</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">parent</span><span class="p">)</span><span class="w"> </span><span class="p">::</span><span class="w"> </span><span class="no">progress</span><span class="p">)</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
      </span><span class="kr">end</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Fixpoint</span><span class="w"> </span><span class="no">tree_deserialize_front_elts</span><span class="w">
	</span><span class="p">(</span><span class="no">shape</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">tree</span><span class="w"> </span><span class="no">unit</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> 
	</span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">shape</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">leaf</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">leaf</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">node</span><span class="w"> </span><span class="p">_</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
    </span><span class="kr">match</span><span class="w"> </span><span class="no">deserialize</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
    </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">a</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
      </span><span class="kr">match</span><span class="w"> </span><span class="no">tree_deserialize_front_elts</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
      </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">l</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> 
        </span><span class="kr">match</span><span class="w"> </span><span class="no">tree_deserialize_front_elts</span><span class="w"> </span><span class="no">r</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="kp">with</span><span class="w">
        </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
        </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">r</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">node</span><span class="w"> </span><span class="no">a</span><span class="w"> </span><span class="no">l</span><span class="w"> </span><span class="no">r</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w">
        </span><span class="kr">end</span><span class="w">
      </span><span class="kr">end</span><span class="w">
    </span><span class="kr">end</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">

</span><span class="k">Definition</span><span class="w"> </span><span class="no">tree_deserialize_front</span><span class="w"> </span><span class="p">(</span><span class="no">bools</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w">
	</span><span class="p">:</span><span class="w"> </span><span class="no">option</span><span class="w"> </span><span class="p">(</span><span class="no">tree</span><span class="w"> </span><span class="no">A</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="no">list</span><span class="w"> </span><span class="no">bool</span><span class="p">)</span><span class="w"> </span><span class="p">:=</span><span class="w">
  </span><span class="kr">match</span><span class="w"> </span><span class="no">tree_deserialize_shape</span><span class="w"> </span><span class="no">bools</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="kp">with</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">None</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="no">None</span><span class="w">
  </span><span class="p">|</span><span class="w"> </span><span class="no">Some</span><span class="w"> </span><span class="p">(</span><span class="no">shape</span><span class="p">,</span><span class="w"> </span><span class="no">bools</span><span class="p">)</span><span class="w"> </span><span class="p">=&gt;</span><span class="w">
		</span><span class="no">tree_deserialize_front_elts</span><span class="w"> </span><span class="no">shape</span><span class="w"> </span><span class="no">bools</span><span class="w">
  </span><span class="kr">end</span><span class="pi">.</span><span class="w">
</span></code></pre></div></div>

<p>Because of the more recursive nature of the encoding, reasoning is significantly easier. We can consider
any portion of the shape in isolation from all others because there are no ties to any global state.</p>

<p>Again, the proof for this serializer is large (about 70 lines) and uninteresting, so it has been omitted. It can be found <a href="https://github.com/pensono/Ethan-Cheerios/blob/f214e92e02bbeebd93c06c44efd73782325a44fa/blog_comparison.v#L1002-L1071">here</a>.</p>

<h2 id="conclusion">Conclusion</h2>

<p>It’s worth noting that possible encodings for a given type are restricted by information dependencies
within that type. Imagine a list is encoded as follows:</p>

<p><img src="/assets/posts/verified-serializers/list_size_end.png" alt="" class="align-center" /></p>

<p>Since the size of the list is at the end, rather than at the beginning, information about how to deserialize
the structure isn’t known until its too late. Similarly, the size can’t be put anywhere in the middle (say
after the first element), because of the possibility of an empty list. Before deserializing each element,
it must be known that it actually is an element of the list, and not some other data coming after the list.</p>

<p>This is why the interleaved list serializer is able to work. Right before each element is deserialized, we mark
that the list continues with the continue bit.</p>

<p>This is also why the tree serializers are able to encode the shape at the front or the end. In both cases, the size
is known so deserializing additional elements is justified. The question of how to arrange these elements can
be reasoned about independently of the elements themselves, therefore the shape of the tree can be encoded without
regard to where the element data is located.</p>

<p>One might expect to be able to speculatively parse elements of the bitstream and stop
when an invalid element is reached. But this requires that we don’t accidentally interpret whatever came after in the
bit stream as an element. If the encoding of different types are guaranteed to not overlap, then this would be possible.
But in our model, serializers can choose arbitrary encodings, so this is not possible.</p>

<p>Beyond practical necessity, serialization can be used as a forcing function to
understand the information contained within data structures. By requiring a well
defined format, the information contained in that structure may be deduced and formalized.
For example, a list needs to have a length, and a tree needs to have a shape. From there,
the encoding of this information is flexible, although some encodings are easier to work with
than others.</p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:efficient" role="doc-endnote">
      <p>A linked list of booleans is not computationally efficient, and could be replaced with another more sensible structure such as a stream of bytes. <a href="#fnref:efficient" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:tree_efficient" role="doc-endnote">
      <p>It’s worth noting that this representation could be made more efficient by recording locations relative to the previous node instead of absolute ones. However, this fact does not significantly change how hard it is to reason about the tree. Recording relative locations would allow us to reason about subtrees instead of parts of some tree, but we still must reason about insertions. <a href="#fnref:tree_efficient" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Ethan Shea</name></author><summary type="html"><![CDATA[This post describes Cheerios, a verified library for serialization in Coq.]]></summary></entry></feed>