{"id":298,"date":"2023-01-15T23:22:40","date_gmt":"2023-01-16T04:22:40","guid":{"rendered":"https:\/\/www.schveiguy.com\/blog\/?p=298"},"modified":"2023-02-04T21:28:53","modified_gmt":"2023-02-05T02:28:53","slug":"the-cost-of-compile-time-in-d","status":"publish","type":"post","link":"https:\/\/www.schveiguy.com\/blog\/2023\/01\/the-cost-of-compile-time-in-d\/","title":{"rendered":"The Cost of Compile Time in D"},"content":{"rendered":"\n<p>When I was creating my <a href=\"https:\/\/dconf.org\/2022\/online\/index.html#steves\">presentation for dconf online 2022<\/a>, I was looking at alternatives to building constraints. If you watched my talk, you can see the fruit of that experiment in my <a href=\"https:\/\/github.com\/schveiguy\/strawman\">strawman library<\/a> (which is very much a proof-of-concept, and not ready for real use).<\/p>\n\n\n\n<p>But it got me thinking &#8212; how much more expensive are these strawman constraints than the current Phobos range constraints? But even before I went that far, I started looking at some of the phobos constraints, and realized even there, we can achieve some savings.<\/p>\n\n\n\n<p>Consider the constraint for isInputRange:<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-d\">enum bool isInputRange(R) =\n    is(typeof(R.init) == R)\n    &amp;&amp; is(ReturnType!((R r) =&gt; r.empty) == bool)\n    &amp;&amp; (is(typeof((return ref R r) =&gt; r.front)) ||\n        is(typeof(ref (return ref R r) =&gt; r.front)))\n    &amp;&amp; !is(ReturnType!((R r) =&gt; r.front) == void)\n    &amp;&amp; is(typeof((R r) =&gt; r.popFront));<\/code><\/pre>\n\n\n\n<p>Let&#8217;s focus on one aspect of this, the use of the <a href=\"https:\/\/dlang.org\/phobos\/std_traits.html#ReturnType\"><code>ReturnType<\/code> template<\/a>. What does that do? Essentially, it takes the parameter (in this case a lambda function) and evaluates to the return type of the callable.<\/p>\n\n\n\n<p>But&#8230;. we have that as part of the language, don&#8217;t we? Yeah, it&#8217;s called <a href=\"https:\/\/dlang.org\/spec\/type.html#typeof\"><code>typeof<\/code><\/a>. <code>typeof<\/code> gives you the &#8220;type of&#8221; an expression. And it&#8217;s a direct link into the compiler&#8217;s semantic analysis &#8212; no additional semantic computation is needed.<\/p>\n\n\n\n<p>To see what we are comparing against, let&#8217;s take a look at the <code>ReturnType<\/code> template (and its dependencies):<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-d\">template ReturnType(alias func)\nif (isCallable!func)\n{\n    static if (is(FunctionTypeOf!func R == return))\n        alias ReturnType = R;\n    else\n        static assert(0, &quot;argument has no return type&quot;);\n}\n\ntemplate FunctionTypeOf(alias func)\nif (isCallable!func)\n{\n    static if ( (is(typeof(&amp; func) Fsym : Fsym*) &amp;&amp; is(Fsym == function)) || is(typeof(&amp; func) Fsym == delegate))\n    {\n        alias FunctionTypeOf = Fsym; \/\/ HIT: (nested) function symbol\n    }\n    else static if (is(typeof(&amp; func.opCall) Fobj == delegate) || is(typeof(&amp; func.opCall!()) Fobj == delegate))\n    {\n        alias FunctionTypeOf = Fobj; \/\/ HIT: callable object\n    }\n    else static if (\n            (is(typeof(&amp; func.opCall) Ftyp : Ftyp*) &amp;&amp; is(Ftyp == function)) ||\n            (is(typeof(&amp; func.opCall!()) Ftyp : Ftyp*) &amp;&amp; is(Ftyp == function))\n        )\n    {\n        alias FunctionTypeOf = Ftyp; \/\/ HIT: callable type\n    }\n    else static if (is(func T) || is(typeof(func) T))\n    {\n        static if (is(T == function))\n            alias FunctionTypeOf = T;    \/\/ HIT: function\n        else static if (is(T Fptr : Fptr*) &amp;&amp; is(Fptr == function))\n            alias FunctionTypeOf = Fptr; \/\/ HIT: function pointer\n        else static if (is(T Fdlg == delegate))\n            alias FunctionTypeOf = Fdlg; \/\/ HIT: delegate\n        else\n            static assert(0);\n    }\n    else\n        static assert(0);\n}\n\ntemplate isCallable(alias callable)\n{\n    \/\/ 20 lines of code\n}\n\ntemplate isSomeFunction(alias T)\n{\n    \/\/ 15 lines of code\n}<\/code><\/pre>\n\n\n\n<p>Whoa, that&#8217;s a lot of code to tell me what the type of something is! Why is it so complex? The reason is because in order to determine the return type of something, we have to use the <code>typeof<\/code> primitive, but this needs a valid expression. For a callable, that means we need a valid set of parameters. All of that needs to be introspected by the library, which is simply given a symbol and doesn&#8217;t know anything about that symbol without context.<\/p>\n\n\n\n<p>However we have context! We know exactly how to call the lambda function we have constructed, with an <code>R<\/code>! Why do we need this complexity for something that should be a simple call? As most well-versed in writing generic library code know, this is not an easy thing to do (sometimes generic types can&#8217;t be easily constructed, or you might have issues with disabled copying, etc.). In addition, <code>ReturnType<\/code> is built to handle all sorts of callable things, not just lambda functions.<\/p>\n\n\n\n<p>But <code>isInputRange<\/code> doesn&#8217;t actually need to construct, or even have a valid <code>R<\/code> for generating the expression, all it needs is an already existing <code>R<\/code> to call methods on it. We can do this using a reinterpret cast of <code>null<\/code> to an <code>R*<\/code> and now we have an &#8220;already made&#8221; <code>R<\/code>. Yes, this would crash if actually run, but we don&#8217;t ever need to run it, we just need to get its type! And so, here is an equivalent <code>isInputRange<\/code> template that does not use <code>ReturnType<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-d\">enum isInputRange(R) =\n    is(typeof(R.init) == R)\n    &amp;&amp; is(typeof(() { return (*cast(R*)null).empty; }()) == bool)\n    &amp;&amp; (is(typeof((return ref R r) =&gt; r.front)) ||\n        is(typeof(ref (return ref R r) =&gt; r.front)))\n    &amp;&amp; !is(typeof(() { return (*cast(R*)null).front; }()) == void)\n    &amp;&amp; is(typeof((R r) =&gt; r.popFront));\n<\/code><\/pre>\n\n\n\n<p>The difference here is we have a no-argument lambda, and so we don&#8217;t have to rely on library tricks or introspection to know how to call it (and as you can see, we call it with no parameters as expected).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Measuring the results<\/h2>\n\n\n\n<p>Given an <code>isInputRange<\/code> template that is completely independent of <code>std.traits<\/code>, what is the result? How much does it save?<\/p>\n\n\n\n<p>To test this, I wrote a program generator that created 10000 identical but independently named input ranges, that are tested like this:<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-d\">struct S0 { int front; void popFront() {}; bool empty = false; }\nstatic assert(isInputRange!S0);\nstruct S1 { int front; void popFront() {}; bool empty = false; }\nstatic assert(isInputRange!S1);\n...\nstruct S9999 { int front; void popFront() {}; bool empty = false; }\nstatic assert(isInputRange!S9999);<\/code><\/pre>\n\n\n\n<p>Running on my Linux system, using DMD 2.101.2, I get the following results:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>COMMAND<\/strong><\/td><td><strong>TIME<\/strong><\/td><td><strong>MEMORY USAGE<\/strong><\/td><\/tr><tr><td>dmd -version=usePhobos<\/td><td>2.75s<\/td><td>1.755G<\/td><\/tr><tr><td>dmd -version=useTypeof<\/td><td>1.47s<\/td><td>621M<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Looking at the savings, it&#8217;s quite significant &#8212; almost 50% time savings, and over 65% memory savings. Note that each call to <code>ReturnType<\/code> is unique, and so it will execute its own semantic analysis. Using the compiler&#8217;s <a href=\"https:\/\/dlang.org\/dmd-osx.html#switch-vtemplates\">-vtemplates<\/a> switch, we can see that using the current Phobos adds quite a few dependent templates. For each usage of <code>isInputRange<\/code>, we see:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>2 distinct instantiations of <code>ReturnType<\/code><\/li>\n\n\n\n<li>4 instantiations of <code>isCallable<\/code> (2 distinct)<\/li>\n\n\n\n<li>2 distinct instantiations of <code>FunctionTypeOf<\/code><\/li>\n\n\n\n<li>2 distinct instantiations of <code>isSomeFunction<\/code><\/li>\n<\/ul>\n\n\n\n<p>All that adds up to an additional 8 distinct template instantiations, and 10 total instantiations. A distinct template instantiation will run semantic analysis, but a non-distinct one will just find the existing template in the symbol table and return it.<\/p>\n\n\n\n<p>Using the measurement numbers we can somewhat extrapolate that each <code>ReturnType<\/code> instantiation adds 64 microseconds, and consumes 56.7K of RAM. The RAM consumption comes from storing the additional template instantiation symbols in the symbol table.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Such small savings, why is it important? It&#8217;s important because this is a perfect example of &#8220;death by 1000 paper cuts&#8221;. Each little template instantiation gives us a bit of convenience, but adds a tiny cost. These costs can add up significantly, and produce an overall compiler experience that is frustratingly slow, or worse, runs out of memory (yes, I have had this happen)! For something such as <code>isInputRange<\/code>, which almost nobody ever looks at or needs to, the cost is not well spent &#8212; especially considering how short and readable the alternative is!<\/p>\n\n\n\n<p>When you reach for something in <code>std.traits<\/code>, consider what the compile-time cost might be, and don&#8217;t always assume that a small call will be efficient. Are you writing something people have to understand easily? If not, make the messy details as complex as needed to avoid such costs. If you can write the same thing using builtins, it will run faster, and it might even work better. I like to prefer compiler builtins such as <code>typeof<\/code>, <a href=\"https:\/\/dlang.org\/spec\/expression.html#is_expression\"><code>is<\/code> expressions<\/a> and <a href=\"https:\/\/dlang.org\/spec\/traits.html\"><code>__traits<\/code><\/a> to <code>std.traits<\/code> whenever possible, as long as the cognitive load of the resulting code isn&#8217;t too great (and yes, it can be).<\/p>\n\n\n\n<p>I do plan to submit a PR to streamline everything I can about the range traits, maybe we can all pitch in and see where some of this interdependent fat can be trimmed all throughout Phobos!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When I was creating my presentation for dconf online 2022, I was looking at alternatives to building constraints. If you watched my talk, you can see the fruit of that experiment in my strawman library (which is very much a proof-of-concept, and not ready for real use). But it got me thinking &#8212; how much [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,2],"tags":[11,13,12],"class_list":["post-298","post","type-post","status-publish","format-standard","hentry","category-dlang","category-prog","tag-dlang","tag-performance","tag-templates"],"_links":{"self":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts\/298"}],"collection":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/comments?post=298"}],"version-history":[{"count":10,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts\/298\/revisions"}],"predecessor-version":[{"id":315,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts\/298\/revisions\/315"}],"wp:attachment":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/media?parent=298"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/categories?post=298"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/tags?post=298"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}