{"id":121,"date":"2016-05-23T21:28:58","date_gmt":"2016-05-24T01:28:58","guid":{"rendered":"http:\/\/www.schveiguy.com\/blog\/?p=121"},"modified":"2017-03-12T14:02:29","modified_gmt":"2017-03-12T18:02:29","slug":"have-your-voldemort-types-and-keep-your-disk-space-too","status":"publish","type":"post","link":"https:\/\/www.schveiguy.com\/blog\/2016\/05\/have-your-voldemort-types-and-keep-your-disk-space-too\/","title":{"rendered":"Have your Voldemort types, and keep your disk space too!"},"content":{"rendered":"<p>A recent <a href=\"https:\/\/issues.dlang.org\/show_bug.cgi?id=15831\">issue<\/a> I discovered (and no doubt has been encountered before) is that using <a href=\"http:\/\/www.drdobbs.com\/cpp\/voldemort-types-in-d\/232901591\">Voldemort types<\/a> in D can result in insane symbol bloat. However, at DConf 2016, a presentation by <a href=\"http:\/\/dconf.org\/2016\/talks\/panteleev.html\">Vladimir Panteleev<\/a> gave me an idea to help solve the problem. This allows one to create a Voldemort type, but cuts out most of the template bloat that can <a href=\"https:\/\/forum.dlang.org\/post\/nhkc3i$92q$1@digitalmars.com\">impede<\/a> your <a href=\"https:\/\/forum.dlang.org\/post\/pzhrikhoetyfwvcmwjwx@forum.dlang.org\">project<\/a>.<\/p>\n<h3>Voldemort Wrappers<\/h3>\n<p>Voldemort wrappers are a way to create chain constructed types &#8212; types where you wrap one type in another type, but the construction of the wrapper is done via an <a href=\"http:\/\/dlang.org\/spec\/template.html#function-templates\">Implicit Function Template Instantiation<\/a> (IFTI) factory function. The type itself is defined <em>inside<\/em> the function, and so is not able to be named by an external entity (hence the term <a href=\"https:\/\/en.wikipedia.org\/wiki\/Lord_Voldemort\">Voldemort<\/a>). This is a very nice encapsulation, because the type doesn&#8217;t interfere with any other symbols, and all creation of the type itself is funneled through the approved factory function.<\/p>\n<p>An example of a Voldemort Wrapper is the <a href=\"http:\/\/dlang.org\/phobos\/std_range.html#.chain\"><code>chain<\/code><\/a> function from Phobos. <code>chain<\/code> takes 2 ranges with the same element type and makes a range that will traverse the first, and then the second, as if they were one range (for more info on ranges, I recommend reading Ali \u00c7ehreli&#8217;s <a href=\"http:\/\/ddili.org\/ders\/d.en\/ranges.html\">chapter<\/a> on the subject). The full <code>chain<\/code> function gives us lots of niceties, such as implementing all the common features between the two ranges. However, for demonstration purposes, we will write an <code>inputChain<\/code> function that only works on like-typed input ranges:<\/p>\n<div class=\"oembed-gist\"><script src=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080.js?file=simplechain.d\"><\/script><noscript>View the code on <a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Gist<\/a>.<\/noscript><\/div>\n<p>Now, we can write a simple test that chains together ranges without any allocation!<\/p>\n<div class=\"oembed-gist\"><script src=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080.js?file=testchain.d\"><\/script><noscript>View the code on <a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Gist<\/a>.<\/noscript><\/div>\n<p>And the result:<\/p>\n<pre>$ .\/testchain\r\nhello, world!<\/pre>\n<p>This is all pretty straightforward stuff, and isn&#8217;t groundbreaking. But what is hidden from you here is the alarming space-cost for Voldemort wrapper types.<\/p>\n<h3>Exponential Symbols<\/h3>\n<p>Let&#8217;s print out the name of the nameless type (yes, it does still have a name, even though you can&#8217;t access it). This is a bit tricky, because simply printing <code>typeof(ch).stringof<\/code> results in the name <code>Chain<\/code>. However, this isn&#8217;t what we want, what we want is the <em>fully qualified and instantiated<\/em> type name. The easiest way to get this is to create an exception with the type name in it:<\/p>\n<div class=\"oembed-gist\"><script src=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080.js?file=simplechainexception.d\"><\/script><noscript>View the code on <a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Gist<\/a>.<\/noscript><\/div>\n<p>The result of running this with our previous main file is a stack trace that starts with:<\/p>\n<pre>$ .\/testchain\r\nobject.Exception@simplechain2.d(13)\r\n----------------\r\n4\u00a0\u00a0 testchain\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0x00000001063efa24 pure @safe dchar simplechain.inputChain!(immutable(char)[], immutable(char)[]).inputChain(immutable(char)[], immutable(char)[]).Chain.front() + 144\r\n...<\/pre>\n<p>Here is the <code>Chain<\/code> type in a &#8220;nicer&#8221; format (I have replaced <code>immutable(char)[]<\/code> with the more commonly known alias <code>string<\/code>):<\/p>\n<pre>simplechain.inputChain!(string, string).inputChain(string, string).Chain<\/pre>\n<p>Here, we can see that the type of <code>ch<\/code> isn&#8217;t just <code>Chain<\/code>, it contains the full signature of the function <code>Chain<\/code> comes from ((Note that yes, the mangled symbol name (the one actually stored in the object file) reflects all of these pieces. I&#8217;m using exceptions to print out the name because they are easier to read and understand, but the same problem exists with mangled names as well.)) . The reason you see <code>inputChain<\/code> twice, is because <code>inputChain<\/code> is a template function. There are two symbols, one for the template (denoted by the instantiation symbol &#8216;<strong><code>!<\/code><\/strong>&#8216;), and one for the function itself, which we will cover later. While this in itself isn&#8217;t extremely troubling (and actually makes a lot of sense), the trouble becomes apparent when you try to chain 3 strings together (using UFCS):<\/p>\n<div class=\"oembed-gist\"><script src=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080.js?file=testchain2.d\"><\/script><noscript>View the code on <a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Gist<\/a>.<\/noscript><\/div>\n<p>Compiling and getting the exception, the type of <code>ch<\/code> is now:<\/p>\n<pre>simplechain.inputChain!(\r\n    simplechain.inputChain!(string, string).inputChain(string, string).Chain,\r\n    string)\r\n .inputChain(\r\n    simplechain.inputChain!(string, string).inputChain(string, string).Chain,\r\n    string)\r\n.Chain<\/pre>\n<p>I&#8217;ve tried to use indents to show you the pieces of this. First, we have the template. The template takes two parameters (two different ranges in fact). The first template parameter is the resulting type of the first <code>inputChain<\/code> call (you should recognize this from before). Note that this contains not only the template instantation, but the full signature of the function call as well. The second parameter is simply another string. And we get the repeated information for the function parameters.<\/p>\n<p>If you continue this pattern, perhaps with more <code>inputChain<\/code> calls tacked onto the end of the call (as one would do with range pipelines in Phobos), then you can see how this will get progressively worse. The first argument to each call is going to be a recursive expansion of each previous call. I believe the growth of the symbol name is on the order of <code>&lt;strong&gt;O(2&lt;sup&gt;n&lt;\/sup&gt;)&lt;\/strong&gt;<\/code>, meaning we have exponential growth. However, for name mangling, the expansion is <code>&lt;strong&gt;O(3&lt;sup&gt;n&lt;\/sup&gt;)&lt;\/strong&gt;<\/code>, because unshown here is the return type of each level of function.<\/p>\n<h3>Abandoning the Dark Lord<\/h3>\n<p>So with such growth, a small range pipeline of Voldemort wrappers can add up to megabyte-long symbol names. But notice that the type itself is dependent only on the <em>template<\/em> parameters, not the <em>function<\/em> parameters ((In D, there is such a thing as a <a href=\"http:\/\/dlang.org\/spec\/struct.html#nested\">nested struct<\/a>. Such a struct can utilize the stack frame of the function itself, giving access to variables and other definitions inside the function)).<\/p>\n<p>We can solve the problem by moving the struct outside the function itself, to be included in the module namespace. Make this a private struct, and repeat all the template paraphernalia, and we have a \u201csolution\u201d:<\/p>\n<div class=\"oembed-gist\"><script src=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080.js?file=simplechain3.d\"><\/script><noscript>View the code on <a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Gist<\/a>.<\/noscript><\/div>\n<p>And the resulting type:<\/p>\n<pre>simplechain.Chain!(simplechain.Chain!(string, string).Chain, string).Chain<\/pre>\n<p>Not too bad as a name, and this solves the exponential growth. But we have lost all the niceties that make Voldemort types so attractive \u2014 avoiding namespace pollution, avoiding repeating template specification, and encapsulation. This solution leaves a lot to be desired.<\/p>\n<h3>Using eponymous templates<\/h3>\n<p>So let\u2019s look at a better way, that allows us to keep the benefits of Voldemort types, but without the baggage. In D, all templated functions, enums, types, etc. are actually a short form of a special type of template called an <a href=\"http:\/\/dlang.org\/spec\/template.html#implicit_template_properties\">eponymous template<\/a>. When you compile <code>inputChain<\/code>, the compiler really treats it as something that looks like this:<\/p>\n<div class=\"oembed-gist\"><script src=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080.js?file=eponymous.d\"><\/script><noscript>View the code on <a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Gist<\/a>.<\/noscript><\/div>\n<p>An eponymous template function still works with IFTI, so it\u2019s equivalent to the original. However, now we have access to a namespace that we didn\u2019t have before \u2014 the space inside the template, but outside the function itself. As shown by Vladimir Panteleev\u2019s <a href=\"http:\/\/thecybershadow.net\/d\/dconf2016\/#\/37\">DConf 2016 talk<\/a>, access to this space is forbidden by the compiler to outside functions and types because it always resolves to the eponymously named member.<\/p>\n<p>So let\u2019s put our struct there:<\/p>\n<div class=\"oembed-gist\"><script src=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080.js?file=simplechain4.d\"><\/script><noscript>View the code on <a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Gist<\/a>.<\/noscript><\/div>\n<p>And the resulting type:<\/p>\n<pre>simplechain.inputChain!(simplechain.inputChain!(string, string).Chain, string).Chain<\/pre>\n<p>Note that the Chain type is safely buried inside the template namespace, without providing access to any outside callers. If you used the above type name, you would get a compiler error.<\/p>\n<p>I call this the Horcrux ((If you don&#8217;t get this, then you need to read more Harry Potter)) method. If we compare this to Voldemort, it&#8217;s pretty much on par with all the features, except Horcrux wrappers do not support access to the function call stack or any definitions inside the function (unless you move them into Horcrux space as well), and the declaration is a little clunky. However, you may have some advantages. For example, if you had overloaded functions that return the same type, they could both be in the same template, and share the type externally, making them even less repetitive than the equivalent Voldemorts. You could also put unit tests inside that would now have access to the structs directly.<\/p>\n<p>There is some <a href=\"http:\/\/forum.dlang.org\/post\/nhldut$1ooe$1@digitalmars.com\">effort<\/a> to fix the compiler to avoid creating such huge symbols, but until this happens, I will be splitting my functions Horcrux style.<\/p>\n<hr \/>\n<p><em><a href=\"https:\/\/gist.github.com\/schveiguy\/21b7a54715245b8c56b8ef70bbca4080\">Here<\/a> is the Github Gist with all the code included in the article.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A recent issue I discovered (and no doubt has been encountered before) is that using Voldemort types in D can result in insane symbol bloat. However, at DConf 2016, a presentation by Vladimir Panteleev gave me an idea to help solve the problem. This allows one to create a Voldemort type, but cuts out most [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,2],"tags":[],"class_list":["post-121","post","type-post","status-publish","format-standard","hentry","category-dlang","category-prog"],"_links":{"self":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts\/121"}],"collection":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/comments?post=121"}],"version-history":[{"count":10,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts\/121\/revisions"}],"predecessor-version":[{"id":141,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/posts\/121\/revisions\/141"}],"wp:attachment":[{"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/media?parent=121"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/categories?post=121"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.schveiguy.com\/blog\/wp-json\/wp\/v2\/tags?post=121"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}