Category Archives: Programming

Programming languages and software development

How to Keep Using D1 Operator Overloads

D1 style operator overloads have been deprecated in D2 since version 2.088, released in 2019. Version 2.100, released last month, saw those operator overloads removed completely from the language. However, using D’s fabulous metaprogramming capability, it is possible to write a mixin template shim that will allow your D1 style operator overloads to keep working.

For sure, the best path forward is to switch to the new style of operator overloads. But there can be good reasons to keep using the old ones. Maybe you really love the simplicity of them. Maybe you use them already for virtual functions in classes, and don’t want to change. Maybe you just don’t want to do much code editing to an old project.

Whatever the reason, this post will show you how to do it easily and succinctly!

D1 Operator Overloads vs. D2 Operator Overloads

An operator overload is a way for a custom type to handle operators (e.g. + and -). In D1 these were handled using plain named functions, such as opAdd for addition or opMul for multiplication. For an example to work with, here is a struct type that uses an integer to represent its internal state:

struct S {
   int x;
   S opAdd(S other) {
      return S(x + other.x);
   }
   S opSub(S other) {
      return S(x - other.x);
   }
   S opMul(S other) {
      return S(x * other.x);
   }
   S opDiv(S other) {
      assert(other.x != 0, "divide by zero!");
      return S(x / other.x);
   }
}

void main() {
   S s1 = S(6);
   S s2 = S(3);
   assert(s1 + s2 == S( 9));
   assert(s1 - s2 == S( 3));
   assert(s1 * s2 == S(18));
   assert(s1 / s2 == S( 2));
}

Note how repetitive the operator code is! Plus, we only handled 4 operations. There are actually 11 math and bitwise binary (2-arg) operations that could be potentially overloaded for an integer. This doesn’t count unary operations (e.g. S s3 = -s1) or operations where S is on the right side of the op, with maybe an int on the left side (e.g. opAdd_r, opMul_r). If we needed to overload based on operand type, we could branch out into template functions, but that might not be that much less code.

D2 decided that a better way to handle bulk operations would be to use templates in order to handle operators. Instead of calling opAdd for + and opMul for *, it will call opBinary!"+" and opBinary!"*" respectively. This means we can handle all the operations in one function. To process them all, we can rewrite S like this:

struct S {
   int x;
   S opBinary(string op)(S other) {
      static if(op == "/" || op == "%")
         assert(other.x != 0, "divide by zero!");
      return mixin("S(x ", op, " other.x)");
   } 
}

void main() {
   S s1 = S(6);
   S s2 = S(3);
   assert( s1 + s2  == S( 9));
   assert( s1 - s2  == S( 3));
   assert( s1 * s2  == S(18));
   assert( s1 / s2  == S( 2));
   assert( s1 % s2  == S( 0));
   assert((s1 | s2) == S( 7));
   // and so on
}

Note how we not only have only one function (with a slight difference for the division operators), but we handle all math operations! The code is easier to write, less error prone, and less verbose.

Aliasing Operators

But what if you already have operators in D1 style, and you don’t want to change them, or merge them into one super-function?

D allows you to alias member functions to another symbol, and opBinary is no exception. Here is the original type, but with aliases for each of the operators:

struct S {
   int x;
   S opAdd(S other) {
      return S(x + other.x);
   }
   S opSub(S other) {
      return S(x - other.x);
   }
   S opMul(S other) {
      return S(x * other.x);
   }
   S opDiv(S other) {
      assert(other.x != 0, "divide by zero!");
      return S(x / other.x);
   }

   alias opBinary(op : "+") = opAdd;
   alias opBinary(op : "-") = opSub;
   alias opBinary(op : "*") = opMul;
   alias opBinary(op : "/") = opDiv;
}

Note that we are using a few cool features of D metaprogramming here. The aliases are eponymous templates which means I don’t have to write out the template long form, and we are using template parameter specialization to avoid having to use a single template and look for the covered operations inside the template, or having to use template constraints to filter out the operations we cover.

But we can do even better than this! Nobody wants to write this boilerplate code tailored to each type which may not all cover the same exact operators.

Mixin Templates

A mixin template is a template with a set of declarations in it. Wherever you mixin that template, it’s (almost) as if you typed all those declarations directly. Using the power of D’s compile-time introspection, it’s possible to handle every single possible operator overload that D1 could offer, by writing aliases to the D1 style operator overload, automatically.

In order to do this, we are going to have three rules. First is that we don’t care if the operators are properly written in D1 style. As long as the names match, we will forward to them. We also don’t need to worry about overloads based on the types or parameters accepted, as aliases are just name rewrites. Second, this mixin MUST be added at the end of the type, because otherwise, the entire type’s members may not have been analyzed by the compiler (this may change in a future version of D). Third, D does not allow overloads between the mixed-in functions and regular functions — the regular functions will take precedence. So you cannot define any D2 style operators of a specific name (e.g. opBinary). If you want D2 operators, convert the whole thing, don’t use some D1 and some D2.

Let’s write just the opAdd declaration in a mixin template, and see how it works.

mixin template D1Ops() {
   static if(__traits(hasMember, typeof(this), "opAdd"))
      alias opBinary(op : "+") = opAdd;
}

There’s a lot of meta code in here, I’ll explain it all.

The mixin template declaration is telling the compiler that this is a template specifically for mixins. Technically, you can use any template for mixins, but declaring it a mixin template requires that it’s only used in that way.

If you don’t know what static if is, I highly recommend reading a tutorial on D metaprogramming, as it’s essential for almost every metaprogramming task. Needless to say, the contained code is only included if the condition is true.

__traits(hasMember, T, "opAdd") is a specialized condition that is true only if the specified type T (in this case, the type of the struct the mixin is being added to) contains a member having the name opAdd.

And finally, the alias is as we wrote before.

Now, how would we use this inside our type?

struct S {
   int x;
   S opAdd(S other) {
      return S(x + other.x);
   }
   S opSub(S other) {
      return S(x - other.x);
   }
   S opMul(S other) {
      return S(x * other.x);
   }
   S opDiv(S other) {
      assert(other.x != 0, "divide by zero!");
      return S(x / other.x);
   }

   mixin D1Ops;
}

That’s it! Now opAdd is hooked via the aliased opBinary instead of via the D1 operator overload. Therefore, S + S will compile on 2.100 and later. However, the other operator overloads will not.

Why do it this way? As we will see, using the static if allows us to mixin the template regardless of whether opAdd is present or not. Using this feature, we can handle every possible situation with regards to existing operator overloads.

Using the Full Power of D

Adding each and every operator overload to the mixin is going to be very repetitive. But there is no need to do this, D is a superpower in metaprogramming! All we need to do is lay out the operation mappings, and we can use another specialized metaprogramming feature, static foreach, to avoid having to repeat the same boilerplate over and over.

With this, we can handle every binary operation that the struct might have written D1 style:

mixin template D1Ops() {
   static foreach(op, d1;
     ["+" : "opAdd", "-" : "opSub", "*" : "opMul", "/" : "opDiv",
      "%" : "opMod"]) {
      static if(__traits(hasMember, typeof(this), d1))
         alias opBinary(string s : op) = mixin(d1);
   }
}

Let’s look at the new things we have added to the mixin template. The first thing is an associative array of string to string, indicating which ops should map to which D1 function names. static foreach is a feature which will, at compile time, loop over all the elements in a thing that normally you would iterate at runtime (in this case, the associative array). It’s as if you wrote all those things out directly one at a time, with the symbols op and d1 mapped to the keys and values of the associative array containing the operation mappings.

See how our static if has changed a bit, instead of using a string literal, we use the d1 symbol, which in the first loop is "opAdd", in the second loop is "opSub" and so on.

In addition, there is one minor change in the alias. Because we must alias the opBinary call to a symbol, and not a string, we must fetch the symbol based on its string name. mixin(d1) does this. This is a relatively new feature, in older compilers we could still achieve this with a single mixin statement for the whole alias statement, but just calling mixin on d1 is a lot cleaner looking.

With that, our final code looks like this:

mixin template D1Ops() {
   static foreach(op, d1;
     ["+" : "opAdd", "-" : "opSub", "*" : "opMul", "/" : "opDiv",
      "%" : "opMod"]) {
      static if(__traits(hasMember, typeof(this), d1))
         alias opBinary(string s : op) = mixin(d1);
   }
}

struct S {
   int x;
   S opAdd(S other) {
      return S(x + other.x);
   }
   S opSub(S other) {
      return S(x - other.x);
   }
   S opMul(S other) {
      return S(x * other.x);
   }
   S opDiv(S other) {
      assert(other.x != 0, "divide by zero!");
      return S(x / other.x);
   }

   mixin D1Ops;
}

void main() {
   S s1 = S(6);
   S s2 = S(3);
   assert( s1 + s2  == S( 9));
   assert( s1 - s2  == S( 3));
   assert( s1 * s2  == S(18));
   assert( s1 / s2  == S( 2));
}

You’ll notice that I intentionally included opMod in the mixin, even though our type does not have it. This demonstrates the power of the static if to only provide aliases if the appropriate D1 operator overload exists.

Filling it out

All that is left for opBinary is to fill out the mappings to handle any possible existing D1 binary operations. As long as you have a D1-style operator, the mixin will generate an alias to cover it.

And finally, any other D1 style operations as listed in the changelog, such as opUnary or opBinaryRight can also be covered by adding another loop. You could even nest the mappings if you wanted to, or include the name of the template to alias as part of the mapping. Or you might notice that all the opBinaryRight operators are the same as the opBinary operators (except in), and just do both at the same time.

You also might not using static foreach for this, and actually write them all out by hand, simply because static foreach is slightly expensive, and so is constructing an associative array at compile-time. Remember, once this template is done, there will never need to be any updates to it. The advantage of using a loop is you have to write a lot less code, which makes it a lot less error prone.

And if you aren’t in the mood to do it yourself, here is a gist mapping the entire suite of D1 operator overloads.

Comparing Exceptions and Errors in D

What are Exceptions and Errors in D? Why is there a difference? Why does D consider Errors throwable inside a nothrow function? Sometimes decisions seem arbitrary, but when you finally understand the reasoning, you can better appreciate why things are the way they are.

Throwing a Throwable

I’m not going to go into all the details of how throwing works in D or in any other language (that is easy to find online). But simply put, an exception is a “exceptional” case, which shouldn’t occur in normal code. The reason throwing is preferred to other types of error handling (such as returning an error code, or a combination error/value) is because an exception requires handling. If you don’t handle it, someone else will. And the default is to print out most of the state that was happening when the exception happened, and exit the program.

As a side note, the most recent compiler has a new feature called @mustuse which requires that any return value (which might contain an error) must be dealt with.

That being said, throwing is relatively expensive, meaning that you should only do it in truly exceptional cases, and not use it for mundane flow control.

In D, you throw an exception or error simply by using the throw statement, which requires an object instance that is a derivative of Throwable:

int div(int x, int y) {
    if(y == 0) throw new Exception("Divide by zero!");
    return x / y;
}

Then you can catch it somewhere else. When you catch it, the exception contains all the information on how it was generated, including the file/line that generated the exception and all the places along the call stack that got to that point. The beauty of exception handling is you can put the handler anywhere along the call stack — where it’s most needed.

Consider a web server, where you might want an exception handler at the part that handles the HTTP request, where you can return the appropriate HTTP error code, and maybe a nice page sent back to the user. Instead of having to propagate some error deep in your page handler code up the call stack so it can be properly handled, you just throw the exception where it happens, and catch it where you want to handle it. The language takes care of the rest!

Stack Unwinding

One of the bookkeeping tasks that the language has to deal with is unwinding the stack. If for instance you have structs on the stack with destructors, those destructors have to be called, or else your program integrity is compromised. Imagine if a reference-counted smart pointer didn’t decrement its reference when an exception is thrown. Or a mutex is left locked.

There’s also scope-guard statements which help properly design initialization/cleanup code without having to remember cleanup at the end of scopes, or in every spot where a return statement exists. Those must also be run when an exception is thrown.

nothrow functions

A nothrow function is one that cannot let Exceptions escape handling outside the function. That means you must handle all possible exceptions that might be thrown inside your function or inside any throwing function you call. A nothrow function’s purpose is to inform the compiler that it can omit cleanup code for exception throwing.

This allows the compiler to both output less code, and also gives the optimizer more possibilities to work with, making nothrow functions preferable to ones that throw.

Stack Unwinding for Errors

However, a nothrow function is still allowed to throw an Error. How does that work?

How it works is that the compiler still omits exception cleanup code, and the code that catches the Error is not allowed to continue the program. If it does, the program may obviously be in an invalid state. You can think of the throw and catch of an Error to be like a plain goto instruction.

The following code example and output demonstrates how cleanup code is skipped:

// For example:
void foo() nothrow {
   throw new Error("catch me!");
}

void bar() nothrow {
   import core.stdc.stdio;
   scope(exit) printf("cleaning up...\n");
   foo();
}

void main() {
   bar();
}
object.Error@(0): catch me!
----------------
./onlineapp.d:3 nothrow void onlineapp.foo() [0x55db91086345]
./onlineapp.d:9 nothrow void onlineapp.bar() [0x55db91086350]
./onlineapp.d:13 _Dmain [0x55db9108636c]

It can be tempting to catch an Error and use that as a control flow mechanism. For example, an array out of bounds access is a frequent error that you may want to just recover from. But the stack frames may not be properly cleaned up, which means things like mutex unlocks, or reference decrements didn’t happen along the way up the stack.

In short, your program is in an undetermined state. Continuing execution risks damaging the data used by the program, or crashing the user’s application.

How to handle Errors

Don’t. The only exception (pun intended) is when you are testing code. And actually the language guarantees proper stack unwinding for assert errors thrown inside unittests and contracts.

As a rule of thumb, an Error is for programming errors (that is, conditions you expect to be enforced by the programmer are incorrect), and an Exception is for environment/user errors.

If you do catch an Error, the only proper action is to perform some possible final action (such as logging the error) and exiting the program. And make sure any final actions you perform can’t be thwarted by undetermined state.

Edit: More Pitfalls!

After much discussion on the D forum, one user (frame) noted that you can return from a scope(failure) statement.

I didn’t go over exactly what a scope guard statement was, but essentially there are 3 conditions that you can use to run cleanup code, exit, success, and failure. I used the scope(exit) code above to show an example of skipping cleanup code.

A scope(failure) statement executes when a function is exiting because a Throwable is thrown. However, an Error is a derivative of Throwable, so this includes Error! Normally, this isn’t a problem, because after the statement is done, the code normally just rethrows the Throwable. However, you are allowed (per the spec) to simply return normally, use goto to exit the statement, or throw an Exception. Any of these mechanisms will mask the fact that an Error was thrown, and that the program is now in a possibly invalid state.

I recommend at this point NOT to use these mechanisms, and I have advocated on an existing dlang issue that the language revoke this allowance.

So what if you want to return a code if an Exception is thrown? Well, the compiler actually rewrites a scope(failure) statement like:

// scope(failure) <code>; // is rewritten as
try {
   ... // all code after the scope(failure) statement
} catch(Throwable _caught) {
   <code>
   throw _caught;
}

Instead, you can expand the statement and change the Throwable to an Exception to make sure you aren’t inadvertently masking an Error from propagating:

try {
   ... // normal function code
} catch(Exception) {
   return 10;
}

Have your Voldemort types, and keep your disk space too!

A recent issue I discovered (and no doubt has been encountered before) is that using Voldemort types in D can result in insane symbol bloat. However, at DConf 2016, a presentation by Vladimir Panteleev gave me an idea to help solve the problem. This allows one to create a Voldemort type, but cuts out most of the template bloat that can impede your project.

Voldemort Wrappers

Voldemort wrappers are a way to create chain constructed types — types where you wrap one type in another type, but the construction of the wrapper is done via an Implicit Function Template Instantiation (IFTI) factory function. The type itself is defined inside the function, and so is not able to be named by an external entity (hence the term Voldemort). This is a very nice encapsulation, because the type doesn’t interfere with any other symbols, and all creation of the type itself is funneled through the approved factory function.

An example of a Voldemort Wrapper is the chain function from Phobos. chain takes 2 ranges with the same element type and makes a range that will traverse the first, and then the second, as if they were one range (for more info on ranges, I recommend reading Ali Çehreli’s chapter on the subject). The full chain function gives us lots of niceties, such as implementing all the common features between the two ranges. However, for demonstration purposes, we will write an inputChain function that only works on like-typed input ranges:

Now, we can write a simple test that chains together ranges without any allocation!

And the result:

$ ./testchain
hello, world!

This is all pretty straightforward stuff, and isn’t groundbreaking. But what is hidden from you here is the alarming space-cost for Voldemort wrapper types.

Exponential Symbols

Let’s print out the name of the nameless type (yes, it does still have a name, even though you can’t access it). This is a bit tricky, because simply printing typeof(ch).stringof results in the name Chain. However, this isn’t what we want, what we want is the fully qualified and instantiated type name. The easiest way to get this is to create an exception with the type name in it:

The result of running this with our previous main file is a stack trace that starts with:

$ ./testchain
object.Exception@simplechain2.d(13)
----------------
4   testchain                           0x00000001063efa24 pure @safe dchar simplechain.inputChain!(immutable(char)[], immutable(char)[]).inputChain(immutable(char)[], immutable(char)[]).Chain.front() + 144
...

Here is the Chain type in a “nicer” format (I have replaced immutable(char)[] with the more commonly known alias string):

simplechain.inputChain!(string, string).inputChain(string, string).Chain

Here, we can see that the type of ch isn’t just Chain, it contains the full signature of the function Chain comes from1 . The reason you see inputChain twice, is because inputChain is a template function. There are two symbols, one for the template (denoted by the instantiation symbol ‘!‘), and one for the function itself, which we will cover later. While this in itself isn’t extremely troubling (and actually makes a lot of sense), the trouble becomes apparent when you try to chain 3 strings together (using UFCS):

Compiling and getting the exception, the type of ch is now:

simplechain.inputChain!(
    simplechain.inputChain!(string, string).inputChain(string, string).Chain,
    string)
 .inputChain(
    simplechain.inputChain!(string, string).inputChain(string, string).Chain,
    string)
.Chain

I’ve tried to use indents to show you the pieces of this. First, we have the template. The template takes two parameters (two different ranges in fact). The first template parameter is the resulting type of the first inputChain call (you should recognize this from before). Note that this contains not only the template instantation, but the full signature of the function call as well. The second parameter is simply another string. And we get the repeated information for the function parameters.

If you continue this pattern, perhaps with more inputChain calls tacked onto the end of the call (as one would do with range pipelines in Phobos), then you can see how this will get progressively worse. The first argument to each call is going to be a recursive expansion of each previous call. I believe the growth of the symbol name is on the order of <strong>O(2<sup>n</sup>)</strong>, meaning we have exponential growth. However, for name mangling, the expansion is <strong>O(3<sup>n</sup>)</strong>, because unshown here is the return type of each level of function.

Abandoning the Dark Lord

So with such growth, a small range pipeline of Voldemort wrappers can add up to megabyte-long symbol names. But notice that the type itself is dependent only on the template parameters, not the function parameters2.

We can solve the problem by moving the struct outside the function itself, to be included in the module namespace. Make this a private struct, and repeat all the template paraphernalia, and we have a “solution”:

And the resulting type:

simplechain.Chain!(simplechain.Chain!(string, string).Chain, string).Chain

Not too bad as a name, and this solves the exponential growth. But we have lost all the niceties that make Voldemort types so attractive — avoiding namespace pollution, avoiding repeating template specification, and encapsulation. This solution leaves a lot to be desired.

Using eponymous templates

So let’s look at a better way, that allows us to keep the benefits of Voldemort types, but without the baggage. In D, all templated functions, enums, types, etc. are actually a short form of a special type of template called an eponymous template. When you compile inputChain, the compiler really treats it as something that looks like this:

An eponymous template function still works with IFTI, so it’s equivalent to the original. However, now we have access to a namespace that we didn’t have before — the space inside the template, but outside the function itself. As shown by Vladimir Panteleev’s DConf 2016 talk, access to this space is forbidden by the compiler to outside functions and types because it always resolves to the eponymously named member.

So let’s put our struct there:

And the resulting type:

simplechain.inputChain!(simplechain.inputChain!(string, string).Chain, string).Chain

Note that the Chain type is safely buried inside the template namespace, without providing access to any outside callers. If you used the above type name, you would get a compiler error.

I call this the Horcrux3 method. If we compare this to Voldemort, it’s pretty much on par with all the features, except Horcrux wrappers do not support access to the function call stack or any definitions inside the function (unless you move them into Horcrux space as well), and the declaration is a little clunky. However, you may have some advantages. For example, if you had overloaded functions that return the same type, they could both be in the same template, and share the type externally, making them even less repetitive than the equivalent Voldemorts. You could also put unit tests inside that would now have access to the structs directly.

There is some effort to fix the compiler to avoid creating such huge symbols, but until this happens, I will be splitting my functions Horcrux style.


Here is the Github Gist with all the code included in the article.

  1. Note that yes, the mangled symbol name (the one actually stored in the object file) reflects all of these pieces. I’m using exceptions to print out the name because they are easier to read and understand, but the same problem exists with mangled names as well. []
  2. In D, there is such a thing as a nested struct. Such a struct can utilize the stack frame of the function itself, giving access to variables and other definitions inside the function []
  3. If you don’t get this, then you need to read more Harry Potter []

Import Changes in D 2.071 [Updated]

Note: This post has been updated on 8/29/2016 with new information on mixin template imports.

In the upcoming version of D, several changes have been made to the import system, including fixes for 2 of the oldest bugs in D history.

There’s bound to be a lot of confusion on this, so I wrote this to try and explain the rules, and the reasoning behind some of the changes. I’ll also explain how you can mitigate any issues you have in your code base.

Bugs 313 and 314

Links: 313 and 314

Description

Private imports are not supposed to infiltrate the modules they are imported in. If you import a, and a imports b privately, then you should not have any access to b‘s symbols. However, before this was fixed, you could access b symbols via the Fully Qualified Name. A FQN is where you list all packages, including subpackages, separated by dots, to access a symbol. For example std.stdio.writeln.

In addition, when importing a module using static, renamed, or selective imports, the imported symbols were incorrectly made public to importing modules.

An example:

With 2.070 and prior versions, compiling this works just fine. With 2.071 and above, you will get either a deprecation warning, or an error.

Note that the private qualifier is only for illustration. This is the default import protection for any imports.

For an example of how selective imports add public symbols:

With 2.070, this compiled just fine. However, printf is supposed to be a private symbol of module ex2_a. With 2.071 and above, this will trigger a deprecation warning. In the future, the code will trigger an error.

Selective imports and FQN

A combination of both 313 and 314 is when you use a selective import, and expect the Fully Qualified Name to also be imported. This is not what the selective import was supposed to do, it was only supposed to add the symbols requested.

An example:

In this example, std.stdio.writeln is not actually supposed to be imported, only write is supposed to be imported (and even the FQN std.stdio.write isn’t imported!). We have to import std.range, because otherwise this would not compile (ironically, the package std is not imported by the selective import unless there is another import of the FQN).

In 2.070, this produces no warning or error. In 2.071 and beyond, this will produce a deprecation warning, and eventually an error.

Fixing problematic code

In order to fix such code, you have to decide what was intended. If your code really was supposed to publicly import the symbols, prepend public to the import statement. This brings all the symbols imported into the namespace of the module, so any importing module also sees those symbols. In our example 2 above, this would mean adding public to the import statement in ex2_a.d

If the imported module was not supposed to publicly expose the symbols, then you need to fix all importing modules with this problem. In our example, this would mean adding import core.stdc.stdio; to the top of ex2_main.d.

In the case of accidentally exposing the FQN of symbols that were privately imported, this is typically an issue with the importing module, not the imported one. In this case, you need to add an import. In our example 1 case, this would mean adding an import for ex1_a module to ex1_main.d.

For example 3, you can achieve the original behavior by both selectively importing the symbol, and statically importing the module. Just add static import std.stdio; to your scoped imports. Alternatively, you can add writeln to the selectively imported symbols, and use the unqualified name instead of the FQN.

For an example of how Phobos was fixed for this problem (there were thousands of messages in every build with deprecation warnings), see the PR I created.

Bug 10378

Links: 10378 Pull Request

Description

Another import-related bug fix is to prevent unintentional hijacking of symbols inside a scoped import. Such imports are not at module level, and import inside a function or other scope. These imports are only valid within the scope. Prior to 2.071, such imports were equivalent to importing every symbol in the imported module into the namespace of that scope. This overrode any other symbol in that namespace, including local variables in outer scopes. An example:

In 2.070 and prior, the assert above used ex4_a‘s definition of foo, not the local variable. In 2.071 and beyond, the local foo has precedence. The precedence rules work like this:

  1. Any local symbols are examined first. This includes selective imports which are aliased into the local scope.
  2. Any module-level symbols are examined.
  3. Any symbols imported are examined, starting with the most derived scope imports, all the way to module-level imports.

Note that this may be a breaking change, as demonstrated by the example.

Why did we change this?

This was changed because any symbol added to an import can drastically affect any code that uses non-selective scoped imports, hijacking the symbol in ways that the author cannot predict. While there is still potential for hijacking, since scoped imports override any module-level or higher level scoped imports, at least symbols that are locally defined are not affected. These are the symbols under direct control of the author of the module, and they should always have precedence.

A common change to a module is to move imports inside the scope of functions or types that are the only users of that import. This helps avoid namespace pollution. However, given that local module functions had precedence over imported ones, but scoped imports would take precedence away, this move was not always what the user intended. For this reason, module functions now always have precedence over non-selective scoped imports.

Fixing problematic code

This one is a little more nuanced. It may be that you wished to override the local symbols! In this case, use a selective import. Selective imports alias the symbols selected into the local scope, overriding any other symbols defined at that point. In our example, if we expected foo to refer to ex4_a.foo, then we would use an import like this: import ex4_a: foo;

In addition, you can use the FQN instead of using the simple name. I would recommend using a static or renamed import in that case.

Imports from mixin templates1

Links: Forum discussion, issue 15925

Description

A somewhat controversial change with 2.071 is the effect mixins can have with imports. If you have a mixin template which imports a module, then use that template within a class or struct, the import is only considered while inside the mixin template. It is not considered when inside the class or struct. For example:

The previous version would allow the import to be considered where the mixin occurs. In order to have a mixin template add an imported symbol, you can selectively import the symbol. In this case, static import will not work:

Why did we change this?

The explanation seems to be that allowing such imports can create a form of hijacking. Since a class-level import would override a module-level import, a user may not realize that the mixed-in import is present, and therefore overriding a module-level import that is in the local module. The hijacking can come after the fact, in the imported module, without the user’s knowledge or any changes in his code.

Fixing problematic code

There isn’t a very easy way to rectify this problem. The only solution is to selectively import all the symbols you may need from that other module within the mixin.

Transitional Switches

The new version of the compiler comes with two new transitional switches that you can use to find or ignore these errors (note that these affect the mixin template imports as well):

-transition=checkimports: This switch will warn you if you have code that behaved differently prior to issue 10378 being fixed. Note that this may slow down compilation notably, hence it’s not the default2

-transition=import: This switch reverts behavior back to the import rules prior to 10378 being fixed. Only use this switch as a stop-gap measure until you can fix the code!

General Recommendations

Because importing external modules that are outside your control can lead to hijacking, I recommend never importing a module at a scoped level that isn’t selective, static, or renamed. This gives you full control over what invades your namespace. The compiler will protect you now a little bit better, but it’s always better to defend against namespace pollution from an uncontrolled module.

References

D programming language

D Import Spec

Issue 313

Issue 314

Issue 10378

issue 15925

D Compiler Download

  1. Thanks to captaindet for bringing this issue to my attention []
  2. Thanks to Dicebot for pointing this out []