Perl: When DWIM Doesn’t

We’ve written in the past of our love for Perl. We meant it. But in any loving relationship, there will also be hard parts and unpleasant surprises. These are some tales of unpleasant surprises.

Surprise One: Bonus Feature

Here is some code that sets up a global $config hash, setting a file path the application should read data from.

our $config;
$config->{file_paht} = "/opt/app/data_file";

And the code that reads the data file:

open (my $fh, "<", $config->{file_path})
    or  die "can't open $config->{file_path}: $!";

my $data;
{
    local $/ = undef;
    $data = <$fh>;
}

You probably spotted that file_paht typo before we did. A warning or error would have helped us spot it earlier, but instead we got a bonus feature.

Perl decided what we really wanted was an anonymous temporary file, and provided us one. A brand-new, anonymous tempfile, that could never have been written to, opened for reading.

This bonus feature is documented as a “special” case in the sixteen or seventeenth paragraph of perldoc -f open. Special, indeed. So special that to debug it we ran an strace, thinking …

where the f^H Sam Hill did that open(“/tmp/PerlIO_Z2sAqY”, O_RDWR…) come from?

… and grepped the source to find the answer, and re-read perldoc -f open to try to find our sanity.

Avoiding this bug requires being more defensive, which is always a good idea when reading disk files in production code:

if (exists $config->{file_path}  and  -r $config->{file_path}) {
    ...
}

In writing this article we began to consider this case a bug in perl, and went to file one at rt.perl.org, only to find that the wonderful Perl 5 Porters had beat us to it, and that there is a current thread on the mailing list concerning this bug. Thanks p5p!

Surprise Two: When DWIM Doesn’t

We are constantly A/B testing at Shutterstock. Sometimes we need to usurp a random test assignment to view specific variants. The overrides are cached in the session:

$session->{ab_variant_overrides} = [34, 29];

Code checks if the variants are being usurped, and builds the appropriate template data structure:

if (exists $session->{ab_variant_overrides}) {
    # template expects custom_overrides to be [int, ...]
    $template->{custom_overrides} = $session->{ab_variant_overrides};
}

At one point we needed a quick hack to do something special inside a usurper variant:

if (grep { $_ == 42 } @{ $session->{ab_variant_overrides} }) {
    # give them something special
}

You may have spotted a bug in that code. If we haven’t yet assigned to $session->{ab_variant_overrides}, we’ll be dereferencing an undefined value. What should happen in that case?

One might expect Perl’s fatal “Can’t use an undefined value as an ARRAY reference” under strictures. Instead, the presence of the grep springs an empty array reference into place and assigns it to $session->{ab_variant_overrides}. Oops.

This behavior is hinted at in item 6 of the “Making References” section in perlref.

References of the appropriate type can spring into existence if you dereference them in a context that assumes they exist. Because we haven’t talked about dereferencing yet, we can’t show you any examples yet.

A quick fix here is to be more defensive by changing the dereferencing:

grep { $_ == 42 } @{ $session->{ab_variant_overrides} || [] }

What are your tales of surprise?

Interested in working at Shutterstock? We're hiring! >>
This entry was posted in Development, Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

5 Responses to Perl: When DWIM Doesn’t

  1. Kris Arnold says:

    On the subject of defensive coding, when I am using hashes (frequently) and when I am worried about autovivication of hash keys (whenever I am using hashes), I like to take advantage of Hash::Util::lock_keys.

    After “locking” a hash, an error will be thrown when attempting to access a non-existant key like. file_paht.

  2. Daniel Lopes says:

    Hello douglas,

    Apart from what Kris said, may I also recommend reading the book ‘Modern Perl’? It has a lot of tips that helps with some constraints you need to put in your software architecture to prevent these kinds of troubles, namely:

    - the use of the // operator for undef conditionals
    - the smart match operator ~~
    - the ‘autovivification’ pragma
    - using strict, warning pragmas and perl version number
    - the various contexts perl uses and how it behaves in each situation

    Perl is not a silver bullet. If you enforce implementation rules from the beginning you can have more fun and a far more effective experience.

  3. dams says:

    You may want to check out the autovivification module from the great Vincent Pit. It removes corner cases of autovivification. https://metacpan.org/module/autovivification

  4. fenway says:

    In general, “exists” is just unreliable with auto-vivification. “defined” serves just as well. How often do you really need to check for the existence of a hash key alone?

  5. spacebat says:

    Your defensive check still isn’t watertight:

    if (exists $config->{file_path} and -r $config->{file_path}) { ... }

    If the value associated with the “file_path” key is undefined for some reason, you get the temporary file again. This will prevent opening a temporary file:

    if (defined $config->{file_path} and -r $config->{file_path}) { ... }

    Nice post BTW.