相关文章推荐
爱喝酒的围巾  ·  FastApi持续更新 - ...·  5 月前    · 
乐观的甘蔗  ·  error C2871: ...·  6 月前    · 
刀枪不入的开心果  ·  python - ...·  1 年前    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I'm wondering about using constants in perl regex's. I want to do something similar to:

use constant FOO => "foo"
use constant BAR => "bar"
$somvar =~ s/prefix1_FOO/prefix2_BAR/g;

of course, in there, FOO resolves to the three letters F O O instead of expanding to the constant.

I looked online, and someone was suggesting using either ${\FOO}, or @{[FOO]} Someone else mentioned (?{FOO}). I was wondering if anyone could shed some light on which of these is correct, and if there's any advantage to any of them. Alternatively, is it better to just use a non-constant variable? (performance is a factor in my case).

There's not much in the way of reasons to use a constant over a variable. It doesn't make a great deal of difference - perl will compile a regex anyway.

For example:

#!/usr/bin/perl
use warnings;
use strict;
use Benchmark qw(:all);
use constant FOO => "foo";
use constant BAR => "bar";
my $FOO_VAR = 'foo';
my $BAR_VAR = 'bar';
sub pattern_replace_const {
   my $somvar = "prefix1_foo test";
   $somvar =~ s/prefix1_${\FOO}/prefix2_${\BAR}/g;
sub pattern_replace_var {
   my $somvar = "prefix1_foo test";
   $somvar =~ s/prefix1_$FOO_VAR/prefix2_$BAR_VAR/g;
cmpthese(
   1_000_000,
   {  'const' => \&pattern_replace_const,
      'var'   => \&pattern_replace_var

Gives:

          Rate const   var
const 917095/s    --   -1%
var   923702/s    1%    --

Really not enough in it to worry about.

However it may be worth noting - you can compile a regex with qr// and do it that way, which - provided the RE is static - might improve performance (but it might not, because perl can detect static regexes, and does that itself.

    Rate      var    const compiled
var      910498/s       --      -2%      -9%
const    933097/s       2%       --      -7%
compiled 998502/s      10%       7%       --

With code like:

my $compiled_regex = qr/prefix1_$FOO_VAR/;
sub compiled_regex { 
    my $somvar = "prefix1_foo test";
    $somvar =~ s/$compiled_regex/prefix2_$BAR_VAR/g;

Honestly though - this is a micro optimisation. The regex engine is fast compared to your code, so don't worry about it. If performance is critical to your code, then the correct way of dealing with it is first write the code, and then profile it to look for hotspots to optimise.

I think the point of constants is not that they may be faster than variables, but that they are constant. – Borodin Feb 9, 2017 at 14:40 In my particular case, the values I'm using are indeed constant. I was thinking of moving to constants for this reason along with the potential performance benifits. – user2766918 Feb 9, 2017 at 14:59 Also constants may be used to assemble several regexps using common components. That may improve maintainability. – U. Windl Jan 13, 2021 at 12:39

The shown problem is due to those constants being barewords (built at compile time)

Constants defined using this module cannot be interpolated into strings like variables.

In the current implemenation (of constant pragma) they are "inlinable subroutines" (see ).

This problem can be solved nicely by using a module like Const::Fast

use Const::Fast;
const my $foo => 'FOO';
const my $bar => 'BAR';
my $var = 'prefix1_FOO_more';
$var =~ s/prefix1_$foo/prefix2_$bar/g;

Now they do get interpolated. Note that more complex replacements may need /e.

These are built at runtime so you can assign results of expressions to them. In particular, you can use the qr operator, for example

const my $patt => qr/$foo/i;  # case-insensitive 

The qr is the recommended way to build regex patterns. (It interpolates unless the delimiter is '.) The performance gain is most often tiny, but you get a proper regular expression, which can be built and used as such (and in this case a constant as well).

I recommend Const::Fast module over the other one readily, and in fact over all others at this time. See a recent article with a detailed discussion of both. Here is a review of many other options.

I strongly recommend to use a constant (of your chosen sort) for things meant to be read-only. That is good for the health of the code, and of developers who come into contact with it (yourself in the proverbial six months included).

These being subroutines, we need to be able to run code in order to have them evaluated and replaced by given values. Can't just "interpolate" (evaluate) a variable -- it's not a variable.

A way to run code inside a string (which need be interpolated, so effectively double quoted) is to de-reference, where there's an expression in a block under a reference; then the expression is evaluated. So we need to first make that reference. So either

say "@{ [FOO] }";  # make array reference, then dereference
say "${ \FOO }";   # make scalar reference then dereference

prints foo. See the docs for why this works and for examples. Thus one can do the same inside a regex, and both in matching and replacement parts

s/prefix1_${\FOO}/prefix2_${\BAR}/g;

(or with @{[...]}), since they are evaluated as interpolated strings.

Which is "better"? These are tricks. There is rarely, if ever, a need for doing this. It has a very good chance to confuse the reader. So I just wouldn't recommend resorting to these at all.

As for (?{ code }), that is a regex feature, whereby code is evaluated inside a pattern (matching side only). It is complex and tricky and very rarely needed. See about it in perlretut and in perlre.

Discussing speed of these things isn't really relevant. They are certainly outside the realm of clean and idiomatic code, while you'd be hard pressed to detect runtime differences.

But if you must use one of these, I'd much rather interpolate inside a scalar reference via a trick then reach for a complex regex feature.

@U.Windl Indeed, it makes for a better example using a variable, changed w/ a comment. Thank you – zdim Jan 13, 2021 at 21:27

According to PerlMonk, you better create an already-interpolated string if you are concerned about performance:

use constant PATTERN => 'def';
my $regex = qr/${\(PATTERN)}/; #options such as /m can go here.
if ($string =~ regex) { ... }

Here is the link to the whole discussion.

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.