Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I've met a problem with
re
module in Python 3.6.5.
I have this pattern in my regular expression:
'\\nRevision: (\d+)\\n'
But when I run it, I'm getting a DeprecationWarning
.
I searched for the problem on SO, and haven't found the answer, actually - what should I use instead of \d+
? Just [0-9]+
or maybe something else?
Python 3 interprets string literals as Unicode strings, and therefore your \d
is treated as an escaped Unicode character.
Declare your RegEx pattern as a raw string instead by prepending r
, as below:
r'\nRevision: (\d+)\n'
This also means you can drop the escapes for \n
as well since these will just be parsed as newline characters by re
.
–
–
–
'\\nRevision: (\d+)\\n'
because Python interprets \d
as invalid escape sequence. As is, Python doesn't substitute that sub-string, but warns about it since Version 3.6:
Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the result. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) It is also important to note that the escape sequences only recognized in string literals fall into the category of unrecognized escapes for bytes literals.
Changed in version 3.6: Unrecognized escape sequences produce a DeprecationWarning. In a future Python version they will be a SyntaxWarning and eventually a SyntaxError.
(source)
Thus, you can fix this warning by either escaping that back-slash properly or using raw strings.
That means, escape more:
'\\nRevision: (\\d+)\\n'
Or, use a raw string literal (where \
doesn't start an escape sequence):
r'\nRevision: (\d+)\n'