Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams Please post the actual code you're using. Note though, splitting that string on white space will not produce the command you want because your bash -c command will not be a single argument, that's probably the root of your problems. Either leave it as a string with shell=True or build it up as a list instead of a string, or try shlex.split instead of naive string split Eric Renouf Jul 12, 2017 at 15:00 Also, please don't add information in the comments, instead edit your question with the updated information Eric Renouf Jul 12, 2017 at 15:01

Immediate Issue: Shell Quoting vs Python Quoting

In an unquoted context in shell, \; prevents ; from being parsed as a command separator. When invoking a command with an argument list from Python without shell=True , however, there is no shell to interpret and remove that backslash, and that extra quoting makes behavior equivalent to passing '\;' in shell -- meaning the argument passed to find isn't the exact string ; it's expecting, and that you get a syntax error.

However, the original code is incorrect (in security-impacting ways!) on other counts as well; one of the below solutions should be used.

Smallest Change: Using find Safely

Personally, if you're willing to be a little flexible on the output names (ie. testmkv-27311.mkv.avi ), I would write this more like the following:

subprocess.call([
  'find', '/',
  '-name', 'testmkv-27311.mkv',
  '-exec',
      'ffmpeg',
          '-i', '{}',
          '-vcodec', 'copy',
          '-acodec', 'copy',
          '-codec:v', 'libx264',
          '-profile:v', 'high',
          '-preset', 'slow',
          '-b:v', '500k',
          '-maxrate', '500k',
          '-bufsize', '1000k',
          '-vf', 'scale=-1:480',
          '-threads', '0',
          '-codec:a', 'libfdk_aac',
          '-b:a', '128k',
          '{}.mp4',

Notably:

  • We are not invoking a shell, and thus avoiding shell-injection-style security vulnerabilities.
  • Quoting is all done purely with Python syntax -- no nested shell quoting is used or required.
  • POSIX does not require find -exec to support {} as a substring (rather than a complete argument). As such, this answer is (like the code in the question) not as portable as might be hoped.
  • Alternate Approach: Native Python File Searching

    That said, there isn't really a compelling reason to use find at all, when the Python standard library can do searches for you, and will make it easy to update the names yourself:

    def convert(filename_in, filename_out):
      subprocess.call([
        'ffmpeg',
          '-i', filename_in,
          '-vcodec', 'copy',
          '-acodec', 'copy',
          '-codec:v', 'libx264',
          '-profile:v', 'high',
          '-preset', 'slow',
          '-b:v', '500k',
          '-maxrate', '500k',
          '-bufsize', '1000k',
          '-vf', 'scale=-1:480',
          '-threads', '0',
          '-codec:a', 'libfdk_aac',
          '-b:a', '128k',
          filename_out,
    target = 'testmkv-27311.mkv'
    for root, dir, files in os.walk('/'):
      if target in files:
        filename_in = os.path.join(dir, target)
        filename_out = filename_in[:-3]+'.mp4'
        convert(filename_in, filename_out)
    

    Alternate Safe Approach: Hardcoded Script Iterating Over Arguments

    As yet another approach to doing this securely (not having find modify or generate code from names, which allows those names to be parsed as code by the shell), you could have your bash script iterate over its arguments:

    bash_script=r'''
    for filename; do
      ffmpeg -i "$filename" \
             -vcodec copy \
             -acodec copy \
             -codec:v libx264 \
             -profile:v high \
             -preset slow \
             -b:v 500k \
             -maxrate 500k \
             -bufsize 1000k \
             -vf scale=-1:480 \
             -threads 0 \
             -codec:a libfdk_aac \
             -b:a 128k \
             "${filename%.mkv}.mp4}"
    subprocess.call([
      'find', '/',
        '-name', 'testmkv-27311.mkv',
        '-exec', 'bash', '-c', bash_script, '_',
        '{}', '+'])
    

    In this case, the _ is the placeholder for $0 in the script; subsequent arguments (the filenames found by find) are passed in $1, $2, etc; and for filename; do (which defaults to for filename in "$@"; do) will iterate over them.

    Using -exec ... {} + passes as many as filenames as possible to each command, rather than invoking the command once per filename; this reduces the number of shells invoked, and thus enhances performance.

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.