Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

When I run my script in bash, I get the error: sh: 2: Syntax error: "|" unexpected . I don't know why, I want to use pipelines here, and a script in perl with that command works, but I need it in Python.

Example of input (text file):

Kribbella flavida
Saccharopolyspora erythraea
Nocardiopsis dassonvillei
Roseiflexus sp.

Script:

#!/usr/bin/python
import sys import os
input_ = open(sys.argv[1],'r') output_file = sys.argv[2]
#stopwords = open(sys.argv[3],'r')
names_board = []
for name in input_:
    names_board.append(name.rstrip())
    print(name) for row in names_board:    
    print(row)    
    os.system("esearch -db pubmed -query %s | efetch -format xml | xtract -pattern PubmedArticle -element AbstractText >> %s" % (name,
output_file))
                What operating system are you using? Have you read man esearch, man efetch, and man xtract?
– Eli Sadoff
                Nov 5, 2016 at 14:34

A possibly unrelated problem is that you aren't properly quoting the input and output file names in the command. Use

os.system('esearch -db pubmed -query "%s" | efetch -format xml | xtract -pattern PubmedArticle -element AbstractText >> "%s"' % (name, output_file))

However, even that is not foolproof for all legal file names (such as filenames that contain a double quote). I would recommend using the subprocess module instead of os.system, leaving the shell out of the process altogether

esearch = ["esearch", "-db", "pubmed", "-query", name]
efetch = ["efetch", "-format", "xml"]
xtract = ["xtract", "-pattern", "PubmedArticle", "-element", "AbstractText"]
with open(sys.argv[2], "a") as output_file:
    p1 = subprocess.Popen(esearch, stdout=subprocess.PIPE)
    p2 = subprocess.Popen(efetch, stdin=p1.stdout, stdout=subprocess.PIPE)
    subprocess.call(xtract, stdin=p2.stdout, stdout=output_file)

The problem is that name contains the newline that terminates the line read from input. When you interpolate name into the shell command, the newline gets inserted too, and the shell then treats it as the end of the first command. However, the second line then starts with a pipe symbol, which is a syntax error: pipe symbols must come between commands on the same line.

A good hint that that is the problem is found in the fact that sh reports an error at line 2, while the command seems to only consist of one line. After substitution, though, it is two lines, and the second one is problematic.

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.