How to split one string into multiple strings separated by at least one space in bash shell?

相关文章推荐
独立的饺子 · 使用HtmlUnit库的Java下载器：下载 ...· 2 周前 ·
大方的烤地瓜 · c# 顶级语句 - 不使用 Main ...· 1 周前 ·
彷徨的机器人 · stringvar转变成str ...· 1 周前 ·
留胡子的汤圆 · SharePoint 搜索 REST ...· 4 天前 ·
没有腹肌的开水瓶 · Exception in thread ...· 4 天前 ·
酒量大的青蛙 · python手势识别控制图片旋转 - CSDN文库· 5 月前 ·
单身的皮带 · Azure Databricks 上的约束 ...· 1 年前 ·
I have a string containing many words with at least one space between each two. How can I split the string into individual words so I can loop through them?
The string is passed as an argument. E.g. ${2} == "cat cat file" . How can I loop through it?
Also, how can I check if a string contains spaces?
Do you just need to loop (e.g. execute a command for each of the words)? Or do you need to store a list of words for later use? – DVK Sep 24 '09 at 20:07
Did you try just passing the string variable to a for loop? Bash, for one, will split on whitespace automatically.
sentence="This is   a sentence."
for word in $sentence
    echo $word
                @MobRule - the only drawback of this is that you can not easily capture (at least I don't recall of a way) the output for further processing. See my "tr" solution below for something that sends stuff to STDOUT
                    – DVK
                Sep 24 '09 at 20:04
                Actually this trick is not only a wrong solution, it also is extremely dangerous due to shell globbing.  touch NOPE; var='* a *'; for a in $var; do echo "[$a]"; done outputs [NOPE] [a] [NOPE] instead of the expected [*] [a] [*] (LFs replaced by SPC for readability).
                    – Tino
                May 13 '15 at 9:55
I like the conversion to an array, to be able to access individual elements:
sentence="this is a story"
stringarray=($sentence)
now you can access individual elements directly (it starts with 0):
echo ${stringarray[0]}
or convert back to string in order to loop:
for i in "${stringarray[@]}"
  # do whatever on $i
Of course looping through the string directly was answered before, but that answer had the the disadvantage to not keep track of the individual elements for later use:
for i in $sentence
  # do whatever on $i
See also Bash Array Reference.
                Sadly not quite perfect, because of shell-globbing: touch NOPE; var='* a *'; arr=($var); set | grep ^arr= outputs arr=([0]="NOPE" [1]="a" [2]="NOPE") instead of the expected arr=([0]="*" [1]="a" [2]="*")
                    – Tino
                May 13 '15 at 10:48
                @Tino: if you do not want globbing to interfere then simply turn it off. The solution will then work fine with wildcards as well. It is the best approach in my opinion.
                    – Alexandros
                Dec 17 '18 at 13:19
                @Alexandros My approach is to only use patterns, which are secure by-default and working in every context perfectly.  A requirement to change shell-globbing to get a secure solution is more than just a very dangerous path, it's already the dark side.  So my advice is to never get accustomed to use pattern like this here, because sooner or later you will forget about some detail, and then somebody exploits your bug.  You can find proof for such exploits in the press.  Every.  Single.  Day.
                    – Tino
                Dec 18 '18 at 15:36
After that, individual words in $text will be in $1, $2, $3, etc. For robustness, one usually does
set -- junk $text
shift
to handle the case where $text is empty or start with a dash. For example:
text="This is          a              test"
set -- junk $text
shift
for word; do
  echo "[$word]"
This prints
[This]
[test]
                This is an excellent way to split the var so that individual parts may be accessed directly. +1; solved my problem
                    – Cheekysoft
                Jul 26 '11 at 11:28
                I was going to suggest using awk but set is much easier.  I'm now a set fanboy.  Thanks @Idelic!
                    – Yzmir Ramirez
                Aug 18 '12 at 1:47
                Please be aware of shell globbing if you do such things: touch NOPE; var='* a *'; set -- $var; for a; do echo "[$a]"; done outputs [NOPE] [a] [NOPE] instead of the expected [*] [a] [*].  Only use it if you are 101% sure that there are no SHELL metacharacters in the splitted string!
                    – Tino
                May 13 '15 at 10:03
                @Tino: That issue applies everywhere, not only here, but in this case you could just set -f before set -- $var and set +f afterwards to disable globbing.
                    – Idelic
                May 14 '15 at 5:11
                @Idelic: Good catch.  With set -f your solution is safe, too.  But set +f is the default of each shell, so it is an essential detail, which must be noted, because others are probably not aware of it (as I was, too).
                    – Tino
                May 14 '15 at 12:50
The probably most easy and most secure way in BASH 3 and above is:
var="string    to  split"
read -ra arr <<<"$var"
(where arr is the array which takes the splitted parts of the string) or, if there might be newlines in the input and you want more than just the first line:
var="string    to  split"
read -ra arr -d '' <<<"$var"
(please note the space in -d '', it cannot be left away), but this might give you an unexpected newline from <<<"$var" (as this implicitly adds an LF at the end).
Example:
touch NOPE
var="* a  *"
read -ra arr <<<"$var"
for a in "${arr[@]}"; do echo "[$a]"; done
Outputs the expected
as this solution (in contrast to all previous solutions here) is not prone to unexpected and often uncontrollable shell globbing.
Also this gives you the full power of IFS as you probably want:
Example:
IFS=: read -ra arr < <(grep "^$USER:" /etc/passwd)
for a in "${arr[@]}"; do echo "[$a]"; done
Outputs something like:
[tino]
[1000]
[1000]
[Valentin Hilbig]
[/home/tino]
[/bin/bash]
As you can see, spaces can be preserved this way, too:
IFS=: read -ra arr <<<' split  :   this    '
for a in "${arr[@]}"; do echo "[$a]"; done
outputs
[ split  ]
[   this    ]
Please note that the handling of IFS in BASH is a subject on it's own, so do your tests, some interesting topics on this:
unset IFS: Ignores runs of SPC, TAB, NL and on line starts and ends
IFS='': No field separation, just reads everything
IFS=' ': Runs of SPC (and SPC only)
Some last example
var=$'\n\nthis is\n\n\na test\n\n'
IFS=$'\n' read -ra arr -d '' <<<"$var"
i=0; for a in "${arr[@]}"; do let i++; echo "$i [$a]"; done
outputs
1 [this is]
2 [a test]
while
unset IFS
var=$'\n\nthis is\n\n\na test\n\n'
read -ra arr -d '' <<<"$var"
i=0; for a in "${arr[@]}"; do let i++; echo "$i [$a]"; done
outputs
1 [this]
2 [is]
3 [a]
4 [test]
If you are not used to $'ANSI-ESCAPED-STRING' get used to it, it's a timesaver.
If you do not include -r (like in read -a arr <<<"$var") then read does backslash escapes.  This is left as exercise for the reader.
For the second question:
To test for something in a string I usually stick to case, as this can check for multiple cases at once (note: case only executes the first match, if you need fallthrough use multiplce case statements), and this need is quite often the case (pun intended):
case "$var" in
'')                empty_var;;                # variable is empty
*' '*)             have_space "$var";;        # have SPC
*[[:space:]]*)     have_whitespace "$var";;   # have whitespaces like TAB
*[^-+.,A-Za-z0-9]*) have_nonalnum "$var";;    # non-alphanum-chars found
*[-+.,]*)          have_punctuation "$var";;  # some punctuation chars found
*)                 default_case "$var";;      # if all above does not match
So you can set the return value to check for SPC like this:
case "$var" in (*' '*) true;; (*) false;; esac
Why case?  Because it usually is a bit more readable than regex sequences, and thanks to Shell metacharacters it handles 99% of all needs very well.
                This answer deserves more upvotes, due to the globbing issues highlighted, and its comprehensiveness
                    – Brian Agnew
                Mar 7 '16 at 12:04
                @brian Thanks.  Please note that you can use set -f or set -o noglob to switch of globbing, such that shell metacharacters no more do harm in this context.  But I am not really a friend of that, as this leaves behind much power of the shell / is very error prone to switch back and forth this setting.
                    – Tino
                Mar 14 '16 at 13:55
                Wonderful answer, indeed deserves more upvotes. Side note on case's fall through - you can use ;& achieve that. Not quite sure in which version of bash that appeared. I'm a 4.3 user
                    – Sergiy Kolodyazhnyy
                Jan 11 '17 at 8:19
                @Serg thanks for noting, as I did not know this yet!  So I looked it up, it appeared in Bash4. ;& is the forced fallthrough without pattern check like in C.  And there also is ;;& which just continues to do the further pattern checks.  So ;; is like if ..; then ..; else if .. and ;;& is like if ..; then ..; fi; if .., where ;& is like m=false; if ..; then ..; m=:; fi; if $m || ..; then .. -- one never stops learning (from others) ;)
                    – Tino
                Jan 14 '17 at 8:40
                In BASH echo "X" | can usually be replaced by <<<"X", like this: grep -s " " <<<"This contains SPC".  You can spot the difference if you do something like echo X | read var in contrast to read var <<< X.  Only the latter imports variable var into the current shell, while to access it in the first variant you must group like this: echo X | { read var; handle "$var"; }
                    – Tino
                May 13 '15 at 11:16
(A) To split a sentence into its words (space separated) you can simply use the default IFS by using
array=( $string )
Example running the following snippet
#!/bin/bash
sentence="this is the \"sentence\"   'you' want to split"
words=( $sentence )
len="${#words[@]}"
echo "words counted: $len"
printf "%s\n" "${words[@]}" ## print array
will output
words counted: 8
"sentence"
'you'
split
As you can see you can use single or double quotes too without any problem
Notes:

-- this is basically the same of mob's answer, but in this way you store the array for any further needing. If you only need a single loop, you can use his answer, which is one line shorter :)

-- please refer to this question for alternate methods to split a string based on delimiter.
(B) To check for a character in a string you can also use a regular expression match.

Example to check for the presence of a space character you can use:
regex='\s{1,}'
if [[ "$sentence" =~ $regex ]]
        echo "Space here!";
                For regex hint (B) a +1, but  -1 for wrong solution (A) as this is error prone to shell globbing. ;)
                    – Tino
                May 13 '15 at 10:53
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
                    Related
                        1994




    
How to check if a string contains a substring in Bash
1790How to check if a program exists from a Bash script?
2833How do I tell if a regular file does not exist in Bash?
1678How do I split a string on a delimiter in Bash?
3281How do I make the first letter of a string uppercase in JavaScript?
7434How to check whether a string contains a substring in JavaScript?
1421How to split a string in Java
2356How to concatenate string variables in Bash
2737How do I convert a String to an int in Java?
1037Check existence of input argument in a Bash shell script
site design / logo © 2019 Stack Exchange Inc; user contributions licensed under cc by-sa 3.0
                            with attribution required.
                    rev 2019.3.8.33007