Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I have searched for a similar question here, but surprisingly could not find any.

In GNU bash, there is (a construct? a structure? a data type?) called " arrays ". Arrays are well documented in the bash documentation, so I think that I understand the basics.

But suddenly, in the documentation there also comes up the term "list". For example, it is used when talking about filename expansion (emphasis is mine):

If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of filenames matching the pattern (see Pattern Matching).

Therefore, I have three questions:

  • What does "list" mean here?
  • Is it used in the same meaning as in for loop description ?
  • I am somehow lost in a whitespace world in bash. If this "list" is a separate concept to arrays (as I think), is it treated specially when it comes to whitespaces and IFS , or in the same way as an array?
  • There is another use of the "list" term when talking about sequence of one or more pipelines, but I am aware that it most probably means a different kind of lists.

    UPDATE

  • Since I see that the way that this "list structure" works is very similar to how arrays work – what are the differences between them?
  • UPDATE 2

  • What are the uses cases when "lists" are preferred over arrays? For example, let us compare. Let us create two files:

    $ touch file1.txt file2.txt

  • When it comes to lists, I can do the following:

    $ A=*.txt ; echo $A
    file1.txt file2.txt
    

    And when it comes to arrays, I can do the following:

    $ B=(*.txt) ; echo ${B[@]}
    file1.txt file2.txt
    

    While these two results are exactly the same, are there any cases when arrays and lists return different results?

    UPDATE 3

    I might have confuse something, because in the above example it seems to be a list "wrapped" in an array. I do not know whether it makes a difference.

    In update 2, they are only the same because you have two file names that contain no whitespace. Try creating some files that do have whitespace in their names, then compare printf '%s\n' $A, printf '%s\n' ${B[@]}, printf '%s\n' "$A", and printf '%s\n' "${B[@]}". – chepner Aug 22, 2019 at 16:43 Thanks, @chepner. I see that another point is what tools are used to display. This would fit in my understanding of the list from my second last comment to the codeforester answer, that Bash is made to treat something as a list, not to define something as a list. — Btw., well, it has been some time, so after your comment, I had to recall all this thread. But it is nice to recognize that I already (almost) understand it. ;) – Silv Aug 25, 2019 at 20:44

    There is no data type called list in Bash. We just have arrays. In the documentation that you have quoted, the term "list" doesn't refer to a data type (or anything technical) - it just means a sequence of file names.

    However, glob expansions work very similar to array elements as far as sequential looping is considered:

    for file in *.txt; do          # loop through the matching files
                                   # no need to worry about white spaces or glob characters in file names
      echo "file=$file"
    

    is same as

    files=(*.txt)                  # put the list of matching files in an array
    for file in "${files[@]}"; do  # loop through the array
      echo "file=$file"
    

    However, if you were to hardcode the file names, then you need quotes to prevent word splitting and globbing:

    for file in verycramped.txt "quite spacious.txt" "too much space.txt" "*ry nights.txt"; do ...
    
    files=(verycramped.txt "quite spacious.txt" "too much space.txt" "*ry nights.txt")
    for file in "${files[@]}"; do ...
    
  • Word Splitting - Greg's Wiki
  • Word Splitting - Bash Manual
  • Word splitting in Bash with IFS set to a non-whitespace character
  • I just assigned a variable, but echo $variable shows something else
  • In other words, it's just used in the English sense to mean a sequence of items, and not in any technical sense. – that other guy Oct 19, 2018 at 23:38 The C language has specifier-qualiifier lists in a declaration, function parameter lists, lists of replacement tokens in a macro: none of these are data structures in C program. The word "list" is just used informally. – Kaz Oct 20, 2018 at 0:07 Also, it is not a good practice to modify the post and add more questions after it has been answered. – codeforester Oct 20, 2018 at 4:42 codeforester, I have read them (maybe too quick?). Please, excuse my notunderstanding. They seem to answer a couple of my other questions, especially about whitespaces; they are very helpful, but the questions that I am asking are about a concept of "list", and such questions are not answered by any of those articles (or at least I do not see it there). And thank you for clarification about updates. But, I did this because I though that other users will be able to see my doubts; they just clarify my doubts from the first three questions. – Silv Oct 20, 2018 at 14:21 POSIX doesn't describe variables in terms of implementation details, but rather the operations and their results: the programming model. Variables look much like character strings. How they are stored is not visible to the shell programmer; you have to read the source code of the shell you're using. Bash could be representing them differently from zsh, from pdksh, dash, ... yet portable scripts behave the same way. – Kaz Oct 20, 2018 at 16:59

    The term "list" is not really a specific technical term in bash; it is used in the grammar to refer to a sequence of commands (such as the body of a for loop, or the contents of a script), and this use has shown up in the documentation of program structure, but that's a very specific type of list.

    In the context you ask about, I'd say a "list" is a value that consists of any number (including 0) of shell words. The arguments to a single command are such a list.

    A shell word, in turn, is what you might call a single string in another language. Normally, when you type a command line, it is separated into words by the characters listed in $IFS (normally whitespace, that is, spaces and horizontal tabs), but you can avoid that by any of the various quoting mechanisms and thus create shell words that contain IFS characters.

    If you wish to store a list in a shell parameter, that parameter must be an array; in that case, each word of the list becomes an element of the array. For example, the list of arguments passed into a command are available in the default array, which is accessed via $ followed by the index that would go in between the square brackets in a named array reference, e.g. "$@" for all the elements turned back into a list, "$0" for the first element (which is the command name), etc.

    When an array is expanded back into a list of words, you have three options; the elements of the array can be kept as they originally were, irrespective of contents ("$@"); they can be concatenated together, joined by spaces, into one big single shell word ("$*"), or they can be first concatenated into one big string and then re-parsed into words using the usual IFS-delimiter rules ($@ or $* without the quotation marks).

    Except for a few builtins like mapfile (a.k.a. readarray), bash doesn't have much support for arrays. For example, the environment can only contain strings, so you can't export an array. You can't pass an array into a function as an array, although you can certainly use the value of an array (or a slice of an array) as (some or all of) the list of arguments passed to a function. You can also pass the name of an array to a function, which can then use name-references and eval to manipulate that array in its caller's scope, but as with all mechanisms for reaching out of one's lexical scope in any language, this is generally considered bad practice. And of course, a function can't return an array, but then a bash function can't return anything but a one-byte numeric exit code. It can output text, but that text is unstructured; if the caller captures it with command or process substitution, it's up to that caller to parse the text however it desires – such as making an array containing one element word for each line of output, which is the default behavior of mapfile/readarray.

    Anyway, the point is, lists in this context are values, while arrays are containers that store list values. Technically, shell parameters (a.k.a. "variables") can be arrays, and as arrays they can hold lists; they can't be lists, and it doesn't really make sense to refer to an "array value". But informally, "array" and "list" are often used interchangeably; that's the nature of lazy humans and the shell's fluidity.

    MarkReed, thanks for the answer. My understanding is now greater. So, in my own words – please, confirm it or not: *.txt produces a "list", but it is a "list" in the sense of human, yes? By bash it is treated as a "string" (in an arbitrary sense), which is to be split by the characters that bash is aware of in particular situation (space, tab, newline, colon, slash etc.)? So, in conclusion, you mean a "value" == a "string", arbitrary understood? (By the way, I have now new questions about $@ and $* that you mentioned, but it is outside the scope of the main questions about "list".) – Silv Oct 20, 2018 at 15:13 Well, in this context, *.txt only "produces" what bash causes it to produce, so it doesn't make sense to talk about it "producing" something that bash then parses. Bash parses the command line you type; if that includes the literal 5-character sequence *.txt outside of any quotation marks, it will replace that sequence with a list of all the filenames matching that pattern. Now here we are talking about a list of words in shell terms, as an actual artifact in bash's memory, not just a human list. – Mark Reed Oct 20, 2018 at 21:13 OK. I think that the point is that I am still thinking about a "list" in an assignment. And this thinking is wrong. I have read in another stackoverflow thread (do not remember where exactly) that such a sequence of characters in an assignment is treated just as a value, like you said, that is, a "string" in my understanding. It is treated as a list only when bash reads the value of this variable, then splitting it etc. Is it true? – Silv Oct 20, 2018 at 21:40 Yup; A=*.txt is a simple string assignment. In shell terms, it results in $A containing a single word. After that assignment, the shell sees the two command lines ls *.txt and ls $A as EXACTLY THE SAME. In each case you type only two shell words, but the number of words actually passed to ls depends on the contents of the directory. On the other hand, A=(*.txt) will consult the directory when you do the assignment; then A is an array containing the list of matching filenames, and even if you then delete them ls "${A[@]}" will pass all of them as arguments to ls. – Mark Reed Oct 21, 2018 at 2:31 Even an array is really just syntactic sugar for managing separate variables, each of which has a string value. There is no array value anywhere. – chepner Aug 22, 2019 at 16:26

    A list in bash is a specific sequence of expressions separated by a pipeline. From man bash, e.g.

    Lists
       A list is a sequence of one or more pipelines separated by one of the 
       operators ;, &, &&, or ||, and optionally terminated by one of ;, &, or 
       <newline>. 
       Of these list operators, && and || have equal precedence, followed by 
       ; and &, which have equal precedence.
       A sequence of one or more newlines may appear in a list instead of a 
       semicolon to delimit commands.
       If a command is terminated by the control operator &, the shell 
       executes the command in the background in a subshell. The shell does 
       not wait for the command to finish, and the return status is 0. 
       Commands separated by a ; are executed sequentially; the shell waits 
       for each command to terminate in turn. The return status is the exit 
       status of the last command executed.
       AND and OR lists are sequences of one of more pipelines separated by 
       the && and || control operators, respectively. AND and OR lists are 
       executed with left associativity. An AND list has the form
              command1 && command2
       command2 is executed if, and only if, command1 returns an exit status 
       of zero.
       An OR list has the form
              command1 || command2
       command2 is executed if and only if command1 returns a non-zero exit 
       status. The return status of AND and OR lists is the exit status of 
       the last command executed in the list.
    

    A List is used in forming Compound Commands (see man bash).

    There is another use of the "list" term when talking about sequence of one or more pipelines, but I am aware that it most probably means a different kind of lists.

    Both:

    $ A=*.txt ; echo $A
    
    $ B=(*.txt) ; echo ${B[@]}
    

    technically are Lists in bash.

    DavidCRankin, thanks for the answer. I now see that a term "list" is more ambiguous in bash than one may think. To be clear, I am rather asking about the result of *.txt, for example in an assignment, not about the whole line. But, I think that it is very good that you confirmed that sense of "lists". +1 – Silv Oct 20, 2018 at 15:17

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.