$ cat ex1.sh #!/bin/bash # read〝FILE.txt〞 line by line while read -r line #←This line is one of the puzzling parts for me – how can "read" read a file instead of the keyboard? do echo $line done <FILE.txt #←This line is another puzzling part for me – what does "done < FILE" mean??? |
Although the above example consists of only a few lines, its syntax is perplexing. As someone with limited knowledge, I find it hard to understand, and the author hasn't explained the principles thoroughly (probably copied from somewhere?).
Well, if I can't comprehend a piece of writing, it's not a big deal. I also want to adopt it for my application. I tested the example, and it works perfectly fine.
My application is also quite simple: I just want to display one line at a time and require pressing any key to proceed to the next line. So, I added an extra command to rewrite it as follows:
$ cat ex2.sh #!/bin/bash while read -r line do echo $line read -p "Press any key to continue" -n 1 #←This is the line I added done <FILE.txt |
Strange enough, my modified program, ex2.sh, doesn't work as expected! Who's the culprit? Is it something peculiar in my understanding? Or is it the line I added myself?
So, I turned into a keyboard detective, determined to catch the culprit and bring them to justice.
After several nights of intense investigation, I finally caught the culprit – it's none other than "File Descriptor," often abbreviated as "fd".
In the original example (ex1.sh), the ability to read a file line by line is achieved through a shell simplification that hides the intricacies of file descriptor operations. However, my modified example (ex2.sh) doesn't work properly because the shell's hidden file descriptor operations have stolen stdin (standard input).
In the literature within my country, mentions of "file descriptor" are rare, and even when they are touched upon, they fail to address the core issues. Therefore, I decided to jot down my insights from these past few days, serving as both my personal reminder and potentially assisting others who encounter the same problem. This is particularly relevant for shell scripts that involve file reading and keyboard input prompts, where the use of file descriptors might eliminate the need to rely on tools like awk or sed to accomplish tasks.
In simple terms, a "file descriptor" is a number assigned by Unix-like operating systems when reading files. This number acts as an index for the kernel to track the input/output of processes associated with the opened files.
For instance, while browsing this article, your browser may have opened around 20 files (HTML files and various image files), each with a unique index (such as 100, 101, 102, and so on). These index numbers are the file descriptors (fd). However, maintaining these file descriptors is the responsibility of the kernel, and regular users need not concern themselves with this level of detail.
fd Number | Name | Function |
0 | stdin | Standard Input |
1 | stdout | Standard Output |
2 | stderr | Standard Error |
Function | Example | Example note | |
COMMAND 1> | Redirect stdout | echo '123' >fileA | fd 1 output to a file |
COMMAND 1>> | Append stdout: | seq 100 200 >> fileA | fd 1 append output to a file |
COMMAND 2> | Redirect stderr: | find / -name '*.conf' 2>/dev/null | fd 2 output to a file |
COMMAND 2>> | Append stderr: | seq 1 10 >>fileA | fd 2 append output to a file |
COMMAND 0< | Redirect stdin | cat < fileA | fd 0 replaced by a file |
For output redirection, the syntax is COMMAND [fd]>, where fd defaults to
1.
if omitted. For input redirection, the syntax is COMMAND [fd]<, where fd defaults to 0 if omitted.
Redirection can also change the original output to go to stderr or vice versa.
The syntax is X>&Y (where X is the original fd, and Y is the redirected fd; if X is omitted, it defaults to 1). For example, redirecting stderr (2) to stdout (1) is written as "2>&1".
Function | Example | |
2>&1 | Redirect stderr(2) to stdout(1): | ls -R /home > fileA 2>&1 |
1>&2 | Redirect stdout to stderr | find / -name '*readme.txt' 1>&2 2>/dev/null |
$ seq 1 1000000 1 2 3 4 Ctrl+Z ←Press <Ctrl+Z> to pause ` [1]+ Stopped seq 1 100000000 ←The program is stopped $ jobs -p ←List the PIDs of paused commands $ 2373 ←The PID of the command "seq 1 1000000" is 2373 ls -lgG /proc/2373/fd/ ←List /proc/<PID>/fd to observe fd usage total 0 lrwx------ 1 64 2015-04-26 22:28 0 -> /dev/tty1 lrwx------ 1 64 2015-04-26 22:28 1 -> /dev/tty1 lrwx------ 1 64 2015-04-26 22:28 2 -> /dev/tty1 |
In the above example, the directory "/proc/<PID>/fd/" contains 3 files, corresponding to file descriptors 0, 1, and 2, respectively. These are linked to "/dev/tty1" (in graphical interface tests, it could be "/dev/pts/N").
This means that in the example, stdin (fd 0), stdout (fd 1), and stderr (fd 2) are all connected to the tty (terminal) or /dev/pts/N (virtual terminal).
Let's modify the experiment with the command seq 1 1000000 > fileA 2>&1 and observe the results: lrwx------ 1 64 2015-04-26 15:04 0 -> /dev/tty1 l-wx------ 1 64 2015-04-26 15:04 1 -> /home/basalt/fileA l-wx------ 1 64 2015-04-26 15:04 2 -> /home/basalt/fileA |
In this example, stdin (fd 0) remains as the tty, but stdout (fd 1) and stderr (fd 2) are both redirected to "fileA."
Therefore, when a command becomes confusing due to piping and redirection, you can gain clarity by observing the information provided by the file descriptors in the directory "/proc/<PID>/fd/."
Now, if you modify it further to seq 1 100 > fileB >&2, after the computation, the contents of the file "fileB" are empty. Why is that? Take a look at "/proc/<PID>/fd/" to understand!
Excluding fd 0 (stdin), fd 1 (stdout), fd 2 (stderr), and system-reserved fd 10 to 255, general users are advised to only use fd 3 to 9 for redirection purposes.
(fd 255 is usually reserved for shell scripts, and process substitution may use fd 63 or fd 62, so it's best to avoid using the system's own fd 10 to 255 to prevent conflicts.)
To use fd 3 to 9, the exec command is used. In a process, the exec function serves two main purposes: it closes the parent process and runs the child process directly. Another important function of exec is fd redirection.
There are two types of redirection: redirecting one fd to another fd and redirecting an fd to a file. Let's explain each of them:
Another principle is that the input source for input redirection should be on the right side of "<:. For example, to redirect fd 0 (stdin) to fd 7, you write exec 7<&0.
Is it a bit abstract? Let's test it step by step and observe the results. Here's an example of an operation:
$ exec 8>/tmp/fd_test ←Create fd 8 and redirect it to the file "tmp/fd_test" $ echo $$ ←View the current shell's PID 2633 ←Current shell's PID $ ls -lgG /proc/2633/fd ←Observe fd usage total 0 total 0 lr-x------ 1 64 2015-05-04 10:17 0 -> /dev/tty1 l-wx------ 1 64 2015-05-04 10:17 1 -> /dev/tty1 l-wx------ 1 64 2015-05-04 10:17 2 -> /dev/tty1 lrwx------ 1 64 2015-05-04 10:58 255 -> /dev/tty1 lr-x------ 1 64 2015-05-04 10:17 8 -> /tmp/fd_test ←fd 8 created and redirected to the file |
From the above example, the exec 8>/tmp/fd_test command opens fd 8, and that fd 8 is then redirected to the file "/tmp/fd_test". Now, let's redirect fd 1 (stdout) to fd 8, which means writing stdout to the file "/tmp/fd_test".
Okay, let's continue the experiment.$ echo 'hello world !' >&8 Write a string to fd 8 $ cat /tmp/fd_test ←Verify the content hello world ! |
As a recap, redirecting stdout (1) to stderr (2) is written as "1>&2" or" >&2". Similarly, redirecting fd 1 to fd 8 in the above example is written as "1>&8" or" >&8".
Remember to close fd 8 if it's no longer needed.
The shell script "ex3.sh" below demonstrates opening multiple fds and closing them after use:
Example:$ cat ex3.sh #!/bin/bash # flowing create fd 3~5 and redirect to file1~file3 exec 3>/tmp/file1 exec 4>/tmp/file2 exec 5>/tmp/file3 # flowing write string to fd1 then redirect to fd 3~5 echo '1234' >&3 echo 'abcd' >&4 echo 'I II III IV' >&5 # flowing close fd 3~5 exec 3>&- exec 4>&- exec 5>&- |
The pipeline functionality allows the stdout of one command to become the stdin of the next command, meaning the stderr part cannot pass through the pipeline to the next command. However, if we want to process stderr after the pipeline, we can use "X>&Y" for redirection. In the following example, stderr is piped to tr to convert it to uppercase, but stdout is not piped.
$ exec 6>&1 ←Redirect fd 6 to fd 1 $ ls -l /root /etc/fstab 2>&1 1>&6 | tr a-z A-Z ←stderr is piped to tr to convert to uppercase -rw-r--r-- 1 root root 608 2014-09-26 15:47 /etc/fstab ←stdout remains unchanged as it is redirected to fd 6 LS: CANNOT OPEN DIRECTORY /ROOT: PERMISSION DENIED ←stderr is converted to uppercase by tr $ exec 6>&- ←Close fd 6 |
If a file is redirected to fd 0 (most commonly used), a more accurate way to describe it would be to say "file replacing keyboard" (emphasizing "replacing").
For example, exec < FILE redirects the file to fd 0 (stdin), which means that the input source is no longer stdin (keyboard), but instead replaced by the file content (keyboard input becomes ineffective). This allows the original commands that were entered from the keyboard (stdin) to be read from the file, one line at a time.
In an interactive shell, the most essential interaction involves the keyboard and screen. Without a keyboard, how can interaction occur? Hence, inputting exec < FILE (if such a file exists) would exit the shell. However, in non-interactive shells (such as shell scripts), the file would replace stdin (keyboard).
An example of a script file, "ex4.sh", reads the first three lines of the file "/etc/fstab".$ cat ex4.sh #!/bin/bash exec < /etc/fstab #fd 0 (stdin)= file〝/etc/fstab〞 # flowing read〝/etc/fstab〞 line1~3 read line1 read line2 read line3 # flowing print〝/etc/fstab〞 line1~3 echo $line1 echo $line2 echo $line3 |
$ cat ex5.sh #!/bin/bash # read file "/etc/fstab" line by line exec 0< /etc/fstab # fd 0 (stdin) = file while read line # now command read from file instead of stdin do echo $line done |
$ exec 3>/tmp/fd_test ←Redirect fd 3 to the file "tmp/fd_test" $ echo "line1" >&3 ←Write the string to fd 3 $ cat /tmp/fd_test ←Verify the content line1 line1 $ exec 9<&3 ←Open fd 9 and redirect fd 3 to fd 9 (now fd 9 is equal to fd 3) $ echo "line2" >&9 ←Write the string to fd 9 $ cat /tmp/fd_test ←Verify the content line1 line2 $ exec 9>&- ←Close fd 9 $ exec 3<&- ←Close fd 3 |
$ echo 1234567890 > File ←Create a file named "File" $ exec 3<> File ←Open the file "File" and redirect it bidirectionally to fd 3 $ read -n 4 <&3 ←Read the 4th character from fd 3 $ echo -n "." >&3 ←Write a period "." to fd 3 $ exec 3>&- ←Close fd 3 $ cat File ←Verify the content 1234.67890 |
If closing an fd without using exec, it will be temporarily closed (its effect lasts for only one operation).
In the following example, stderr is temporarily closed. (Note: You cannot close stdin and stdout in an interactive shell.)
Example: (Tested as a non-root user)$ find / -name 'readme.*' 2>&- ←Find all files named "readme." in the filesystem & close stderr /usr/share/icons/Bluecurve/48x48/mimetypes/readme.png /usr/share/doc/cyrus-sasl-lib-2.1.22/readme.html /usr/share/doc/words-3.0/readme.txt /usr/share/icons/Bluecurve/48x48/mimetypes/readme.png |
For permanently closing an fd, it is necessary to use exec, for example, exec 9>&-. Remember to close unused fds to allow other programs to use them and to avoid potential conflicts where multiple programs compete for the same fd number, leading to hard-to-detect bugs.
Also, note that only stdout can pass through a pipeline. Other unrelated fds should be temporarily closed. For example, in the example provided with exec X>&Y, it is safer to rewrite it as follows:
Example:$ exec 6>&1 $ ls -l /root /etc/fstab 2>&1 1>&6 6>&- | tr a-z A-Z 6>&- ←Temporarily close fd 6 for "tr" $ exec 6>&- ←Permanently close fd 6 for the shell |
lr-x------ 1 64 2015-04-26 14:45 0 -> /home/basalt/FILE.txt ←stdin changed to a file lrwx------ 1 64 2015-04-26 14:45 1 -> /dev/tty1 lrwx------ 1 64 2015-04-26 14:45 10 -> /dev/tty1 ←additional fd 10 opened lrwx------ 1 64 2015-04-26 14:45 2 -> /dev/tty1 lr-x------ 1 64 2015-04-26 14:45 255 -> /home/basalt/ex1.sh |
exec 10<&0 #←Backup fd 0 to fd 10 exec < FILE.txt #← stdin=file while read -r line do echo $line done #←Original command was done<FILE.txt exec 0<&10 #← Restore fd 0 from fd 10 |
$ cat ex6.sh #!/bin/bash # read〝FILE.txt〞 line by line exec 7<FILE.txt # ←fd 7=FILE.txt) while read -u 7 line #←read reads from fd 7 instead of stdin do echo $line read -p "Press any key to continue" -n 1 done |
On the internet, I stumbled upon a classic case where many people encountered the issue of stdin being taken away without understanding the reason. A user wanted to write a shell script to list files in the working directory and then prompt whether to delete them, but it didn't work, so the user sought help online.
The problematic shell script looks like this:$ cat ex7.sh #!/bin/bash while read file_name do rm -iv $file_name done < <(ls) |
$ cat ex8.sh |