天天看點

SHELL重定向和管道的實作

<a href="http://www.sarathlakshman.com/2012/09/24/implementation-overview-of-redirection-and-pipe-operators-in-shell/">原文連結</a>

i have been always fascinated about the design of unix. i am still curious and enjoy the philosophy and the idea of ‘write programs that do one thing and do it well’. aim of this blog post is to walk through some interesting aspects on implementation of file descriptors and to illustrate how gracefully that design helps to build interesting unix shell methodologies. for any process, there are three default file descriptors(檔案描述符). stdin - with descriptor number 0, stdout - with descriptor number 1 and stderr with descriptor number 2.

let us go through some basic system calls. all the system calls explained below are not exact syntax. please refer man for correct function prototype.

fork()

the fork system call creates another copy of current process and mark the new process as child of parent process which called fork. this system call returns zero in the child process and return child’s pid in the parent process. it copies everything including file descriptors, and virtual memory. if a process tries to write any virtual memory page, it will do a copy on write to create copy of that particular page for that process space.

exec(binary_path)

the exec system call overwrites the current process with executable image from a file. eg. if you run exec(“/bin/ls”). it will overwrite the memory code image with binary from /bin/ls and execute. the file descriptor table remains the same as that of original process.

open(file, mode)

opens a file and creates a file descriptor associated with the file.

important: by design, when the kernel allocates a file descriptor, it will create the fd with next smallest available file descriptor number.

close(fd)

closes the open file descriptor

dup(fd)

the dup system call creates a file descriptor that is duplicate of given fd passed as argument.

pipe(int arr[2])

creates a pipe, and stores the read descriptor in array location zero and write descriptor in array location one.

read(fd, buff, len)

reads len bytes to buff from file descriptor fd.

write(fd, buff, len)

writes len bytes from buff to file descriptor fd.

let us go through some interesting shell features that we use frequently and look at their implementations.

$ cmd1 &gt; stdout.txt

the above command redirects stdout to file stdout.txt

for implementing the above operation, we should be able to link stdout of cmd1 with file descriptor of stdout.txt opened with write mode. let us look at the code.

$ cmd1 2&gt; stdout.txt

the above command redirects stderr to file stdout.txt

$ cmd2 &gt; stdout_stderr.txt 2&gt;&amp;1

the above command redirects both stdout and stderr to file stdout_stderr.txt

$ cmd3 &lt; input.txt

the above command redirects data from input.txt to stdin for cmd3.

$ cmd1 | cmd2

this command says that cmd2 will receive stdin from stdout of cmd1.

aren’t you feeling awesome? with simple design, without making any code change to individual programs, it is possible to connect input and output streams to individual programs. hats off to designers of unix.