Due Date: See website for due date (Late days may be used.)
This project must be done in groups of 2 students. Use Piazza and the grouper app to find a partner (URL).
A shell receives line-by-line input from a terminal. If the user inputs a built-in command, the shell will execute this command. Otherwise, the shell will interpret the input as the name of a program to be executed, along with arguments to be passed to it. In this case, the shell will fork a new child process and execute the program in the context of the child. Normally, the shell will wait for a command to complete before reading the next command from the user. Such programs are said to run as “foreground” jobs. If the user appends an ampersand ‘&’ to a command, the command is started in the “background” and the shell will return to the prompt immediately.
The shell provides job control. A user may interrupt foreground jobs, send foreground jobs into the background, and vice versa. At a given point in time, a shell may run zero or more background jobs and zero or one foreground jobs. If there is a foreground job, the shell waits for it to complete before printing another prompt and reading the next command. In addition, the shell informs the user about status changes of the jobs it manages. For instance, jobs may exit, or terminate due to a signal, or be stopped for several reasons.
At a minimum, we expect that your shell has the ability to start foreground and back- ground jobs and implements the built-in commands ‘jobs,’ ‘fg,’ ‘bg,’ ‘kill,’ and ‘stop.’ The semantics of these commands should match the semantics of the same-named commands in bash or tcsh. The ability to correctly respond to ˆC (SIGINT) and ˆZ (SIGTSTP) is ex- pected, as are informative messages about the status of the children managed. Like bash or tcsh, you should use consecutively numbered small integers to enumerate your jobs.
For the minimum functionality, the shell need not support pipes (|), I/O redirection (< > >>), nor the ability to run programs that require exclusive access to the terminal (e.g., vim).
We expect most students to implement pipes, I/O redirection, and managing the control- ling terminal to ensure that jobs that require exclusive access to the terminal obtain such access. Beyond that, esh’s extensibility, described in Section 7 should allow for plenty of creative freedom.
You will need to use fork(), a variant of exec*(), and the waitpid() system calls.
You will need to catch SIGCHLD to learn about when the shell’s child processes change status. Since child processes execute concurrently with respect to the parent shell, it is impossible to predict when a child will exit (or terminate with a signal), and thus it is impossible to predict when this signal will arrive. In the worst case, a child may have terminated by the time the parent returns from fork()!
You will need to block the signal in those sections of your code where you access data structures that are also needed by the handler that is executed when this signal arrives. For example, consider the data structure used to maintain the current set of jobs. A new job is added after a child process has been forked; a job may be removed when SIGCHLD is received. To avoid a situation where the job has not yet been added when SIGCHLD arrives, or – worse – a situation in which SIGCHLD arrives while the shell is adding the job, the parent should block SIGCHLD until after it completed adding the job to the list. If the SIGCHLD signal is delivered to the shell while the shell blocks this signal, it is marked pending and will be received as soon as the shell unblocks this signal.
Use sigprocmask(2) to block and unblock signals. To set up signal handlers, use the sigac- tion(2) system call. Set sa flags to SA RESTART. The mask of blocked signals is inherited when fork() is called. Consequently, the child will need to unblock any signals the parent blocked before calling fork().
Each process in Unix is part of a group. Process groups are treated as an ensemble for the purpose of signal delivery and when waiting for processes. Specifically, the kill(2), killpg(2), and waitpid(2) system calls support the naming of process groups as possible targets1.
1Note the idiosynchracies of the API: kill(-pid, sig) does the same as killpg(pid, sig). Make sure to use the correct call.
Each process group has a designated leader, which is one of the processes in the group. To create a new group with itself as the leader, a process simply calls setpgid(0, 0). The id of a process group is the process id of the leader. Child processes inherit the process group of their parent process initially. They can then form their own group if desired, or their parent process can place them into a different process group via setpgid().
In addition to signals and waitpid, process groups are used to manage access to the ter- minal, as described next.
Managing Access To The Terminal
Running multiple processes on the same terminal creates a sharing issue: if multiple pro- cesses attempt to read from the terminal, which process should receive the input? Sim- ilarly, some programs – such as vi – output to the terminal in a way that does not allow them to share the terminal with others. 2
To solve this problem, Unix introduced the concept of a foreground process group. Each terminal maintains such a group. If a process in a process group that is not the foreground process group attempts to perform an operation that would require exclusive access to a terminal, it is sent a signal: SIGTTOU or SIGTTIN, depending on whether the attempted use was for an output (write) or input (read) operation. The default action taken in re- sponse to these signals is to suspend the process. If that happens, the process’s parent (i.e., your shell) can learn about this status change by calling waitpid(). WIFSTOPPED(status) will be true in this case. To allow this process to continue, its process group must be made the foreground process group of the controlling terminal via tcsetpgrp(), and then the process must be sent a SIGCONT signal. The state of the terminal must be saved when the process was suspended and restored when it is continued. (The shell will typically take this action in response to a ’fg’ command issued by the user.)
Signals that are sent as a result of user input, such as SIGINT or SIGTSTP, are also sent to a terminal’s foreground process group by the operating system.
Pipes and I/O Redirection
To implement pipes, use the pipe(2) system call. A pipe must be set up by the parent shell process before a child is forked. Forking a child will inherit the file descriptors that are part of the pipe(). The child must then redirect its stdout/stdin file descriptor to the pipe’s input or output end as needed using the dup2(2) system call.
Note that all processes that are part of a pipeline are children of the shell, e.g., if a user runs a | b then the process executing b is not a child process of the process executing the program a.
2Note that regular output via write(2) does not require exclusive access, unless the terminal’s ’tostop’ flag is set. The terminal will simply interleave such output.
Generally, a pipeline of commands is considered one job. All processes that form part of a pipeline should thus be part of the same process group.
Although the parent shell process creates pipes for each pair of communicating children before they are forked, it will not itself write to the pipes or read from the pipes it creates. Therefore, you must make sure that the parent shell process closes the file descriptors referring to the pipe’s ends after each child was forked. This is necessary for two reasons: first, in order to avoid leaking file descriptors. Second, to ensure the proper behavior of programs such as /bin/cat if the user asks the shell to execute them. To see why, we must first discuss what happens to file descriptors on fork(), close(), and exit().
Each file descriptor represents a reference to an underlying kernel object. Upon fork(), both the child and the parent process have access to any object the parent process may have created (i.e., open files or other kernel objects). Closing a file descriptor in the (par- ent) shell process affects only the current process’s access to the underlying object. Hence when the parent shell closes the file descriptor referring to the pipe it created, the child processes will still be able to access the pipe’s ends, allowing it to communicate with the other commands in the pipeline.
The actual object (such as a pipe or file) is closed only when the last process that has an open file descriptor referring to the object closes that file descriptor. If you fail to close the pipe’s file descriptors in the parent process (your shell), you compromise the cor- rect functioning of programs that rely on taking action when their standard input stream reaches end of file. For instance, the /bin/cat program will exit if its standard input stream reaches EOF, which in the case of a pipe happened iff all descriptors pointing to the pipe’s output end are closed. So if cat’s standard input stream is connected to a pipe for which the shell still has an open file descriptor, cat will never “see” EOF for its stan- dard input stream and appear stuck.
Lastly, note that when a process exits for whatever reason, including a signal, all file descriptors it had open are closed by the kernel as if the process has called close() before exiting.
Additional information can be found in the GNU C library manual, available at http:// www.gnu.org/s/libc/manual/html_node/index.html. Read, in particular, the sections on Signal Handling and Job Control.
We will provide a test driver to test your project, and tests for the basic and advanced functionality. The tests may be found on rlogin in
/web/courses/cs3214/spring2020/projects/eshtests/. The basic and advanced tests are also in the Gitlab repository that you forked to start the project. If updates to the tests come out you will have to pull from the remote repository to update your local copy.
Static and Dynamic Analysis Tools
While we encourage you to utilize the normal debugging practices (such as using gdb, strace, and printf), we have developed analysis tools designed to flag common errors that students encounter when programming this project.
These analysis tools—EshMD and ShellTrace—use static and dynamic analysis to reason about your code.
Static analysis involves looking at your source code without running it to find paths that
could potentially lead to a buggy execution. A static analysis performs symbolic execu- tion to reason about what the possible states your program can reach are.
Dynamic analysis runs your code on various test cases and looks that the behavior of your program to detect bugs. This differs from static analysis because it only looks at your how your code runs for that specific input/test.
Together, the analysis tools will look at your submitted program and point to locations in your code where your shell is not operating properly. This can be a very useful tool to determine the reasons your shell is crashing or why a test is failing.
You can submit your project for analysis using these tools by running (in your src folder):
or by using the course website (URL).
You can submit your code for analysis as many times as you want, whenever in the de- velopment cycle you want to. This might be useful to catch bugs as you implement more features.
There will be a small survey at the end to share your thoughts and experiences using the analysis tools to help debug this project.
It is often impossible to anticipate the future uses and needs of a system or application. Extensible architectures address this problem by allowing the loading of plug-ins that provide additional functionality or enhance built-in functionality.
When started with the ’-p dir’ flag, ’esh’ will dynamically load shared libraries contained in the directory ’dir.’ Multiple -p flags may be provided. Each shared library must define a strong global symbol named esh module, which shall refer to an instance of struct esh plugin. This struct contains information about the plug-in, including a set of func- tion pointers to invoke the plug-in’s functionality.
Multiple plug-ins may be loaded; a plug-in may specify its rank relative to others. Your shell should invoke the plug-ins’ functions in increasing rank order. If plug-ins share the same rank, their execution order is not defined. Some functionality (e.g., built-ins) requires that invocation stop if a plug-in provides this functionality.
You will need to make some modifications to your shell to be able to host plugins. We recommend that you first be able to host plugins on your shell before attempting to write plugins.
Here are some ideas for plug-ins:
- Change current directory (cd)
- Glob expansion (e.g., *.c)
- Setting and unsetting environment variables
- Timing commands: ”time” or time-outs.
- Shell variables
- pushd, popd, etc.
- Command-line history (perhaps using’s GNU History library)
- Backquote substitution
- Smart command-line completion
- Embedding applications: scripting languages, web servers, etc.
A side-note on Unix philosophy – in general, Unix implements functionality using many small programs and utilities. As such, built-in commands are often only those that must be implemented within the shell, such as cd. In addition, essential commands such as ’kill’ are often built-in to make sure an operator can execute those commands even if no new processes can be forked. Your plug-ins should generally stay with this philosophy and implement only functionality that is not already available using Unix commands or that would be better implemented using separate programs. If in doubt, ask.
You will note that the functions to read from the terminal and to parse the command line are invoked indirectly as function pointers that are part of esh shell. Advanced plug-ins may replace those if desired.
You will receive credit for every plug-in you write, and for every plug-in written by others which your shell can successfully load and run. You should publish plug-ins you have developed on the forum.
It is ok to sit together and debug a situation that arises if a plug-in written by one group does not run successfully in another group’s shell.
However, you may not share any code – electronically or otherwise – for the shell or a plug-in
– across groups. To allow others access to your plug-ins, we provide a shared place to which to copy them. Create a directory with your SLO id in
/web/courses/cs3214/spring2020/projects/student-plugins For each plu-
gin you wish to share, create a subdirectory within that directory, e.g. gback/cd, gback/glob, etc. In that subdirectory, copy the .so file, but do not include the corresponding .c file. In addition, provide a description of the plugin as a .txt file and a Python test for the plugin,
as described below.
In addition, note that the code contained in the plug-ins you load will run with the full privileges of the user executing the shell. In practice, this setup requires that you trust the provider of the plug-in. The “Acceptable Use of Information Systems” policy, published at http://www.vt.edu/about/acceptable-use.html, applies. If you are in doubt whether a plug-in you’ve written would violate this policy, please ask first.