innercoder.com

A blog for Linux programming enthusiasts

A Linux Shell Programming Basics

| Comments

A shell is an interactive program that in its core forks for new processes and wait for their termination. In other words, we present a prompt, the user enters the commands, we parse the input , and execute it. Good error and signal handling is needed, as well as the return values of the different system calls that we will use. The man pages come real handy for this since we have a very good reference just a terminal screen away. This program is considered a good start to learn the basic of system calls and signal processing. This post serves as a quick analysis of the code in the github repository.

Note: the complete code of an older version is fully posted in the end. Check my github repo for an updated version. Use the link above.

This shell features some basic error and signal handling, use of system calls (fork, waitpid, execve), background processes, and directory change. And more than implementing a lot of features from the beginning, it is going to focus on a smaller set of features but well implemented.

Below is the basic structure for each new process. This structure can also be used later on when a job list is implemented. This is still in the TODO list and can serve as the second part for the development of this project.

The last two members are for storing the input buffer and an array of pointers to it, respectively. Since the same buffer variable is used to capture the input, each process structure will hold its own input.

1
2
3
4
5
6
7
8
9
10
struct proc_st {
        int bg;
        int exec;
        int commc;
        int fd_out;
        int fd_in;
        pid_t pid;
        char pbuff[MAX_LEN];
        char *commv[];
};

Here are the function prototypes of the program so far. The used for error feedback, signal handling, and command identification respectively.

1
2
3
4
/* function prototypes */
void error_msg(char *);
void sig_handler(int);
void comm_identi(struct proc_st *, char *);

The following are for the initialization of the signal handlers. The signal function call’s first argument is used to identify the signal and the second argument is set the action to perform when received. SIG_IGN is used for ignoring the signal. For the signal SIGCHLD, a handler function is given. Why the special case for SIGCHLD? Since we are allowing background processes and the signal is by default ignored, the shell needs to know about the finished background processes so as to not leave zombie processes behind.

1
2
3
4
/* initialize signal handlers */
signal(SIGINT, SIG_IGN);
signal(SIGQUIT, SIG_IGN);
signal(SIGCHLD, sig_handler);

Here is the core of the program. It is currently performing a lot of things on its own but this a close future enhancement for readbility and functionality.

  • Line 4: obtains input from the user.
  • Line 10: sets an internal flag to let the program know to send the command arguments for execution.
  • Line 11: send the obtained string input for parsing and identification.
  • Line 13-20: if command is executable, then proceed to fork and wait if it is not a background job.
  • Line 23-28: check if the command is set to be performed in the background.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
printf("[SHELL]> ");
for (;;) {

  while (fgets(buff, MAX_LEN, stdin) != NULL) {

      proc = malloc(sizeof(struct proc_st) + MAX_ARGS);

      /*initialize to zero */
      memset(proc, 0, sizeof(struct proc_st));
      proc->exec = 1;
      comm_identi(proc, buff);

      if (proc->exec == 1) {
          if ((pid = fork()) < 0 ) {
              printf("fork error");
          } else if (pid == 0) { /* child */
              if (execvp(*(proc->commv), proc->commv) < 0 )
                  error_msg("could not execute");
              free(proc);
          }

          /* parent */
          if (proc->bg == 0) {
              if ((pid = waitpid(pid, &status, 0)) < 0)
                  error_msg("waitpid error");
          } else
              printf("PID %d %s\n", pid, proc->pbuff);
      }
      free(proc);
      printf("[SHELL]> ");
  }
}
exit(0);

The other important function peforms parsing of the input string buffer and command identification for built-in commands. Parsing is done with the strtok() function. There are many ways to implement the parsing though. The function strcmp() is used for a simple identification of built-in functions.

  • Line 5-14: performs parsing with the strtok() function.
  • Line 16: necessary limiting for command pointer array.
  • Line 19-36: this section checks for built-in commands.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
void comm_identi(struct proc_st *proc, char *buffer)
{
        char *tmp;

        /* replace newline with null */
        if (buffer[strlen(buffer) - 1] == '\n')
                buffer[strlen(buffer) - 1] = 0;
        strncpy(proc->pbuff,buffer,MAX_LEN);

        tmp = strtok(buffer, " ");
        while (tmp != NULL) {
                proc->commv[(proc->commc)++] = tmp;
                tmp = strtok(NULL, " ");
        }

        proc->commv[proc->commc] = NULL;

        /* check for built-in functions */
        if (proc->commc >= 1) {
                if(!strcmp(proc->commv[0], "exit")) {
                        free(proc);
                        exit(EXIT_SUCCESS);
                }

                if(!strcmp(proc->commv[0], "cd")) {
                        proc->exec = 0;
                        if (chdir(proc->commv[1]) != 0)
                                error_msg("chdir() error");
                }

                /* check if it is a background process */
                if (*proc->commv[(proc->commc) - 1] == '&') {
                        proc->bg = 1;
                        proc->commv[(proc->commc) - 1] = NULL;
                }
        }

        return;
}

If interested, download the code and modify it to your liking. Fix any bugs or improve on it. That’s the only way to learn. If you are unsure about any libc funtion use the man pages. Though for beginners it might be a little puzzling at first, they are straight to the point and very informative. I also recommend Stevens Advanced Programming in the Unix Environment, Kerrisk The Linux Programming Interface, and O'Hallaron Computer Systems: A Programmer’s Perspective.

Update for Basic Shell v2:

  • better handling of signals. Removed unsafe function calls. But sigaction() is still needed for a even better handling.
  • implemented a jump table for built-in commands.
  • removed unneeded allocation and frees.

Let me know your thought in the comments.

Comments