Through the Looking Glass

Exploring Unix processes in illumos

Posted on September 17, 2016

I started reading Solaris Internals (SI) this week, and in the first chapter the process model is discussed. I’ve read APUE, and I’ve written plenty of C code; I knew what a process was, but I didn’t know what it was. How is it actually manifested in code? What does it look like? It’d been a long while since I last wrote a technical blog post, or really done a technical deep dive into anything of substance, so it was time to find out and maybe write a short post about it.

Let’s dive in.

I’m using an OmniOS virtual machine running on OS X:

OmniOS 5.11     omnios-r151018-95eaa7e  June 2016
phoenix% git log -1
commit b6bc2fd4673eae6c96e2aea9e16105dd32a66b7b
Author: Dan McDonald <danmcd@omniti.com>
Date:   Wed Sep 14 13:10:21 2016 -0400

    7364 NVMe driver performance can be improved by caching nvme_dma_t
    structs for PRPL. (fix lint)

I installed the basic development tools using IPS, but I needed Joyent’s pkgsrc to get mg(1), my lightweight C editor and pager of choice.

Some background

APUE p10 defines a process as

An executing instance of a program…a term used on almost every page of this text.

I can verify the claim in the latter half of that sentence; I quickly realised that the most I could say about a process concretely was that it is “the thing that runs,” and began digging through APUE. This turned out to be a more difficult task than I thought. A more hand-wavey description would include things like the program’s address space (including the text and data segments), and Unix context information like UID, CWD, the controlling TTY, the parent process ID, the environment, and so forth.

As I read through the APUE chapters on processes (chapters 7-9), I found a useful concise list on page 233, albeit couched in terms of what child processes inherit from their parent.

At this point in my research, I realised that I only sort-of knew what a process was, and that I didn’t really understand them.

The Solaris process model1 is described on SI p19:

The following objects form the nucleus of the Solaris kernel threads model:

SI p18 notes

The kernel thread is the core unit of execution managed by the Solaris kernel.

and SI p19 notes

A process is an abstraction that contains the execution environment for a user program. It consists of a virtual memory environment (an address space), program resources such as an open file list, and other components of the process environment are shared by all the threads within each process.

The hunt is on

One of the initial challenges was figuring out where to find things; I’d never looked through the illumos code base before. The first thing I do when confronted with a codebase I don’t know is to open up a pair of terminals (in tmux(1)), one with cscope(1) 2 and the other is where I begin looking through the source.

In this case, I figured LWPs would be easier to track down as a searchable identifier. This turned out to be right: a grep(1) for “lwp” returned a lot of results, but one of those was lwpid_t. Plugging this into cscope(1)’s “Find this global definition:” yielded two definitions:

Global definition: lwpid_t

  File  Line
1 lwp.h 68 typedef uint_t lwpid_t;
2 lwp.h 68 typedef uint_t lwpid_t;

They’re both just a typedef3:

typedef uint_t lwpid_t;

Searching for users of this identifier brings up head/proc_service.h:

/*
 *  Description:
 *      Types, global variables, and function definitions for provider
 * of import functions for users of libc_db and librtld_db.
 */

Instead of reading the contents, what I started with was looking at the included headers. One that looked interesting was sys/procfs_isa.h. This can be found in uts/{intel,sparc}/sys/procfs_isa.h.

What’s in uts?

KERNEL MAKEFILE STRUCTURE

So, this looks to be the actual kernel. There are architecture-specific directories (like the intel and sparc directories above), various other directories, and common.

I made an assumption here: the process definition is going to be the same for all architectures. In useful terms, it means that I started my search in common/:

phoenix% find common -name \*proc\*
common/sys/procfs.h
common/sys/processor.h
common/sys/contract/process.h
common/sys/contract/process_impl.h
common/sys/procset.h
common/sys/proc.h
common/io/kb8042/at_keyprocess.c
common/syscall/processor_bind.c
common/syscall/sigprocmask.c
common/syscall/processor_info.c
common/fs/proc
common/fs/smbsrv/smb_process_exit.c
common/avs/ns/solaris/nsc_proc.c
common/os/procset.c
common/os/proc.c
common/os/rctl_proc.c
common/contract/process.c

The first two things that stood out were sys/proc.h and os/proc.c.

struct proc

Fortunately, this assumption was correct: sys/proc.h contains a definition for struct proc:

/*
 * One structure allocated per active process.  It contains all
 * data needed about the process while the process may be swapped
 * out.  Other per-process data (user.h) is also inside the proc structure.
 * Lightweight-process data (lwp.h) and the kernel stack may be swapped out.
 */
typedef struct  proc {
    /* omitting the fields for brevity */
} proc_t;

This is entirely too long (and not entirely relevant) to duplicate here, but you can see it on Github.

Down the rabbit hole

Looking at the structure doesn’t really tell us much about how it’s used, though. For that, we’ll need to venture outside the header file. Looking through os/proc.c, we see the following functions defined:

/*
 * Install process context ops for the current process.
 */
void
installpctx(
        proc_t *p,
        void    *arg,
        void    (*save)(void *),
        void    (*restore)(void *),
        void    (*fork)(void *, void *),
        void    (*exit)(void *),
        void    (*free)(void *, int))

/*
 * Remove a process context ops from the current process.
 */
int
removepctx(
        proc_t *p,
        void    *arg,
        void    (*save)(void *),
        void    (*restore)(void *),
        void    (*fork)(void *, void *),
        void    (*exit)(void *),
        void    (*free)(void *, int))

void
savepctx(proc_t *p)

void
restorepctx(proc_t *p)

void
forkpctx(proc_t *p, proc_t *cp)

/*
 * exitpctx is called during thread/lwp exit to perform any actions
 * needed when an LWP in the process leaves the processor for the last
 * time. This routine is not intended to deal with freeing memory; freepctx()
 * is used for that purpose during proc_exit(). This routine is provided to
 * allow for clean-up that can't wait until thread_free().
 */
void
exitpctx(proc_t *p)

/*
 * freepctx is called from proc_exit() to get rid of the actual context ops.
 */
void
freepctx(proc_t *p, int isexec)

Reading through these functions, I noticed a few have the following:

ASSERT(p == curthread->t_procp);

So, we have two puzzle pieces here: proc_t and curthread.

I wondered if curthread was the current running thread in the kernel (recall that kernel threads are distinct from user threads, and that every process has at least one user thread, the main thread). If so, this might shed some light on the subject. Plugging curthread into cscope(1)’s “Find this global definition” yields five results:

Global definition: curthread

  File                                  Line
1 lib/libc/inc/thr_uberdata.h           1239 #define curthread (_curthread())
2 lib/libfakekernel/common/sys/thread.h   81 #define curthread (_curthread())
3 lib/libzpool/common/sys/zfs_context.h  176 #define curthread ((void *)(uintptr_t)thr_self())
4 uts/common/sys/thread.h                531 #define curthread (threadp())
5 /usr/include/sys/thread.h              531 #define curthread (threadp())

The most interesting here seems to be #4, common/sys/thread.h, which means we should track down threadp. cscope(1) shows this as being defined in intel/asm/thread.h4:

/*
 * 0x10 is offsetof(struct cpu, cpu_thread)
 * 0x18 is the same thing for the _LP64 version.
 * (It's also the value of CPU_THREAD in assym.h)
 * Yuck.
 */

extern __GNU_INLINE struct _kthread
*threadp(void)
{
        struct _kthread *__value;

#if defined(__amd64)
        __asm__ __volatile__(
            "movq %%gs:0x18,%0"         /* CPU_THREAD */
            : "=r" (__value));
#elif defined(__i386)
        __asm__ __volatile__(
            "movl %%gs:0x10,%0"         /* CPU_THREAD */
            : "=r" (__value));
#else
#error  "port me"
#endif
        return (__value);
}

My Intel assembly is fairly rusty and it’s been a while since I’ve used inline assembly, so this took some extra digging around. %gs is a segment register; Intel CPUs have several of these:

%0 defines register 0, or RAX; RAX is a general-purpose register, often used for a current working value.

I think this function is copying the contents of GS into EAX, which is a pointer to a struct _kthread. This definition also shows us two other interesting definitions: cpu_thread and _kthread. I suspect that struct _kthread is typedef’d to kthread_t. However, while interesting, this function didn’t really develop my understanding of processes. It is useful background information to keep in mind.

After looking at this, my understanding of a process is this: a structure that contains a lot of contextual information and a pointer to a thread.

At this point, though, it still feels like going down a rabbit hole. We still haven’t seen where processes are created, though. Searching for occurrences of proc_t yields 1,336 lines: to many to reliably scan through. Narrowing down cscope to only focus on uts/ (or even uts/{common,intel} cuts down on some of this — but not by much.

main.c

I had an intuition that, given the uts/common/os/proc.c file was found in uts/common/os/, that might be a good place to look. Somehow, the uts/common/os/main.c source file jumped out at me, and this appeared to be paydirt:

/* well known processes */
proc_t *proc_sched;             /* memory scheduler */
proc_t *proc_init;              /* init */
proc_t *proc_pageout;           /* pageout daemon */
proc_t *proc_fsflush;           /* fsflush daemon */

/* omitted some code for brevity */

kmem_cache_t *process_cache;    /* kmem cache for proc structures */

In this file, there’s a main5 function. The first definition in this function is

proc_t		*p = ttoproc(curthread);	/* &p0 */

ttoproc is defined in uts/common/sys/thread.h6 as

#define ttoproc(x)      ((x)->t_procp)

This header file has the commentary

/*
 * The thread object, its states, and the methods by which it
 * is accessed.
 */

I also found that my earlier intuition that struct _kthread is typedef’d to kthread_t. I also found that it has a tprocp field, which means that GS points to a kthread_t.

So this is set up initially in main, which means there’s already a valid kthread_t running. The immediate question is, is the proc_t *p expected to be initialised at this point? The first access to p is

p->p_mstart = gethrtime();

Clearly, memory has been set aside for this already by that point. Reading through main, the startup invocation looked interesting. That’s not defined in this file, though; it’s defined in =uts/i86pc/os/startup.c (which also has this neat 64-bit vmem layout diagram). The startup function has the commentary

* In a 32-bit OS, boot loads the kernel text at 0xfe800000 and kernel data
 * at 0xfec00000.  On a 64-bit OS, kernel text and data are loaded at
 * 0xffffffff.fe800000 and 0xffffffff.fec00000 respectively.  Those
 * addresses are fixed in the binary at link time.

Maybe the process is set aside in the compiled binary?

The following set of function calls look interesting:

startup_memlist();
startup_kmem();
startup_vm();

A glance through this file and those functions didn’t show anything related to process memory setup, so it was back to main.c. At this, I realised I missed something:

ASSERT(curthread == CPU->cpu_thread);

curthread is already running, and remember that kthread_t has a process pointer, so it’s likely this is set up even earlier.

This was a fun afternoon exploration7; though I haven’t answered my question, I did get to look through the codebase. I also noticed that SI chapter 2 is on the process model, so there’s a pretty good chance that if I just sit down and read the book I’ll find the answer.

Epilogue

Some updates on this:

@AmazingDim pointed me to this section of the illumos Developer’s Guide. This explains things like “uts” meaning “UNIX Timesharing System.”


  1. The following quotes (actually, most of section 1.4) were the main spark that started me down this.

  2. While I’m using cscope and mg in a terminal, I’ll try to remember to link to Github where possible.

  3. One of the things that I didn’t do this time was show full paths in cscope; part of this is that I quickly set up the dev environment instead of copying over my normal dev environment. The latter includes a lot of helper scripts and shell aliases. My favourite is rscope, which is an alias to `cscope -p 99 $(find . -name .[ch])`. This might have given me the notion to look in the uts/ directory earlier than I ended up doing so.

  4. There’s a corresponding definition in the sparc/asm directory as well; because this is an Intel machine, I’ve elected to focus on the Intel parts.

  5. In retrospect, searching for the main function might have been a better route, but the journey was a great deal of fun.

  6. Note that this is under common/sys, not intel/asm, unlike the previous example.

  7. If this is just on page 20 of SI, and the book has around a thousand pages, this is going to be quite a long read.