The Single UNIX Specification, Version 2 Threads Extensions

The Single UNIX Specification, Version 2 Threads Extensions

The Single UNIX Specification, Version 2 includes the threads model and interfaces defined in IEEE Std 1003.1c-1995 together with a number of extensions. These extensions, known as the X/Open Threads Extension, based on widely accepted existing industry practice, were developed by the Aspen Group and submitted to The Open Group's Base Working group (the group that develops operating system interface specifications within The Open Group). This article is a brief introduction to these extensions. It assumes a working knowledge of the threads model specified in POSIX.1c and threads programming concepts in general.

Introduction

The X/Open Threads Extension is built upon the threads model and interfaces defined in IEEE Std 1003.1c-1995, commonly known as POSIX.1c or Pthreads. POSIX.1c contains much optional functionality. When POSIX.1c was incorporated into the Single UNIX Specification, Version 2, the majority of the POSIX.1c optional functionality was made mandatory and additional functionality, known as the Aspen threads extensions submission, was incorporated.

The Aspen Group

Over the past few years almost all UNIX system vendors implemented some flavor of a threads package based on the POSIX.1c interfaces. Each vendor found that the POSIX.1c interfaces were not complete in solving all their threads requirements. Consequently, each vendor implemented extensions to their thread packages to meet those requirements.

Unfortunately for application developers, not all vendors implemented the same exact set of extensions. To make things worse, the same functionality was added, but used different interface names or parameter sets. In short, this resulted in proprietary threads interfaces that are not portable across implementations, yet certain applications, such as database engines, were making heavy use of these proprietary interfaces.

Fortunately, many of the threads extensions developed were general enough that they are easily supported on any UNIX system threads implementation. In late 1995, the Aspen Group formed a subgroup to standardize the interfaces and functionality of the common thread extensions that various UNIX system vendors had implemented. The threads extensions that came out of this work by the Aspen Group comprise extensions that were made for OSF DCE 1.0 as well as others by Sun, HP, and Digital. The Aspen Group handed the completed work over to X/Open in 1996 as a submission for consideration for inclusion in the next revision of the Single UNIX Specification.

The following extensions to POSIX.1c were formulated by the Aspen Group:

extended mutex attribute types
read-write locks and attributes
thread concurrency level
thread stack guard size
parallel I/O.

A total of 19 new functions were specified.

The Aspen Group carefully followed the threads programming model specified in POSIX.1c when developing these extensions. As with POSIX.1c all the new functions return zero if successful, otherwise an error number is returned to indicate the error.

The concept of attribute objects was introduced in POSIX.1c to allow implementations to extend the standard without changing the existing interfaces. Attribute objects were defined for threads, mutexes, and condition variables. Attributes objects are defined as implementation-dependent opaque types to aid extensibility, and functions are defined to allow attributes to be set or retrieved. The Aspen Group followed this model when adding the new type attribute of pthread_mutexattr_t and the new read-write lock attributes object pthread_rwlockattr_t.

Extended Mutex Attributes

POSIX.1c defines a mutex attributes object as an implementation-dependent opaque object of type pthread_mutexattr_t, and specifies a number of attributes which this object must have and a number of functions which manipulate these attributes. These attributes include detachstate, inheritsched, schedparm, schedpolicy, contentionscope, stackaddr, and stacksize.

The Single UNIX Specification, Version 2 specifies another mutex attribute called type. The type attribute allows applications to specify the behavior of mutex locking operations in situations where the POSIX.1c behavior is undefined. The OSF DCE threads implementation, based on Draft 4 of POSIX.1c, specified a similar attribute. Note that the names of the attributes have changed somewhat from the OSF DCE threads implementation.

The Single UNIX Specification, Version 2 also extends the specification of the following POSIX.1c functions which manipulate mutexes:

 

pthread_mutex_lock()
pthread_mutex_trylock()
pthread_mutex_unlock()

to take account of the new mutex attribute type and to specify behavior which was declared as undefined in POSIX.1c. How a calling thread acquires or releases a mutex now depends upon the mutex type attribute.

The type attribute can have the following values:

Value Definition
PTHREAD_MUTEX_NORMAL Basic mutex with no specific error checking built in. Does not report a deadlock error.
PTHREAD_MUTEX_RECURSIVE Allows any thread to recursively lock a mutex. The mutex must be unlocked an equal number of times to release the mutex.
PTHREAD_MUTEX_ERRORCHECK Detects and reports simple usage errors; that is, an attempt to unlock a mutex that is not locked by the calling thread or that is not locked at all, or an attempt to relock a mutex the thread already owns.
PTHREAD_MUTEX_DEFAULT The default mutex type. May be mapped to any of the above mutex types or may be an implementation-dependent type.

Value	Definition
PTHREAD_MUTEX_NORMAL	Basic mutex with no specific error checking built in. Does not report a deadlock error.
PTHREAD_MUTEX_RECURSIVE	Allows any thread to recursively lock a mutex. The mutex must be unlocked an equal number of times to release the mutex.
PTHREAD_MUTEX_ERRORCHECK	Detects and reports simple usage errors; that is, an attempt to unlock a mutex that is not locked by the calling thread or that is not locked at all, or an attempt to relock a mutex the thread already owns.
PTHREAD_MUTEX_DEFAULT	The default mutex type. May be mapped to any of the above mutex types or may be an implementation-dependent type.

Normal mutexes do not detect deadlock conditions; for example, a thread will hang if it tries to relock a normal mutex that it already owns. Attempting to unlock a mutex locked by another thread, or unlocking an unlocked mutex, results in undefined behavior. Normal mutexes will usually be the fastest type of mutex available on a platform but provide the least error checking.

Recursive mutexes are useful for converting old code where it is difficult to establish clear boundaries of synchronization. A thread can relock a recursive mutex without first unlocking it. The relocking deadlock which can occur with normal mutexes cannot occur with this type of mutex. However, multiple locks of a recursive mutex require the same number of unlocks to release the mutex before another thread can acquire the mutex. Furthermore, this type of mutex maintains the concept of an owner. Thus, a thread attempting to unlock a recursive mutex which another thread has locked returns with an error. A thread attempting to unlock a recursive mutex that is not locked shall return with an error. Never use a recursive mutex with condition variables because the implicit unlock performed by pthread_cond_wait() or pthread_cond_timedwait() will not actually release the mutex if it had been locked multiple times.

Errorcheck mutexes provide error checking and are useful primarily as a debugging aid. A thread attempting to relock an errorcheck mutex without first unlocking it returns with an error. Again, this type of mutex maintains the concept of an owner. Thus, a thread attempting to unlock an errorcheck mutex which another thread has locked returns with an error. A thread attempting to unlock an errorcheck mutex that is not locked also returns with an error. It should be noted that errorcheck mutexes will almost always be much slower than normal mutexes due to the extra state checks performed.

The default mutex type provides implementation-dependent error checking. The default mutex may be mapped to one of the other defined types or may be something entirely different. This enables each vendor to provide the mutex semantics which the vendor feels will be most useful to their target users. Most vendors will probably choose to make normal mutexes the default so as to give applications the benefit of the fastest type of mutexes available on their platform. Check your implementation's documentation.

An application developer can use any of the mutex types almost interchangeably as long as the application does not depend upon the implementation detecting (or failing to detect) any particular errors. Note that a recursive mutex can be used with condition variable waits as long as the application never recursively locks the mutex.

Two functions are provided in the Single UNIX Specification, Version 2 for manipulating the type attribute of a mutex attributes object. This attribute is set or returned in the type parameter of these functions. The pthread_mutexattr_settype() function is used to set a specific type value while pthread_mutexattr_gettype() is used to return the type of the mutex. Setting the type attribute of a mutex attributes object affects only mutexes initialized using that mutex attributes object. Changing the type attribute does not affect mutexes previously initialized using that mutex attributes object.

Read-Write Locks and Attributes

Read-write locks (also known as readers-writer locks) allow a thread to exclusively lock some shared data while updating that data, or allow any number of threads to have simultaneous read-only access to the data.

Unlike a mutex, a read-write lock distinguishes between reading data and writing data. A mutex excludes all other threads. A read-write lock allows other threads access to the data, providing no thread is modifying the data. Thus, a read-write lock is less primitive than either a mutex-condition variable pair or a semaphore.

Application developers should consider using a read-write lock rather than a mutex to protect data that is frequently referenced but seldom modified. Most threads (readers) will be able to read the data without waiting and will only have to block when some other thread (a writer) is in the process of modifying the data. Conversely a thread that wants to change the data is forced to wait until there are no readers. This type of lock is often used to facilitate parallel access to data on multiprocessor platforms or to avoid context switches on single processor platforms where multiple threads access the same data.

If a read-write lock becomes unlocked and there are multiple threads waiting to acquire the write lock, the implementation's scheduling policy determines which thread shall acquire the read-write lock for writing. If there are multiple threads blocked on a read-write lock for both read locks and write locks, it is unspecified whether the readers or a writer acquire the lock first. However, for performance reasons, implementations often favor writers over readers to avoid potential writer starvation.

A read-write lock object is an implementation-dependent opaque object of type pthread_rwlock_t as defined in <pthread.h>. There are two different sorts of locks associated with a read-write lock - a read lock and a write lock.

The pthread_rwlockattr_init() function initializes a read-write lock attributes object with the default value for all the attributes defined in the implementation. After a read-write lock attributes object has been used to initialize one or more read-write locks, changes to the read-write lock attributes object, including destruction, do not affect previously initialized read-write locks.

Implementations must provide at least the read-write lock attribute process-shared. This attribute can have the following values:

Value Definition
PTHREAD_PROCESS_SHARED Any thread of any process that has access to the memory where the read-write lock resides can manipulate the read-write lock.
PTHREAD_PROCESS_PRIVATE Only threads created within the same process as the thread that initialized the read-write lock can manipulate the read-write lock. This is the default value.

Value	Definition
PTHREAD_PROCESS_SHARED	Any thread of any process that has access to the memory where the read-write lock resides can manipulate the read-write lock.
PTHREAD_PROCESS_PRIVATE	Only threads created within the same process as the thread that initialized the read-write lock can manipulate the read-write lock. This is the default value.

The pthread_rwlockattr_setpshared() function is used to set the process-shared attribute of an initialized read-write lock attributes object while the function pthread_rwlockattr_getpshared() obtains the current value of the process-shared attribute.

A read-write lock attributes object is destroyed using the pthread_rwlockattr_destroy() function. The effect of subsequent use of the read-write lock attributes object is undefined.

A thread creates a read-write lock using the pthread_rwlock_init() function. The attributes of the read-write lock can be specified by the application developer, otherwise the default implementation-dependent read-write lock attributes are used if the pointer to the read-write lock attributes object is NULL. In cases where the default attributes are appropriate, the PTHREAD_RWLOCK_INITIALIZER macro can be used to initialize statically allocated read-write locks.

A thread which wants to apply a read lock to the read-write lock can use either pthread_rwlock_rdlock() or pthread_rwlock_tryrdlock(). If pthread_rwlock_rdlock() is used, the thread acquires a read lock if a writer does not hold the write lock and there are no writers blocked on the write lock. If a read lock is not acquired, the calling thread blocks until it can acquire a lock. However, if pthread_rwlock_tryrdlock() is used, the function returns immediately with the error EBUSY if any thread holds a write lock or there are blocked writers waiting for the write lock.

A thread which wants to apply a write lock to the read-write lock can use either of two functions: pthread_rwlock_wrlock() or pthread_rwlock_trywrlock(). If pthread_rwlock_wrlock() is used, the thread acquires the write lock if no other reader or writer threads hold the read-write lock. If the write lock is not acquired, the thread blocks until it can acquire the write lock. However, if pthread_rwlock_trywrlock() is used, the function returns immediately with the error EBUSY if any thread is holding either a read or a write lock.

The pthread_rwlock_unlock() function is used to unlock a read-write lock object held by the calling thread. Results are undefined if the read-write lock is not held by the calling thread. If there are other read locks currently held on the read-write lock object, the read-write lock object shall remain in the read locked state but without the current thread as one of its owners. If this function releases the last read lock for this read-write lock object, the read-write lock object shall be put in the unlocked read state. If this function is called to release a write lock for this read-write lock object, the read-write lock object shall be put in the unlocked state.

The same POSIX working group which developed POSIX.1b and POSIX.1c is currently developing IEEE PASC P1003.1j draft standard, which specifies a set of extensions for realtime and threaded programming. This includes readers-writer locks which are nearly identical to the Single UNIX Specification, Version 2 read-write locks. The Aspen Group was aware of this draft standard, but felt that there was an immediate and urgent need for standardization in the area of read-write locks.

The following table maps the Single UNIX Specification, Version 2 read-write lock functions to their equivalent IEEE PASC P1003.1j draft 5 functions:

 SUS, V2
 IEEE PASC P1003.1j
 pthread_rwlock_init()
 rwlock_init()
 pthread_rwlock_destroy()
 rwlock_destroy()
 pthread_rwlock_rdlock()
 rwlock_rlock()
 pthread_rwlock_tryrdlock()
 rwlock_tryrlock()
 pthread_rwlock_wrlock()
 rwlock_wlock()
 pthread_rwlock_trywrlock()
 rwlock_trywlock()
 pthread_rwlock_unlock()
 rwlock_unlock()
 pthread_rwlockattr_init()
 rwlock_attr_init()
 pthread_rwlockattr_destroy()
 rwlock_attr_destroy()
 pthread_rwlockattr_setpshared()
 rwlock_attr_setpshared()
 pthread_rwlockattr_getpshared()
 rwlock_attr_getpshared()

SUS, V2	IEEE PASC P1003.1j
pthread_rwlock_init()	rwlock_init()
pthread_rwlock_destroy()	rwlock_destroy()
pthread_rwlock_rdlock()	rwlock_rlock()
pthread_rwlock_tryrdlock()	rwlock_tryrlock()
pthread_rwlock_wrlock()	rwlock_wlock()
pthread_rwlock_trywrlock()	rwlock_trywlock()
pthread_rwlock_unlock()	rwlock_unlock()
pthread_rwlockattr_init()	rwlock_attr_init()
pthread_rwlockattr_destroy()	rwlock_attr_destroy()
pthread_rwlockattr_setpshared()	rwlock_attr_setpshared()
pthread_rwlockattr_getpshared()	rwlock_attr_getpshared()

The Aspen Group chose function names which are different from those used in the IEEE PASC P1003.1j draft standard to avoid name space conflicts with those interfaces. Note that draft 5 requires the header <semaphore.h> while the Single UNIX Specification, Version 2 requires the <pthread.h> header. However, it is hoped that the final POSIX.1j standard will adopt the Aspen functions names and headers instead of the current ones.

Thread Concurrency Level

On threads implementations that multiplex user threads onto a smaller set of kernel execution entities, the system attempts to create a reasonable number of kernel execution entities for the application upon application startup.

On some implementations, these kernel entities are retained by user threads that block in the kernel. Other implementations do not timeslice user threads so that multiple compute-bound user threads can share a kernel thread. On such implementations, some applications may use up all the available kernel execution entities before its user-space threads are used up. The process may be left with user threads capable of doing work for the application but with no way to schedule them.

The pthread_setconcurrency() function enables an application to request more kernel entities; that is, specify a desired concurrency level. However, this function merely provides a hint to the implementation. The implementation is free to ignore this request or to provide some other number of kernel entities. If an implementation does not multiplex user threads onto a smaller number of kernel execution entities, the pthread_setconcurrency() function has no effect.

The pthread_setconcurrency() function may also have an effect on implementations where the kernel mode and user mode schedulers cooperate to ensure that ready user threads are not prevented from running by other threads blocked in the kernel.

The pthread_getconcurrency() function always returns the value set by a previous call to pthread_setconcurrency(). However, if pthread_setconcurrency() was not previously called, this function shall return zero to indicate that the threads implementation is maintaining the concurrency level.

Thread Stack Guard Size

DCE threads introduced the concept of a thread stack guard size. Most thread implementations add a region of protected memory to a thread's stack, commonly known as a guard region, as a safety measure to prevent stack pointer overflow in one thread from corrupting the contents of another thread's stack. The default size of the guard regions attribute is PAGESIZE bytes and is implementation-dependent.

Some application developers may wish to change the stack guard size. When an application creates a large number of threads, the extra page allocated for each stack may strain system resources. In addition to the extra page of memory, the kernel's memory manager has to keep track of the different protections on adjoining pages. When this is a problem, the application developer may request a guard size of 0 bytes to conserve system resources by eliminating stack overflow protection.

Conversely an application that allocates large data structures such as arrays on the stack may wish to increase the default guard size in order to detect stack overflow. If a thread allocates two pages for a data array, a single guard page provides little protection against thread stack overflows since the thread can corrupt adjoining memory beyond the guard page.

The Single UNIX Specification, Version 2 defines a new attribute of a thread attributes object; that is, the guardsize attribute which allows applications to specify the size of the guard region of a thread's stack.

Two functions are provided for manipulating a thread's stack guard size. The pthread_attr_setguardsize() function sets the thread guardsize attribute, and the pthread_attr_getguardsize() function retrieves the current value.

An implementation may round up the requested guard size to a multiple of the configurable system variable PAGESIZE. In this case, pthread_attr_getguardsize() returns the guard size specified by the previous pthread_attr_setguardsize() function call and not the rounded up value.

If an application is managing its own thread stacks using the stackaddr attribute, the guardsize attribute is ignored and no stack overflow protection is provided. In this case, it is the responsibility of the application to manage stack overflow along with stack allocation.

Parallel I/O

Many I/O intensive applications, such as database engines, attempt to improve performance through the use of parallel I/O. However, POSIX.1 does not support parallel I/O very well because the current offset of a file is an attribute of the file descriptor.

Suppose two or more threads independently issue read requests on the same file. To read specific data from a file, a thread must first call lseek() to seek to the proper offset in the file, and then call read() to retrieve the required data. If more than one thread does this at the same time, the first thread may complete its seek call, but before it gets a chance to issue its read call a second thread may complete its seek call, resulting in the first thread accessing incorrect data when it issues its read call. One workaround is to lock the file descriptor while seeking and reading or writing, but this reduces parallelism and adds overhead.

Instead, the Single UNIX Specification, Version 2 provides two functions to make seek/read and seek/write operations atomic. The file descriptor's current offset is unchanged, thus allowing multiple read and write operations to proceed in parallel. This improves the I/O performance of threaded applications. The pread() function is used to do an atomic read of data from a file into a buffer. Conversely, the pwrite() function does an atomic write of data from a buffer to a file.

Functional Overview

The <pthread.h> header defines the following new types:

pthread_rwlock_t
Read-write lock object.
pthread_rwlockattr_t
Read-write lock attributes object.

The <pthread.h> header defines the following new macros:

PTHREAD_RWLOCK_INITIALIZER
Statically initialize a read-write lock object.

All of the following functions have their prototypes defined in <pthread.h>:

pthread_mutexattr_gettype()
Get the value of the type attribute of the specified mutex attribute object attr.
```
int pthread_mutexattr_gettype(const pthread_mutexattr_t *attr,
    int *type);
```
pthread_mutexattr_settype()
Set the value of the type attribute of the specified mutex attribute object attr.
```
int pthread_mutexattr_settype(pthread_mutexattr_t *attr,
    int *type);
```

pthread_rwlock_init()

Initialize the read-write lock object rwlock.

int pthread_rwlock_init(pthread_rwlock_t *rwlock,
    const pthread_rwlockattr_t *attr);

pthread_rwlock_rdlock()
Lock the read-write lock object rwlock for reading.
```
int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock);
```
pthread_rwlock_tryrdlock()
Lock the read-write lock object rwlock for reading unless there is an existing write lock or blocked writers.
```
int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock);
```
pthread_rwlock_wrlock()
Lock the read-write lock object rwlock for writing. Block, if necessary, until the write lock becomes available.
```
int pthread_rwlock_wrlock(pthread_rwlock_t *rwlock);
```
pthread_rwlock_trywrlock()
Lock the read-write lock object rwlock for writing unless there are any existing read or write locks.
```
int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock);
```
pthread_rwlock_unlock()
Unlock the read-write lock object rwlock.
```
int pthread_rwlock_unlock(pthread_rwlock_t *rwlock);
```
pthread_rwlock_destroy()
Destroy the read-write lock object rwlock.
```
int pthread_rwlock_destroy(pthread_rwlock_t *rwlock);
```
pthread_rwlockattr_init()
Initialize the read-write lock attributes object rwlockattr.
```
int pthread_rwlockattr_init(pthread_rwlock_t *rwlockattr);
```
pthread_rwlockattr_getpshared()
Get the value of the process-shared attribute of the read-write lock attributes object rwlockattr.
```
int pthread_rwlockattr_getpshared(const pthread_rwlockattr_t
    *rwlockattr, int *pshared);
```
pthread_rwlockattr_setpshared()
Set the value of the process-shared attribute of the read-write lock attributes object rwlockattr.
```
int pthread_rwlockattr_setpshared(pthread_rwlockattr_t *rwlockattr,
    int *pshared);
```
thread_rwlockattr_destroy()
Destroy the read-write lock attributes object rwlockattr.
```
int pthread_rwlockattr_destroy(pthread_rwlock_t *rwlockattr);
```
pthread_getconcurrency()
Get the level of thread concurrency.
```
int pthread_getconcurrency(void);
```
pthread_setconcurrency()
Set the level of thread concurrency.
```
int pthread_setconcurrency(int new_level);
```
pthread_attr_getguardsize()
Get the value of the guardsize attribute of the thread attributes object attr.
```
int pthread_attr_getguardsize(const pthread_attr_t *attr,
    size_t *guardsize);
```
pthread_attr_setguardsize()
Set the value of the guardsize attribute of the thread attributes object attr.
```
int pthread_attr_setguardsize(pthread_attr_t *attr, size_t guardsize);
```
pread()
Read nbyte bytes from offset offset in the file opened on file descriptor filedes.
```
size_t pread(int filedes, void *buf, size_t nbyte, off_t offset);
```
pwrite()
Write nbyte bytes from offset offset in the file opened on file descriptor filedes.
```
size_t pwrite(int filedes, void *buf, size_t nbyte, off_t offset);
```

More Information

More information on the Single UNIX Specification, Version 2 can be obtained from the following sources:

The Open Group Source Book "Go Solo 2 - The Authorized Guide to Version 2 of the Single UNIX Specification", 500 pages, ISBN 0-13-575689-8. This book provides complete information on what's new in Version 2 , with technical papers written by members of the working groups that developed the specifications , and a CD-ROM containing the complete 3000 page specification in both HTML and PDF formats (including PDF reader software). For more information on the book, see URL http://www.UNIX-systems.org/gosolo2 .
Additional information on the Single UNIX Specification can be obtained at The Open Group world wide web site, see the URL http://www.UNIX-systems.org/.

Acknowledgements

This article was authored by Finnbarr Murphy, Dave Butenhof ( of Digital) and Andrew Josey (of The Open Group).

Read other technical papers.

Read or download the complete Single UNIX Specification from http://www.UNIX-systems.org/go/unix.

UNIX is a registered trademark of The Open Group.